docs: map existing codebase

2026-04-27 09:13:11 +02:00
parent 15dc0d289a
commit 3e91d5713d
7 changed files with 2237 additions and 0 deletions
--- a/.planning/codebase/CONCERNS.md
+++ b/.planning/codebase/CONCERNS.md
@@ -0,0 +1,509 @@
+# Codebase Concerns
+
+**Analysis Date:** 2026-04-27
+
+## Tech Debt
+
+### Debug Console Logging in Production Code
+
+**Issue:** Production code contains debug console.log statements that should be removed before deployment
+
+**Files:**
+- `server/src/modules/characters/characters.service.ts` (4 console.log statements in updateHp method, lines 263, 283, 303, 310)
+- `client/src/features/battle/hooks/use-battle-socket.ts` (3 console.log statements)
+- `client/src/features/characters/components/alchemy-tab.tsx` (3 console.log statements)
+- `client/src/features/characters/components/character-sheet-page.tsx` (1 console.log statement)
+- `client/src/shared/hooks/use-character-socket.ts` (4 console.log statements)
+
+**Impact:** Exposes internal logic and data in browser/server console. Can leak sensitive information and degrade performance in high-volume operations.
+
+**Fix approach:** Implement proper structured logging (Winston, Pino, or similar) for development vs. production environments. Remove or replace all console.log statements with conditional logging.
+
+---
+
+### N+1 Query Problem in Character Details Endpoint
+
+**Issue:** The findOne method in CharactersService performs multiple sequential async API calls for translation enrichment, causing N+1 query pattern
+
+**Files:** `server/src/modules/characters/characters.service.ts` (lines 186-222)
+
+**Problem Details:**
+- Fetches items, formulas, and preparedItems from database (3 queries)
+- For EACH item, formula, and prepared item, calls `enrichEquipmentWithTranslations()` which queries the Translations table or Claude API
+- If a character has 50 items, this results in 50+ additional Prisma queries (or Claude API calls)
+
+**Impact:** Dramatically slow character sheet loading. Creates rate-limiting pressure on Claude API. Potential timeout on high-inventory characters.
+
+**Fix approach:**
+1. Use Prisma's `include` to load equipment relationships in the initial query
+2. Batch translation lookups: collect all unique equipment IDs, translate in one batch, then merge results
+3. Consider caching translations at the service layer for repeated requests
+
+---
+
+### Oversized Service Class
+
+**Issue:** CharactersService is 1,454 lines - violates single responsibility principle and makes testing difficult
+
+**Files:** `server/src/modules/characters/characters.service.ts`
+
+**Contains:**
+- Character CRUD operations
+- HP management with dying/wounded condition logic
+- Item management
+- Condition management
+- Resource management
+- Rest system logic
+- Alchemy system logic (vials, formulas, prepared items)
+- Multiple helper methods (getDyingConditions, getDeathThreshold, etc.)
+
+**Impact:** Hard to locate code, difficult to test individual features, increased risk of bugs when modifying one feature affecting another.
+
+**Fix approach:** Split into multiple focused services:
+- `CharactersCoreService` - CRUD operations
+- `CharacterHpService` - HP and dying/wounded logic
+- `CharacterAlchemyService` - Alchemy system
+- `CharacterRestService` - Rest and recovery logic
+
+---
+
+## Known Issues
+
+### Missing Level-Up System
+
+**Issue:** Level progression is mentioned in CLAUDE.md as "Noch zu implementieren" but no implementation exists
+
+**Files:** No implementation - feature completely absent
+
+**Impact:** Cannot advance character levels, blocking career progression mechanic.
+
+**Workaround:** Level can be updated via direct database manipulation or API patching, but no UI or validation logic.
+
+**Priority:** High - Critical gameplay feature
+
+---
+
+### Unused equipmentlevel.json File at Project Root
+
+**Issue:** File `equipmentlevel.json` (250KB+) exists at project root but is not referenced by any code
+
+**Files:** `/c/Users/ZielonkaA/Documents/Dimension-47/equipmentlevel.json`
+
+**Impact:** 
+- Takes up disk space
+- Ignored by git, clogs project root
+- Suggests abandoned equipment import/level-mapping system
+- Unknown purpose - may be partial/corrupted data
+
+**Fix approach:** 
+1. Determine original purpose (equipment level mapping?)
+2. Either integrate into equipment seed system or delete
+3. Check git history to understand why it was abandoned
+
+---
+
+## Security Considerations
+
+### WebSocket Gateway Authentication is Minimal
+
+**Issue:** Token validation in WebSocket connection handler is basic but lacks some best practices
+
+**Files:** `server/src/modules/characters/characters.gateway.ts` (lines 60-93)
+
+**Current implementation:**
+- Extracts token from `client.handshake.auth.token` or Authorization header
+- Verifies token signature with JWT_SECRET
+- Verifies user exists in database
+- No rate limiting on connection attempts
+- No IP whitelisting or additional authentication layers
+
+**Potential risks:**
+- Rapid connection attempts could brute-force token validation or create connection storms
+- No tracking of legitimate vs. malicious connection attempts
+- Token rotation/revocation not supported (token valid until expiry)
+
+**Recommendations:**
+1. Implement connection rate limiting per IP/user
+2. Add token revocation mechanism
+3. Consider requiring additional re-authentication for sensitive operations
+4. Log suspicious connection patterns
+
+---
+
+### Input Validation on Equipment Search Query
+
+**Issue:** Equipment search endpoint accepts query parameters without upper bounds, potentially enabling resource exhaustion
+
+**Files:** `server/src/modules/equipment/equipment.controller.ts` (lines 23-42)
+
+**Current implementation:**
+- Takes `limit` parameter (defaults to 50, user-configurable)
+- No maximum limit enforced
+- Accepts large pagination page numbers
+- `traits` parameter is comma-separated without validation on count
+
+**Potential risks:**
+- `?limit=999999999` could cause OOM or timeout
+- `?page=999999999999` could overflow internal calculations
+- Unbounded trait queries could cause DoS
+
+**Fix approach:**
+1. Set hard maximum limits: `limit` ≤ 500, `page` ≤ 10000
+2. Validate and cap traits array length ≤ 50
+3. Implement pagination cursor-based approach instead of offset-based
+4. Add rate limiting per user/IP on search endpoint
+
+---
+
+### CORS Configuration Overly Permissive in Development
+
+**Issue:** Development CORS allows all localhost origins, which is standard but could leak requests to unintended local services
+
+**Files:** `server/src/main.ts` (lines 26-47)
+
+**Current implementation:**
+```typescript
+if (nodeEnv === 'development' && /^https?:\/\/localhost(:\d+)?$/.test(origin)) {
+  return callback(null, true);
+}
+```
+
+**Impact:** Any localhost service can make cross-origin requests to the API. Low risk in pure development, but could leak credentials or data if other services run on localhost.
+
+**Fix approach:** In development, explicitly list expected origins (e.g., `http://localhost:5173`) rather than wildcard localhost.
+
+---
+
+## Performance Bottlenecks
+
+### Equipment Database with 5,482 Items - Unindexed Search
+
+**Issue:** Full-text search on Equipment table lacks proper indexing for performance
+
+**Files:** 
+- `server/src/modules/equipment/equipment.service.ts` (search implementation)
+- `server/prisma/schema.prisma` (Equipment model definition)
+- `server/prisma/seed-equipment.ts` (5,482 items)
+
+**Current queries use:**
+- `.contains()` filters on `name`, `summary`, `effect` fields
+- No full-text search index
+- No composite indexes for common filter combinations
+
+**Impact:** 
+- Search queries on large equipment set may be slow
+- If multiple users search simultaneously, database load spikes
+- Trait filtering requires collection of all results then filtering in application code (N+1 pattern with filters)
+
+**Improvement path:**
+1. Add PostgreSQL `@@@` full-text search with tsvector indexes
+2. Add composite indexes: `(category, itemCategory)`, `(level, name)`
+3. Use Prisma's `findRaw` for complex searches with proper SQL optimization
+4. Consider search caching with Redis for common queries
+
+---
+
+### Translation Caching at Item Retrieval Time
+
+**Issue:** Translations are fetched on-demand during character detail retrieval, causing repeated Claude API calls for same items
+
+**Files:** `server/src/modules/characters/characters.service.ts` (lines 126-147, 186-222)
+
+**Current behavior:**
+- First time a character's item is viewed: cache miss, Claude API called
+- If multiple characters have same item: Claude API called multiple times
+- No cache invalidation strategy or TTL
+
+**Impact:** 
+- Increased Claude API costs and latency
+- Rate limiting pressure if many characters queried
+- Slow character sheet loads
+
+**Fix approach:**
+1. Pre-cache translations for common items at server startup
+2. Implement TTL-based cache expiration (e.g., 30 days)
+3. Batch missing translations when loading characters with multiple items
+4. Cache at equipment level, not at character level
+
+---
+
+## Fragile Areas
+
+### Character Death/Resurrection State Management
+
+**Issue:** Dying/Wounded conditions interact with HP changes in complex ways that could have race conditions
+
+**Files:** `server/src/modules/characters/characters.service.ts` (lines 262-406)
+
+**Fragile interactions:**
+- When HP drops to 0: checks Wounded/Doomed values, calculates Dying value, adds/updates condition, broadcasts update
+- When HP increases from 0: removes Dying, adds/increments Wounded, broadcasts update
+- Multiple async operations between checking and updating
+- No transaction wrapping
+
+**Risks:**
+- If two updates happen simultaneously (WebSocket + rest endpoint), conditions could get corrupted
+- Wounded value calculation depends on database state that could change between read and write
+- Broadcasting happens after database update, potential sync issues if broadcast fails
+
+**Safe modification:**
+1. Wrap Dying/Wounded logic in Prisma transaction
+2. Add optimistic locking (version field) to Character model
+3. Consolidate HP update and condition management into single atomic operation
+4. Test scenarios: multiple simultaneous HP changes, network race conditions
+
+**Test coverage:** Currently untested (no test files in server codebase)
+
+---
+
+### Alchemy System State Consistency
+
+**Issue:** Alchemy state (vials, formulas, prepared items) spans multiple database models with complex interdependencies
+
+**Files:**
+- `server/src/modules/characters/characters.service.ts` (alchemy methods ~200+ lines)
+- `server/prisma/schema.prisma` (CharacterAlchemyState, CharacterFormula, CharacterPreparedItem, CharacterResource)
+
+**Fragile points:**
+- Prepared items created by `createPreparedItem()` with `isInfused: true`, then deleted by rest system
+- Versatile vials count stored in CharacterAlchemyState but actual vials tracked in CharacterResource
+- Formula learning updates both CharacterFormula and CharacterResource
+- No constraint ensuring consistency between these tables
+
+**Risk:** Orphaned records, inconsistent vial counts, invalid formula states
+
+**Safe modification:**
+1. Add unique constraint validation in service methods
+2. Use Prisma transactions for multi-table alchemy operations
+3. Add data integrity checks in rest system
+4. Create migration to validate existing data
+
+---
+
+## Test Coverage Gaps
+
+### No Backend Tests
+
+**Issue:** Server codebase has zero test files - all business logic untested
+
+**Files:** 
+- `server/src/` directory contains no `.test.ts` or `.spec.ts` files
+- Test infrastructure exists (Jest, @nestjs/testing configured) but unused
+
+**Untested functionality:**
+- Authentication and authorization guards
+- Character CRUD with access control
+- HP updates with Dying/Wounded condition logic (most critical)
+- Rest system with conditional resource resets
+- Alchemy system with formula learning and vial tracking
+- WebSocket message handlers (race conditions, auth)
+- Equipment search with filtering
+
+**Priority areas for testing:**
+1. Character death/resurrection state transitions (HIGH - bug risk)
+2. Authentication guards on protected endpoints (HIGH - security)
+3. Access control: ensure players can't edit others' characters (HIGH - security)
+4. Rest system calculations with various ability scores (MEDIUM)
+5. Alchemy vial/formula consistency (MEDIUM)
+
+**Fix approach:**
+1. Create test files for each module: `auth.service.spec.ts`, `characters.service.spec.ts`, etc.
+2. Start with authentication and authorization tests (security-critical)
+3. Add tests for HP/Dying state transitions (highest bug risk)
+4. Use Prisma's test database pattern for integration tests
+
+---
+
+### No Frontend Tests
+
+**Issue:** Client codebase has zero test files - no component or hook testing
+
+**Files:** 
+- `client/src/` directory contains no `.test.tsx` or `.spec.tsx` files
+- Test infrastructure missing (no Jest/Vitest config, no testing-library)
+
+**Untested components:**
+- Character sheet tabs (Inventory, Alchemy, Actions, Talents)
+- HP control component with temporary/current/max interactions
+- Condition management modal
+- Rest modal with preview calculation
+- WebSocket hooks for real-time sync
+- Item search and filtering UI
+
+**Impact:** Cannot verify UI correctly reflects data, no regression protection, refactoring is risky
+
+**Fix approach:**
+1. Install Vitest + React Testing Library
+2. Create test files for critical components: HP control, Rest modal, Item search
+3. Test WebSocket connection/disconnection handling
+4. Add snapshot tests for data-heavy components (character sheet)
+
+---
+
+## Missing Critical Features
+
+### Level-Up System
+
+**Issue:** Character advancement is incomplete
+
+**Problem:** 
+- Level field exists in database and UI, but no level-up mechanics
+- Cannot allocate ability score increases at certain levels
+- Cannot select new feats when leveling
+- Cannot add/remove skills when leveling
+
+**Files:** No implementation exists
+
+**Blocks:** Full character progression gameplay
+
+---
+
+## Dependencies at Risk
+
+### Claude API Integration as Critical Path
+
+**Issue:** German translation system depends on Claude API for on-demand translation calls
+
+**Files:** 
+- `server/src/modules/claude/claude.service.ts`
+- `server/src/modules/translations/translations.service.ts`
+
+**Risk points:**
+1. No fallback if API is unavailable - character sheets fail to load with translations
+2. Rate limiting: high-traffic scenarios could hit Claude API limits
+3. Cost: unbounded - every unique item/feat/spell triggers an API call
+4. Latency: API calls on the critical path of character sheet loading (3+ second potential delay)
+
+**Migration plan:**
+1. Pre-populate translations for all 5,482 equipment items on server startup
+2. Batch translate feats/spells during seeding
+3. Use Claude API only for user-created custom items
+4. Cache everything in database with TTL
+5. Implement graceful degradation: show English names if translation fails
+
+---
+
+### Prisma 7.2.0 - Check for Security/Critical Updates
+
+**Files:** 
+- `server/package.json` - `@prisma/client: ^7.2.0`
+- `server/prisma/schema.prisma` - Using latest client
+
+**Note:** Version is recent (January 2025+) but "^" allows minor updates. Monitor for critical patches in Prisma security advisories.
+
+---
+
+## Scaling Limits
+
+### Equipment Search at 5,482 items
+
+**Current capacity:** Linear search with `.contains()` filters
+
+**Limit:** Performance degrades beyond ~10,000 items, potential timeout at 50,000+
+
+**Scaling path:**
+1. Add PostgreSQL full-text search indexes (supports millions of items)
+2. Implement search result caching
+3. Consider Elasticsearch if full-text becomes bottleneck
+4. Pagination already uses `take/skip`, good foundation
+
+---
+
+### WebSocket Connections per Campaign
+
+**Issue:** No connection pooling limits, no per-campaign connection quota
+
+**Files:** `server/src/modules/characters/characters.gateway.ts`
+
+**Current behavior:**
+- Each connection stores socket in `connectedClients` Map
+- Broadcasts go to all sockets in character room
+- No limits on concurrent connections per campaign or globally
+
+**Potential issue:** If campaign has 100 characters and each has 10 connected users watching, that's 1,000 broadcast operations per update. At scale this causes CPU/memory pressure.
+
+**Scaling recommendations:**
+1. Set max connections per campaign (e.g., 500)
+2. Implement connection pooling/groups (watch list vs. active participants)
+3. Use socket.io's built-in room management more aggressively
+4. Monitor memory usage of `connectedClients` Map for leaks
+
+---
+
+## Database Schema Concerns
+
+### Character pathbuilderData Stored as JSON Blob
+
+**Issue:** Original Pathbuilder export is stored unstructured as JSON
+
+**Files:** `server/prisma/schema.prisma` (line 198: `pathbuilderData Json?`)
+
+**Potential issues:**
+- Cannot query Pathbuilder fields directly
+- Data validation relies on application code, not schema
+- Difficult to report or audit what data came from imports
+- No versioning if Pathbuilder export format changes
+
+**Impact:** Low - used primarily for reference, but limits data flexibility
+
+**Improvement path:**
+1. Parse and normalize Pathbuilder data into proper database fields
+2. Consider storing hash of original for audit trail
+3. Add migration to extract key fields from JSON for querying
+
+---
+
+## Environmental Configuration
+
+### JWT_SECRET Management
+
+**Issue:** JWT_SECRET passed via environment variable with no rotation mechanism
+
+**Files:** 
+- `server/src/modules/auth/strategies/jwt.strategy.ts`
+- `server/src/modules/characters/characters.gateway.ts`
+
+**Concern:**
+- If secret is leaked, all existing tokens become invalid
+- No way to rotate secret without invalidating all sessions
+- No secret versioning
+
+**Recommendation:**
+1. Implement secret versioning in token header
+2. Support multiple valid secrets during rotation period
+3. Add secret rotation schedule (e.g., quarterly)
+4. Store secrets in vaulting system, not .env
+
+---
+
+### ENV File Example Status
+
+**Files:** 
+- `server/.env.example`
+- `client/.env.example`
+
+**Status:** Confirmed present in CLAUDE.md - properly structured
+
+**Verification:** Users can reference templates when setting up environment
+
+---
+
+## API Documentation
+
+### Swagger API Docs Enabled but may Expose Sensitive Operations
+
+**Files:** `server/src/main.ts` (lines 49-64)
+
+**Current state:**
+- Swagger UI enabled at `/api/docs`
+- Available to all users with API access
+- Could expose internal API structure to unauthorized users
+
+**Recommendation:**
+- In production: require ADMIN role to view Swagger
+- In development: keep open for debugging
+- Add basic authentication to `/api/docs` endpoint
+
+---
+
+*Concerns audit: 2026-04-27*