# Server Consolidation & Security Hardening Plan Audit-driven plan to (a) reduce 12 PM2 processes to 3 application servers + 1 auth server, (b) put every API endpoint behind real authentication, and (c) standardize on ESM across all Node services. Approach is "do it properly the first time" — no half-finished pieces, no deferred cleanup. --- ## Status (2026-05-23) | Phase | Status | Notes | |---|---|---| | 1 — Decommission dead services | **Complete** | aircall/gorgias/clarity/legacy-auth-server deleted from repo + PM2 + Caddyfile + ecosystem.cjs | | 2 — Build shared `lib/` | **Complete** | Lives at `inventory-server/shared/` (see Deviations). `/verify` endpoint live on auth-server | | 3 — Convert auth-server + inventory-server to ESM | **Complete (code)** | All 58 server-side files ESM; verified 0 import failures on netcup. Pending: `npm install` on server + pm2 reload to actually run the new code. See Deviations #10–13 | | 4 — Build `dashboard-server` (the merge) | Not started | klaviyo/meta/google/typeform still run as 4 separate PM2 apps | | 5 — Convert `acot-server` to ESM | Not started | | | 6 — Auth hardening | **Complete (code) — gated on Phase F1** | All in-process items wired (rate-limit, JWT precondition, CORS lockdown, request-log, upload allowlist, `requirePermission` on sensitive routes, permissions seed migration). `authenticate()` is live on `/api/*`. Server-side artefacts (Caddyfile, ecosystem.cjs) written to `inventory-server/deploy/` for review. 6.11 (audit logging) deferred. **Frontend cannot use the app until Phase F1 ships** — see below | | **F1 — Frontend fetch wrapper (NEW)** | **Not started — CRITICAL** | Frontend uses raw `fetch()` in ~220 sites; only 7 send `Authorization: Bearer`. With Phase 6's `authenticate()` middleware live, every refresh 401s until the frontend uniformly attaches the token. See "Phase F1" below | | 7 — Caddyfile final form | Partial | Proposed file at `inventory-server/deploy/Caddyfile.proposed`. Apply blocked on F1 (forward_auth would 401 every page load until then) | | 8 — ecosystem.config.cjs final form | Partial | Proposed at `inventory-server/deploy/ecosystem.config.cjs.proposed`. Includes Phase 6.4 JWT_SECRET footgun fix and 6.10 lt-wordlist token move | **Live PM2 count: 10** (down from 13). Target after Phase 4: 5 application apps + acot-phone-server + lt-wordlist-api. **Apply order from current state:** (a) `npm install` on netcup to install the new shared-module deps (`pino`, `pino-http`, `ioredis`, `express-rate-limit`, `jsonwebtoken`), (b) ship Phase F1 frontend fetch wrapper, (c) `pm2 reload inventory-server new-auth-server` (Phase 3+6 code goes live, requests carry tokens, app keeps working), (d) apply `deploy/ecosystem.config.cjs.proposed` (Phase 6.4 + 6.10), (e) apply `deploy/Caddyfile.proposed` (Phase 6.1 — edge gate). --- ## Goals - Every public-facing endpoint requires a valid auth token (Caddy gate + per-server middleware + per-route permission checks for sensitive operations). - Reduce service count from 12 PM2 processes to 4: `inventory-server`, `acot-server`, `dashboard-server`, `auth-server`. - Standardize on ESM (`"type": "module"`) across all Node services. - Decommission `aircall-server`, `gorgias-server`, `clarity-server`, and the legacy `auth-server` (port 3003). - Eliminate dependency duplication: one Redis client, one Postgres pool helper, one logger, one auth middleware — shared across services. ## Non-goals - Rewriting business logic. Route handlers move as-is unless they break under ESM or shared middleware. - Switching auth providers (we keep JWT + bcrypt + Postgres). - Replacing PM2 or Caddy. - Migrating Klaviyo/Meta/Google/Typeform's external API contracts. --- ## Target architecture ``` ┌──────────────────────────┐ │ tools.acherryontop.com │ │ (Caddy) │ │ forward_auth gate ─────┼──► auth-server:3011 └────────────┬─────────────┘ /verify endpoint │ ┌────────────────────────────────┼────────────────────────────────┐ ▼ ▼ ▼ ┌─────────────────────┐ ┌──────────────────────┐ ┌─────────────────────┐ │ inventory-server │ │ dashboard-server │ │ acot-server │ │ :3010 (ESM) │ │ :3015 (ESM) │ │ :3012 (ESM) │ │ │ │ │ │ │ │ /api/products │ │ /api/klaviyo/* │ │ /api/acot/* │ │ /api/orders │ │ /api/meta/* │ │ (MySQL via SSH) │ │ /api/analytics │ │ /api/google-*/* │ │ │ │ /api/dashboard │ │ /api/typeform/* │ │ │ │ ... (~25 routers) │ │ │ │ │ └─────────────────────┘ └──────────────────────┘ └─────────────────────┘ │ │ │ ├── Postgres (inventory_db) ├── Postgres (klaviyo) └── MySQL (workpi, via ssh2 tunnel) ├── shared lib/ ◄────────────────┤ │ - auth middleware ├── Redis (shared client) │ - permission helper └── shared lib/ ◄─────────────────┐ │ - logger │ │ - pg pool factory │ │ - error formatter │ └─────────────────────────────────────────────────────────────────┘ │ ┌──────────────────┴───┐ │ auth-server │ │ :3011 (ESM) │ │ /login, /me, │ │ /verify, user mgmt │ └──────────────────────┘ ``` PM2 process count: **12 → 4** (plus `acot-phone-server` and `lt-wordlist-api`, which stay as-is — out of scope). --- ## Phase 1 — Decommission dead/leaving services Status: **Complete (2026-05-23)**. All four services removed from repo, PM2, Caddyfile, and ecosystem.config.cjs. Frontend widgets (`AircallDashboard.jsx`, `GorgiasOverview.jsx`) and their dashboard.ts/Navigation.jsx/vite.config.ts wiring also removed. Verification: smoke-tested `https://tools.acherryontop.com/api/{aircall,gorgias,clarity}/*` → 404. Backups left at `/home/matt/{ecosystem.config.cjs,Caddyfile}.bak.2026-05-23`. ### To remove | Service | Reason | Steps | |---|---|---| | `aircall-server` (3002) | Migrating off Aircall | `pm2 delete aircall-server`; remove from `ecosystem.config.cjs`; remove `/api/aircall/*` from Caddyfile; drop `inventory/dashboard/aircall-server/` directory; remove MongoDB connection from any frontend code; cancel Mongo if it was only feeding Aircall | | `gorgias-server` (3006) | Migrating off Gorgias | same pattern; check frontend for `/api/gorgias/*` callers and delete the dashboards/widgets that use them | | `clarity-server` (3009) | Already dead (no `.js` files, not in ecosystem) | remove `/api/clarity/*` from Caddyfile; delete `inventory/dashboard/clarity-server/` directory | | `auth-server` (3003, legacy) | Replaced by `new-auth-server` on 3011 | grep entire codebase for `dashboard-auth` and `localhost:3003`; redirect or remove callers; `pm2 delete auth-server`; remove from ecosystem; remove `/dashboard-auth/*` from Caddyfile; delete `inventory/dashboard/auth-server/` directory | ### Verification before deletion ```bash # from inventory/ root — find any references before removing grep -rn "aircall\|/api/aircall" inventory/src/ inventory-server/src/ grep -rn "gorgias\|/api/gorgias" inventory/src/ inventory-server/src/ grep -rn "/dashboard-auth\|localhost:3003" inventory/src/ inventory-server/src/ grep -rn "/api/clarity" inventory/src/ inventory-server/src/ ``` Any remaining callers must be deleted or repointed before the server is removed. Do **not** leave a 502 response in production. ### Database/secret cleanup - Drop the MongoDB instance feeding Aircall (after confirming no other consumers). - Rotate any Gorgias/Aircall API keys still in `.env` files (defense in depth — they'll be useless soon anyway, but commit hygiene matters). - Remove `MONGODB_URI`, `AIRCALL_*`, `GORGIAS_*` from any `.env` files. --- ## Phase 2 — Build the shared `lib/` Status: **Complete (2026-05-23)**. All 11 modules written under `inventory-server/shared/` (NOT repo root — see Deviations). `/verify` endpoint added to auth-server in CJS form (will move to shared/auth/verify.js usage during Phase 3 ESM conversion). Smoke-tested with no-token / bad-token / expired-token / valid-token cases. No service consumes shared/ yet; that happens in Phases 3–5. ### Location A single shared directory at the repo root: `shared/` (sibling of `inventory/` and `acot-phone/`). Each service imports from it via a relative path. We do **not** introduce npm workspaces yet — relative imports are fine for three consumers and avoid the npm-link / hoisting headaches. ### Modules to create ``` shared/ ├── package.json # "type": "module" ├── auth/ │ ├── middleware.js # authenticate(), requirePermission(), requireAdmin() │ └── verify.js # verifyToken() — pure function, no Express dependency ├── db/ │ ├── pg.js # createPool(envPrefix) — returns configured Pool │ └── redis.js # createRedis() — single client, lazy-connect ├── logging/ │ ├── logger.js # pino-based, redacts Authorization/Cookie │ └── request-log.js # Express middleware, structured access log ├── errors/ │ └── handler.js # consistent error envelope, no leak in prod ├── cors/ │ └── policy.js # single allowed-origins list, exported as cors() options └── rate-limit/ └── login.js # express-rate-limit config for /login ``` ### Auth middleware spec (`shared/auth/middleware.js`) ```js // Pseudocode — final implementation matches the existing pattern in // inventory/auth/routes.js authenticate() but factored out. export function authenticate({ pool }) { return async (req, res, next) => { const header = req.headers.authorization; if (!header?.startsWith('Bearer ')) { return res.status(401).json({ error: 'Authentication required' }); } try { const decoded = jwt.verify(header.slice(7), process.env.JWT_SECRET); // Short-circuit DB hit with an in-memory cache, 60s TTL keyed by token jti const user = await loadUserCached(pool, decoded.userId); if (!user.is_active) return res.status(403).json({ error: 'Account inactive' }); req.user = user; next(); } catch { res.status(401).json({ error: 'Invalid token' }); } }; } export function requirePermission(code) { return (req, res, next) => { if (req.user.is_admin) return next(); if (req.user.permissions?.includes(code)) return next(); res.status(403).json({ error: 'Insufficient permissions' }); }; } export const requireAdmin = (req, res, next) => req.user.is_admin ? next() : res.status(403).json({ error: 'Admin only' }); ``` ### Why a 60s in-memory user cache `forward_auth` in Caddy will call `auth-server` on every request. Each per-server `authenticate()` middleware also has a DB lookup to load permissions. Without caching, every API request becomes 1 SQL query for the user row + 1 for permissions. 60s TTL is short enough that deactivating a user takes effect within a minute, long enough that Klaviyo dashboards (which fire dozens of requests on load) don't hammer Postgres. ### Add to `auth-server`: a `/verify` endpoint Caddy's `forward_auth` only needs "is this token valid? give me a user-id." Today's `/me` does that but with a full permissions join. Add a lightweight `/verify` that: - Verifies JWT signature only (no DB hit). - Returns `200` with `X-User-Id` and `X-User-Is-Admin` response headers (which Caddy `copy_headers` will pass to upstream). - Returns `401` on bad token. **Decision: each service re-verifies the JWT independently.** Caddy's `forward_auth` is a fast first-pass reject for obviously bad tokens, but the security boundary is the per-server `authenticate()` middleware. Cost is negligible (one HMAC-SHA256 per request); the upside is that a misconfigured Caddyfile can never let an unauthenticated request reach a backend. Upstream services do **not** trust any `X-User-*` headers from Caddy — they parse the `Authorization` header themselves. --- ## Phase 3 — Convert `auth-server` and `inventory-server` to ESM Status: **Complete (code) — 2026-05-23.** Both servers + all sub-trees converted to ESM. 58 importable .js files load cleanly on netcup (verified via dynamic-import sweep). Two latent bugs surfaced and fixed: `??`/`||` precedence in `shared/db/{pg,redis}.js`, and CJS named-import of `Pool` from `pg` in both auth files (now uses `import pg from 'pg'; const { Pool } = pg`). Scripts under `inventory-server/scripts/` (one-shot maintenance / orchestrators) kept CommonJS via a sibling `scripts/package.json` declaring `"type": "commonjs"` — Node's package-type resolution walks up directory by directory, so this overrides the parent's `"type": "module"` without renaming any file or touching any `spawn()` callsite. Convert individual scripts to ESM if/when touched. Pending to actually go live: `npm install` on netcup (new deps: `pino`, `pino-http`, `ioredis`, `express-rate-limit`, `jsonwebtoken`) + `pm2 reload`. See "Phase F1" — the frontend fetch wrapper should ship in the same deploy or this immediately breaks the app. ### Mechanical conversion Per service: 1. Add `"type": "module"` to `package.json`. 2. Convert `require()` → `import`. `module.exports` → `export` / `export default`. 3. Fix `__dirname`/`__filename` (use `import.meta.url` + `fileURLToPath`). 4. Convert any dynamic require (e.g., conditional plugin loading) to `await import()`. 5. Update any sub-imports that don't include the file extension — ESM requires `./foo.js`, not `./foo`. 6. Update `ecosystem.config.cjs` if any service entry depended on CJS semantics. The ecosystem file itself can stay `.cjs` — PM2 reads it as config, doesn't matter what the apps it spawns are. 7. Update nodemon config / scripts. ### Risk areas in inventory-server - `routes/ai.js` does a lazy init (`aiRouter.initInBackground()` called from `server.js`) — confirm the export shape still works as a default export of an Express router with a sidecar function. May need to split into `export default router; export function initInBackground() {}`. - Multer setup in `routes/import.js` — straightforward, no ESM-specific concerns. - SSE setup in `server.js` — moves over cleanly, no module-system entanglement. - The `child_process.spawn` calls for metrics calculation: ESM doesn't change `child_process` behavior, but if any spawned script uses `require()` of a sibling, that sibling must also be ESM (or stay CJS with a `.cjs` extension). ### Test strategy - After conversion, `pm2 start ecosystem.config.cjs --only inventory-server` on the server, watch logs for require/import errors at startup. - Hit `/health`, then the most exercised endpoints (`/api/products`, `/api/dashboard/overview`, `/api/analytics/...`). If startup is clean and three smoke endpoints work, ESM conversion is done. Functional correctness is preserved because no logic changed. ### Auth-server Already small (~200 LOC server.js + ~few hundred in routes.js + permissions.js). 1-day conversion. Add the new `/verify` endpoint as part of this work. --- ## Phase 4 — Build `dashboard-server` (the merge) Status: **Not started.** The big merge. Klaviyo + Meta + Google + Typeform → one ESM service. Highest-risk phase — see Rollback strategy for the per-vendor cutover plan. ### Layout ``` inventory/dashboard/ ├── server.js # entry: load env, init Postgres+Redis, mount routes, listen ├── package.json # "type": "module", deps from all 4 source servers (deduped) ├── .env # KLAVIYO_*, META_*, GOOGLE_*, TYPEFORM_*, shared DB_*, REDIS_URL ├── routes/ │ ├── klaviyo/ # absorbed from dashboard/klaviyo-server/src/ │ ├── meta/ # absorbed from dashboard/meta-server/ │ ├── google/ # absorbed from dashboard/google-server/ │ └── typeform/ # absorbed from dashboard/typeform-server/ ├── services/ # per-vendor API clients (Klaviyo SDK calls, etc.) ├── scripts/ │ └── import-campaign-products.js # one-shot, moved from klaviyo-server/scripts/ └── logs/ ``` ### Mount points ```js // server.js (sketch) import { authenticate, requirePermission } from '../../shared/auth/middleware.js'; import { createPool } from '../../shared/db/pg.js'; import { createRedis } from '../../shared/db/redis.js'; import { logger, requestLog } from '../../shared/logging/index.js'; import corsPolicy from '../../shared/cors/policy.js'; import errorHandler from '../../shared/errors/handler.js'; import klaviyoRouter from './routes/klaviyo/index.js'; import metaRouter from './routes/meta/index.js'; import googleRouter from './routes/google/index.js'; import typeformRouter from './routes/typeform/index.js'; const app = express(); const pool = await createPool('KLAVIYO_DB'); // klaviyo has its own DB; others can share or have none const redis = await createRedis(); app.use(requestLog); app.use(cors(corsPolicy)); app.use(express.json({ limit: '10mb' })); // Everything below this line requires a valid token. app.use('/api', authenticate({ pool })); app.use('/api/klaviyo', klaviyoRouter({ pool, redis })); app.use('/api/meta', metaRouter({ redis })); app.use('/api/google-analytics', googleRouter({ redis })); // matches Caddy /api/dashboard-analytics rewrite app.use('/api/typeform', typeformRouter({ redis })); app.get('/health', (req, res) => res.json({ ok: true })); app.use(errorHandler); app.listen(process.env.DASHBOARD_PORT || 3015); ``` ### Per-vendor routers Each vendor's existing route file becomes a factory that takes the shared `pool`/`redis` and returns an Express router. Replace each server's per-instance pool/redis with the injected one. ### Permission gates (sensitive routes only) Authenticated-only is the default after `app.use('/api', authenticate(...))`. For sensitive operations, add `requirePermission` per route: - Anything that mutates Klaviyo lists/segments → `requirePermission('klaviyo_write')` - Triggering a campaign sync → `requirePermission('klaviyo_admin')` - Read-only dashboards → no extra check beyond authenticate. Define the new permission codes in the `permissions` table via a migration in Phase 6. ### Dependency dedup **Decision: standardize on `ioredis`.** Klaviyo's larger codebase already uses it, and `ioredis` has better cluster/sentinel support if we ever need it. Update `meta`/`google`/`typeform` call sites — each is a handful of `get`/`set` calls, mechanical conversion. Remove the `redis` package from `dashboard-server`'s `package.json`. ### Env consolidation Single `.env` at `inventory/dashboard/.env`, prefixed keys: ``` DASHBOARD_PORT=3015 KLAVIYO_API_KEY=... KLAVIYO_DB_HOST=... KLAVIYO_DB_NAME=... META_ACCESS_TOKEN=... GOOGLE_SERVICE_ACCOUNT_KEY=... TYPEFORM_TOKEN=... REDIS_URL=... JWT_SECRET=... # shared with auth-server; same secret means same tokens valid here ``` ### Klaviyo's `scripts/import-campaign-products.js` One-shot script — keep it, but run it from the merged dashboard-server's directory. Update the script's imports to ESM. If it's run via cron, update the cron entry to the new path. ### Risk: shared error states When all four vendors share a Redis client, a Redis hiccup affects all four. Make sure the connection has retry config (`ioredis` defaults are reasonable but verify) and that vendor routes degrade gracefully when Redis is unavailable (most use it as a cache, so cache-miss → fall through to upstream API is the right behavior). --- ## Phase 5 — Convert `acot-server` to ESM (stays standalone) Status: **Not started.** Largest single conversion (~5K LOC), but no merge involved. ### Special concern: ssh2 tunnel `acot-server` opens an SSH tunnel via `ssh2` to access the production MySQL at `192.168.1.5:3309`. The tunnel must be: - Established before the HTTP listener starts (so no requests fail with "no DB connection"). - Re-established on disconnect (`ssh2` connection's `close` event → recreate). - Cleanly torn down on `SIGTERM`/`SIGINT` so PM2 restarts don't leak file descriptors. Verify (or add) this lifecycle handling as part of the conversion. If it's already correct, conversion is mechanical; if not, this is a good moment to fix it. ### Test strategy Same as inventory-server: start with PM2, smoke-test the most-used `/api/acot/*` endpoints, watch logs for unhandled rejection or tunnel-close events. --- ## Phase 6 — Auth hardening Status: **Complete (code) — 2026-05-23. Application gated on Phase F1.** All in-process hardening shipped alongside the Phase 3 ESM conversion. The `authenticate()` middleware is wired live on `/api/*` in inventory-server — **the moment that code reaches production, the frontend stops working until Phase F1 lands**, because today's frontend doesn't include `Authorization: Bearer` on the vast majority of fetch calls (see Phase F1 below for the diagnosis). Per-item status: | # | Item | Status | Where | |---|---|---|---| | 6.1 | Caddy `forward_auth` gate | **Proposed** — apply *after* F1 | `inventory-server/deploy/Caddyfile.proposed` | | 6.2 | `requirePermission` on sensitive routes + permissions migration | **Done** | inline in `config.js`, `data-management.js`, `import.js`, `ai-prompts.js`, `ai-validation.js`, `templates.js`, `reusable-images.js`; codes seeded by `migrations/005_phase6_permission_codes.sql` | | 6.3 | Login rate-limit + `/verify` rate-limit | **Done** | `auth/server.js` uses `shared/rate-limit/login.js` (`loginLimiter`, `verifyLimiter`) | | 6.4 | JWT_SECRET as startup precondition + ecosystem footgun fix | **Done in code; proposed for ecosystem.cjs** | Both auth-server and inventory-server `process.exit(1)` if `JWT_SECRET` is unset. `inventory-server/deploy/ecosystem.config.cjs.proposed` removes the `JWT_SECRET: process.env.JWT_SECRET` override that was shadowing `.env` | | 6.5 | Structured request logging w/ redaction | **Done** | `shared/logging/request-log.js` (pino-http, redacts Authorization/Cookie); mounted in both `auth/server.js` and `src/server.js` | | 6.6 | CORS lockdown | **Done** | `src/middleware/cors.js` now re-exports `shared/cors/policy.js`. LAN wildcards (`192.168.*`, `10.*`) and `*` defaults gone | | 6.7 | Upload hardening | **Done** | Exact-match MIME+extension allowlist on `routes/import.js` and `routes/reusable-images.js`; dead `multer({ dest })` removed from `routes/products.js` (no upload route was using it — strongest hardening was deletion) | | 6.8 | Frontend token storage stays localStorage + XSS audit | **Audited** | Confirmed `dangerouslySetInnerHTML` is sanitized in `ProductEditor.tsx`. **Flagged: `ChatRoom.tsx:277,392` renders user-controlled chat content as raw HTML — real XSS vector, separate fix needed** | | 6.9 | Remove debug middleware | **Done** | The header-dumping `app.use((req,res,next)=>{ console.log(... req.headers ...) })` block removed from `src/server.js`. Replaced with `shared/logging/request-log.js` (which redacts). | | 6.10 | `lt-wordlist-api` token move | **Proposed for ecosystem.cjs** | `inventory-server/deploy/ecosystem.config.cjs.proposed` shows the entry without inline token; apply alongside rotating the secret value into `/opt/lt-wordlist-api/.env` | | 6.11 | Audit logging for sensitive ops | **Deferred** | Out of scope for this pass per user direction. Existing `import_audit_log` and `product_editor_audit_log` tables stay as-is; generic `system_audit_log` table + middleware is its own project | ### 6.1 Caddy `forward_auth` gate Add to the `tools.acherryontop.com` block, before the `@api_routes` handler: ```caddyfile # Forward-auth gate for all API traffic @needs_auth path /api/* /chat-api/* handle @needs_auth { forward_auth localhost:3011 { uri /verify copy_headers Authorization # On 401/403, Caddy returns the auth-server's response body verbatim } # Existing per-vendor handle blocks remain below this line } # /auth-inv/* stays public (you need to log in!) handle /auth-inv/* { uri strip_prefix /auth-inv reverse_proxy localhost:3011 } ``` The `forward_auth` directive subrequests `/verify` on the auth-server. If it returns 2xx, the request proceeds upstream. If 401/403, Caddy returns that response to the client and never hits the backend. This is the **first** line of defense. Per-server middleware (`shared/auth/middleware.js`) is the **second** line — re-verifies the JWT independently. Defense in depth: a Caddyfile typo can't open a hole. ### 6.2 Per-route permission gates After per-server `authenticate()`, add `requirePermission(code)` to destructive or sensitive routes. Audit needed in: - `inventory-server/src/routes/config.js` — global config writes → `admin` - `inventory-server/src/routes/import.js` — uploads, deletes, generate-upc → `product_import` - `inventory-server/src/routes/data-management.js` — CSV operations → `data_management` - `inventory-server/src/routes/ai-prompts.js` — prompt edits → `ai_admin` - `inventory-server/src/routes/templates.js` — template writes → `templates_write` - `inventory-server/src/routes/reusable-images.js` — image management → `image_admin` - `inventory-server/src/routes/products.js` — only one POST (`/resolve-identifiers`); evaluate whether it needs a permission code or authenticated-only is fine - `inventory-server/src/routes/product-editor-audit-log.js` and `import-audit-log.js` — read-only by sensitive users → `audit_read` - `dashboard-server` Klaviyo/Meta/Google/Typeform write endpoints → vendor-specific codes per above Migration: a single SQL script that inserts the new permission codes into the `permissions` table and assigns them to existing admin users. Non-admin users get permissions explicitly granted via the user management UI. ```sql INSERT INTO permissions (code, name) VALUES ('product_import', 'Product Import'), ('data_management', 'Data Management'), ('ai_admin', 'AI Settings Admin'), ('templates_write', 'Template Editing'), ('image_admin', 'Image Management'), ('audit_read', 'Audit Log Access'), ('klaviyo_write', 'Klaviyo Write'), ('klaviyo_admin', 'Klaviyo Admin'), ('meta_write', 'Meta Write'), ('google_write', 'Google Analytics Write'), ('typeform_write', 'Typeform Write'), ('acot_admin', 'ACOT Server Admin') ON CONFLICT (code) DO NOTHING; ``` ### 6.3 Rate limiting on login `shared/rate-limit/login.js`: ```js import rateLimit from 'express-rate-limit'; export const loginLimiter = rateLimit({ windowMs: 15 * 60 * 1000, // 15 minutes max: 10, // 10 attempts per IP per window message: { error: 'Too many login attempts, try again later' }, standardHeaders: true, legacyHeaders: false, }); ``` Apply in `auth-server` on the `/login` route. Consider also rate-limiting `/verify` and `/me` (much higher cap, ~600/min — they're called legitimately by every page load). ### 6.4 JWT secret rotation - Rotate `JWT_SECRET` to a fresh 32-byte random string as part of the deployment. - Document that rotation logs out all users — acceptable for an internal tool, do it during off-hours. - Add `JWT_SECRET` to the env var validation block in `auth-server/server.js` (refuse to start if not set). - **Fix the existing footgun**: `/var/www/ecosystem.config.cjs` currently has `JWT_SECRET: process.env.JWT_SECRET` *after* `...inventoryEnv` in the new-auth-server block. This shadows the `.env` value with whatever the shell exported when PM2 was started — which has already silently diverged at least once (detected and fixed 2026-05-23 by a clean PM2 restart in a shell without JWT_SECRET exported). Delete that override line during rotation; let `.env` be the single source of truth. ### 6.5 Request logging `shared/logging/request-log.js` — log method, path, status, duration, user-id (if authenticated). **Never** log `Authorization` or `Cookie` headers. Remove the current `server.js:79-87` debug middleware in inventory-server (it logs full headers including the bearer token). ### 6.6 CORS lockdown Current `middleware/cors.js` allows `192.168.*.*` and `10.*.*.*` with `credentials: true`. Tighten to explicit known origins: ```js origin: [ 'https://tools.acherryontop.com', 'https://inventory.kent.pw', /^http:\/\/localhost:(5174|5175)$/, ] ``` If anyone genuinely needs LAN access, add their specific IP, not a `/16` range. ### 6.7 Upload hardening `POST /api/import/upload-image` (multer-backed) needs: - File-size limit set on multer config (current limit may be defaulted — verify). - MIME-type allowlist (image/jpeg, image/png, image/webp; reject everything else). - Filename sanitization (no `..`, no absolute paths, generate UUID-based names server-side). - The Caddy `/uploads/*` handler currently serves any file in the uploads directory publicly. Move this **behind** the auth gate: include `/uploads/*` in `@needs_auth`. If some images are referenced from public emails (Klaviyo newsletter), put **those** in a separate public bucket; everything else stays gated. ### 6.8 Frontend token storage **Decision: stay on `localStorage`.** This is an internal tool with no untrusted user-generated HTML being rendered, so the XSS-token-theft surface is small. The `forward_auth` gate is the main security gap we're addressing; cookie-based auth would be a larger, separate project (cookie-parser, CSRF double-submit pattern, AuthContext refactor) that doesn't change the threat model for an internal tool with no public sign-up. Sanity check during this refactor: grep the React codebase for `dangerouslySetInnerHTML`. If any usages exist, verify each one is rendering trusted (server-controlled, not user-supplied) content. If a user-supplied content path exists, that's a real XSS vector and needs separate remediation regardless of token-storage choice. ### 6.9 Remove debug middleware [inventory-server/src/server.js:79-87](inventory-server/src/server.js#L79-L87) logs full request headers including `Authorization`. Delete this block. Replace with `shared/logging/request-log.js`. ### 6.10 `lt-wordlist-api` token `ADD_WORD_TOKEN` is currently hardcoded in `/var/www/ecosystem.config.cjs`. Move to `/opt/lt-wordlist-api/.env`, rotate the token value, update any callers. ### 6.11 Audit logging for sensitive operations Already have `import-audit-log` and `product-editor-audit-log` tables. Extend the pattern: - Log `user_id`, `endpoint`, `params`, `result` for `config.js` writes and `data-management.js` operations. - Schema: reuse the existing audit table pattern or add a generic `system_audit_log` table. - Don't log request bodies wholesale (may contain large blobs); log the action and the target ID. --- ## Phase F1 — Frontend fetch wrapper (NEW — 2026-05-23) Status: **Not started. CRITICAL. Blocks the Phase 3+6 deploy from being usable.** ### The discovery While wiring `authenticate()` on `/api/*` in Phase 6.1/6.2, we audited the frontend's fetch usage and found: - **7** call sites send `Authorization: Bearer ${token}` explicitly (all in `AuthContext.tsx` for `/me` + `/login`, plus a couple of `settings/*` pages). - **~220** other `fetch(...)` / `axios.*(...)` call sites across `inventory/src/services/`, `inventory/src/pages/`, `inventory/src/components/` send **no** Authorization header at all. - There is no global fetch wrapper, axios interceptor, or service-worker shim that injects the token. Today this works because nothing on the server checks. Caddy currently has no `forward_auth` gate (Phase 6.1 is a Caddyfile change that hasn't shipped yet) and the previous inventory-server had no `authenticate()` middleware. The frontend's auth model was "you log in once to get the token; the token is checked only by `/me`; everything else is implicitly trusted at the network layer." With Phase 6 code in production, **every page refresh 401s** on the first API call after the next pm2 reload. The user explicitly accepted this when authorising the Phase 6 work — but the fix is its own deliverable, and shipping Phase 3+6 to PM2 without F1 in the same window means an outage window measured in *however long F1 takes* (not minutes). ### Recommended approach Add a single fetch wrapper at `inventory/src/utils/api.ts` (or similar) and migrate the ~220 call sites to use it. The wrapper: 1. Reads `localStorage.getItem('token')` on every call (cheap; localStorage is sync). 2. Merges `Authorization: Bearer ${token}` into the request headers if a token exists. 3. Intercepts 401 responses → fires `window.dispatchEvent(new Event('auth:logout'))` (a listener already exists in `AuthContext.tsx:117`) so the user gets bounced to `/login` cleanly instead of seeing broken pages. 4. Preserves the existing call shape — `apiFetch(url, init)` should be a drop-in for `fetch(url, init)` so the migration is mechanical. ```ts // inventory/src/utils/api.ts (sketch) export async function apiFetch(input: RequestInfo | URL, init: RequestInit = {}): Promise { const token = localStorage.getItem('token'); const headers = new Headers(init.headers); if (token && !headers.has('Authorization')) { headers.set('Authorization', `Bearer ${token}`); } const res = await fetch(input, { ...init, headers }); if (res.status === 401 && token) { // Token expired or revoked — bounce to /login. AuthContext already listens. window.dispatchEvent(new Event('auth:logout')); } return res; } ``` Same shape for axios: ```ts // inventory/src/utils/apiClient.ts (sketch) import axios from 'axios'; export const apiClient = axios.create(); apiClient.interceptors.request.use((config) => { const token = localStorage.getItem('token'); if (token) config.headers.Authorization = `Bearer ${token}`; return config; }); apiClient.interceptors.response.use( (r) => r, (err) => { if (err?.response?.status === 401) window.dispatchEvent(new Event('auth:logout')); return Promise.reject(err); }, ); ``` ### Migration plan 1. Land the two wrapper modules above. ~50 LOC total. 2. Codemod or sed-loop: in `inventory/src/`, replace `fetch(` → `apiFetch(` (with the right import) and `axios.get/post/...` → `apiClient.get/post/...`. ~220 call sites — a half-day of careful find-and-replace plus per-page verification. Spot-check the ones with custom `Content-Type` (multipart uploads especially) so the wrapper doesn't clobber multipart boundaries. 3. Leave the `AuthContext.tsx` `/login` and `/me` calls alone — they already work and migrating them adds no value. 4. Run the SPA: log in, exercise Overview / Products / Analytics / Dashboard / etc. with browser devtools open watching for `Authorization` header on every `/api/*` request. ### Sequencing with Phase 3+6 deploy **Two options:** A) **Ship F1 first** (recommended). Frontend goes out with the wrapper; nothing changes server-side. Then `pm2 reload` Phase 3+6. Zero-downtime, zero broken-page window. B) **Ship together.** F1 and Phase 3+6 land in the same deploy. Brief window (seconds) where the frontend has the wrapper but the server hasn't reloaded yet — wrapper just sends extra headers the old server ignores. Safe. Do **not** ship Phase 3+6 first and F1 second. That gives a broken app for as long as F1 takes. ### Out of scope (kept on `localStorage`) Per Phase 6.8, we're not migrating to httpOnly cookie auth. F1 is the minimum work to make the per-service `authenticate()` (Phase 6) actually usable. A future Phase F2 could move to cookies + CSRF double-submit, but that's a much larger change touching the AuthContext, the login flow, and every backend that reads tokens. Not justified for an internal tool with no public sign-up. ### Note on `/uploads/*` gating (Phase 6.7's Caddyfile change) The proposed Caddyfile moves `/uploads/*` behind `forward_auth`. Most product images today are referenced from `` in the SPA — those requests are made by the browser, which **does not include `Authorization` headers on image requests**. Fixing this is part of F1's scope too: either (a) keep `/uploads/*` public (revert that part of 6.7) and accept that uploaded images leak to anyone who guesses a URL, or (b) issue per-image signed URLs from the API and gate those at Caddy. Decide before applying the Caddyfile. --- ## Phase 7 — Caddyfile final form Status: **Proposed (2026-05-23). Apply blocked on Phase F1.** The full proposed file lives at `inventory-server/deploy/Caddyfile.proposed` and matches the spec below except that vendor handle blocks still point to per-vendor PM2 apps (Phase 4 hasn't merged them yet). See `inventory-server/deploy/README.md` for the apply commands (admin-API + sudo cp pattern from Phase 2 deviation #8). After all phases, the `tools.acherryontop.com` block looks like: ```caddyfile tools.acherryontop.com { import security_headers # Public: login endpoint handle /auth-inv/* { uri strip_prefix /auth-inv reverse_proxy localhost:3011 } # Public: static frontend assets @static path *.js *.css *.png *.jpg *.jpeg *.gif *.ico *.svg *.woff *.woff2 handle @static { header Cache-Control "public, max-age=2592000" root * /var/www/inventory/frontend/build file_server } # All API + uploads: auth gate first @gated path /api/* /chat-api/* /uploads/* handle @gated { forward_auth localhost:3011 { uri /verify copy_headers Authorization } # Uploaded files handle /uploads/* { root * /var/www/inventory file_server } # Vendor dashboard routes → merged dashboard-server handle /api/klaviyo/* { reverse_proxy localhost:3015 } handle /api/meta/* { reverse_proxy localhost:3015 } handle /api/google-analytics/* { reverse_proxy localhost:3015 } handle /api/typeform/* { reverse_proxy localhost:3015 } # ACOT-specific handle /api/acot/* { reverse_proxy localhost:3012 } # Chat handle /chat-api/* { uri strip_prefix /chat-api reverse_proxy localhost:3014 } # Catch-all: inventory-server handle /api/* { reverse_proxy localhost:3010 } } handle /health { reverse_proxy localhost:3010 } # SPA fallback handle { root * /var/www/inventory/frontend/build try_files {path} /index.html file_server encode gzip } handle_errors { respond "{err.status_code} {err.status_text}" } } ``` Removed: `/dashboard-auth/*`, `/api/aircall/*`, `/api/gorgias/*`, `/api/clarity/*`, the LAN/`Access-Control-Allow-Origin "*"` permissive defaults on `/api/*`. Kept: `/apiv2/*` and `/apiv2-test/*` proxies to backend.acherryontop.com (out of scope, separate system). --- ## Phase 8 — ecosystem.config.cjs final form Status: **Proposed (2026-05-23).** Full proposed file at `inventory-server/deploy/ecosystem.config.cjs.proposed`. Includes the Phase 6.4 `JWT_SECRET` shadow-override fix and the Phase 6.10 `lt-wordlist-api` token move. Still lists per-vendor PM2 apps until Phase 4 merge ships — that's the only thing keeping app count at 10 instead of the target 5. ```js module.exports = { apps: [ { name: 'auth-server', script: './inventory/auth/server.js', cwd: '/var/www', env: { NODE_ENV: 'production', AUTH_PORT: 3011 }, ...commonSettings, }, { name: 'inventory-server', script: './inventory/src/server.js', cwd: '/var/www', env: { NODE_ENV: 'production', PORT: 3010, UPLOADS_DIR: '/var/www/inventory/uploads' }, ...commonSettings, }, { name: 'dashboard-server', script: './inventory/dashboard/server.js', cwd: '/var/www', env: { NODE_ENV: 'production', DASHBOARD_PORT: 3015 }, ...commonSettings, }, { name: 'acot-server', script: './inventory/dashboard/acot-server/server.js', cwd: '/var/www', env: { NODE_ENV: 'production', ACOT_PORT: 3012 }, ...commonSettings, }, { name: 'chat-server', script: './inventory/chat/server.js', cwd: '/var/www', env: { NODE_ENV: 'production', PORT: 3014 }, ...commonSettings, }, // acot-phone-server and lt-wordlist-api unchanged ], }; ``` Five entries instead of twelve. Each app loads its own `.env` from its directory (already handled by `dotenv.config`). --- ## Sequencing & dependencies ``` Phase 1 (decommission) ──┬─────────────────────────────────────────┐ │ │ ▼ │ Phase 2 (shared lib/) │ │ │ ┌──────────────┼──────────────┐ │ ▼ ▼ ▼ ▼ Phase 3a Phase 3b Phase 4 Phase 6 (auth hardening inventory-server auth-server dashboard-server runs alongside 3+4+5, to ESM to ESM + /verify build & test completes after them) │ │ │ │ └──────────────┼──────────────┘ │ ▼ │ Phase 5 (acot-server to ESM) ──────────────────►│ ▼ Phase 7 (Caddy cutover) │ ▼ Phase 8 (PM2 final state) ``` Phase 1 unblocks everything (fewer services to convert). Phase 2 is the foundation; nothing else can start until shared `lib/` exists. Phases 3–5 can run in parallel; they touch independent services. Phase 6's sub-items can be developed alongside 3–5 but **enabled** only after them (no point adding `requirePermission` to a route that doesn't yet have `authenticate`). **Phase F1 must precede the Phase 3+6 pm2 reload** — without the fetch wrapper, the moment the new code goes live the SPA breaks. Discovered during Phase 3+6 implementation; see Phase F1. Phase 7 is the cutover: Caddyfile flip happens after F1 ships AND after the `/uploads/*` gating decision in F1 is made. Phase 8 is cleanup: remove dead PM2 entries. Estimated effort, end-to-end: **~3 weeks of focused work** by one engineer. Phase 1 ≈ 1 day, Phase 2 ≈ 2 days, Phase 3 ≈ 3 days (both services), Phase 4 ≈ 5–7 days (the merge), Phase 5 ≈ 2–3 days, Phase 6 ≈ 3–4 days, Phase F1 ≈ 0.5–1 day, Phase 7+8 ≈ 1 day. --- ## Testing strategy No formal test suite exists today (per CLAUDE.md). For a refactor this size, that's a gap to close — but writing tests retroactively for 15K LOC of routes is a separate, larger project. For this refactor: ### Manual smoke testing per phase A checklist of representative endpoints to hit after each deploy: - `inventory-server`: `/api/products`, `/api/dashboard/overview`, `/api/analytics/revenue`, `/api/orders`, `/api/purchase-orders`, `/api/import/list-uploads`, `/api/config/global` - `dashboard-server`: `/api/klaviyo/campaigns`, `/api/meta/insights`, `/api/google-analytics/...`, `/api/typeform/responses` - `acot-server`: `/api/acot/...` (top-3 endpoints by call volume — pull from access logs) - `auth-server`: `/login`, `/me`, `/verify` Each smoke test runs (a) without a token → expect 401, (b) with an invalid token → expect 401, (c) with a valid token → expect 2xx. ### Frontend integration check After deploys, log into the SPA and exercise each major page (Overview, Products, Analytics, Dashboard, Klaviyo, Meta, etc.). If everything loads and dashboards populate, the auth + routing layer is intact. ### Test scaffold during Phase 2 (committed) While building `shared/`, set up `vitest` (lightweight, ESM-native, fast) as the standard test runner for the repo. Initial coverage focuses on the security-critical surface only: - `shared/auth/verify.js` — known good token, expired token, wrong-signature token, malformed token, missing token. - `shared/auth/middleware.js` — request with no header → 401; bad header → 401; valid token + inactive user → 403; valid token + missing permission → 403; valid token + correct permission → next() called with `req.user` populated. - `shared/auth/middleware.js` user-cache TTL: same token within 60s → one DB hit; same token after 61s → two DB hits. `package.json` gets a `"test": "vitest run"` script at the repo root and per-service. Set up but don't backfill broader test coverage — that's a separate, larger project. The vitest scaffold gives future work a foothold; this refactor commits to having tests for the auth boundary specifically because that's what's load-bearing for the whole security model. --- ## Rollback strategy Each phase produces an independently deployable state. Rollback per phase: - **Phase 1**: re-add removed services to ecosystem; restore from git. Don't roll back data deletions — only do those after a week of stable production. - **Phases 3, 5**: ESM conversion is per-service; if one service breaks, `pm2 restart ` to the previous commit. Other services unaffected. - **Phase 4**: the dashboard-server merge is the highest-risk change. Plan: deploy `dashboard-server` to a non-conflicting port (3015) while leaving the old per-vendor servers running. Cut over Caddy routes one vendor at a time (start with Meta — smallest). If any vendor breaks, point Caddy back to the old server (still running) for that vendor, debug, retry. Only delete the old servers after all four are stable on `dashboard-server`. - **Phases 6, 7**: Caddy config is git-tracked. `git revert` + `caddy reload` rolls back in seconds. Auth changes are additive (defense in depth) — if `forward_auth` causes problems, comment it out and per-server middleware continues protecting routes. --- ## Out of scope (intentional) These came up in the audit but aren't part of this refactor: - `httpOnly` cookie auth ("Phase F2" — deferred). Phase F1 keeps `localStorage` + Bearer header because that's the minimum to unblock the Phase 6 `authenticate()` rollout. A future move to cookie auth would touch `AuthContext`, every backend that reads tokens, and introduce CSRF concerns — much larger project. - Replacing PM2 with systemd or Docker. - Test coverage beyond the auth-critical surface. - `apiv2`/`apiv2-test` proxies to `backend.acherryontop.com` — separate system, not touched. - `acot-phone-server` and `lt-wordlist-api` — staying as-is. - Centralized observability stack (Prometheus, Grafana). The logger work in Phase 6.5 sets up the data, but shipping it somewhere is future work. - ChatRoom XSS remediation (flagged during Phase 6.8 audit — `inventory/src/components/chat/ChatRoom.tsx:277,392` renders user-controlled chat content via `dangerouslySetInnerHTML` without sanitization). Real vulnerability for an internal-but-multi-user tool; separate fix. --- ## Concrete deliverables When this is done: - 4 application PM2 processes instead of 12 (plus 2 unchanged: acot-phone, lt-wordlist). - All `/api/*` and `/chat-api/*` requests gated at Caddy and re-verified at each upstream. - Sensitive endpoints additionally gated by per-permission checks. - One ESM standard across the entire Node codebase. - One shared `lib/` for auth, logging, DB, errors, CORS. - Login rate-limited. - `JWT_SECRET` rotated. - Old auth-server, Aircall, Gorgias, Clarity directories deleted from the repo. - Caddyfile slimmed to one auth-gated block. - Permission codes inserted into `permissions` table for granular authorization. - No half-finished pieces, no `// TODO: add auth later` comments, no deferred secrets cleanup. --- ## Deviations from original plan (recorded during execution) These are decisions made during Phase 1/2 implementation that amend the spec above. Future phases should follow the deviated path, not the original sketch. 1. **`shared/` location.** Original plan placed `shared/` at the repo root as a sibling of `inventory/` and `acot-phone/`. Implemented at `inventory-server/shared/` (= `/var/www/inventory/shared/` on the server) instead. Reason: the actual project root *is* `/var/www/inventory/`; placing shared/ outside it would have meant building a deployment story for it that doesn't exist. Import paths change accordingly: - From `inventory-server/{auth,src,chat}/server.js` → `../shared/...` - From `inventory-server/dashboard/{vendor}-server/server.js` → `../../shared/...` 2. **`/verify` response headers.** Plan specified `X-User-Id` + `X-User-Is-Admin`. Implemented as `X-User-Id` + `X-User-Username` (both available from the JWT payload). `X-User-Is-Admin` was dropped because `is_admin` isn't in the JWT today and returning it would require a DB lookup — violating the "no DB hit" principle. To restore `X-User-Is-Admin`, enrich the JWT payload at login time (one-line change in `auth/routes.js`) during Phase 6, then echo from `/verify`. Upstreams don't trust these headers anyway (they re-verify), so the omission is informational, not security-relevant. 3. **User cache key in `shared/auth/middleware.js`.** Plan sketch mentioned "60s TTL keyed by token jti". Implemented as keyed by `userId` instead — the JWT doesn't currently include a `jti` claim, and the cache's invalidation semantics are "this user was deactivated/changed permissions" (per-user), not "this token was revoked" (per-token). The plan's pseudocode already used `loadUserCached(pool, decoded.userId)` so this matches the spirit. 4. **Redis client safety.** `shared/db/redis.js` sets `enableOfflineQueue: false` and `lazyConnect: true`. Plan didn't specify but these defaults mean a Redis hiccup fails fast (route fall-through to upstream API as designed in Phase 4 risk notes) rather than queueing commands indefinitely. 5. **CORS allowed origins kept `https://acot.site`.** Plan example listed three origins; production has acot.site as a redirect to tools.acherryontop.com but also reaches the API directly in some flows. Kept it to avoid breakage. LAN wildcards (`192.168.*`, `10.*`) and `Access-Control-Allow-Origin "*"` are NOT included in the new `shared/cors/policy.js` per the plan's Phase 6.6 spirit, but the legacy `inventory-server/src/middleware/cors.js` still has them until services are migrated to consume `shared/cors/`. 6. **Defunct permission codes left in DB.** Removed the `dashboard:gorgias` and `dashboard:calls` Protected blocks from the frontend, but the corresponding permission rows in the `permissions` table are still there (assigned to some users). They're inert (no UI references them) but should be cleaned up alongside the Phase 6.2 permissions migration. 7. **PM2 process names retained `new-auth-server` (not `auth-server`).** Plan's Phase 8 final form names it `auth-server` (after the legacy 3003 one is removed). Decided to keep the existing `new-auth-server` name through Phase 2 to avoid a rename mid-stream. Phase 8 can rename if desired, but it's cosmetic — all wiring is by port (3011) not name. 8. **Caddyfile changes via admin API on `:2020`.** The Caddyfile is owned by root and matt has no passwordless sudo. Cutover used `curl -X POST .../load` on the Caddy admin port (which matt can hit), then a separate `sudo cp /home/matt/Caddyfile.new /etc/caddy/Caddyfile` step to persist the on-disk file. Future Caddyfile changes can follow the same pattern. Backup convention: `/etc/caddy/Caddyfile.bak.YYYY-MM-DD`. 9. **Path-naming.** Plan uses `inventory/` as the top-level (server-side path convention). Locally the equivalent is `inventory-server/`. Whenever the plan says `inventory/dashboard/foo/`, read that as `/var/www/inventory/dashboard/foo/` on the server or `inventory-server/dashboard/foo/` locally. 10. **Scripts directory kept CJS via package.json shim.** Original plan called for converting "any spawned script" to ESM alongside its caller. Implemented: added `inventory-server/scripts/package.json` with `"type": "commonjs"`. Node's package-type resolution walks up directory by directory, so this overrides the parent's `"type": "module"` for the entire `scripts/` tree (≈15 files including `import/*.js`, `metrics-new/utils/*`, the orchestrator scripts) without renaming any file or touching any `spawn()` callsite. Convert individual scripts to ESM when touched; don't bulk-migrate. 11. **`src/routes/products.js` had dead multer setup.** Phase 6.7 spec called for hardening the upload route in products.js. There was no upload route — the `multer({ dest })` instance and `importProductsFromCSV` import were dead code left over from a long-ago migration. Strongest 6.7 hardening was deletion: no upload handler = no attack surface. The two real upload paths (`/api/import/upload-image` and `/api/reusable-images/upload`) got tightened MIME+extension allowlists instead. 12. **Two pre-existing syntax errors in shared/db/ surfaced.** `shared/db/pg.js:13` and `shared/db/redis.js:22` both had `?? Number(...) || N` — mixing `??` and `||` without parentheses is a TC39 syntax error. They passed Phase 2 because nothing imported them yet; Phase 3 smoke-test exposed it. Fixed with parens. 13. **`import { Pool } from 'pg'` doesn't work in ESM.** The `pg` package is CJS using `module.exports = { Pool, ... }`. Node's ESM-from-CJS interop fails to detect `Pool` as a named export via static analysis. The bulletproof pattern, now used everywhere: `import pg from 'pg'; const { Pool } = pg;`. Same idea for any future CJS-only deps. `src/utils/db.js` already had it; the two auth files needed the fix during execution. 14. **Frontend Bearer-header gap discovered (drives new Phase F1).** Phase 6 was specified assuming the frontend already sends `Authorization: Bearer` on every API call. It does not — only 7 of ~220 call sites do. Phase 6's `authenticate()` middleware is shipped and ready to enable, but until F1 lands the SPA will 401 on every page. The plan now has Phase F1 to address this explicitly; until then, the Phase 3+6 pm2 reload should not ship unless F1 ships in the same window. 15. **macOS NFS workflow note.** The `inventory-server/` directory locally is an NFS mount of `/var/www/inventory/` on netcup. Bulk operations (`find`/`grep -r`/mass `node --check`/`npm install`) hang or take minutes locally and pollute file listings with macOS AppleDouble `._*` sidecar files. Default to `ssh netcup` for any sweep across the tree — individual file edits via the editor are fine.