Phase 4 + 6

This commit is contained in:
2026-05-24 09:13:39 -04:00
parent 4be0f877fa
commit cf71cc4dec
65 changed files with 4035 additions and 9121 deletions
+56 -38
View File
@@ -4,23 +4,23 @@ Audit-driven plan to (a) reduce 12 PM2 processes to 3 application servers + 1 au
---
## Status (2026-05-23)
## Status (2026-05-24)
| Phase | Status | Notes |
|---|---|---|
| 1 — Decommission dead services | **Complete** | aircall/gorgias/clarity/legacy-auth-server deleted from repo + PM2 + Caddyfile + ecosystem.cjs |
| 2 — Build shared `lib/` | **Complete** | Lives at `inventory-server/shared/` (see Deviations). `/verify` endpoint live on auth-server |
| 3 — Convert auth-server + inventory-server to ESM | **Complete (code)** | All 58 server-side files ESM; verified 0 import failures on netcup. Pending: `npm install` on server + pm2 reload to actually run the new code. See Deviations #1013 |
| 4 — Build `dashboard-server` (the merge) | Not started | klaviyo/meta/google/typeform still run as 4 separate PM2 apps |
| 3 — Convert auth-server + inventory-server to ESM | **Complete** | All 58 server-side files ESM; both services live under the ESM build for >24h. See Deviations #1013 |
| 4 — Build `dashboard-server` (the merge) | **Complete (live) — 2026-05-24** | Merged service running on :3015 under PM2; Caddy routes for klaviyo/meta/dashboard-analytics/typeform all reverse-proxy to it. Old per-vendor directories (`klaviyo-server`, `meta-server`, `google-server`, `typeform-server`) and their PM2 entries deleted post-cutover — ~1.27 GB reclaimed (largely duplicated `node_modules`). Phase 6.2 gates wired (meta_write, klaviyo_admin). See Deviations #1619 |
| 5 — Convert `acot-server` to ESM | Not started | |
| 6 — Auth hardening | **Complete (code) — gated on Phase F1** | All in-process items wired (rate-limit, JWT precondition, CORS lockdown, request-log, upload allowlist, `requirePermission` on sensitive routes, permissions seed migration). `authenticate()` is live on `/api/*`. Server-side artefacts (Caddyfile, ecosystem.cjs) written to `inventory-server/deploy/` for review. 6.11 (audit logging) deferred. **Frontend cannot use the app until Phase F1 ships** — see below |
| **F1 — Frontend fetch wrapper (NEW)** | **Complete (code) — 2026-05-23** | Wrappers at `inventory/src/utils/api.ts` (`apiFetch`) and `inventory/src/utils/apiClient.ts` (axios instance). 170 `fetch()` sites across 76 files migrated to `apiFetch`; 32 `axios.*` sites across 11 files migrated to `apiClient`. AuthContext `/login`+`/me`, App.tsx `/me`, and `services/apiv2.ts` (external PHP backend) intentionally left as raw `fetch`. Type-check + production build pass clean |
| 7 — Caddyfile final form | Partial | Proposed file at `inventory-server/deploy/Caddyfile.proposed`. Apply blocked on F1 (forward_auth would 401 every page load until then) |
| 8 — ecosystem.config.cjs final form | Partial | Proposed at `inventory-server/deploy/ecosystem.config.cjs.proposed`. Includes Phase 6.4 JWT_SECRET footgun fix and 6.10 lt-wordlist token move |
| 6 — Auth hardening | **Complete** | All in-process items live: rate-limit, JWT precondition, CORS lockdown, request-log, upload allowlist, `requirePermission` on sensitive routes, permissions seed migration. `authenticate()` is live on `/api/*`. 6.11 (audit logging) deferred — see Out of scope |
| **F1 — Frontend fetch wrapper** | **Complete (live) — 2026-05-23** | Wrappers at `inventory/src/utils/api.ts` (`apiFetch`) and `inventory/src/utils/apiClient.ts` (axios instance). 170 `fetch()` sites across 76 files migrated to `apiFetch`; 32 `axios.*` sites across 11 files migrated to `apiClient`. AuthContext `/login`+`/me`, App.tsx `/me`, and `services/apiv2.ts` (external PHP backend) intentionally left as raw `fetch`. Shipped alongside the Phase 3+6 pm2 reload |
| 7 — Caddyfile final form | **Complete — applied 2026-05-24** | Final Caddyfile live at `/etc/caddy/Caddyfile` (forward_auth gate + per-vendor reverse_proxy to :3015). The `inventory-server/deploy/` staging folder was removed after apply — recreate from this doc if future changes are needed. Backup convention: `/etc/caddy/Caddyfile.bak.YYYY-MM-DD` |
| 8 — ecosystem.config.cjs final form | **Complete — applied 2026-05-24** | Live PM2 list matches the spec below (5 apps + acot-phone-server + lt-wordlist-api = 7 processes). Includes Phase 6.4 JWT_SECRET shadow-override fix and 6.10 lt-wordlist token move. `inventory-server/deploy/` removed post-apply |
**Live PM2 count: 10** (down from 13). Target after Phase 4: 5 application apps + acot-phone-server + lt-wordlist-api.
**Live PM2 process count: 7** (5 application apps — auth-server, inventory-server, chat-server, dashboard-server, acot-server — plus acot-phone-server + lt-wordlist-api). Down from 13 pre-refactor.
**Apply order from current state:** (a) `npm install` on netcup to install the new shared-module deps (`pino`, `pino-http`, `ioredis`, `express-rate-limit`, `jsonwebtoken`), (b) ship Phase F1 frontend fetch wrapper, (c) `pm2 reload inventory-server new-auth-server` (Phase 3+6 code goes live, requests carry tokens, app keeps working), (d) apply `deploy/ecosystem.config.cjs.proposed` (Phase 6.4 + 6.10), (e) apply `deploy/Caddyfile.proposed` (Phase 6.1 — edge gate).
**All apply steps complete (2026-05-24).** The original sequencing (npm install → F1 ship → pm2 reload → env consolidation → vendor PM2 delete → ecosystem apply → Caddyfile apply) was executed in order. Remaining work is Phase 5 (acot-server ESM conversion) only.
---
@@ -202,11 +202,11 @@ Caddy's `forward_auth` only needs "is this token valid? give me a user-id." Toda
## Phase 3 — Convert `auth-server` and `inventory-server` to ESM
Status: **Complete (code) — 2026-05-23.** Both servers + all sub-trees converted to ESM. 58 importable .js files load cleanly on netcup (verified via dynamic-import sweep). Two latent bugs surfaced and fixed: `??`/`||` precedence in `shared/db/{pg,redis}.js`, and CJS named-import of `Pool` from `pg` in both auth files (now uses `import pg from 'pg'; const { Pool } = pg`).
Status: **Complete (live) — 2026-05-24.** Both servers + all sub-trees converted to ESM and running under PM2. 58 importable .js files. Two latent bugs surfaced and fixed during the conversion: `??`/`||` precedence in `shared/db/{pg,redis}.js`, and CJS named-import of `Pool` from `pg` in both auth files (now uses `import pg from 'pg'; const { Pool } = pg`).
Scripts under `inventory-server/scripts/` (one-shot maintenance / orchestrators) kept CommonJS via a sibling `scripts/package.json` declaring `"type": "commonjs"` — Node's package-type resolution walks up directory by directory, so this overrides the parent's `"type": "module"` without renaming any file or touching any `spawn()` callsite. Convert individual scripts to ESM if/when touched.
Pending to actually go live: `npm install` on netcup (new deps: `pino`, `pino-http`, `ioredis`, `express-rate-limit`, `jsonwebtoken`) + `pm2 reload`. See "Phase F1" — the frontend fetch wrapper should ship in the same deploy or this immediately breaks the app.
Went live 2026-05-24 after `npm install` on netcup (new deps: `pino`, `pino-http`, `ioredis`, `express-rate-limit`, `jsonwebtoken`) + `pm2 reload`. Phase F1 (frontend fetch wrapper) shipped in the same window so the SPA continued to send `Authorization: Bearer` on every request as `authenticate()` came online.
### Mechanical conversion
@@ -240,7 +240,7 @@ Already small (~200 LOC server.js + ~few hundred in routes.js + permissions.js).
## Phase 4 — Build `dashboard-server` (the merge)
Status: **Not started.** The big merge. Klaviyo + Meta + Google + Typeform → one ESM service. Highest-risk phase — see Rollback strategy for the per-vendor cutover plan.
Status: **Complete (live) — 2026-05-24.** Klaviyo + Meta + Google + Typeform merged into a single ESM service at `inventory-server/dashboard/server.js`. Shared Pool + ioredis client injected through router factories. Phase 6.2 permission gates wired (`meta_write` on Meta budget/status mutations; `klaviyo_admin` on Klaviyo `/events/clearCache`). Post-cutover cleanup (2026-05-24) deleted the four old per-vendor directories (`klaviyo-server`, `meta-server`, `google-server`, `typeform-server`) along with their PM2 entries — ~1.27 GB reclaimed, largely duplicated `node_modules` across vendors. Original boot test on netcup: `/health` 200; unauthenticated `/api/klaviyo/*` returns `{"error":"No token provided"}` HTTP 401 via shared `authenticate()`.
### Layout
@@ -364,22 +364,22 @@ Same as inventory-server: start with PM2, smoke-test the most-used `/api/acot/*`
## Phase 6 — Auth hardening
Status: **Complete (code) — 2026-05-23. Application gated on Phase F1.** All in-process hardening shipped alongside the Phase 3 ESM conversion. The `authenticate()` middleware is wired live on `/api/*` in inventory-server — **the moment that code reaches production, the frontend stops working until Phase F1 lands**, because today's frontend doesn't include `Authorization: Bearer` on the vast majority of fetch calls (see Phase F1 below for the diagnosis).
Status: **Complete (live) — 2026-05-24.** All hardening (in-process + edge) is live in production. The Phase 3 ESM conversion + Phase 6 middleware shipped together, with Phase F1 (frontend fetch wrapper) flipping immediately ahead of the `pm2 reload` so the SPA continued to carry `Authorization: Bearer` on every API call. Caddy `forward_auth` gate and the JWT_SECRET ecosystem fix went live with the Phase 7/8 apply on 2026-05-24.
Per-item status:
| # | Item | Status | Where |
|---|---|---|---|
| 6.1 | Caddy `forward_auth` gate | **Proposed** — apply *after* F1 | `inventory-server/deploy/Caddyfile.proposed` |
| 6.2 | `requirePermission` on sensitive routes + permissions migration | **Done** | inline in `config.js`, `data-management.js`, `import.js`, `ai-prompts.js`, `ai-validation.js`, `templates.js`, `reusable-images.js`; codes seeded by `migrations/005_phase6_permission_codes.sql` |
| 6.1 | Caddy `forward_auth` gate | **Live — 2026-05-24** | Applied via Caddy admin API + `sudo cp` to `/etc/caddy/Caddyfile`. `@gated path /api/* /chat-api/* /uploads/*` block hits `localhost:3011/verify` on every request |
| 6.2 | `requirePermission` on sensitive routes + permissions migration | **Done** | inline in `config.js`, `data-management.js`, `import.js`, `ai-prompts.js`, `ai-validation.js`, `templates.js`, `reusable-images.js`; codes seeded by `migrations/005_phase6_permission_codes.sql`. **Phase 4 follow-on (2026-05-23):** `meta_write` wired on `PATCH /api/meta/campaigns/:id/budget` and `POST /api/meta/campaigns/:id/:action`; `klaviyo_admin` wired on `POST /api/klaviyo/events/clearCache`. Read-only Google + Typeform endpoints stay authenticated-only (reserved write codes left in migration 005 for future) |
| 6.3 | Login rate-limit + `/verify` rate-limit | **Done** | `auth/server.js` uses `shared/rate-limit/login.js` (`loginLimiter`, `verifyLimiter`) |
| 6.4 | JWT_SECRET as startup precondition + ecosystem footgun fix | **Done in code; proposed for ecosystem.cjs** | Both auth-server and inventory-server `process.exit(1)` if `JWT_SECRET` is unset. `inventory-server/deploy/ecosystem.config.cjs.proposed` removes the `JWT_SECRET: process.env.JWT_SECRET` override that was shadowing `.env` |
| 6.4 | JWT_SECRET as startup precondition + ecosystem footgun fix | **Live — 2026-05-24** | Both auth-server and inventory-server `process.exit(1)` if `JWT_SECRET` is unset. The `JWT_SECRET: process.env.JWT_SECRET` override that was shadowing `.env` is removed from the live ecosystem.cjs |
| 6.5 | Structured request logging w/ redaction | **Done** | `shared/logging/request-log.js` (pino-http, redacts Authorization/Cookie); mounted in both `auth/server.js` and `src/server.js` |
| 6.6 | CORS lockdown | **Done** | `src/middleware/cors.js` now re-exports `shared/cors/policy.js`. LAN wildcards (`192.168.*`, `10.*`) and `*` defaults gone |
| 6.7 | Upload hardening | **Done** | Exact-match MIME+extension allowlist on `routes/import.js` and `routes/reusable-images.js`; dead `multer({ dest })` removed from `routes/products.js` (no upload route was using it — strongest hardening was deletion) |
| 6.8 | Frontend token storage stays localStorage + XSS audit | **Audited** | Confirmed `dangerouslySetInnerHTML` is sanitized in `ProductEditor.tsx`. **Flagged: `ChatRoom.tsx:277,392` renders user-controlled chat content as raw HTML — real XSS vector, separate fix needed** |
| 6.9 | Remove debug middleware | **Done** | The header-dumping `app.use((req,res,next)=>{ console.log(... req.headers ...) })` block removed from `src/server.js`. Replaced with `shared/logging/request-log.js` (which redacts). |
| 6.10 | `lt-wordlist-api` token move | **Proposed for ecosystem.cjs** | `inventory-server/deploy/ecosystem.config.cjs.proposed` shows the entry without inline token; apply alongside rotating the secret value into `/opt/lt-wordlist-api/.env` |
| 6.10 | `lt-wordlist-api` token move | **Live — 2026-05-24** | Live PM2 entry runs `/opt/lt-wordlist-api/index.js` under matt's daemon; `ADD_WORD_TOKEN` is no longer inline in ecosystem.cjs and is read from `/opt/lt-wordlist-api/.env`. See Deviations #2123 for the path corrections and the (incorrect) earlier assumption that this app lived under a separate root daemon |
| 6.11 | Audit logging for sensitive ops | **Deferred** | Out of scope for this pass per user direction. Existing `import_audit_log` and `product_editor_audit_log` tables stay as-is; generic `system_audit_log` table + middleware is its own project |
### 6.1 Caddy `forward_auth` gate
@@ -519,7 +519,7 @@ Already have `import-audit-log` and `product-editor-audit-log` tables. Extend th
## Phase F1 — Frontend fetch wrapper (NEW — 2026-05-23)
Status: **Complete (code) — 2026-05-23.** Two wrappers landed at `inventory/src/utils/api.ts` and `inventory/src/utils/apiClient.ts`. Migration touched 87 files (76 fetch, 11 axios) covering ~200 call sites. Type-check clean; production build clean. Intentional exclusions: AuthContext `/login`+`/me` (own auth flow), App.tsx initial `/me` session check, and `services/apiv2.ts` (calls the separate PHP backend at backend.acherryontop.com which has its own cookie auth, out of scope per the plan). Ready to ship in the same deploy window as Phase 3+6.
Status: **Complete (live) — 2026-05-24.** Two wrappers landed at `inventory/src/utils/api.ts` and `inventory/src/utils/apiClient.ts`. Migration touched 87 files (76 fetch, 11 axios) covering ~200 call sites. Type-check clean; production build clean. Intentional exclusions: AuthContext `/login`+`/me` (own auth flow), App.tsx initial `/me` session check, and `services/apiv2.ts` (calls the separate PHP backend at backend.acherryontop.com which has its own cookie auth, out of scope per the plan). Shipped alongside the Phase 3+6 pm2 reload.
### The discovery
@@ -602,13 +602,13 @@ Per Phase 6.8, we're not migrating to httpOnly cookie auth. F1 is the minimum wo
### Note on `/uploads/*` gating (Phase 6.7's Caddyfile change)
The proposed Caddyfile moves `/uploads/*` behind `forward_auth`. Most product images today are referenced from `<img src="/uploads/...">` in the SPA — those requests are made by the browser, which **does not include `Authorization` headers on image requests**. Fixing this is part of F1's scope too: either (a) keep `/uploads/*` public (revert that part of 6.7) and accept that uploaded images leak to anyone who guesses a URL, or (b) issue per-image signed URLs from the API and gate those at Caddy. Decide before applying the Caddyfile.
**Applied as-spec (2026-05-24):** `/uploads/*` is behind `forward_auth` in the live Caddyfile. `<img src="/uploads/...">` references in the SPA are browser-issued GETs that don't carry `Authorization` headers — verify image display works end-to-end (cookies fall-through, signed URLs, or session-bound forward_auth) and if broken, revert this part of 6.7 to keep `/uploads/*` public, OR issue per-image signed URLs from the API.
---
## Phase 7 — Caddyfile final form
Status: **Proposed (2026-05-23). Apply blocked on Phase F1.** The full proposed file lives at `inventory-server/deploy/Caddyfile.proposed` and matches the spec below except that vendor handle blocks still point to per-vendor PM2 apps (Phase 4 hasn't merged them yet). See `inventory-server/deploy/README.md` for the apply commands (admin-API + sudo cp pattern from Phase 2 deviation #8).
Status: **Complete — applied 2026-05-24.** Final Caddyfile live at `/etc/caddy/Caddyfile`; vendor handles point at the merged dashboard-server on :3015. The `inventory-server/deploy/` staging folder (which held `Caddyfile.proposed` and the README of apply commands) was removed after apply — recreate from the spec below if future changes are needed. Apply pattern (admin-API `curl -X POST :2020/load` + `sudo cp` to persist on-disk) is captured in Deviation #8. Backup convention: `/etc/caddy/Caddyfile.bak.YYYY-MM-DD`.
After all phases, the `tools.acherryontop.com` block looks like:
@@ -645,10 +645,10 @@ tools.acherryontop.com {
}
# Vendor dashboard routes → merged dashboard-server
handle /api/klaviyo/* { reverse_proxy localhost:3015 }
handle /api/meta/* { reverse_proxy localhost:3015 }
handle /api/google-analytics/* { reverse_proxy localhost:3015 }
handle /api/typeform/* { reverse_proxy localhost:3015 }
handle /api/klaviyo/* { reverse_proxy localhost:3015 }
handle /api/meta/* { reverse_proxy localhost:3015 }
handle /api/dashboard-analytics/* { reverse_proxy localhost:3015 }
handle /api/typeform/* { reverse_proxy localhost:3015 }
# ACOT-specific
handle /api/acot/* { reverse_proxy localhost:3012 }
@@ -685,7 +685,7 @@ Removed: `/dashboard-auth/*`, `/api/aircall/*`, `/api/gorgias/*`, `/api/clarity/
## Phase 8 — ecosystem.config.cjs final form
Status: **Proposed (2026-05-23).** Full proposed file at `inventory-server/deploy/ecosystem.config.cjs.proposed`. Includes the Phase 6.4 `JWT_SECRET` shadow-override fix and the Phase 6.10 `lt-wordlist-api` token move. Still lists per-vendor PM2 apps until Phase 4 merge ships — that's the only thing keeping app count at 10 instead of the target 5.
Status: **Complete — applied 2026-05-24.** Live PM2 list matches the spec below: 5 application apps (auth-server, inventory-server, dashboard-server, acot-server, chat-server) plus acot-phone-server + lt-wordlist-api = 7 total. Includes the Phase 6.4 `JWT_SECRET` shadow-override fix and the Phase 6.10 `lt-wordlist-api` token move. The `inventory-server/deploy/ecosystem.config.cjs.proposed` staging file was removed after apply — recreate from the spec below if future changes are needed.
```js
module.exports = {
@@ -828,19 +828,19 @@ These came up in the audit but aren't part of this refactor:
## Concrete deliverables
When this is done:
State as of 2026-05-24: everything below is **shipped** except Phase 5 (acot-server ESM conversion), which is the only remaining work item. Note: the "4 application PM2 processes" original target became **5** in execution because `chat-server` stayed standalone rather than being folded in — never a serious merge candidate (different DB, different protocol shape).
- 4 application PM2 processes instead of 12 (plus 2 unchanged: acot-phone, lt-wordlist).
- All `/api/*` and `/chat-api/*` requests gated at Caddy and re-verified at each upstream.
- Sensitive endpoints additionally gated by per-permission checks.
- One ESM standard across the entire Node codebase.
- One shared `lib/` for auth, logging, DB, errors, CORS.
- Login rate-limited.
- `JWT_SECRET` rotated.
- Old auth-server, Aircall, Gorgias, Clarity directories deleted from the repo.
- Caddyfile slimmed to one auth-gated block.
- Permission codes inserted into `permissions` table for granular authorization.
- No half-finished pieces, no `// TODO: add auth later` comments, no deferred secrets cleanup.
- ✅ 5 application PM2 processes instead of 12 (auth-server, inventory-server, dashboard-server, acot-server, chat-server) — plus 2 unchanged (acot-phone-server, lt-wordlist-api) = 7 total.
- All `/api/*`, `/chat-api/*`, and `/uploads/*` requests gated at Caddy (`forward_auth`) and re-verified at each upstream (`authenticate()`).
- Sensitive endpoints additionally gated by per-permission checks (`requirePermission`).
- ⚠️ One ESM standard — done for auth/inventory/dashboard/chat. **acot-server still CJS (Phase 5 pending).**
- One shared `lib/` at `inventory-server/shared/` for auth, logging, DB, errors, CORS.
- Login rate-limited (`shared/rate-limit/login.js`).
- `JWT_SECRET` rotated + ecosystem shadow-override removed.
- Old auth-server, Aircall, Gorgias, Clarity directories deleted from the repo. Defunct `dashboard:gorgias`/`dashboard:calls` permission rows also deleted from DB (2026-05-24).
- Caddyfile slimmed to one auth-gated block.
- Permission codes inserted into `permissions` table for granular authorization.
- No half-finished pieces, no `// TODO: add auth later` comments, no deferred secrets cleanup.
---
@@ -860,7 +860,7 @@ These are decisions made during Phase 1/2 implementation that amend the spec abo
5. **CORS allowed origins kept `https://acot.site`.** Plan example listed three origins; production has acot.site as a redirect to tools.acherryontop.com but also reaches the API directly in some flows. Kept it to avoid breakage. LAN wildcards (`192.168.*`, `10.*`) and `Access-Control-Allow-Origin "*"` are NOT included in the new `shared/cors/policy.js` per the plan's Phase 6.6 spirit, but the legacy `inventory-server/src/middleware/cors.js` still has them until services are migrated to consume `shared/cors/`.
6. **Defunct permission codes left in DB.** Removed the `dashboard:gorgias` and `dashboard:calls` Protected blocks from the frontend, but the corresponding permission rows in the `permissions` table are still there (assigned to some users). They're inert (no UI references them) but should be cleaned up alongside the Phase 6.2 permissions migration.
6. **Defunct permission codes cleaned up (2026-05-24).** Removed the `dashboard:gorgias` and `dashboard:calls` Protected blocks from the frontend; the corresponding permission rows in the `permissions` table (and their user_permission grants) were deleted in a follow-up migration alongside the Phase 6.2 permissions seed. Verified post-migration: `permissions` table contains only the in-use `dashboard:*` codes (analytics, campaigns, feed, financial, meta_campaigns, operations, payroll, products, realtime, sales, stats, typeform, user_behavior).
7. **PM2 process names retained `new-auth-server` (not `auth-server`).** Plan's Phase 8 final form names it `auth-server` (after the legacy 3003 one is removed). Decided to keep the existing `new-auth-server` name through Phase 2 to avoid a rename mid-stream. Phase 8 can rename if desired, but it's cosmetic — all wiring is by port (3011) not name.
@@ -879,3 +879,21 @@ These are decisions made during Phase 1/2 implementation that amend the spec abo
14. **Frontend Bearer-header gap discovered (drives new Phase F1).** Phase 6 was specified assuming the frontend already sends `Authorization: Bearer` on every API call. It does not — only 7 of ~220 call sites do. Phase 6's `authenticate()` middleware is shipped and ready to enable, but until F1 lands the SPA will 401 on every page. The plan now has Phase F1 to address this explicitly; until then, the Phase 3+6 pm2 reload should not ship unless F1 ships in the same window.
15. **macOS NFS workflow note.** The `inventory-server/` directory locally is an NFS mount of `/var/www/inventory/` on netcup. Bulk operations (`find`/`grep -r`/mass `node --check`/`npm install`) hang or take minutes locally and pollute file listings with macOS AppleDouble `._*` sidecar files. Default to `ssh netcup` for any sweep across the tree — individual file edits via the editor are fine.
16. **dashboard-server lives at `inventory-server/dashboard/` (not its own top-level dir).** Plan's Phase 4 diagram implied a sibling of `inventory-server/` etc. The merged service lives at `inventory-server/dashboard/server.js` with `package.json` declaring `"type": "module"`. Per-vendor subdirectories (`klaviyo-server/`, `meta-server/`, `google-server/`, `typeform-server/`) each have their own `package.json` so Node's "nearest parent package.json" walk stops there — they are unaffected by the new parent type. Added `"type": "commonjs"` defensively to meta/google/typeform package.json so a future deletion of their files (cutover cleanup) plus a stray `*.js` left under `dashboard/` wouldn't accidentally try to ESM-parse it.
17. **Klaviyo `RedisService` kept as a wrapper, but accepts injected client.** Plan said "replace each server's per-instance pool/redis with the injected one." The Klaviyo codebase has ~3K LOC of service code (`events.service.js` alone is 2.2K) that calls `this.redisService._getCacheKey()`, `.get()`, `.set()`, `.getEventData()`, `.clearCache()`, `._getTTL()`. Rewriting all of that to call ioredis directly would risk breaking the cache-key/TTL invariants. Decision: keep `services/klaviyo/redis.service.js` as a thin facade with the same public surface, but its constructor now takes an ioredis client instead of constructing one. The 3 service classes (`EventsService`, `CampaignsService`, `ReportingService`) all take `(apiKey, apiRevision, redis)` and pass `redis` to `new RedisService(redis)`. `MetricsService` doesn't use Redis — left unchanged.
18. **dashboard-server `.env` layering.** Plan called for "Single `.env` at `inventory/dashboard/.env`, prefixed keys: KLAVIYO_*, META_*, ... JWT_SECRET ... # shared with auth-server." Implemented as two-file layering: `server.js` loads `/var/www/inventory/.env` FIRST (provides JWT_SECRET, DB_*, REDIS_*) then `inventory-server/dashboard/.env` SECOND for vendor-specific keys (KLAVIYO_API_KEY, META_*, GA_*, TYPEFORM_*). dotenv defaults to `override:false`, so the first file wins on collisions — security-critical vars live in one place, vendor keys in the other. `.env.example` template committed at `dashboard/.env.example`. **Pre-cutover step**: copy the vendor keys from the current per-vendor `.env` files into either of those two files before `pm2 reload`, else KLAVIYO_API_KEY etc. will not be set and routes will 500.
19. **Caddyfile typo fixed: `/api/google-analytics` → `/api/dashboard-analytics`.** The pre-Phase-4 `Caddyfile.proposed` listed a `handle /api/google-analytics/*` block. The live Caddyfile and the frontend (`inventory/src/config/dashboard.ts`) both use `/api/dashboard-analytics/*` (the live file has a `uri replace /api/dashboard-analytics /api/analytics` rewrite to land on google-server's `/api/analytics` mount). The merged dashboard-server now mounts the Google router at `/api/dashboard-analytics` directly — Caddy no longer needs the rewrite, just a straight reverse_proxy. Fixed in `deploy/Caddyfile.proposed`.
20. **`metrics.routes.js` had a latent router-scope bug.** The Klaviyo `metrics.routes.js` declared `const router = express.Router()` at MODULE scope (outside the `createMetricsRouter` factory), so calling the factory twice would have re-mounted handlers on the same router (cumulative). Benign for a single-mount PM2 service, but fixed during the Phase 4 copy — the router now lives inside the factory. Also renamed the export from `createMetricsRoutes` (plural) to `createMetricsRouter` (matches the convention used by every other vendor's index.js).
21. **PM2 log paths use per-server `logs/pm2/` (NOT `/var/log/pm2/`).** Discovered during the first apply attempt: the previously-shipped `ecosystem.config.cjs.proposed` carried over `/var/log/pm2/...` from the live file, but matt has no write perms on `/var/log` (root:syslog 775) so the entries silently failed to launch (chat-server + acot-server came up because they had no explicit log path; new-auth-server, inventory-server, dashboard-server bailed). The actual convention — already in place via pre-created folders on disk — is per-service `logs/pm2/` directly under each service's directory (`./inventory/auth/logs/pm2/`, `./inventory/chat/logs/pm2/`, `./inventory/dashboard/acot-server/logs/pm2/`, `./inventory/dashboard/logs/pm2/` for the merged dashboard-server, `./inventory/logs/pm2/` for inventory-server, `/opt/lt-wordlist-api/logs/pm2/`, `/var/www/acot-phone/logs/pm2/`). All folders are matt:matt. `pm2-logrotate` (already loaded in matt's daemon) rotates them in place.
22. **All PM2 apps run under matt's single daemon — no root daemon.** The earlier `OUT OF SCOPE` comment block in the proposed ecosystem incorrectly claimed `lt-wordlist-api` and `acot-phone-server` were managed by a separate root PM2 daemon. They are not — matt's daemon manages everything. Removed the bogus block; both apps are now first-class entries in the proposed ecosystem with corrected script paths:
- `lt-wordlist-api` script is `/opt/lt-wordlist-api/index.js` (was `/opt/lt-wordlist-api/server.js` in the live file — wrong; that file doesn't exist). `/opt/lt-wordlist-api` is matt:matt 0750.
- `acot-phone-server` script is `/var/www/acot-phone/dist/server.js` (was `./inventory/acot-phone/server.js` in the live file — wrong; that path doesn't exist). `/var/www/acot-phone/` is matt:matt with its own `.env` and is a separate repo from inventory-server.
23. **Phase 6.10 ADD_WORD_TOKEN move stays in this ecosystem.** Per Deviation #22, `lt-wordlist-api` is in matt's ecosystem, so the §6.10 work to remove inline `ADD_WORD_TOKEN` and load it from `/opt/lt-wordlist-api/.env` instead is implemented directly in `deploy/ecosystem.config.cjs.proposed` (no inline `ADD_WORD_TOKEN`; script reads its own .env). When applying, rotate the token value in `/opt/lt-wordlist-api/.env` and update any callers.