# Import & Metrics Pipeline Fix Plan Fixes for issues found in a full review (2026-06-10) of the `full-update.js` pipeline: `inventory-server/scripts/full-update.js` → `import-from-prod.js` (6 importers in `scripts/import/`) → `calculate-metrics-new.js` (7 SQL modules in `scripts/metrics-new/`). Every issue below was verified against the code, and where marked **[verified-live]**, against the live MySQL source (`sg` on 192.168.1.5 via the acot-db tooling / `ssh workpi`) and live PostgreSQL (`inventory_db` — `ssh netcup`, then `psql -U inventory_readonly`, password in `/Users/matt/Dev/inventory/CLAUDE.md`). Write credentials for migrations: see `/var/www/inventory/.env` on netcup (`inventory_user`). ## Operational context (read first) - Local `inventory-server/` is **NFS-mounted** to `/var/www/inventory/` on the netcup server — edits appear on the server with no copy step. Run heavy validation/grep/find **on the server via `ssh netcup`**, not locally (NFS hangs + AppleDouble `._*` noise). - The PG server timezone is **Europe/Berlin**. The business operates in **America/Chicago**. This matters for Fix 2. - MySQL server is America/Chicago; the mysql2 driver is configured `timezone: '-05:00'` and corrected at runtime by `adjustDateForMySQL()` in `scripts/import/utils.js` (see `memory/TIMEZONE_ISSUE.md`). Don't "fix" that part — it already works. - Orders/PO/products imports are incremental by default (`INCREMENTAL_UPDATE !== 'false'`); a full orders sync = run with `INCREMENTAL_UPDATE=false` (5-year window). - Existing rebuild tooling: `scripts/metrics-new/backfill/rebuild_daily_snapshots.sql` (rebuilds `daily_product_snapshots` from `orders`/`receivings`). The full-pipeline order after data fixes: re-import → rebuild snapshots → `node scripts/calculate-metrics-new.js`. - Precedent: `scripts/metrics-new/migrations/002_fix_discount_double_counting.sql` documents the procedure used last time a discount formula changed. Follow the same pattern (migration doc + code fix + full re-import + rebuild). --- ## P0 — Data correctness (do both, then ONE re-import + rebuild) ### Fix 1: Item-level promo discounts dropped (~$26K / 30 days ≈ 10% of product revenue) [verified-live] **File:** `scripts/import/orders.js` — `order_totals` CTE (~lines 604-623) and the discount fetch in `processDiscountsBatch` (~lines 379-383). **Problem.** The discount applied to each PG `orders` row is: prorated `summary_discount_subtotal` + item-level promo discounts. The item-level part is gated: ```sql SUM(CASE WHEN COALESCE(md.discount_amount_subtotal, 0) > 0 THEN id.amount ELSE 0 END) ``` In the PHP source (`/Users/matt/Dev/acot/website/website/lib/neworder.class.php`): - `order_items.prod_price` is the **pre-promo** price; `summary_subtotal = Σ prod_price·qty` (line ~3087). - Item-level promo discounts live in `order_discount_items` with `which = 2`; they are applied to the order total via `summary_discount += amount + products_disc_sum` (line ~6567) — i.e. they are **not** part of `discount_amount_subtotal` and **not** baked into `prod_price`. - Live data (90 days): of 10,010 type-10 promo discounts, **8,070 have item rows but only 8 have `discount_amount_subtotal > 0`** — the gate zeroes essentially all item-level promo discounts. - Live impact (30 days): **$25,989 dropped** across 2,021 orders, vs only $13,574 captured via the prorated subtotal component. Order discount components, 30d: total $54,957 = $13,574 subtotal + $15,395 shipping + ~$25,989 item-level. (Shipping discounts correctly excluded from product revenue.) **Consequence.** `orders.discount` understated → `net_revenue`, `profit_30d`, `margin_30d` overstated by ~10% of revenue; `discounts_30d` / `discount_rate_30d` ~3x understated. Flows into daily snapshots, product/brand/vendor/category metrics, and dashboards. **Fix.** 1. In `processDiscountsBatch`, fetch only real item discounts: `SELECT order_id, pid, discount_id, amount FROM order_discount_items WHERE order_id IN (?) AND which = 2`. (`which=1` rows store prices of free promo-added items; `which=3` are usage records — neither is a discount amount.) 2. In the `order_totals` CTE, remove the gate — sum `id.amount` unconditionally: `SUM(COALESCE(id.amount, 0)) AS promo_discount_sum` (drop the join/CASE on `temp_main_discounts`; `temp_main_discounts` becomes unused and can be removed entirely along with its insert loop). 3. Sanity guard (optional, recommended): clamp final per-row discount to `price * quantity`. **Verification.** After a FULL orders re-import, for a recent 30-day window PG should satisfy: `SUM(discount)` ≈ MySQL `Σ summary_discount_subtotal` + `Σ order_discount_items.amount (which=2)` over the same orders (± rounding from proration). Spot-check an order with a type-10 promo: discount on the affected pid ≈ the `which=2` amount. Re-run migration 002's verification query too (pids 624756, 614513) to confirm no regression of the prior fix. ### Fix 2: Daily snapshots bucket sales by Europe/Berlin days, not business days [verified-live] **Files:** `scripts/metrics-new/update_daily_snapshots.sql` (SalesData join `o.date::date = _target_date` ~line 138; gap-fill and stale-detection aggregates at lines ~47-83); `scripts/metrics-new/backfill/rebuild_daily_snapshots.sql` (same pattern — check & fix); `scripts/metrics-new/update_product_metrics.sql` (`HistoricalDates` `MIN(o.date)::date` etc., lines ~131-147). **Problem.** `orders.date` is `timestamptz`; `::date` casts in the server TZ (**Europe/Berlin**, verified via `SHOW timezone`). Berlin is 7-8h ahead of Central, so every order placed after ~5 PM Central lands on the **next** snapshot day. This shifts a large evening slice of daily sales forward one day; skews `yesterday_sales`, day-of-week patterns (the forecast engine's DOW multipliers, daily-grain forecast accuracy — see `FORECAST_FIX_PLAN.md`), and is inconsistent with `stock_snapshots`, whose dates come from a Central-time MySQL cron. **Fix.** Bucket all order/receiving dates in business time. Replace every `o.date::date` / `received_date::date` used for *day bucketing* in the two snapshot SQL files with: ```sql (o.date AT TIME ZONE 'America/Chicago')::date ``` Apply consistently in: SalesData, ReceivingData, the gap-fill date lists, the stale-detection aggregates (they must match SalesData or every day looks permanently stale), and the rebuild script. `HistoricalDates` in update_product_metrics (first/last sold dates) should match too. Add an index to keep the per-day loop fast, e.g. `CREATE INDEX ON orders ( ((date AT TIME ZONE 'America/Chicago')::date) );` and equivalent on `receivings(received_date)`; check `EXPLAIN` on the SalesData query afterward. Note: `receivings.received_date` came from MySQL DATETIME (Central literal) inserted as timestamptz — it was interpreted in the *session* TZ at insert. Before converting, spot-check a few receivings against MySQL to confirm which TZ the stored instants actually represent; the conversion expression must yield the Central calendar day MySQL shows. Same check for `orders.date` (it originates from `_order.date_placed`, a TIMESTAMP column, so it should be a correct instant — `AT TIME ZONE 'America/Chicago'` is right for it). **Verification.** Pick 2-3 recent days; compare per-day `units_sold` totals in `daily_product_snapshots` against MySQL `SELECT date_placed_onlydate, SUM(qty_ordered) ... WHERE order_status >= 20 GROUP BY 1` (MySQL stores Central days). They should now match closely (small diffs from canceled-status timing). ### P0 execution order (single pass) 1. Land Fix 1 (orders.js) and Fix 2 (both snapshot SQL files + product-metrics date CTE). 2. Full orders re-import: `INCREMENTAL_UPDATE=false node scripts/import-from-prod.js` (or at minimum the orders step) — run on the server, it's long. 3. Rebuild snapshots: `psql -f scripts/metrics-new/backfill/rebuild_daily_snapshots.sql` (after confirming it contains the TZ fix). The hourly job's 90-day self-heal will NOT fix history beyond 90 days by itself; the explicit rebuild is required. 4. `node scripts/calculate-metrics-new.js`. 5. Expect dashboards to show: margins down ~8-10 points (real), daily sales curves shifted, DOW profile changed. Tell the user before/after numbers. --- ## P1 — Wrong or drifting numbers, fix soon ### Fix 3: Vendor avg lead time computed over a near-cartesian join **File:** `scripts/metrics-new/calculate_vendor_metrics.sql`, `VendorPOAggregates` (lines ~62-83). **Problem.** Joins each done-PO line to **every** receiving of the same (pid, supplier) after the PO date — a product received 10 times contributes 10 ever-growing lead times → overstated, busy-product- weighted vendor lead time. The per-product version in `update_periodic_metrics.sql` (lines 27-48) is correct (MIN receiving per PO within 180 days, then average). **Fix.** Reuse the periodic shape, aggregated to vendor: ```sql WITH po_first_receiving AS ( SELECT po.vendor, po.po_id, po.pid, po.date::date AS po_date, MIN(r.received_date::date) AS first_receive_date FROM purchase_orders po JOIN receivings r ON r.pid = po.pid AND r.supplier_id = po.supplier_id AND r.received_date >= po.date AND r.received_date <= po.date + INTERVAL '180 days' WHERE po.status = 'done' AND po.date >= CURRENT_DATE - INTERVAL '1 year' AND po.vendor IS NOT NULL AND po.vendor <> '' GROUP BY po.vendor, po.po_id, po.pid, po.date ) SELECT vendor, COUNT(DISTINCT po_id) AS po_count_365d, ROUND(AVG(GREATEST(1, first_receive_date - po_date)))::int AS avg_lead_time_days_hist FROM po_first_receiving GROUP BY vendor ``` **Verification.** For a few vendors compare old vs new values; new should be materially lower and roughly match `AVG(product_metrics.avg_lead_time_days)` for that vendor's products. ### Fix 4: Deleted order items & combined orders never reconciled in PG [verified-live] **File:** `scripts/import/orders.js`. **Problem.** The orders import upserts but never deletes: - Items removed from an order in MySQL (`DELETE FROM order_items ...` happens, e.g. neworder.class.php ~line 6500 for unpicked promo items, plus staff edits) leave stale rows in PG forever. May 2026 check: PG has 49,841 item rows vs MySQL 49,377 (+0.9%) — and PG should be ≤ MySQL. - Combining orders (`combine_orders`, neworder.class.php ~11946) sets the source orders to status 16 AND **zeroes `date_placed`**, then copies all items to a NEW order. Because the import query filters `o.date_placed >= …`, a combined source order can never be re-fetched, so its stale 'placed' rows would double-count with the new merged order. Currently latent (last combine 2024-07, predating current PG data — verified no stale rows exist today), but it will silently corrupt the day combining is used again. **Fix.** Two parts, both inside the orders import after the upsert phase: 1. **Item-set reconciliation** for re-imported orders: the import already knows the set of changed `orderIds` and inserted their current items into `temp_order_items`. Mirror the PO import's pattern (`purchase-orders.js` lines ~683-694): ```sql DELETE FROM orders o WHERE o.order_number = ANY($1) -- orders fetched this run AND NOT EXISTS (SELECT 1 FROM temp_order_items t WHERE t.order_id = o.order_number AND t.pid = o.pid); ``` 2. **Combined/cancelled sweep** that does NOT depend on `date_placed`: each run, fetch from MySQL `SELECT order_id, order_status FROM _order WHERE order_status IN (15,16) AND stamp > ?` (no date_placed filter) and update matching PG rows' `status`/`canceled` ('combined' rows are then excluded from metrics — see Fix 5). Cheap (small result set). **Verification.** Re-run the May-2026 row-count comparison (MySQL vs PG for one month) after one full run; counts should converge (PG ≤ MySQL, diff explained by TZ window edges only). ### Fix 5: 'combined' orders are counted as sales **Files:** `scripts/metrics-new/update_daily_snapshots.sql` (status filters, lines ~77, 120-134), `update_product_metrics.sql` (`HistoricalDates` line ~145, `LifetimeRevenue` line ~249), `backfill/rebuild_daily_snapshots.sql`. **Problem.** Sales filters exclude only `('canceled', 'returned')`. Status 16 'combined' = "merged into another order" — the new order carries the same items, so counting both double-counts. 826 combined orders exist in MySQL; today none are in PG (see Fix 4), but once Fix 4's sweep starts marking rows 'combined', the metrics filters must exclude them. **Fix.** Change every `NOT IN ('canceled', 'returned')` in the metrics SQL to `NOT IN ('canceled', 'returned', 'combined')`. Grep for the pattern in `scripts/metrics-new/` and `src/routes/` (dashboard endpoints replicate these filters — see CLAUDE.md analytics-filters note). ### Fix 6: Incremental sync watermark race (silent permanent misses) **Files:** `scripts/import/orders.js` (~772), `products.js` (~934), `purchase-orders.js` (~833). **Problem.** `sync_status.last_sync_timestamp` is set to `NOW()` *after* the import finishes. Any MySQL row modified between the source query and that write is below the new watermark but was never fetched → permanently skipped (until a full sync or the row changes again). Long imports widen the window; PG/MySQL clock skew adds to it. **Fix.** Capture the watermark **before** the source query and write that value: ```js const [[{ now: sourceNow }]] = await prodConnection.query('SELECT NOW() as now'); // ... do the import ... await localConnection.query( `INSERT INTO sync_status ... VALUES ('orders', $1) ON CONFLICT ... SET last_sync_timestamp = $1`, [sourceNow]); ``` Using MySQL's own clock also eliminates cross-server skew. Note `sourceNow` comes back through the mysql2 driver TZ conversion — verify round-tripping with `adjustDateForMySQL` produces a correct comparison value, or store `UTC_TIMESTAMP()` and compare against `CONVERT_TZ`-normalized stamps. Overlap (re-importing rows changed during the run) is harmless — everything is upserted. ### Fix 7: Stockout days / service level / fill rate / avg stock built on activity-only snapshots **Files:** `scripts/metrics-new/update_product_metrics.sql` — `SnapshotAggregates` (`stockout_days_30d`, `avg_stock_*_30d`, lines ~177-189), `ServiceLevels` (lines ~304-323), plus `calculate_sales_velocity` usage. **Problem.** `daily_product_snapshots` only has rows on days with sales/receivings. So: - A product that is out of stock (and therefore sells nothing) gets **no row** → `stockout_days_30d` ≈ 0 exactly when stockouts matter → `calculate_sales_velocity(sales, stockout_days)`'s adjustment is inert → velocity and replenishment understated for constrained products. - `service_level_30d` divides stockout days by COUNT(activity days), not 30. - `avg_stock_units_30d` / `avg_stock_cost_30d` average only activity days (biased toward in-stock days) → GMROI / stockturn / sell-through denominators biased. - `fill_rate_30d`'s `units_sold * 0.2` lost-sales heuristic is arbitrary — fine to keep, but document. **Fix.** Derive stock-presence metrics from `stock_snapshots` (full daily coverage from MySQL `snap_product_value`, imported by `stock-snapshots.js`) instead of `daily_product_snapshots`: ```sql StockCoverage AS ( SELECT pid, COUNT(*) FILTER (WHERE stock_quantity <= 0) AS stockout_days_30d, AVG(stock_quantity) AS avg_stock_units_30d, AVG(stock_value) AS avg_stock_cost_30d FROM stock_snapshots WHERE snapshot_date >= _current_date - INTERVAL '29 days' GROUP BY pid ) ``` Treat products absent from `stock_snapshots` for a day as unknown (NULL), not in-stock. Keep `daily_product_snapshots` for sales/revenue aggregates. `service_level_30d` denominator becomes the count of covered days. Note `stock_snapshots` has no `eod_stock_retail`; keep retail/gross averages on the old source or compute as `stock_quantity * current price` explicitly. **Verification.** Pick products that had a known stockout period; `stockout_days_30d` should now be > 0 and `sales_velocity_daily` should rise accordingly. --- ## P2 — Definition / robustness improvements ### Fix 8: Returns don't reduce COGS; LifetimeRevenue ignores returns `update_daily_snapshots.sql` SalesData: COGS accrues only on `quantity > 0` rows; return rows (negative qty — 15,875 rows live) subtract revenue but never COGS → margin understated in return-heavy periods. Add a returns-COGS term mirroring the sales-COGS COALESCE chain (`SUM(... WHEN quantity < 0 THEN cost * ABS(quantity))`) and subtract it in `cogs` (or store `returns_cogs` separately and use `cogs - returns_cogs` in profit). Also `LifetimeRevenue` in `update_product_metrics.sql` (line ~242) filters `quantity > 0` — include negative-qty rows so lifetime revenue nets out returns (drop the quantity filter; `price*quantity` is already signed, but check the `- discount` term sign for return rows). ### Fix 9: return_rate_30d definition `update_product_metrics.sql` line ~468: `returns / (sales + returns)` → industry standard is `returns / sales`. Change denominator to `NULLIF(sa.sales_30d, 0)`. ### Fix 10: GMROI not annualized Line ~466: `profit_30d / avg_stock_cost_30d` is a monthly GMROI (~1/12 of the conventional annual figure, benchmark ≥ 2-3). Either annualize (`* 12.17`) or rename the column/label "monthly". Decision for Matt; annualizing is recommended for comparability. Frontend displays must be checked either way. ### Fix 11: get_weighted_avg_cost is a lifetime WAC `db/functions.sql` (~line 81, deployed identically): averages ALL receivings ≤ date — decade-old costs weigh equally. Recommended: window to recent receivings, e.g. last 365 days falling back to lifetime when none. Used as fallback COGS when `o.costeach` is NULL, so impact is modest but real for long-lived SKUs. Apply with `CREATE OR REPLACE FUNCTION` in `db/functions.sql` AND on the live DB. ### Fix 12: exclude_from_forecast removes products from product_metrics entirely `update_product_metrics.sql` line ~627 (`WHERE s.exclude_forecast IS FALSE OR ... IS NULL`): the flag's name implies forecast-only, but excluded products get NO metrics row → vanish from brand/ vendor/category rollups and dashboards. Fix: always emit the row; instead NULL the forecast/replenishment columns when excluded (wrap those expressions in `CASE WHEN s.exclude_forecast THEN NULL ELSE ... END`). ### Fix 13: Incremental products import misses category-only changes `products.js` incremental WHERE (~lines 433-440) keys on `p.stamp`, `ci.stamp`, price/b2b dates — `product_category_index` changes don't bump any of those → PG `product_categories` goes stale. Also the `needs_update` comparison (~lines 604-625) doesn't compare `categories`, so even refetched rows skip the category rewrite. Fix both: add `t.categories IS NOT DISTINCT FROM p.categories` to the needs_update comparison (note: `products.categories` is the GROUP_CONCAT string — confirm PG column holds the same representation), and add a cheap full-sweep (e.g. weekly, or compare `COUNT(*) GROUP BY pid` hashes) OR include `EXISTS (SELECT 1 FROM product_category_index pci WHERE pci.pid = p.pid AND pci.stamp > ?)` in the incremental WHERE if that table has a stamp column — verify schema first (`DESCRIBE product_category_index`). ### Fix 14: PO/receivings OFFSET pagination over a moving filter `purchase-orders.js` (~lines 275-298, 447-470): `LIMIT/OFFSET` with a `date_updated > ?` predicate; concurrent updates shift rows between pages → silent skips. Fix: keyset pagination — `WHERE ... AND p.po_id > ? ORDER BY p.po_id LIMIT 500`, carrying the last seen po_id (drop OFFSET). Same for receivings on `receiving_id`. ### Fix 15: Status map gaps and unsafe defaults - `orders.js` orderStatusMap lacks 45 (`payment_pending`) and 67 (`remote_send`) → imported as numeric strings. Add both (mirror in `migrations/001_map_order_statuses.sql` as a follow-up update for existing rows). - `purchase-orders.js` `poStatusMap[po.status] || 'created'` (line ~335): an unknown *cancel-like* code would be treated as an open PO and inflate on-order FIFO. Default to a sentinel like `'unknown_'` instead, and make the FIFO/on-order CTEs in `update_product_metrics.sql` treat only the known-open statuses as open (they already whitelist open statuses — so the sentinel is safe there; just ensure nothing treats unknown as 'created'). Same for receivingStatusMap. ### Fix 16: Transactions issued through the pool wrapper land on arbitrary connections `categories.js` (lines ~17-152) and `daily-deals.js` (~27-130) call `query('BEGIN')` / `query('COMMIT')` on the wrapper, which checks out a client per call — BEGIN/work/COMMIT are not guaranteed to share a connection (works only by pool-LIFO accident). The categories `DISABLE TRIGGER` rides on this too. Fix: use the wrapper's `beginTransaction()/commit()/rollback()` (see `utils.js` lines 121-148) exactly as orders.js does. In categories.js also move the post-COMMIT `ENABLE TRIGGER` inside the transaction (DISABLE/ENABLE both inside), or drop the trigger toggling entirely if the trigger isn't actually problematic anymore. ### Fix 17: stock-snapshots import swallows batch errors → permanent holes `stock-snapshots.js` (~lines 153-155): a failed batch is logged and skipped, but the next incremental starts at `MAX(snapshot_date)` — the hole is never revisited. Fix: rethrow (fail the step) or collect failed date ranges and retry once, then fail if still failing. Also line ~168: `calculateRate(processedRows, startTime)` — arguments reversed (signature is `calculateRate(startTime, current)`, see `metrics-new/utils/progress.js:70`). ### Fix 18: Metrics cancellation targets an application_name that's never set `calculate-metrics-new.js` line ~180 cancels backends `WHERE application_name = 'node-metrics-calculator'`, but the Pool config never sets it → cancellation no-ops (the 30-min `statement_timeout` is the only real guard). Fix: add `application_name: 'node-metrics-calculator'` to both dbConfig branches. ### Fix 19: Aggregate-table change-detection lists miss cost-only changes `calculate_brand_metrics.sql` / `calculate_vendor_metrics.sql` / `calculate_category_metrics.sql` ON CONFLICT WHERE lists don't include `profit_30d`/`cogs_30d` — a cost revision with unchanged sales/revenue leaves stale rows (product_metrics has a 1-day staleness net; rollups don't). Add `... OR x.profit_30d IS DISTINCT FROM EXCLUDED.profit_30d OR x.cogs_30d IS DISTINCT FROM EXCLUDED.cogs_30d` to each, or add a `last_calculated < NOW() - INTERVAL '1 day'` net like product_metrics line ~707. ### Fix 20: Snapshot stale-detection only compares unit counts `update_daily_snapshots.sql` lines ~57-85: detects mismatches in `units_sold`/`units_received` only; price/discount/costeach corrections older than the 2-day recheck are never repaired. Add a revenue comparison to the stale check: compare `SUM(net_revenue)` per day against the equivalent recomputed from `orders` (ROUND both to 2dp to avoid float-noise churn). ### Fix 21: Category metrics positive-only revenue asymmetry `calculate_category_metrics.sql` (lines ~27-36, 64-73): revenue summed only when `> 0` while cogs/profit use COALESCE-all → margin numerator/denominator from different populations, and inconsistent with brand/vendor (plain COALESCE). Change the revenue/sales CASEs to `COALESCE(pm.revenue_7d, 0)` etc., matching brand_metrics. ### Fix 22 (decision needed): Demand-pattern & seasonality definitions - `classify_demand_pattern` (db/functions.sql): CV thresholds 0.2/0.5 + avg<1/day. Industry standard is Syntetos-Boylan: ADI ≥ 1.32 and CV² ≥ 0.49 quadrants (smooth/erratic/intermittent/lumpy). Today everything classifies sporadic/lumpy. If adopting SB: ADI = 30 / COUNT(days with sales), CV² computed on nonzero-demand sizes. Changes the vocabulary consumed by the forecast engine (`scripts/forecast/forecast_engine.py` reads `demand_pattern`) — coordinate before changing. - SeasonalityAnalysis (`update_product_metrics.sql` ~360): `month_avg = AVG(units_sold)` over rows with sales only → intensity, not volume. Use monthly totals (SUM, with zero months counted) / overall monthly average for the index. - Safety stock: currently static config units; `sales_std_dev_30d` exists but is unused. Optional upgrade: `safety = z * σ_d * sqrt(lead_time)` with z from a service-level setting. These change user-facing semantics — confirm with Matt before implementing. --- ## Verified non-issues (no action, or cleanup only) - **`costeach` fallback `price * 0.5`** (orders.js line ~615): fires on **2.1%** of item rows (729/34,833, last 30d, live-verified). Accepted by Matt — 50% margin is a fair estimate for these products. Optional: nothing. - **Missing-product order skips**: zero occurrences — MySQL has no orphan order_items (1-year check), PG products is a superset of MySQL products (687,579 vs 687,576), last 7 import runs all logged `totalSkipped: 0`. Cleanup only: remove the unused `importMissingProducts` import line at `orders.js:2` (the function itself stays in products.js — harmless utility). - **Status 30 'cancelled_old'** in `total_sold >= 20` filter: zero rows live in `_order` — safe. - **Duplicate (order_id, pid) order items**: none exist in MySQL — the upsert PK is safe. - **base_discount** in orders.js: computed/stored in temp table but unused since migration 002 — remove the column from temp table + queries for clarity (no behavior change). - **`full-update.js` `runScript`**: try/catch around `console.log` is dead code; per-step `status:'complete'` messages could confuse a UI parser. Cosmetic only — tidy if touching the file. ## Suggested implementation order | Step | Fixes | Re-import/rebuild needed | |---|---|---| | 1 | Fix 1 + Fix 2 (+ Fix 5 filters, Fix 8/9 while editing the same SQL) | FULL orders re-import → snapshot rebuild → metrics (once) | | 2 | Fix 4 + Fix 6 (orders.js reconciliation + watermarks; POs/products watermarks too) | no | | 3 | Fix 3, Fix 7 (metrics SQL only) | metrics run | | 4 | Fix 13-21 (robustness batch) | no | | 5 | Fix 10-12, Fix 22 after Matt's sign-off (definition changes) | metrics run | After step 1, expect: margin_30d down ~8-10 points, discounts_30d ~3x up, daily curves shifted to correct business days. Communicate before/after so the change isn't mistaken for a data incident. ## Reference: verification snippets used in the review ```sql -- MySQL: item-level discounts dropped by the gate (30d) SELECT COUNT(DISTINCT o.order_id), ROUND(SUM(odi.amount),2) FROM order_discount_items odi JOIN order_discounts od ON od.order_id=odi.order_id AND od.discount_id=odi.discount_id JOIN _order o ON o.order_id=odi.order_id WHERE odi.which=2 AND o.date_placed >= DATE_SUB(CURDATE(), INTERVAL 30 DAY) AND o.order_status >= 20 AND COALESCE(od.discount_amount_subtotal,0)=0; -- → 2,021 orders / $25,989 (2026-06-10) -- MySQL: costeach fallback frequency (30d) SELECT COUNT(*), SUM(CASE WHEN NOT EXISTS (SELECT 1 FROM order_costs oc WHERE oc.orderid=oi.order_id AND oc.pid=oi.prod_pid AND oc.pending=0) AND NOT EXISTS (SELECT 1 FROM product_inventory pi WHERE pi.pid=oi.prod_pid) THEN 1 ELSE 0 END) FROM order_items oi JOIN _order o ON o.order_id=oi.order_id WHERE o.order_status >= 20 AND o.date_placed >= DATE_SUB(CURDATE(), INTERVAL 30 DAY); -- → 729 / 34,833 = 2.1% (2026-06-10) -- PG: timezone check SHOW timezone; -- Europe/Berlin (2026-06-10) -- Row drift, May 2026: MySQL 49,377 items / PG 49,841 (+0.9%) ```