Files
inventory/IMPORT_METRICS_FIX_PLAN.md
T
2026-06-11 19:32:20 -04:00

450 lines
27 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Import & Metrics Pipeline Fix Plan
Fixes for issues found in a full review (2026-06-10) of the `full-update.js` pipeline:
`inventory-server/scripts/full-update.js``import-from-prod.js` (6 importers in `scripts/import/`)
`calculate-metrics-new.js` (7 SQL modules in `scripts/metrics-new/`).
Every issue below was verified against the code, and where marked **[verified-live]**, against the
live MySQL source (`sg` on 192.168.1.5 via the acot-db tooling / `ssh workpi`) and live PostgreSQL
(`inventory_db``ssh netcup`, then `psql -U inventory_readonly`, password in `/Users/matt/Dev/inventory/CLAUDE.md`).
Write credentials for migrations: see `/var/www/inventory/.env` on netcup (`inventory_user`).
## Operational context (read first)
- Local `inventory-server/` is **NFS-mounted** to `/var/www/inventory/` on the netcup server — edits
appear on the server with no copy step. Run heavy validation/grep/find **on the server via
`ssh netcup`**, not locally (NFS hangs + AppleDouble `._*` noise).
- The PG server timezone is **Europe/Berlin**. The business operates in **America/Chicago**. This
matters for Fix 2.
- MySQL server is America/Chicago; the mysql2 driver is configured `timezone: '-05:00'` and
corrected at runtime by `adjustDateForMySQL()` in `scripts/import/utils.js` (see
`memory/TIMEZONE_ISSUE.md`). Don't "fix" that part — it already works.
- Orders/PO/products imports are incremental by default (`INCREMENTAL_UPDATE !== 'false'`); a full
orders sync = run with `INCREMENTAL_UPDATE=false` (5-year window).
- Existing rebuild tooling: `scripts/metrics-new/backfill/rebuild_daily_snapshots.sql` (rebuilds
`daily_product_snapshots` from `orders`/`receivings`). The full-pipeline order after data fixes:
re-import → rebuild snapshots → `node scripts/calculate-metrics-new.js`.
- Precedent: `scripts/metrics-new/migrations/002_fix_discount_double_counting.sql` documents the
procedure used last time a discount formula changed. Follow the same pattern (migration doc +
code fix + full re-import + rebuild).
---
## P0 — Data correctness (do both, then ONE re-import + rebuild)
### Fix 1: Item-level promo discounts dropped (~$26K / 30 days ≈ 10% of product revenue) [verified-live]
**File:** `scripts/import/orders.js``order_totals` CTE (~lines 604-623) and the discount fetch in
`processDiscountsBatch` (~lines 379-383).
**Problem.** The discount applied to each PG `orders` row is:
prorated `summary_discount_subtotal` + item-level promo discounts. The item-level part is gated:
```sql
SUM(CASE WHEN COALESCE(md.discount_amount_subtotal, 0) > 0 THEN id.amount ELSE 0 END)
```
In the PHP source (`/Users/matt/Dev/acot/website/website/lib/neworder.class.php`):
- `order_items.prod_price` is the **pre-promo** price; `summary_subtotal = Σ prod_price·qty` (line ~3087).
- Item-level promo discounts live in `order_discount_items` with `which = 2`; they are applied to the
order total via `summary_discount += amount + products_disc_sum` (line ~6567) — i.e. they are **not**
part of `discount_amount_subtotal` and **not** baked into `prod_price`.
- Live data (90 days): of 10,010 type-10 promo discounts, **8,070 have item rows but only 8 have
`discount_amount_subtotal > 0`** — the gate zeroes essentially all item-level promo discounts.
- Live impact (30 days): **$25,989 dropped** across 2,021 orders, vs only $13,574 captured via the
prorated subtotal component. Order discount components, 30d: total $54,957 = $13,574 subtotal +
$15,395 shipping + ~$25,989 item-level. (Shipping discounts correctly excluded from product revenue.)
**Consequence.** `orders.discount` understated → `net_revenue`, `profit_30d`, `margin_30d` overstated
by ~10% of revenue; `discounts_30d` / `discount_rate_30d` ~3x understated. Flows into daily snapshots,
product/brand/vendor/category metrics, and dashboards.
**Fix.**
1. In `processDiscountsBatch`, fetch only real item discounts:
`SELECT order_id, pid, discount_id, amount FROM order_discount_items WHERE order_id IN (?) AND which = 2`.
(`which=1` rows store prices of free promo-added items; `which=3` are usage records — neither is a
discount amount.)
2. In the `order_totals` CTE, remove the gate — sum `id.amount` unconditionally:
`SUM(COALESCE(id.amount, 0)) AS promo_discount_sum` (drop the join/CASE on `temp_main_discounts`;
`temp_main_discounts` becomes unused and can be removed entirely along with its insert loop).
3. Sanity guard (optional, recommended): clamp final per-row discount to `price * quantity`.
**Verification.** After a FULL orders re-import, for a recent 30-day window PG should satisfy:
`SUM(discount)` ≈ MySQL `Σ summary_discount_subtotal` + `Σ order_discount_items.amount (which=2)`
over the same orders (± rounding from proration). Spot-check an order with a type-10 promo:
discount on the affected pid ≈ the `which=2` amount. Re-run migration 002's verification query too
(pids 624756, 614513) to confirm no regression of the prior fix.
### Fix 2: Daily snapshots bucket sales by Europe/Berlin days, not business days [verified-live]
**Files:** `scripts/metrics-new/update_daily_snapshots.sql` (SalesData join `o.date::date = _target_date`
~line 138; gap-fill and stale-detection aggregates at lines ~47-83);
`scripts/metrics-new/backfill/rebuild_daily_snapshots.sql` (same pattern — check & fix);
`scripts/metrics-new/update_product_metrics.sql` (`HistoricalDates` `MIN(o.date)::date` etc., lines ~131-147).
**Problem.** `orders.date` is `timestamptz`; `::date` casts in the server TZ (**Europe/Berlin**,
verified via `SHOW timezone`). Berlin is 7-8h ahead of Central, so every order placed after
~5 PM Central lands on the **next** snapshot day. This shifts a large evening slice of daily sales
forward one day; skews `yesterday_sales`, day-of-week patterns (the forecast engine's DOW
multipliers, daily-grain forecast accuracy — see `FORECAST_FIX_PLAN.md`), and is inconsistent with
`stock_snapshots`, whose dates come from a Central-time MySQL cron.
**Fix.** Bucket all order/receiving dates in business time. Replace every `o.date::date` /
`received_date::date` used for *day bucketing* in the two snapshot SQL files with:
```sql
(o.date AT TIME ZONE 'America/Chicago')::date
```
Apply consistently in: SalesData, ReceivingData, the gap-fill date lists, the stale-detection
aggregates (they must match SalesData or every day looks permanently stale), and the rebuild script.
`HistoricalDates` in update_product_metrics (first/last sold dates) should match too.
Add an index to keep the per-day loop fast, e.g.
`CREATE INDEX ON orders ( ((date AT TIME ZONE 'America/Chicago')::date) );` and equivalent on
`receivings(received_date)`; check `EXPLAIN` on the SalesData query afterward.
Note: `receivings.received_date` came from MySQL DATETIME (Central literal) inserted as timestamptz —
it was interpreted in the *session* TZ at insert. Before converting, spot-check a few receivings
against MySQL to confirm which TZ the stored instants actually represent; the conversion expression
must yield the Central calendar day MySQL shows. Same check for `orders.date` (it originates from
`_order.date_placed`, a TIMESTAMP column, so it should be a correct instant — `AT TIME ZONE
'America/Chicago'` is right for it).
**Verification.** Pick 2-3 recent days; compare per-day `units_sold` totals in
`daily_product_snapshots` against MySQL
`SELECT date_placed_onlydate, SUM(qty_ordered) ... WHERE order_status >= 20 GROUP BY 1`
(MySQL stores Central days). They should now match closely (small diffs from canceled-status timing).
### P0 execution order (single pass)
1. Land Fix 1 (orders.js) and Fix 2 (both snapshot SQL files + product-metrics date CTE).
2. Full orders re-import: `INCREMENTAL_UPDATE=false node scripts/import-from-prod.js` (or at minimum
the orders step) — run on the server, it's long.
3. Rebuild snapshots: `psql -f scripts/metrics-new/backfill/rebuild_daily_snapshots.sql` (after
confirming it contains the TZ fix). The hourly job's 90-day self-heal will NOT fix history beyond
90 days by itself; the explicit rebuild is required.
4. `node scripts/calculate-metrics-new.js`.
5. Expect dashboards to show: margins down ~8-10 points (real), daily sales curves shifted, DOW
profile changed. Tell the user before/after numbers.
---
## P1 — Wrong or drifting numbers, fix soon
### Fix 3: Vendor avg lead time computed over a near-cartesian join
**File:** `scripts/metrics-new/calculate_vendor_metrics.sql`, `VendorPOAggregates` (lines ~62-83).
**Problem.** Joins each done-PO line to **every** receiving of the same (pid, supplier) after the PO
date — a product received 10 times contributes 10 ever-growing lead times → overstated, busy-product-
weighted vendor lead time. The per-product version in `update_periodic_metrics.sql` (lines 27-48)
is correct (MIN receiving per PO within 180 days, then average).
**Fix.** Reuse the periodic shape, aggregated to vendor:
```sql
WITH po_first_receiving AS (
SELECT po.vendor, po.po_id, po.pid, po.date::date AS po_date,
MIN(r.received_date::date) AS first_receive_date
FROM purchase_orders po
JOIN receivings r ON r.pid = po.pid AND r.supplier_id = po.supplier_id
AND r.received_date >= po.date
AND r.received_date <= po.date + INTERVAL '180 days'
WHERE po.status = 'done' AND po.date >= CURRENT_DATE - INTERVAL '1 year'
AND po.vendor IS NOT NULL AND po.vendor <> ''
GROUP BY po.vendor, po.po_id, po.pid, po.date
)
SELECT vendor, COUNT(DISTINCT po_id) AS po_count_365d,
ROUND(AVG(GREATEST(1, first_receive_date - po_date)))::int AS avg_lead_time_days_hist
FROM po_first_receiving GROUP BY vendor
```
**Verification.** For a few vendors compare old vs new values; new should be materially lower and
roughly match `AVG(product_metrics.avg_lead_time_days)` for that vendor's products.
### Fix 4: Deleted order items & combined orders never reconciled in PG [verified-live]
**File:** `scripts/import/orders.js`.
**Problem.** The orders import upserts but never deletes:
- Items removed from an order in MySQL (`DELETE FROM order_items ...` happens, e.g.
neworder.class.php ~line 6500 for unpicked promo items, plus staff edits) leave stale rows in PG
forever. May 2026 check: PG has 49,841 item rows vs MySQL 49,377 (+0.9%) — and PG should be ≤
MySQL.
- Combining orders (`combine_orders`, neworder.class.php ~11946) sets the source orders to status 16
AND **zeroes `date_placed`**, then copies all items to a NEW order. Because the import query
filters `o.date_placed >= …`, a combined source order can never be re-fetched, so its stale
'placed' rows would double-count with the new merged order. Currently latent (last combine
2024-07, predating current PG data — verified no stale rows exist today), but it will silently
corrupt the day combining is used again.
**Fix.** Two parts, both inside the orders import after the upsert phase:
1. **Item-set reconciliation** for re-imported orders: the import already knows the set of changed
`orderIds` and inserted their current items into `temp_order_items`. Mirror the PO import's
pattern (`purchase-orders.js` lines ~683-694):
```sql
DELETE FROM orders o
WHERE o.order_number = ANY($1) -- orders fetched this run
AND NOT EXISTS (SELECT 1 FROM temp_order_items t
WHERE t.order_id = o.order_number AND t.pid = o.pid);
```
2. **Combined/cancelled sweep** that does NOT depend on `date_placed`: each run, fetch from MySQL
`SELECT order_id, order_status FROM _order WHERE order_status IN (15,16) AND stamp > ?`
(no date_placed filter) and update matching PG rows' `status`/`canceled`
('combined' rows are then excluded from metrics — see Fix 5). Cheap (small result set).
**Verification.** Re-run the May-2026 row-count comparison (MySQL vs PG for one month) after one full
run; counts should converge (PG ≤ MySQL, diff explained by TZ window edges only).
### Fix 5: 'combined' orders are counted as sales
**Files:** `scripts/metrics-new/update_daily_snapshots.sql` (status filters, lines ~77, 120-134),
`update_product_metrics.sql` (`HistoricalDates` line ~145, `LifetimeRevenue` line ~249),
`backfill/rebuild_daily_snapshots.sql`.
**Problem.** Sales filters exclude only `('canceled', 'returned')`. Status 16 'combined' = "merged
into another order" — the new order carries the same items, so counting both double-counts. 826
combined orders exist in MySQL; today none are in PG (see Fix 4), but once Fix 4's sweep starts
marking rows 'combined', the metrics filters must exclude them.
**Fix.** Change every `NOT IN ('canceled', 'returned')` in the metrics SQL to
`NOT IN ('canceled', 'returned', 'combined')`. Grep for the pattern in `scripts/metrics-new/` and
`src/routes/` (dashboard endpoints replicate these filters — see CLAUDE.md analytics-filters note).
### Fix 6: Incremental sync watermark race (silent permanent misses)
**Files:** `scripts/import/orders.js` (~772), `products.js` (~934), `purchase-orders.js` (~833).
**Problem.** `sync_status.last_sync_timestamp` is set to `NOW()` *after* the import finishes. Any
MySQL row modified between the source query and that write is below the new watermark but was never
fetched → permanently skipped (until a full sync or the row changes again). Long imports widen the
window; PG/MySQL clock skew adds to it.
**Fix.** Capture the watermark **before** the source query and write that value:
```js
const [[{ now: sourceNow }]] = await prodConnection.query('SELECT NOW() as now');
// ... do the import ...
await localConnection.query(
`INSERT INTO sync_status ... VALUES ('orders', $1) ON CONFLICT ... SET last_sync_timestamp = $1`,
[sourceNow]);
```
Using MySQL's own clock also eliminates cross-server skew. Note `sourceNow` comes back through the
mysql2 driver TZ conversion — verify round-tripping with `adjustDateForMySQL` produces a correct
comparison value, or store `UTC_TIMESTAMP()` and compare against `CONVERT_TZ`-normalized stamps.
Overlap (re-importing rows changed during the run) is harmless — everything is upserted.
### Fix 7: Stockout days / service level / fill rate / avg stock built on activity-only snapshots
**Files:** `scripts/metrics-new/update_product_metrics.sql` — `SnapshotAggregates`
(`stockout_days_30d`, `avg_stock_*_30d`, lines ~177-189), `ServiceLevels` (lines ~304-323),
plus `calculate_sales_velocity` usage.
**Problem.** `daily_product_snapshots` only has rows on days with sales/receivings. So:
- A product that is out of stock (and therefore sells nothing) gets **no row** → `stockout_days_30d`
≈ 0 exactly when stockouts matter → `calculate_sales_velocity(sales, stockout_days)`'s adjustment
is inert → velocity and replenishment understated for constrained products.
- `service_level_30d` divides stockout days by COUNT(activity days), not 30.
- `avg_stock_units_30d` / `avg_stock_cost_30d` average only activity days (biased toward in-stock
days) → GMROI / stockturn / sell-through denominators biased.
- `fill_rate_30d`'s `units_sold * 0.2` lost-sales heuristic is arbitrary — fine to keep, but document.
**Fix.** Derive stock-presence metrics from `stock_snapshots` (full daily coverage from MySQL
`snap_product_value`, imported by `stock-snapshots.js`) instead of `daily_product_snapshots`:
```sql
StockCoverage AS (
SELECT pid,
COUNT(*) FILTER (WHERE stock_quantity <= 0) AS stockout_days_30d,
AVG(stock_quantity) AS avg_stock_units_30d,
AVG(stock_value) AS avg_stock_cost_30d
FROM stock_snapshots
WHERE snapshot_date >= _current_date - INTERVAL '29 days'
GROUP BY pid
)
```
Treat products absent from `stock_snapshots` for a day as unknown (NULL), not in-stock. Keep
`daily_product_snapshots` for sales/revenue aggregates. `service_level_30d` denominator becomes the
count of covered days. Note `stock_snapshots` has no `eod_stock_retail`; keep retail/gross averages
on the old source or compute as `stock_quantity * current price` explicitly.
**Verification.** Pick products that had a known stockout period; `stockout_days_30d` should now be
> 0 and `sales_velocity_daily` should rise accordingly.
---
## P2 — Definition / robustness improvements
### Fix 8: Returns don't reduce COGS; LifetimeRevenue ignores returns
`update_daily_snapshots.sql` SalesData: COGS accrues only on `quantity > 0` rows; return rows
(negative qty — 15,875 rows live) subtract revenue but never COGS → margin understated in
return-heavy periods. Add a returns-COGS term mirroring the sales-COGS COALESCE chain
(`SUM(... WHEN quantity < 0 THEN cost * ABS(quantity))`) and subtract it in `cogs` (or store
`returns_cogs` separately and use `cogs - returns_cogs` in profit). Also `LifetimeRevenue` in
`update_product_metrics.sql` (line ~242) filters `quantity > 0` — include negative-qty rows so
lifetime revenue nets out returns (drop the quantity filter; `price*quantity` is already signed,
but check the `- discount` term sign for return rows).
### Fix 9: return_rate_30d definition
`update_product_metrics.sql` line ~468: `returns / (sales + returns)` → industry standard is
`returns / sales`. Change denominator to `NULLIF(sa.sales_30d, 0)`.
### Fix 10: GMROI not annualized
Line ~466: `profit_30d / avg_stock_cost_30d` is a monthly GMROI (~1/12 of the conventional annual
figure, benchmark ≥ 2-3). Either annualize (`* 12.17`) or rename the column/label "monthly".
Decision for Matt; annualizing is recommended for comparability. Frontend displays must be checked
either way.
### Fix 11: get_weighted_avg_cost is a lifetime WAC
`db/functions.sql` (~line 81, deployed identically): averages ALL receivings ≤ date — decade-old
costs weigh equally. Recommended: window to recent receivings, e.g. last 365 days falling back to
lifetime when none. Used as fallback COGS when `o.costeach` is NULL, so impact is modest but real
for long-lived SKUs. Apply with `CREATE OR REPLACE FUNCTION` in `db/functions.sql` AND on the live DB.
### Fix 12: exclude_from_forecast removes products from product_metrics entirely
`update_product_metrics.sql` line ~627 (`WHERE s.exclude_forecast IS FALSE OR ... IS NULL`): the
flag's name implies forecast-only, but excluded products get NO metrics row → vanish from brand/
vendor/category rollups and dashboards. Fix: always emit the row; instead NULL the
forecast/replenishment columns when excluded (wrap those expressions in
`CASE WHEN s.exclude_forecast THEN NULL ELSE ... END`).
### Fix 13: Incremental products import misses category-only changes
`products.js` incremental WHERE (~lines 433-440) keys on `p.stamp`, `ci.stamp`, price/b2b dates —
`product_category_index` changes don't bump any of those → PG `product_categories` goes stale. Also
the `needs_update` comparison (~lines 604-625) doesn't compare `categories`, so even refetched rows
skip the category rewrite. Fix both: add `t.categories IS NOT DISTINCT FROM p.categories` to the
needs_update comparison (note: `products.categories` is the GROUP_CONCAT string — confirm PG column
holds the same representation), and add a cheap full-sweep (e.g. weekly, or compare
`COUNT(*) GROUP BY pid` hashes) OR include `EXISTS (SELECT 1 FROM product_category_index pci WHERE
pci.pid = p.pid AND pci.stamp > ?)` in the incremental WHERE if that table has a stamp column —
verify schema first (`DESCRIBE product_category_index`).
### Fix 14: PO/receivings OFFSET pagination over a moving filter
`purchase-orders.js` (~lines 275-298, 447-470): `LIMIT/OFFSET` with a `date_updated > ?` predicate;
concurrent updates shift rows between pages → silent skips. Fix: keyset pagination —
`WHERE ... AND p.po_id > ? ORDER BY p.po_id LIMIT 500`, carrying the last seen po_id (drop OFFSET).
Same for receivings on `receiving_id`.
### Fix 15: Status map gaps and unsafe defaults
- `orders.js` orderStatusMap lacks 45 (`payment_pending`) and 67 (`remote_send`) → imported as
numeric strings. Add both (mirror in `migrations/001_map_order_statuses.sql` as a follow-up update
for existing rows).
- `purchase-orders.js` `poStatusMap[po.status] || 'created'` (line ~335): an unknown *cancel-like*
code would be treated as an open PO and inflate on-order FIFO. Default to a sentinel like
`'unknown_<code>'` instead, and make the FIFO/on-order CTEs in `update_product_metrics.sql` treat
only the known-open statuses as open (they already whitelist open statuses — so the sentinel is
safe there; just ensure nothing treats unknown as 'created'). Same for receivingStatusMap.
### Fix 16: Transactions issued through the pool wrapper land on arbitrary connections
`categories.js` (lines ~17-152) and `daily-deals.js` (~27-130) call `query('BEGIN')` /
`query('COMMIT')` on the wrapper, which checks out a client per call — BEGIN/work/COMMIT are not
guaranteed to share a connection (works only by pool-LIFO accident). The categories
`DISABLE TRIGGER` rides on this too. Fix: use the wrapper's `beginTransaction()/commit()/rollback()`
(see `utils.js` lines 121-148) exactly as orders.js does. In categories.js also move the
post-COMMIT `ENABLE TRIGGER` inside the transaction (DISABLE/ENABLE both inside), or drop the
trigger toggling entirely if the trigger isn't actually problematic anymore.
### Fix 17: stock-snapshots import swallows batch errors → permanent holes
`stock-snapshots.js` (~lines 153-155): a failed batch is logged and skipped, but the next
incremental starts at `MAX(snapshot_date)` — the hole is never revisited. Fix: rethrow (fail the
step) or collect failed date ranges and retry once, then fail if still failing. Also line ~168:
`calculateRate(processedRows, startTime)` — arguments reversed (signature is
`calculateRate(startTime, current)`, see `metrics-new/utils/progress.js:70`).
### Fix 18: Metrics cancellation targets an application_name that's never set
`calculate-metrics-new.js` line ~180 cancels backends `WHERE application_name =
'node-metrics-calculator'`, but the Pool config never sets it → cancellation no-ops (the 30-min
`statement_timeout` is the only real guard). Fix: add `application_name: 'node-metrics-calculator'`
to both dbConfig branches.
### Fix 19: Aggregate-table change-detection lists miss cost-only changes
`calculate_brand_metrics.sql` / `calculate_vendor_metrics.sql` / `calculate_category_metrics.sql`
ON CONFLICT WHERE lists don't include `profit_30d`/`cogs_30d` — a cost revision with unchanged
sales/revenue leaves stale rows (product_metrics has a 1-day staleness net; rollups don't). Add
`... OR x.profit_30d IS DISTINCT FROM EXCLUDED.profit_30d OR x.cogs_30d IS DISTINCT FROM
EXCLUDED.cogs_30d` to each, or add a `last_calculated < NOW() - INTERVAL '1 day'` net like
product_metrics line ~707.
### Fix 20: Snapshot stale-detection only compares unit counts
`update_daily_snapshots.sql` lines ~57-85: detects mismatches in `units_sold`/`units_received` only;
price/discount/costeach corrections older than the 2-day recheck are never repaired. Add a
revenue comparison to the stale check: compare `SUM(net_revenue)` per day against the equivalent
recomputed from `orders` (ROUND both to 2dp to avoid float-noise churn).
### Fix 21: Category metrics positive-only revenue asymmetry
`calculate_category_metrics.sql` (lines ~27-36, 64-73): revenue summed only when `> 0` while
cogs/profit use COALESCE-all → margin numerator/denominator from different populations, and
inconsistent with brand/vendor (plain COALESCE). Change the revenue/sales CASEs to
`COALESCE(pm.revenue_7d, 0)` etc., matching brand_metrics.
### Fix 22 (decision needed): Demand-pattern & seasonality definitions
- `classify_demand_pattern` (db/functions.sql): CV thresholds 0.2/0.5 + avg<1/day. Industry standard
is Syntetos-Boylan: ADI ≥ 1.32 and CV² ≥ 0.49 quadrants (smooth/erratic/intermittent/lumpy).
Today everything classifies sporadic/lumpy. If adopting SB: ADI = 30 / COUNT(days with sales),
CV² computed on nonzero-demand sizes. Changes the vocabulary consumed by the forecast engine
(`scripts/forecast/forecast_engine.py` reads `demand_pattern`) — coordinate before changing.
- SeasonalityAnalysis (`update_product_metrics.sql` ~360): `month_avg = AVG(units_sold)` over rows
with sales only → intensity, not volume. Use monthly totals (SUM, with zero months counted) /
overall monthly average for the index.
- Safety stock: currently static config units; `sales_std_dev_30d` exists but is unused. Optional
upgrade: `safety = z * σ_d * sqrt(lead_time)` with z from a service-level setting.
These change user-facing semantics — confirm with Matt before implementing.
---
## Verified non-issues (no action, or cleanup only)
- **`costeach` fallback `price * 0.5`** (orders.js line ~615): fires on **2.1%** of item rows
(729/34,833, last 30d, live-verified). Accepted by Matt — 50% margin is a fair estimate for these
products. Optional: nothing.
- **Missing-product order skips**: zero occurrences — MySQL has no orphan order_items (1-year check),
PG products is a superset of MySQL products (687,579 vs 687,576), last 7 import runs all logged
`totalSkipped: 0`. Cleanup only: remove the unused `importMissingProducts` import line at
`orders.js:2` (the function itself stays in products.js — harmless utility).
- **Status 30 'cancelled_old'** in `total_sold >= 20` filter: zero rows live in `_order` — safe.
- **Duplicate (order_id, pid) order items**: none exist in MySQL — the upsert PK is safe.
- **base_discount** in orders.js: computed/stored in temp table but unused since migration 002 —
remove the column from temp table + queries for clarity (no behavior change).
- **`full-update.js` `runScript`**: try/catch around `console.log` is dead code; per-step
`status:'complete'` messages could confuse a UI parser. Cosmetic only — tidy if touching the file.
## Suggested implementation order
| Step | Fixes | Re-import/rebuild needed |
|---|---|---|
| 1 | Fix 1 + Fix 2 (+ Fix 5 filters, Fix 8/9 while editing the same SQL) | FULL orders re-import → snapshot rebuild → metrics (once) |
| 2 | Fix 4 + Fix 6 (orders.js reconciliation + watermarks; POs/products watermarks too) | no |
| 3 | Fix 3, Fix 7 (metrics SQL only) | metrics run |
| 4 | Fix 13-21 (robustness batch) | no |
| 5 | Fix 10-12, Fix 22 after Matt's sign-off (definition changes) | metrics run |
After step 1, expect: margin_30d down ~8-10 points, discounts_30d ~3x up, daily curves shifted to
correct business days. Communicate before/after so the change isn't mistaken for a data incident.
## Reference: verification snippets used in the review
```sql
-- MySQL: item-level discounts dropped by the gate (30d)
SELECT COUNT(DISTINCT o.order_id), ROUND(SUM(odi.amount),2)
FROM order_discount_items odi
JOIN order_discounts od ON od.order_id=odi.order_id AND od.discount_id=odi.discount_id
JOIN _order o ON o.order_id=odi.order_id
WHERE odi.which=2 AND o.date_placed >= DATE_SUB(CURDATE(), INTERVAL 30 DAY)
AND o.order_status >= 20 AND COALESCE(od.discount_amount_subtotal,0)=0;
-- → 2,021 orders / $25,989 (2026-06-10)
-- MySQL: costeach fallback frequency (30d)
SELECT COUNT(*),
SUM(CASE WHEN NOT EXISTS (SELECT 1 FROM order_costs oc WHERE oc.orderid=oi.order_id
AND oc.pid=oi.prod_pid AND oc.pending=0)
AND NOT EXISTS (SELECT 1 FROM product_inventory pi WHERE pi.pid=oi.prod_pid)
THEN 1 ELSE 0 END)
FROM order_items oi JOIN _order o ON o.order_id=oi.order_id
WHERE o.order_status >= 20 AND o.date_placed >= DATE_SUB(CURDATE(), INTERVAL 30 DAY);
-- → 729 / 34,833 = 2.1% (2026-06-10)
-- PG: timezone check
SHOW timezone; -- Europe/Berlin (2026-06-10)
-- Row drift, May 2026: MySQL 49,377 items / PG 49,841 (+0.9%)
```