Import/calculations improvements

Forecast improvements
2026-06-11 19:32:20 -04:00 · 2026-06-11 14:55:33 -04:00
24 changed files with 2062 additions and 446 deletions
@@ -0,0 +1,343 @@
+# Forecast Accuracy Fix Plan
+
+**Written:** 2026-06-10, from a code + live-data review of the forecasting pipeline.
+**Goal:** eliminate the systematic ~1.7–2x over-forecast bias, recover demand the model currently ignores, and fix the accuracy measurement so improvements are visible and long-lead forecasts are validated.
+
+Read this whole document before starting. Fixes are grouped into phases; each phase is independently deployable and has its own validation step. Line numbers are as of 2026-06-10 — re-locate by function name if the file has drifted.
+
+---
+
+## 1. Diagnosis summary (measured 2026-06-10)
+
+The dashboard headline is **202% WMAPE**. Decomposition of that number, all measured against `forecast_accuracy` run 129 and ad-hoc queries:
+
+| Finding | Evidence |
+|---|---|
+| Daily-grain WMAPE has a ~190% *floor* for this catalog | Avg demand ≈ 0.11 units/product/day. A perfect rate forecast of intermittent demand scores ≈ 2e^−λ ≈ 190%. A trivial trailing-30d-average naive forecast scores **204%** on the same products/days; the engine scores 221% (slightly *worse than naive*). |
+| Same forecasts at 21-day-per-product grain: **109%**; bias-corrected: **75%** | Half the headline is metric grain, most of the rest is bias. |
+| Aggregate over-forecast **+70%** (227,690 forecast vs 133,861 actual units) | Portfolio daily ratio is 1.5–2.5x on most days. |
+| Decay phase 2.47x over (fc 51,675 / act 20,915) | Root cause F1: velocity inflated **4.07x** (measured: 1.353 vs true 0.332 units/day) by averaging over sparse snapshot rows. |
+| Preorder phase 2.15x over (fc 67,212 / act 31,189) | Root cause F4: launch curve applied at age=0 starting *today*, ignoring that the product hasn't arrived. |
+| Mature phase 1.69x over (fc 57,857 / act 34,313) | Root causes F2 (history edge truncation) + F3 (seasonal double-count). |
+| Dormant products sold **16,180 units** (~11% of demand) against zero forecasts | Root cause F5; also excluded from the headline metric, so invisible. |
+| All 879,800 accuracy samples are in the **1–7d lead bucket** | Root cause F7: archiving design only ever saves yesterday's slice. 30–90d forecasts (what purchasing uses) are never validated. |
+| Launch phase is healthy: WMAPE 100%, bias −6%, beats naive | The lifecycle-curve concept works; its calibration inputs are broken. Don't redesign it. |
+
+**Key data fact** underlying several fixes: `daily_product_snapshots` is **activity-based and sparse** — only ~500–1,800 of ~38K products have a row on a given day. Verified: every pid-day with an order DOES have a snapshot row and units match (5,234/5,234 pid-days, 8,980 vs 8,984 units over 7 days). So *missing row = zero sales*, and any query that aggregates over only the rows that exist is averaging over sold-days.
+
+---
+
+## 2. Environment & operational notes
+
+- **Files:** engine is `inventory-server/scripts/forecast/forecast_engine.py`; orchestrator `run_forecast.js` in the same dir; consumer endpoints in `inventory-server/src/routes/dashboard.js` (`/forecast/metrics` ~line 308, `/forecast/accuracy` ~line 647); overview UI in `inventory/src/components/overview/ForecastMetrics.tsx` and `ForecastAccuracy.tsx`.
+- **Local `inventory-server/` is NFS-mounted to `/var/www/inventory/` on the netcup server.** Edits made locally appear on the server immediately — no copy step. Do NOT run bulk `grep`/`find`/`node --check` over `inventory-server/` locally (the mount hangs); `ssh netcup` and run them there.
+- **Avoid the glob tool** for search in this repo; use bash (`grep`/`rg` via ssh for server-side trees).
+- **Scheduling:** the engine runs daily at **09:30:01 server time** (runs table is conclusive), but the cron entry is NOT in matt's crontab, `/etc/cron.d`, or pm2. Likely root's crontab (`sudo crontab -l` to confirm). You do not need to touch the schedule for these fixes; just know a run fires at 09:30 daily and occasionally skips days (e.g. 2026-06-07/08).
+- **Manual test runs:** `ssh netcup`, then `cd /var/www/inventory/scripts/forecast && node run_forecast.js`. Takes ~3.5–4 min. Safe to run any time: the engine TRUNCATEs and rebuilds `product_forecasts`, archives prior past-dated rows, and records a new `forecast_runs` row. Python deps live in the server venv (`venv/`); `run_forecast.js` handles env + venv automatically.
+- **DB access for validation:** `ssh netcup`, then `PGPASSWORD=6D3GUkxuFgi2UghwgnUd psql -h localhost -U inventory_readonly -d inventory_db`. The engine itself connects with the write user via env vars loaded from `/var/www/inventory/.env` — schema changes should be made idempotently *inside the engine code* (the file already uses `CREATE TABLE IF NOT EXISTS` / `CREATE INDEX IF NOT EXISTS`; use `ALTER TABLE ... ADD COLUMN IF NOT EXISTS` the same way) so no manual migration is needed.
+- **Python gotchas already handled in this file (don't regress):** numpy types must go through the registered psycopg2 adapters; `pd.Series.combine_first()` keeps zeros over real data — use `reindex(..., fill_value=0.0)`.
+- Engine runtime budget: currently ~212–227s. Phases 1–2 shouldn't move it meaningfully; Phase 3's extra archiving adds one INSERT…SELECT. If runtime balloons past ~6 min, investigate before shipping.
+- `--backfill` mode (`backfill_accuracy_data`) is an in-sample backtest using the *old* formulas. **Do not run it anymore**; there is enough real out-of-sample history. Updating it to match the new logic is optional/low priority (F11).
+
+---
+
+## Phase 1 — Bias bugs in the engine (no schema changes)
+
+### F1. Decay velocity: stop averaging over sparse snapshot rows
+
+**Where:** `forecast_engine.py`, `batch_load_product_data()`, the decay query (~lines 697–710).
+
+**Problem:** `AVG(COALESCE(dps.units_sold, 0))` runs over only the snapshot rows that exist — mostly sold-days. Measured inflation on the current 975 decay products: **4.07x** (1.353 vs 0.332 true units/day). This feeds `compute_scale_factor()` for the decay phase and is the single largest bias source.
+
+**Fix:** divide the sum by calendar days in the window, clipped to the product's age (decay products are 14–60 days old, so a 20-day-old product's window is 20 days, not 30):
+
+```sql
+SELECT dps.pid,
+    SUM(COALESCE(dps.units_sold, 0))::float
+      / GREATEST(LEAST(30, (CURRENT_DATE - pm.date_first_received::date)), 1) AS avg_daily
+FROM daily_product_snapshots dps
+JOIN product_metrics pm ON pm.pid = dps.pid
+WHERE dps.pid = ANY(%s)
+  AND dps.snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
+  AND dps.snapshot_date >= pm.date_first_received::date
+GROUP BY dps.pid, pm.date_first_received
+```
+
+No Python-side changes needed; `data['decay_velocity']` keeps the same shape. Products with zero snapshot rows in the window still get no entry → existing `scale = 1.0` fallback applies (acceptable: decay classification requires `sales_velocity_daily > 0`, so truly dead products don't reach this path).
+
+### F2. Mature history: reindex over the full calendar window
+
+**Where:** `forecast_engine.py`, `forecast_mature()` (~lines 833–836).
+
+**Problem:** `hist.set_index('snapshot_date').resample('D').sum()` only spans first-snapshot → last-snapshot. Interior gaps correctly become zeros, but **leading and trailing quiet periods are absent**, so the Holt level is fitted on the product's busy span. A marginal mature product whose activity clusters in 2 of the last 8 weeks gets a level ~4x too high.
+
+**Fix:** replace the resample with an explicit reindex over the full `EXP_SMOOTHING_WINDOW` ending yesterday:
+
+```python
+hist = history_df.copy()
+hist['snapshot_date'] = pd.to_datetime(hist['snapshot_date'])
+hist = hist.set_index('snapshot_date')['units_sold']
+full_index = pd.date_range(
+    end=pd.Timestamp(date.today() - timedelta(days=1)),
+    periods=EXP_SMOOTHING_WINDOW, freq='D')
+series = hist.reindex(full_index, fill_value=0.0).values.astype(float)
+```
+
+Notes: (pid, snapshot_date) is unique in `daily_product_snapshots`, so no duplicate-index risk. `observed_mean` and the `cap` recompute over the full window automatically (intended — the cap gets correspondingly tighter). Mature products are by definition >60 days old, so the 60-day window never predates first receipt. Do NOT use `combine_first` (see gotchas above).
+
+### F3. Stop double-applying the monthly seasonal index
+
+**Where:** `forecast_engine.py`, `generate_all_forecasts()` — the `seasonal_multipliers` pre-compute (~lines 959–961) and application (~line 1050).
+
+**Problem:** every per-product calibration (decay velocity, mature Holt level, launch first-week scale, preorder rate, slow-mover velocity) is fitted on *raw recent actuals*, which already embed the current month's seasonal level. The forecast then multiplies by the **absolute** monthly index of the target date. Example from the live indices (`forecast_runs.phase_counts` for run 129): May = 1.224 (sale month), June = 0.982. Early-June forecasts were calibrated on May-sale-inflated velocities and barely discounted — a structural ~25% over-forecast at that transition, and it'll be worse around November (1.316).
+
+**Fix:** apply the seasonal index *relative to the calibration period*. Compute a calibration index as the average monthly index over the trailing 30 calendar days (robust at month boundaries), then divide:
+
+```python
+today = date.today()
+trailing = [today - timedelta(days=i) for i in range(1, 31)]
+calibration_index = float(np.mean([monthly_indices.get(d.month, 1.0) for d in trailing]))
+seasonal_multipliers = [
+    monthly_indices.get(d.month, 1.0) / max(calibration_index, 0.1)
+    for d in forecast_dates
+]
+```
+
+Leave the DOW multipliers absolute — every calibration is a multi-week average and therefore DOW-neutral, so reshaping by absolute DOW indices is correct.
+
+**Optional sub-fix (same area, low priority):** the monthly indices are computed from a single trailing 365-day window, so each month appears once and YoY growth contaminates "seasonality". A cheap improvement is widening `SEASONAL_LOOKBACK_DAYS` to 730 and averaging the two observations of each month. Do this only after the main fixes are validated.
+
+### Phase 1 validation
+
+Deploy (edit locally; NFS propagates), run the engine manually once, wait for 3–5 daily cycles, then:
+
+```sql
+-- Portfolio ratio per day (target: drifts from ~2.0 toward 0.8–1.3)
+WITH ranked AS (
+  SELECT pfh.pid, pfh.forecast_date, pfh.forecast_units, pfh.lifecycle_phase,
+    ROW_NUMBER() OVER (PARTITION BY pfh.pid, pfh.forecast_date ORDER BY fr.started_at DESC) rn
+  FROM product_forecasts_history pfh
+  JOIN forecast_runs fr ON fr.id = pfh.run_id
+  WHERE pfh.forecast_date >= CURRENT_DATE - 7)
+SELECT r.forecast_date, round(SUM(r.forecast_units),0) AS fc,
+  SUM(COALESCE(dps.units_sold,0)) AS act,
+  round(SUM(r.forecast_units)/NULLIF(SUM(COALESCE(dps.units_sold,0)),0),2) AS ratio
+FROM ranked r
+LEFT JOIN daily_product_snapshots dps ON dps.pid = r.pid AND dps.snapshot_date = r.forecast_date
+WHERE r.rn = 1 AND r.lifecycle_phase != 'dormant'
+GROUP BY 1 ORDER BY 1;
+```
+
+Also check `forecast_accuracy` `by_phase` rows for the newest run: decay bias should fall from +0.35 toward ~0, mature from +0.17 toward ~0. (Accuracy lags ~1 day behind each fix since it evaluates yesterday's forecasts.)
+
+---
+
+## Phase 2 — Demand the model currently ignores or mistimes
+
+### F4. Preorder: forecast the preorder rate until arrival, launch curve after
+
+**Where:** `forecast_engine.py` — `batch_load_product_data()` (add arrival dates), `generate_all_forecasts()` preorder branch (~lines 1005–1009), and `forecast_from_curve()` (or a small wrapper).
+
+**Problem:** preorder products run the launch curve from `age=0` starting **today**, i.e. full first-week launch sales while the product is still weeks from arriving. Actual preorder-period sales are a much slower trickle.
+
+**Fix:**
+
+1. Batch-load each preorder product's expected arrival from `purchase_orders` (line-item grain: it has `pid` and `expected_date` directly). Open statuses verified against live data: `created`, `ordered`, `electronically_sent`, `receiving_started` (~705 open line items currently have a future `expected_date`):
+
+```sql
+SELECT pid, MIN(expected_date) AS expected_arrival
+FROM purchase_orders
+WHERE pid = ANY(%s)
+  AND status IN ('created', 'ordered', 'electronically_sent', 'receiving_started')
+  AND expected_date IS NOT NULL
+  AND expected_date >= CURRENT_DATE
+GROUP BY pid
+```
+
+Fallbacks, in order: (a) an open PO with a *past* `expected_date` → assume arrival in 7 days; (b) no PO at all → arrival in 14 days (and log a counter of how many hit this default).
+
+2. In the preorder branch, build the daily array piecewise. Let `days_until_arrival = (expected_arrival - today).days`:
+   - Days `0 .. days_until_arrival-1`: flat observed preorder daily rate = `preorder_sales[pid] / max(preorder_days[pid], 1)` (both already batch-loaded), clamped to ≤ the curve's scaled week-0 daily value.
+   - Days `days_until_arrival .. horizon`: `forecast_from_curve(curve_info, scale, age_days=0, ...)` shifted so the curve's day 0 lands on the arrival date (i.e. pass `horizon_days - days_until_arrival` and offset into the output array).
+   - Keep the existing `compute_scale_factor('preorder', ...)` for the post-arrival curve; the pre-arrival segment doesn't use it.
+
+This is consistent with how the reference curves were built: historical preorder units were recorded on their **order dates** (pre-arrival), so week-0 of the fitted curves reflects post-receipt orders, not the backlog.
+
+### F5. Dormant products: small positive rate instead of hard zero, and count them
+
+**Where:** `forecast_engine.py` — `generate_all_forecasts()` dormant branch (~lines 1040–1042), `batch_load_product_data()`, and `compute_accuracy()`.
+
+**Problem:** all ~28K dormant products are forecast at exactly 0, yet they sold 16,180 units in the eval window (~11% of all demand) — restocks, promos, long-tail. Worse, dormant is *excluded* from the headline accuracy filter, so this miss is invisible.
+
+**Fix (cheap version, do this now):**
+
+1. Batch-load a trailing-180-day order rate for dormant products (11,362 of them have ≥1 sale in 180d — verified):
+
+```sql
+SELECT o.pid, SUM(o.quantity) / 180.0 AS rate
+FROM orders o
+WHERE o.pid = ANY(%s)
+  AND o.canceled IS DISTINCT FROM TRUE
+  AND o.date >= CURRENT_DATE - INTERVAL '180 days'
+GROUP BY o.pid
+```
+
+2. Dormant branch: if the product has a rate > 0, forecast it flat with `method = 'velocity'`; else keep zeros with `method = 'zero'`. Apply the same DOW/seasonal multipliers as everything else (automatic — they're applied after the branch).
+3. In `compute_accuracy()`, add a second overall row: `metric_type='overall', dimension_value='all_incl_dormant'` with no dormant filter (keep the existing `'all'` row unchanged for trend continuity). One extra entry in the `dimensions`/`filter_clauses` dicts.
+
+**Upgrade path (optional, Phase 4):** replace flat rates for `slow_mover` + dormant-with-sales with TSB (Teunter–Syntetos–Babai), the standard intermittent-demand method with obsolescence handling. Per product over a daily series `d_t` (build it from snapshots the F2 way — full calendar reindex):
+
+```
+if d_t > 0:  p_t = p_{t-1} + β·(1 − p_{t-1});  z_t = z_{t-1} + α·(d_t − z_{t-1})
+else:        p_t = p_{t-1}·(1 − β);            z_t = z_{t-1}
+forecast = p_T · z_T   (flat across horizon)
+```
+
+Start with α=0.1, β=0.05, initialize p = (nonzero days / total days), z = mean of nonzero demands. Scope: slow_mover (~6K) + dormant with 180d sales (~11K); series from up to 180 days of snapshots (sparse rows → ~manageable volume). Only do this after Phase 3 measurement exists to prove it beats the flat rates.
+
+### Phase 2 validation
+
+After 3–5 cycles: preorder `by_phase` bias should drop from +0.85 toward < +0.3; the new `all_incl_dormant` row should appear and its `total_actual_units` minus `'all'`'s should be largely *covered* rather than all-miss (dormant `bias` rising from −1.36 toward ~−0.3 or better).
+
+---
+
+## Phase 3 — Fix the measurement (schema + engine + API + UI)
+
+> Without this phase you cannot see whether Phases 1–2 worked except by ad-hoc SQL, the lead-time chart stays a single bucket forever, and the dashboard keeps displaying a number with a 190% floor in red.
+
+### F7. Archive long-lead forecasts so 15/30/60/90d accuracy exists
+
+**Where:** `forecast_engine.py` — `archive_forecasts()` (~lines 1086–1154), `compute_accuracy()` CTE (~lines 1201–1228).
+
+**Problem:** the current design archives only *past-dated* rows of the previous run before truncation. With daily runs, that's only ever the 1-day-ahead slice — all 879,800 accuracy samples sit in the '1-7d' bucket and the longer buckets in the UI chart can never populate. Purchasing decisions ride on 30–60d forecasts that are never validated.
+
+**Fix:**
+
+1. Keep the existing past-date archiving exactly as is (it provides dense short-lead coverage).
+2. After `generate_all_forecasts()` completes, additionally archive a **sampled set of future leads** from the new run, non-dormant only, attributed to the *current* run id (correct attribution, unlike the past-date path which attributes to the previous run):
+
+```sql
+INSERT INTO product_forecasts_history
+    (run_id, pid, forecast_date, forecast_units, forecast_revenue,
+     lifecycle_phase, forecast_method, confidence_lower, confidence_upper, generated_at)
+SELECT %(run_id)s, pid, forecast_date, forecast_units, forecast_revenue,
+    lifecycle_phase, forecast_method, confidence_lower, confidence_upper, generated_at
+FROM product_forecasts
+WHERE lifecycle_phase != 'dormant'
+  AND forecast_date - CURRENT_DATE IN (7, 14, 30, 60, 89)
+ON CONFLICT (run_id, pid, forecast_date) DO NOTHING
+```
+
+Volume: ~10K non-dormant products × 5 leads ≈ 50K rows/day; the existing 90-day prune (`forecast_date < CURRENT_DATE - 90`) bounds steady state at a few million rows. Note future-dated rows survive until their date passes + 90 days — that's intended.
+
+3. **CRITICAL companion change** in `compute_accuracy()`: the accuracy CTE must now exclude not-yet-realized rows, or future-dated archives get scored against actual=0:
+
+```sql
+FROM product_forecasts_history pfh
+JOIN forecast_runs fr ON fr.id = pfh.run_id
+WHERE pfh.forecast_date < CURRENT_DATE          -- ADD THIS
+```
+
+4. **Dedup semantics change.** Today's `ROW_NUMBER() OVER (PARTITION BY pid, forecast_date ORDER BY started_at DESC)` keeps only the latest (= shortest-lead) row per pid/date, which would silently discard all the new long-lead rows. Restructure:
+   - Compute `lead_days = forecast_date - started_at::date` and the lead bucket *inside* `ranked_history`.
+   - For `by_lead_time`: dedup `PARTITION BY pid, forecast_date, lead_bucket` (one sample per pid/date/bucket, latest run wins within a bucket).
+   - For everything else (`overall`, `by_phase`, `by_method`, `daily`, and the new weekly metric below): restrict to `lead_days BETWEEN 0 AND 6` and keep the existing per-(pid, date) dedup. This preserves the current meaning of the headline metrics (short-lead) while the lead-time table becomes real.
+
+### F8. Track a naive baseline (forecast value-added)
+
+**Where:** `archive_forecasts()` (both INSERT paths), `compute_accuracy()`, `forecast_accuracy` schema, `/forecast/accuracy` endpoint.
+
+**Problem:** the engine currently *loses* to a trailing-average naive forecast (221% vs 204% daily WMAPE) and nothing on the dashboard would ever reveal that. Every accuracy improvement should be judged as value-over-naive.
+
+**Fix:**
+
+1. Schema (idempotent, in the ensure blocks): `ALTER TABLE product_forecasts_history ADD COLUMN IF NOT EXISTS naive_units NUMERIC(10,2);` and `ALTER TABLE forecast_accuracy ADD COLUMN IF NOT EXISTS naive_wmape NUMERIC(10,4), ADD COLUMN IF NOT EXISTS fva NUMERIC(10,4);`
+2. Populate `naive_units` during both archive INSERTs via a join — naive = flat trailing-28-day average daily units as of archive time (28 days = DOW-balanced; information available at generation; same value at every lead, which is exactly what a naive baseline means):
+
+```sql
+LEFT JOIN (
+    SELECT o.pid, SUM(o.quantity) / 28.0 AS naive_daily
+    FROM orders o
+    WHERE o.canceled IS DISTINCT FROM TRUE
+      AND o.date >= CURRENT_DATE - INTERVAL '28 days' AND o.date < CURRENT_DATE
+    GROUP BY o.pid
+) nv ON nv.pid = pf.pid
+-- select COALESCE(nv.naive_daily, 0) AS naive_units
+```
+
+3. In `compute_accuracy()`, add to each dimension's aggregate: `SUM(ABS(naive_units - actual_units)) / NULLIF(SUM(actual_units),0) AS naive_wmape` and store `fva = 1 - wmape / naive_wmape` (NULL-safe). Rows archived before this change have `naive_units` NULL — treat NULL as excluded (`FILTER (WHERE naive_units IS NOT NULL)` on the naive sums) rather than as zero.
+4. Endpoint: include `naiveWmape` and `fva` in the `overall` (and per-phase) payload of `/dashboard/forecast/accuracy` in `dashboard.js`.
+
+### F9. Weekly-grain headline metric + bias as a percentage
+
+**Where:** `compute_accuracy()`, `/forecast/accuracy` endpoint, `ForecastAccuracy.tsx`.
+
+**Problem:** daily-grain WMAPE on this catalog has a ~190% floor — as a headline it's noise. The informative numbers are (a) weekly-per-product WMAPE (currently ~109%, target ~70–85% post-fix) and (b) aggregate bias, which the UI currently renders as `+0.108 units` — indistinguishable from zero while the reality is +70%.
+
+**Fix:**
+
+1. New metric in `compute_accuracy()`: `metric_type='overall_weekly', dimension_value='all'`. Definition: using the short-lead deduped rows (lead ≤ 6, non-dormant), aggregate per `(pid, date_trunc('week', forecast_date))` keeping only complete weeks (`COUNT(*) = 7`), then `WMAPE = SUM(ABS(fc_week − act_week)) / SUM(act_week)`, excluding pid-weeks where both are 0. Store sample_size = number of pid-weeks. Compute `naive_wmape`/`fva` the same way from `naive_units`.
+2. Endpoint: expose as `overallWeekly`; also add a weekly variant to the `accuracyTrend` query (`metric_type='overall_weekly'`). The trend will start empty (old runs lack the row) — that's fine; don't backfill.
+3. `ForecastAccuracy.tsx`:
+   - Headline WMAPE → `overallWeekly.wmape`, labeled "WMAPE (weekly)". Keep daily WMAPE available in a tooltip if desired.
+   - Color thresholds for weekly grain: green ≤ 60, yellow ≤ 90, red above (tunable; document that they're calibrated for intermittent retail demand).
+   - Replace the bias row: show `(totalForecast / totalActual − 1)` as a signed percentage labeled "Forecast vs actual" (both totals already arrive in `overall`). Keep MAE.
+   - Add a "vs naive" line: naive weekly WMAPE and FVA. FVA > 0 = engine adds value.
+   - The lead-time chart needs no code change — buckets will populate as F7 rows mature (7d lead evaluable after 7 days, 30d after 30, etc.).
+4. `confidenceLevel` in `/forecast/metrics` ([dashboard.js ~line 360]) is "share of products forecast via lifecycle curves", not confidence. It only feeds a per-day tooltip field — rename the JSON field to `curveCoverage` and update the one consumer in `ForecastMetrics.tsx`, or leave it and add a comment; low priority.
+
+### Phase 3 validation
+
+- Next run after deploy: `forecast_accuracy` contains `overall_weekly` and `fva` values; `/dashboard/forecast/accuracy` returns them; the overview popover renders weekly WMAPE, bias %, and the naive comparison.
+- After 7/14/30 days: `by_lead_time` rows appear for '8-14d', '15-30d', '31-60d' buckets respectively (61-90d after ~60 days).
+- Confirm engine runtime still < ~5 min and `product_forecasts_history` growth ≈ 50–70K rows/day.
+
+---
+
+## Phase 4 — Optional / after the above is proven
+
+- **F6. TSB for slow movers + dormant** (spec in F5). Gate on Phase 3 measurement: ship only if weekly FVA improves on those phases.
+- **F10. Confidence-margin source:** `load_accuracy_margins()` feeds daily-grain per-phase WMAPE (clamped to 1.0) into the intervals, so every interval is ±100% — uninformative. Once `overall_weekly` exists, add per-phase weekly rows (`by_phase_weekly`) and source margins from those instead.
+- **F11.** Update or delete `backfill_accuracy_data()` (it encodes the old formulas). Until then, just don't run `--backfill`.
+- **F12.** `compute_dow_indices()` weights by revenue but the multipliers are applied to units — switch `SUM(o.price * o.quantity)` to `SUM(o.quantity)`. Tiny effect.
+- **F13.** Longer term: for reorder decisions the right target is P(lead-time demand > stock), not a point forecast. Evaluate quantile (pinball) loss at lead-time horizons using the existing confidence-interval columns. Design separately.
+
+---
+
+## 4. Success criteria
+
+1. Rolling-14-day portfolio forecast/actual ratio within **0.8–1.25** (currently 1.5–2.5).
+2. Weekly-grain WMAPE ≤ **90%** and **FVA > 0** (engine beats naive) sustained for 2+ weeks.
+3. Decay/preorder/mature per-phase bias within ±0.1 units/day (currently +0.35 / +0.85 / +0.17).
+4. `all_incl_dormant` actuals covered: dormant bias better than −0.4 (currently −1.36, i.e. 100% miss).
+5. Lead-time buckets through 31–60d populated with ≥10K samples each within ~6 weeks.
+6. Launch phase stays healthy (bias within ±0.15, WMAPE not degraded) — regression guard for F3/F4 changes.
+
+## 5. Re-measurement appendix
+
+The naive-vs-engine comparison used in the diagnosis (rerun any time; adjust dates):
+
+```sql
+WITH ranked AS (
+  SELECT pfh.pid, pfh.forecast_date, pfh.forecast_units, pfh.lifecycle_phase,
+    ROW_NUMBER() OVER (PARTITION BY pfh.pid, pfh.forecast_date ORDER BY fr.started_at DESC) rn
+  FROM product_forecasts_history pfh
+  JOIN forecast_runs fr ON fr.id = pfh.run_id
+  WHERE pfh.forecast_date BETWEEN CURRENT_DATE - 9 AND CURRENT_DATE - 1),
+eng AS (SELECT * FROM ranked WHERE rn = 1 AND lifecycle_phase != 'dormant'),
+naive AS (
+  SELECT o.pid, SUM(o.quantity)/30.0 AS naive_daily FROM orders o
+  WHERE o.canceled IS DISTINCT FROM TRUE
+    AND o.date >= CURRENT_DATE - 39 AND o.date < CURRENT_DATE - 9
+  GROUP BY o.pid)
+SELECT e.lifecycle_phase, COUNT(*) AS n, SUM(COALESCE(dps.units_sold,0)) AS actual,
+  round(SUM(e.forecast_units),0) AS engine_fc, round(SUM(COALESCE(nv.naive_daily,0)),0) AS naive_fc,
+  round(SUM(ABS(e.forecast_units - COALESCE(dps.units_sold,0)))/NULLIF(SUM(COALESCE(dps.units_sold,0)),0),2) AS engine_wmape,
+  round(SUM(ABS(COALESCE(nv.naive_daily,0) - COALESCE(dps.units_sold,0)))/NULLIF(SUM(COALESCE(dps.units_sold,0)),0),2) AS naive_wmape
+FROM eng e
+LEFT JOIN naive nv ON nv.pid = e.pid
+LEFT JOIN daily_product_snapshots dps ON dps.pid = e.pid AND dps.snapshot_date = e.forecast_date
+GROUP BY ROLLUP(e.lifecycle_phase) ORDER BY 1;
+```
+
+Baseline numbers to beat (June 1–9, 2026): engine 221% / naive 204% daily WMAPE; engine_fc/actual = 1.82; per-phase table in §1.
@@ -0,0 +1,449 @@
+# Import & Metrics Pipeline Fix Plan
+
+Fixes for issues found in a full review (2026-06-10) of the `full-update.js` pipeline:
+`inventory-server/scripts/full-update.js` → `import-from-prod.js` (6 importers in `scripts/import/`)
+→ `calculate-metrics-new.js` (7 SQL modules in `scripts/metrics-new/`).
+
+Every issue below was verified against the code, and where marked **[verified-live]**, against the
+live MySQL source (`sg` on 192.168.1.5 via the acot-db tooling / `ssh workpi`) and live PostgreSQL
+(`inventory_db` — `ssh netcup`, then `psql -U inventory_readonly`, password in `/Users/matt/Dev/inventory/CLAUDE.md`).
+Write credentials for migrations: see `/var/www/inventory/.env` on netcup (`inventory_user`).
+
+## Operational context (read first)
+
+- Local `inventory-server/` is **NFS-mounted** to `/var/www/inventory/` on the netcup server — edits
+  appear on the server with no copy step. Run heavy validation/grep/find **on the server via
+  `ssh netcup`**, not locally (NFS hangs + AppleDouble `._*` noise).
+- The PG server timezone is **Europe/Berlin**. The business operates in **America/Chicago**. This
+  matters for Fix 2.
+- MySQL server is America/Chicago; the mysql2 driver is configured `timezone: '-05:00'` and
+  corrected at runtime by `adjustDateForMySQL()` in `scripts/import/utils.js` (see
+  `memory/TIMEZONE_ISSUE.md`). Don't "fix" that part — it already works.
+- Orders/PO/products imports are incremental by default (`INCREMENTAL_UPDATE !== 'false'`); a full
+  orders sync = run with `INCREMENTAL_UPDATE=false` (5-year window).
+- Existing rebuild tooling: `scripts/metrics-new/backfill/rebuild_daily_snapshots.sql` (rebuilds
+  `daily_product_snapshots` from `orders`/`receivings`). The full-pipeline order after data fixes:
+  re-import → rebuild snapshots → `node scripts/calculate-metrics-new.js`.
+- Precedent: `scripts/metrics-new/migrations/002_fix_discount_double_counting.sql` documents the
+  procedure used last time a discount formula changed. Follow the same pattern (migration doc +
+  code fix + full re-import + rebuild).
+
+---
+
+## P0 — Data correctness (do both, then ONE re-import + rebuild)
+
+### Fix 1: Item-level promo discounts dropped (~$26K / 30 days ≈ 10% of product revenue) [verified-live]
+
+**File:** `scripts/import/orders.js` — `order_totals` CTE (~lines 604-623) and the discount fetch in
+`processDiscountsBatch` (~lines 379-383).
+
+**Problem.** The discount applied to each PG `orders` row is:
+prorated `summary_discount_subtotal` + item-level promo discounts. The item-level part is gated:
+
+```sql
+SUM(CASE WHEN COALESCE(md.discount_amount_subtotal, 0) > 0 THEN id.amount ELSE 0 END)
+```
+
+In the PHP source (`/Users/matt/Dev/acot/website/website/lib/neworder.class.php`):
+- `order_items.prod_price` is the **pre-promo** price; `summary_subtotal = Σ prod_price·qty` (line ~3087).
+- Item-level promo discounts live in `order_discount_items` with `which = 2`; they are applied to the
+  order total via `summary_discount += amount + products_disc_sum` (line ~6567) — i.e. they are **not**
+  part of `discount_amount_subtotal` and **not** baked into `prod_price`.
+- Live data (90 days): of 10,010 type-10 promo discounts, **8,070 have item rows but only 8 have
+  `discount_amount_subtotal > 0`** — the gate zeroes essentially all item-level promo discounts.
+- Live impact (30 days): **$25,989 dropped** across 2,021 orders, vs only $13,574 captured via the
+  prorated subtotal component. Order discount components, 30d: total $54,957 = $13,574 subtotal +
+  $15,395 shipping + ~$25,989 item-level. (Shipping discounts correctly excluded from product revenue.)
+
+**Consequence.** `orders.discount` understated → `net_revenue`, `profit_30d`, `margin_30d` overstated
+by ~10% of revenue; `discounts_30d` / `discount_rate_30d` ~3x understated. Flows into daily snapshots,
+product/brand/vendor/category metrics, and dashboards.
+
+**Fix.**
+1. In `processDiscountsBatch`, fetch only real item discounts:
+   `SELECT order_id, pid, discount_id, amount FROM order_discount_items WHERE order_id IN (?) AND which = 2`.
+   (`which=1` rows store prices of free promo-added items; `which=3` are usage records — neither is a
+   discount amount.)
+2. In the `order_totals` CTE, remove the gate — sum `id.amount` unconditionally:
+   `SUM(COALESCE(id.amount, 0)) AS promo_discount_sum` (drop the join/CASE on `temp_main_discounts`;
+   `temp_main_discounts` becomes unused and can be removed entirely along with its insert loop).
+3. Sanity guard (optional, recommended): clamp final per-row discount to `price * quantity`.
+
+**Verification.** After a FULL orders re-import, for a recent 30-day window PG should satisfy:
+`SUM(discount)` ≈ MySQL `Σ summary_discount_subtotal` + `Σ order_discount_items.amount (which=2)`
+over the same orders (± rounding from proration). Spot-check an order with a type-10 promo:
+discount on the affected pid ≈ the `which=2` amount. Re-run migration 002's verification query too
+(pids 624756, 614513) to confirm no regression of the prior fix.
+
+### Fix 2: Daily snapshots bucket sales by Europe/Berlin days, not business days [verified-live]
+
+**Files:** `scripts/metrics-new/update_daily_snapshots.sql` (SalesData join `o.date::date = _target_date`
+~line 138; gap-fill and stale-detection aggregates at lines ~47-83);
+`scripts/metrics-new/backfill/rebuild_daily_snapshots.sql` (same pattern — check & fix);
+`scripts/metrics-new/update_product_metrics.sql` (`HistoricalDates` `MIN(o.date)::date` etc., lines ~131-147).
+
+**Problem.** `orders.date` is `timestamptz`; `::date` casts in the server TZ (**Europe/Berlin**,
+verified via `SHOW timezone`). Berlin is 7-8h ahead of Central, so every order placed after
+~5 PM Central lands on the **next** snapshot day. This shifts a large evening slice of daily sales
+forward one day; skews `yesterday_sales`, day-of-week patterns (the forecast engine's DOW
+multipliers, daily-grain forecast accuracy — see `FORECAST_FIX_PLAN.md`), and is inconsistent with
+`stock_snapshots`, whose dates come from a Central-time MySQL cron.
+
+**Fix.** Bucket all order/receiving dates in business time. Replace every `o.date::date` /
+`received_date::date` used for *day bucketing* in the two snapshot SQL files with:
+
+```sql
+(o.date AT TIME ZONE 'America/Chicago')::date
+```
+
+Apply consistently in: SalesData, ReceivingData, the gap-fill date lists, the stale-detection
+aggregates (they must match SalesData or every day looks permanently stale), and the rebuild script.
+`HistoricalDates` in update_product_metrics (first/last sold dates) should match too.
+Add an index to keep the per-day loop fast, e.g.
+`CREATE INDEX ON orders ( ((date AT TIME ZONE 'America/Chicago')::date) );` and equivalent on
+`receivings(received_date)`; check `EXPLAIN` on the SalesData query afterward.
+
+Note: `receivings.received_date` came from MySQL DATETIME (Central literal) inserted as timestamptz —
+it was interpreted in the *session* TZ at insert. Before converting, spot-check a few receivings
+against MySQL to confirm which TZ the stored instants actually represent; the conversion expression
+must yield the Central calendar day MySQL shows. Same check for `orders.date` (it originates from
+`_order.date_placed`, a TIMESTAMP column, so it should be a correct instant — `AT TIME ZONE
+'America/Chicago'` is right for it).
+
+**Verification.** Pick 2-3 recent days; compare per-day `units_sold` totals in
+`daily_product_snapshots` against MySQL
+`SELECT date_placed_onlydate, SUM(qty_ordered) ... WHERE order_status >= 20 GROUP BY 1`
+(MySQL stores Central days). They should now match closely (small diffs from canceled-status timing).
+
+### P0 execution order (single pass)
+
+1. Land Fix 1 (orders.js) and Fix 2 (both snapshot SQL files + product-metrics date CTE).
+2. Full orders re-import: `INCREMENTAL_UPDATE=false node scripts/import-from-prod.js` (or at minimum
+   the orders step) — run on the server, it's long.
+3. Rebuild snapshots: `psql -f scripts/metrics-new/backfill/rebuild_daily_snapshots.sql` (after
+   confirming it contains the TZ fix). The hourly job's 90-day self-heal will NOT fix history beyond
+   90 days by itself; the explicit rebuild is required.
+4. `node scripts/calculate-metrics-new.js`.
+5. Expect dashboards to show: margins down ~8-10 points (real), daily sales curves shifted, DOW
+   profile changed. Tell the user before/after numbers.
+
+---
+
+## P1 — Wrong or drifting numbers, fix soon
+
+### Fix 3: Vendor avg lead time computed over a near-cartesian join
+
+**File:** `scripts/metrics-new/calculate_vendor_metrics.sql`, `VendorPOAggregates` (lines ~62-83).
+
+**Problem.** Joins each done-PO line to **every** receiving of the same (pid, supplier) after the PO
+date — a product received 10 times contributes 10 ever-growing lead times → overstated, busy-product-
+weighted vendor lead time. The per-product version in `update_periodic_metrics.sql` (lines 27-48)
+is correct (MIN receiving per PO within 180 days, then average).
+
+**Fix.** Reuse the periodic shape, aggregated to vendor:
+
+```sql
+WITH po_first_receiving AS (
+    SELECT po.vendor, po.po_id, po.pid, po.date::date AS po_date,
+           MIN(r.received_date::date) AS first_receive_date
+    FROM purchase_orders po
+    JOIN receivings r ON r.pid = po.pid AND r.supplier_id = po.supplier_id
+        AND r.received_date >= po.date
+        AND r.received_date <= po.date + INTERVAL '180 days'
+    WHERE po.status = 'done' AND po.date >= CURRENT_DATE - INTERVAL '1 year'
+      AND po.vendor IS NOT NULL AND po.vendor <> ''
+    GROUP BY po.vendor, po.po_id, po.pid, po.date
+)
+SELECT vendor, COUNT(DISTINCT po_id) AS po_count_365d,
+       ROUND(AVG(GREATEST(1, first_receive_date - po_date)))::int AS avg_lead_time_days_hist
+FROM po_first_receiving GROUP BY vendor
+```
+
+**Verification.** For a few vendors compare old vs new values; new should be materially lower and
+roughly match `AVG(product_metrics.avg_lead_time_days)` for that vendor's products.
+
+### Fix 4: Deleted order items & combined orders never reconciled in PG [verified-live]
+
+**File:** `scripts/import/orders.js`.
+
+**Problem.** The orders import upserts but never deletes:
+- Items removed from an order in MySQL (`DELETE FROM order_items ...` happens, e.g.
+  neworder.class.php ~line 6500 for unpicked promo items, plus staff edits) leave stale rows in PG
+  forever. May 2026 check: PG has 49,841 item rows vs MySQL 49,377 (+0.9%) — and PG should be ≤
+  MySQL.
+- Combining orders (`combine_orders`, neworder.class.php ~11946) sets the source orders to status 16
+  AND **zeroes `date_placed`**, then copies all items to a NEW order. Because the import query
+  filters `o.date_placed >= …`, a combined source order can never be re-fetched, so its stale
+  'placed' rows would double-count with the new merged order. Currently latent (last combine
+  2024-07, predating current PG data — verified no stale rows exist today), but it will silently
+  corrupt the day combining is used again.
+
+**Fix.** Two parts, both inside the orders import after the upsert phase:
+1. **Item-set reconciliation** for re-imported orders: the import already knows the set of changed
+   `orderIds` and inserted their current items into `temp_order_items`. Mirror the PO import's
+   pattern (`purchase-orders.js` lines ~683-694):
+   ```sql
+   DELETE FROM orders o
+   WHERE o.order_number = ANY($1)           -- orders fetched this run
+     AND NOT EXISTS (SELECT 1 FROM temp_order_items t
+                     WHERE t.order_id = o.order_number AND t.pid = o.pid);
+   ```
+2. **Combined/cancelled sweep** that does NOT depend on `date_placed`: each run, fetch from MySQL
+   `SELECT order_id, order_status FROM _order WHERE order_status IN (15,16) AND stamp > ?`
+   (no date_placed filter) and update matching PG rows' `status`/`canceled`
+   ('combined' rows are then excluded from metrics — see Fix 5). Cheap (small result set).
+
+**Verification.** Re-run the May-2026 row-count comparison (MySQL vs PG for one month) after one full
+run; counts should converge (PG ≤ MySQL, diff explained by TZ window edges only).
+
+### Fix 5: 'combined' orders are counted as sales
+
+**Files:** `scripts/metrics-new/update_daily_snapshots.sql` (status filters, lines ~77, 120-134),
+`update_product_metrics.sql` (`HistoricalDates` line ~145, `LifetimeRevenue` line ~249),
+`backfill/rebuild_daily_snapshots.sql`.
+
+**Problem.** Sales filters exclude only `('canceled', 'returned')`. Status 16 'combined' = "merged
+into another order" — the new order carries the same items, so counting both double-counts. 826
+combined orders exist in MySQL; today none are in PG (see Fix 4), but once Fix 4's sweep starts
+marking rows 'combined', the metrics filters must exclude them.
+
+**Fix.** Change every `NOT IN ('canceled', 'returned')` in the metrics SQL to
+`NOT IN ('canceled', 'returned', 'combined')`. Grep for the pattern in `scripts/metrics-new/` and
+`src/routes/` (dashboard endpoints replicate these filters — see CLAUDE.md analytics-filters note).
+
+### Fix 6: Incremental sync watermark race (silent permanent misses)
+
+**Files:** `scripts/import/orders.js` (~772), `products.js` (~934), `purchase-orders.js` (~833).
+
+**Problem.** `sync_status.last_sync_timestamp` is set to `NOW()` *after* the import finishes. Any
+MySQL row modified between the source query and that write is below the new watermark but was never
+fetched → permanently skipped (until a full sync or the row changes again). Long imports widen the
+window; PG/MySQL clock skew adds to it.
+
+**Fix.** Capture the watermark **before** the source query and write that value:
+```js
+const [[{ now: sourceNow }]] = await prodConnection.query('SELECT NOW() as now');
+// ... do the import ...
+await localConnection.query(
+  `INSERT INTO sync_status ... VALUES ('orders', $1) ON CONFLICT ... SET last_sync_timestamp = $1`,
+  [sourceNow]);
+```
+Using MySQL's own clock also eliminates cross-server skew. Note `sourceNow` comes back through the
+mysql2 driver TZ conversion — verify round-tripping with `adjustDateForMySQL` produces a correct
+comparison value, or store `UTC_TIMESTAMP()` and compare against `CONVERT_TZ`-normalized stamps.
+Overlap (re-importing rows changed during the run) is harmless — everything is upserted.
+
+### Fix 7: Stockout days / service level / fill rate / avg stock built on activity-only snapshots
+
+**Files:** `scripts/metrics-new/update_product_metrics.sql` — `SnapshotAggregates`
+(`stockout_days_30d`, `avg_stock_*_30d`, lines ~177-189), `ServiceLevels` (lines ~304-323),
+plus `calculate_sales_velocity` usage.
+
+**Problem.** `daily_product_snapshots` only has rows on days with sales/receivings. So:
+- A product that is out of stock (and therefore sells nothing) gets **no row** → `stockout_days_30d`
+  ≈ 0 exactly when stockouts matter → `calculate_sales_velocity(sales, stockout_days)`'s adjustment
+  is inert → velocity and replenishment understated for constrained products.
+- `service_level_30d` divides stockout days by COUNT(activity days), not 30.
+- `avg_stock_units_30d` / `avg_stock_cost_30d` average only activity days (biased toward in-stock
+  days) → GMROI / stockturn / sell-through denominators biased.
+- `fill_rate_30d`'s `units_sold * 0.2` lost-sales heuristic is arbitrary — fine to keep, but document.
+
+**Fix.** Derive stock-presence metrics from `stock_snapshots` (full daily coverage from MySQL
+`snap_product_value`, imported by `stock-snapshots.js`) instead of `daily_product_snapshots`:
+```sql
+StockCoverage AS (
+  SELECT pid,
+         COUNT(*) FILTER (WHERE stock_quantity <= 0) AS stockout_days_30d,
+         AVG(stock_quantity)  AS avg_stock_units_30d,
+         AVG(stock_value)     AS avg_stock_cost_30d
+  FROM stock_snapshots
+  WHERE snapshot_date >= _current_date - INTERVAL '29 days'
+  GROUP BY pid
+)
+```
+Treat products absent from `stock_snapshots` for a day as unknown (NULL), not in-stock. Keep
+`daily_product_snapshots` for sales/revenue aggregates. `service_level_30d` denominator becomes the
+count of covered days. Note `stock_snapshots` has no `eod_stock_retail`; keep retail/gross averages
+on the old source or compute as `stock_quantity * current price` explicitly.
+
+**Verification.** Pick products that had a known stockout period; `stockout_days_30d` should now be
+> 0 and `sales_velocity_daily` should rise accordingly.
+
+---
+
+## P2 — Definition / robustness improvements
+
+### Fix 8: Returns don't reduce COGS; LifetimeRevenue ignores returns
+`update_daily_snapshots.sql` SalesData: COGS accrues only on `quantity > 0` rows; return rows
+(negative qty — 15,875 rows live) subtract revenue but never COGS → margin understated in
+return-heavy periods. Add a returns-COGS term mirroring the sales-COGS COALESCE chain
+(`SUM(... WHEN quantity < 0 THEN cost * ABS(quantity))`) and subtract it in `cogs` (or store
+`returns_cogs` separately and use `cogs - returns_cogs` in profit). Also `LifetimeRevenue` in
+`update_product_metrics.sql` (line ~242) filters `quantity > 0` — include negative-qty rows so
+lifetime revenue nets out returns (drop the quantity filter; `price*quantity` is already signed,
+but check the `- discount` term sign for return rows).
+
+### Fix 9: return_rate_30d definition
+`update_product_metrics.sql` line ~468: `returns / (sales + returns)` → industry standard is
+`returns / sales`. Change denominator to `NULLIF(sa.sales_30d, 0)`.
+
+### Fix 10: GMROI not annualized
+Line ~466: `profit_30d / avg_stock_cost_30d` is a monthly GMROI (~1/12 of the conventional annual
+figure, benchmark ≥ 2-3). Either annualize (`* 12.17`) or rename the column/label "monthly".
+Decision for Matt; annualizing is recommended for comparability. Frontend displays must be checked
+either way.
+
+### Fix 11: get_weighted_avg_cost is a lifetime WAC
+`db/functions.sql` (~line 81, deployed identically): averages ALL receivings ≤ date — decade-old
+costs weigh equally. Recommended: window to recent receivings, e.g. last 365 days falling back to
+lifetime when none. Used as fallback COGS when `o.costeach` is NULL, so impact is modest but real
+for long-lived SKUs. Apply with `CREATE OR REPLACE FUNCTION` in `db/functions.sql` AND on the live DB.
+
+### Fix 12: exclude_from_forecast removes products from product_metrics entirely
+`update_product_metrics.sql` line ~627 (`WHERE s.exclude_forecast IS FALSE OR ... IS NULL`): the
+flag's name implies forecast-only, but excluded products get NO metrics row → vanish from brand/
+vendor/category rollups and dashboards. Fix: always emit the row; instead NULL the
+forecast/replenishment columns when excluded (wrap those expressions in
+`CASE WHEN s.exclude_forecast THEN NULL ELSE ... END`).
+
+### Fix 13: Incremental products import misses category-only changes
+`products.js` incremental WHERE (~lines 433-440) keys on `p.stamp`, `ci.stamp`, price/b2b dates —
+`product_category_index` changes don't bump any of those → PG `product_categories` goes stale. Also
+the `needs_update` comparison (~lines 604-625) doesn't compare `categories`, so even refetched rows
+skip the category rewrite. Fix both: add `t.categories IS NOT DISTINCT FROM p.categories` to the
+needs_update comparison (note: `products.categories` is the GROUP_CONCAT string — confirm PG column
+holds the same representation), and add a cheap full-sweep (e.g. weekly, or compare
+`COUNT(*) GROUP BY pid` hashes) OR include `EXISTS (SELECT 1 FROM product_category_index pci WHERE
+pci.pid = p.pid AND pci.stamp > ?)` in the incremental WHERE if that table has a stamp column —
+verify schema first (`DESCRIBE product_category_index`).
+
+### Fix 14: PO/receivings OFFSET pagination over a moving filter
+`purchase-orders.js` (~lines 275-298, 447-470): `LIMIT/OFFSET` with a `date_updated > ?` predicate;
+concurrent updates shift rows between pages → silent skips. Fix: keyset pagination —
+`WHERE ... AND p.po_id > ? ORDER BY p.po_id LIMIT 500`, carrying the last seen po_id (drop OFFSET).
+Same for receivings on `receiving_id`.
+
+### Fix 15: Status map gaps and unsafe defaults
+- `orders.js` orderStatusMap lacks 45 (`payment_pending`) and 67 (`remote_send`) → imported as
+  numeric strings. Add both (mirror in `migrations/001_map_order_statuses.sql` as a follow-up update
+  for existing rows).
+- `purchase-orders.js` `poStatusMap[po.status] || 'created'` (line ~335): an unknown *cancel-like*
+  code would be treated as an open PO and inflate on-order FIFO. Default to a sentinel like
+  `'unknown_<code>'` instead, and make the FIFO/on-order CTEs in `update_product_metrics.sql` treat
+  only the known-open statuses as open (they already whitelist open statuses — so the sentinel is
+  safe there; just ensure nothing treats unknown as 'created'). Same for receivingStatusMap.
+
+### Fix 16: Transactions issued through the pool wrapper land on arbitrary connections
+`categories.js` (lines ~17-152) and `daily-deals.js` (~27-130) call `query('BEGIN')` /
+`query('COMMIT')` on the wrapper, which checks out a client per call — BEGIN/work/COMMIT are not
+guaranteed to share a connection (works only by pool-LIFO accident). The categories
+`DISABLE TRIGGER` rides on this too. Fix: use the wrapper's `beginTransaction()/commit()/rollback()`
+(see `utils.js` lines 121-148) exactly as orders.js does. In categories.js also move the
+post-COMMIT `ENABLE TRIGGER` inside the transaction (DISABLE/ENABLE both inside), or drop the
+trigger toggling entirely if the trigger isn't actually problematic anymore.
+
+### Fix 17: stock-snapshots import swallows batch errors → permanent holes
+`stock-snapshots.js` (~lines 153-155): a failed batch is logged and skipped, but the next
+incremental starts at `MAX(snapshot_date)` — the hole is never revisited. Fix: rethrow (fail the
+step) or collect failed date ranges and retry once, then fail if still failing. Also line ~168:
+`calculateRate(processedRows, startTime)` — arguments reversed (signature is
+`calculateRate(startTime, current)`, see `metrics-new/utils/progress.js:70`).
+
+### Fix 18: Metrics cancellation targets an application_name that's never set
+`calculate-metrics-new.js` line ~180 cancels backends `WHERE application_name =
+'node-metrics-calculator'`, but the Pool config never sets it → cancellation no-ops (the 30-min
+`statement_timeout` is the only real guard). Fix: add `application_name: 'node-metrics-calculator'`
+to both dbConfig branches.
+
+### Fix 19: Aggregate-table change-detection lists miss cost-only changes
+`calculate_brand_metrics.sql` / `calculate_vendor_metrics.sql` / `calculate_category_metrics.sql`
+ON CONFLICT WHERE lists don't include `profit_30d`/`cogs_30d` — a cost revision with unchanged
+sales/revenue leaves stale rows (product_metrics has a 1-day staleness net; rollups don't). Add
+`... OR x.profit_30d IS DISTINCT FROM EXCLUDED.profit_30d OR x.cogs_30d IS DISTINCT FROM
+EXCLUDED.cogs_30d` to each, or add a `last_calculated < NOW() - INTERVAL '1 day'` net like
+product_metrics line ~707.
+
+### Fix 20: Snapshot stale-detection only compares unit counts
+`update_daily_snapshots.sql` lines ~57-85: detects mismatches in `units_sold`/`units_received` only;
+price/discount/costeach corrections older than the 2-day recheck are never repaired. Add a
+revenue comparison to the stale check: compare `SUM(net_revenue)` per day against the equivalent
+recomputed from `orders` (ROUND both to 2dp to avoid float-noise churn).
+
+### Fix 21: Category metrics positive-only revenue asymmetry
+`calculate_category_metrics.sql` (lines ~27-36, 64-73): revenue summed only when `> 0` while
+cogs/profit use COALESCE-all → margin numerator/denominator from different populations, and
+inconsistent with brand/vendor (plain COALESCE). Change the revenue/sales CASEs to
+`COALESCE(pm.revenue_7d, 0)` etc., matching brand_metrics.
+
+### Fix 22 (decision needed): Demand-pattern & seasonality definitions
+- `classify_demand_pattern` (db/functions.sql): CV thresholds 0.2/0.5 + avg<1/day. Industry standard
+  is Syntetos-Boylan: ADI ≥ 1.32 and CV² ≥ 0.49 quadrants (smooth/erratic/intermittent/lumpy).
+  Today everything classifies sporadic/lumpy. If adopting SB: ADI = 30 / COUNT(days with sales),
+  CV² computed on nonzero-demand sizes. Changes the vocabulary consumed by the forecast engine
+  (`scripts/forecast/forecast_engine.py` reads `demand_pattern`) — coordinate before changing.
+- SeasonalityAnalysis (`update_product_metrics.sql` ~360): `month_avg = AVG(units_sold)` over rows
+  with sales only → intensity, not volume. Use monthly totals (SUM, with zero months counted) /
+  overall monthly average for the index.
+- Safety stock: currently static config units; `sales_std_dev_30d` exists but is unused. Optional
+  upgrade: `safety = z * σ_d * sqrt(lead_time)` with z from a service-level setting.
+
+These change user-facing semantics — confirm with Matt before implementing.
+
+---
+
+## Verified non-issues (no action, or cleanup only)
+
+- **`costeach` fallback `price * 0.5`** (orders.js line ~615): fires on **2.1%** of item rows
+  (729/34,833, last 30d, live-verified). Accepted by Matt — 50% margin is a fair estimate for these
+  products. Optional: nothing.
+- **Missing-product order skips**: zero occurrences — MySQL has no orphan order_items (1-year check),
+  PG products is a superset of MySQL products (687,579 vs 687,576), last 7 import runs all logged
+  `totalSkipped: 0`. Cleanup only: remove the unused `importMissingProducts` import line at
+  `orders.js:2` (the function itself stays in products.js — harmless utility).
+- **Status 30 'cancelled_old'** in `total_sold >= 20` filter: zero rows live in `_order` — safe.
+- **Duplicate (order_id, pid) order items**: none exist in MySQL — the upsert PK is safe.
+- **base_discount** in orders.js: computed/stored in temp table but unused since migration 002 —
+  remove the column from temp table + queries for clarity (no behavior change).
+- **`full-update.js` `runScript`**: try/catch around `console.log` is dead code; per-step
+  `status:'complete'` messages could confuse a UI parser. Cosmetic only — tidy if touching the file.
+
+## Suggested implementation order
+
+| Step | Fixes | Re-import/rebuild needed |
+|---|---|---|
+| 1 | Fix 1 + Fix 2 (+ Fix 5 filters, Fix 8/9 while editing the same SQL) | FULL orders re-import → snapshot rebuild → metrics (once) |
+| 2 | Fix 4 + Fix 6 (orders.js reconciliation + watermarks; POs/products watermarks too) | no |
+| 3 | Fix 3, Fix 7 (metrics SQL only) | metrics run |
+| 4 | Fix 13-21 (robustness batch) | no |
+| 5 | Fix 10-12, Fix 22 after Matt's sign-off (definition changes) | metrics run |
+
+After step 1, expect: margin_30d down ~8-10 points, discounts_30d ~3x up, daily curves shifted to
+correct business days. Communicate before/after so the change isn't mistaken for a data incident.
+
+## Reference: verification snippets used in the review
+
+```sql
+-- MySQL: item-level discounts dropped by the gate (30d)
+SELECT COUNT(DISTINCT o.order_id), ROUND(SUM(odi.amount),2)
+FROM order_discount_items odi
+JOIN order_discounts od ON od.order_id=odi.order_id AND od.discount_id=odi.discount_id
+JOIN _order o ON o.order_id=odi.order_id
+WHERE odi.which=2 AND o.date_placed >= DATE_SUB(CURDATE(), INTERVAL 30 DAY)
+  AND o.order_status >= 20 AND COALESCE(od.discount_amount_subtotal,0)=0;
+-- → 2,021 orders / $25,989 (2026-06-10)
+
+-- MySQL: costeach fallback frequency (30d)
+SELECT COUNT(*),
+  SUM(CASE WHEN NOT EXISTS (SELECT 1 FROM order_costs oc WHERE oc.orderid=oi.order_id
+            AND oc.pid=oi.prod_pid AND oc.pending=0)
+        AND NOT EXISTS (SELECT 1 FROM product_inventory pi WHERE pi.pid=oi.prod_pid)
+      THEN 1 ELSE 0 END)
+FROM order_items oi JOIN _order o ON o.order_id=oi.order_id
+WHERE o.order_status >= 20 AND o.date_placed >= DATE_SUB(CURDATE(), INTERVAL 30 DAY);
+-- → 729 / 34,833 = 2.1% (2026-06-10)
+
+-- PG: timezone check
+SHOW timezone;  -- Europe/Berlin (2026-06-10)
+
+-- Row drift, May 2026: MySQL 49,377 items / PG 49,841 (+0.9%)
+```
@@ -76,7 +76,9 @@ $function$;

 -- =============================================================================
 -- get_weighted_avg_cost: Weighted average cost from receivings up to a given date.
-- Uses all non-canceled receivings (no row limit) weighted by quantity.
+-- Prefers receivings from the 365 days before p_date so decade-old costs don't
+-- weigh equally with recent ones; falls back to the lifetime average when the
+-- product had no receivings in that window.
 -- =============================================================================
 CREATE OR REPLACE FUNCTION public.get_weighted_avg_cost(
    p_pid bigint,
@@ -97,8 +99,21 @@ BEGIN
    FROM receivings
    WHERE pid = p_pid
        AND received_date <= p_date
+        AND received_date > p_date - INTERVAL '365 days'
        AND status != 'canceled';

+    IF weighted_cost IS NULL THEN
+        SELECT
+            CASE
+                WHEN SUM(qty_each) > 0 THEN SUM(cost_each * qty_each) / SUM(qty_each)
+                ELSE NULL
+            END INTO weighted_cost
+        FROM receivings
+        WHERE pid = p_pid
+            AND received_date <= p_date
+            AND status != 'canceled';
+    END IF;
+
    RETURN weighted_cost;
 END;
 $function$;
@@ -76,6 +76,8 @@ if (process.env.DATABASE_URL && typeof process.env.DATABASE_URL === 'string') {
    dbConfig = {
        connectionString: process.env.DATABASE_URL,
        ssl: process.env.DB_SSL === 'true' ? { rejectUnauthorized: false } : false,
+        // Required by cancelCalculation(): pg_cancel_backend targets this name
+        application_name: 'node-metrics-calculator',
        // Add performance optimizations
        max: 10, // connection pool max size
        idleTimeoutMillis: 30000,
@@ -93,6 +95,8 @@ if (process.env.DATABASE_URL && typeof process.env.DATABASE_URL === 'string') {
        database: process.env.DB_NAME,
        port: process.env.DB_PORT || 5432,
        ssl: process.env.DB_SSL === 'true',
+        // Required by cancelCalculation(): pg_cancel_backend targets this name
+        application_name: 'node-metrics-calculator',
        // Add performance optimizations
        max: 10, // connection pool max size
        idleTimeoutMillis: 30000,
@@ -634,6 +634,52 @@ def forecast_from_curve(curve_params, scale_factor, age_days, horizon_days):
    return np.array(forecasts)


+def forecast_preorder(curve_params, scale_factor, days_until_arrival,
+                      preorder_daily_rate, horizon_days):
+    """
+    Piecewise pre-order forecast: a flat observed pre-order trickle until the
+    product is expected to arrive, then the scaled launch curve from age 0.
+
+    The launch curve was fit on POST-receipt order history, so running it from
+    today (while the product is still weeks from arriving) front-loads full
+    first-week launch volume that hasn't happened yet — the main driver of the
+    ~2.15x preorder over-forecast. Instead we forecast the slow pre-order rate
+    up to the arrival date, then start the curve's day 0 on that date.
+    See FORECAST_FIX_PLAN F4.
+
+    Args:
+        curve_params: (amplitude, decay_rate, baseline, ...) weekly curve
+        scale_factor: per-product multiplier for the post-arrival curve envelope
+        days_until_arrival: calendar days from today until expected arrival
+        preorder_daily_rate: observed pre-order units/day (trickle)
+        horizon_days: forecast horizon length
+
+    Returns:
+        array of daily forecast values of length horizon_days
+    """
+    amplitude, decay_rate, baseline = curve_params[:3]
+    forecasts = np.zeros(horizon_days)
+
+    # Clamp the arrival offset into the horizon
+    dua = int(max(0, min(days_until_arrival, horizon_days)))
+
+    # Pre-arrival segment: flat pre-order trickle, capped at the curve's scaled
+    # week-0 daily value (a pre-order day shouldn't out-sell the launch peak).
+    if dua > 0:
+        week0_daily = (amplitude / 7.0) * scale_factor + (baseline / 7.0)
+        pre_rate = preorder_daily_rate
+        if week0_daily > 0:
+            pre_rate = min(pre_rate, week0_daily)
+        forecasts[:dua] = max(0.0, pre_rate)
+
+    # Post-arrival segment: scaled launch curve, curve day 0 = arrival date.
+    if dua < horizon_days:
+        curve_part = forecast_from_curve(curve_params, scale_factor, 0, horizon_days - dua)
+        forecasts[dua:] = curve_part
+
+    return forecasts
+
+
 # ---------------------------------------------------------------------------
 # Batch data loading (eliminates N+1 per-product queries)
 # ---------------------------------------------------------------------------
@@ -651,9 +697,11 @@ def batch_load_product_data(conn, products):
    data = {
        'preorder_sales': {},
        'preorder_days': {},
+        'preorder_arrival_days': {},
        'launch_sales': {},
        'decay_velocity': {},
        'mature_history': {},
+        'dormant_rate': {},
    }

    # Pre-order sales: orders placed BEFORE first received date
@@ -677,6 +725,39 @@ def batch_load_product_data(conn, products):
            data['preorder_days'][int(row['pid'])] = float(row['preorder_days'])
        log.info(f"Batch loaded pre-order sales for {len(data['preorder_sales'])}/{len(preorder_pids)} preorder products")

+        # Expected arrival per pre-order product, to time the launch curve.
+        # Prefer the soonest FUTURE expected_date on an open PO; if the only open
+        # PO has a past expected_date assume 7 days; if there's no open PO at all
+        # assume 14 days. See FORECAST_FIX_PLAN F4.
+        arrival_sql = """
+        SELECT pid,
+            MIN(expected_date) FILTER (
+                WHERE expected_date IS NOT NULL AND expected_date >= CURRENT_DATE
+            ) AS future_arrival
+        FROM purchase_orders
+        WHERE pid = ANY(%s)
+          AND status IN ('created', 'ordered', 'electronically_sent', 'receiving_started')
+        GROUP BY pid
+        """
+        adf = execute_query(conn, arrival_sql, [preorder_pids])
+        today = date.today()
+        for _, row in adf.iterrows():
+            pid = int(row['pid'])
+            fa = row['future_arrival']
+            if pd.notna(fa):
+                fa_date = pd.Timestamp(fa).date()
+                data['preorder_arrival_days'][pid] = max(0, (fa_date - today).days)
+            else:
+                data['preorder_arrival_days'][pid] = 7  # open PO, expected_date already past
+        no_po = 0
+        for pid in preorder_pids:
+            if int(pid) not in data['preorder_arrival_days']:
+                data['preorder_arrival_days'][int(pid)] = 14  # no open PO at all
+                no_po += 1
+        log.info(f"Batch loaded preorder arrival for "
+                 f"{len(data['preorder_arrival_days']) - no_po}/{len(preorder_pids)} via open POs, "
+                 f"{no_po} defaulted to 14d")
+
    # Launch sales: first 14 days after first received
    launch_pids = products[products['phase'] == 'launch']['pid'].tolist()
    if launch_pids:
@@ -694,15 +775,23 @@ def batch_load_product_data(conn, products):
            data['launch_sales'][int(row['pid'])] = float(row['total_sold'])
        log.info(f"Batch loaded launch sales for {len(data['launch_sales'])}/{len(launch_pids)} launch products")

-    # Decay recent velocity: average daily sales over last 30 days
+    # Decay recent velocity: TRUE calendar-daily average over the last 30 days.
+    # We divide the summed units by calendar days (clipped to the product's age),
+    # NOT by the number of snapshot rows. Snapshots are sparse and mostly land on
+    # sold-days, so AVG(units_sold) averages over sold-days only and inflated the
+    # decay rate ~4x (measured 1.353 vs true 0.332 units/day). See FORECAST_FIX_PLAN F1.
    decay_pids = products[products['phase'] == 'decay']['pid'].tolist()
    if decay_pids:
        sql = """
-        SELECT dps.pid, AVG(COALESCE(dps.units_sold, 0)) AS avg_daily
+        SELECT dps.pid,
+            SUM(COALESCE(dps.units_sold, 0))::float
+              / GREATEST(LEAST(30, (CURRENT_DATE - pm.date_first_received::date)), 1) AS avg_daily
        FROM daily_product_snapshots dps
+        JOIN product_metrics pm ON pm.pid = dps.pid
        WHERE dps.pid = ANY(%s)
          AND dps.snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
-        GROUP BY dps.pid
+          AND dps.snapshot_date >= pm.date_first_received::date
+        GROUP BY dps.pid, pm.date_first_received
        """
        df = execute_query(conn, sql, [decay_pids])
        for _, row in df.iterrows():
@@ -724,6 +813,25 @@ def batch_load_product_data(conn, products):
            data['mature_history'][int(pid)] = group.copy()
        log.info(f"Batch loaded history for {len(data['mature_history'])}/{len(mature_pids)} mature products")

+    # Dormant trailing order rate: dormant products forecast 0 by default, but
+    # ~11K of them still sell (restocks, promos, long-tail) — ~11% of all demand
+    # currently forecast as a hard zero. Load a trailing-180-day daily order rate
+    # so the dormant branch can carry a small positive rate. See FORECAST_FIX_PLAN F5.
+    dormant_pids = products[products['phase'] == 'dormant']['pid'].tolist()
+    if dormant_pids:
+        sql = """
+        SELECT o.pid, SUM(o.quantity) / 180.0 AS rate
+        FROM orders o
+        WHERE o.pid = ANY(%s)
+          AND o.canceled IS DISTINCT FROM TRUE
+          AND o.date >= CURRENT_DATE - INTERVAL '180 days'
+        GROUP BY o.pid
+        """
+        df = execute_query(conn, sql, [dormant_pids])
+        for _, row in df.iterrows():
+            data['dormant_rate'][int(row['pid'])] = float(row['rate'])
+        log.info(f"Batch loaded dormant order rate for {len(data['dormant_rate'])}/{len(dormant_pids)} dormant products")
+
    return data


@@ -829,11 +937,20 @@ def forecast_mature(product, history_df):
        # Not enough data — flat velocity
        return np.full(FORECAST_HORIZON_DAYS, velocity)

-    # Fill date gaps with 0 sales (days where product had no snapshot = no sales)
+    # Reindex over the FULL calendar window ending yesterday, not just the span
+    # between the first and last snapshot. resample() only covers first→last
+    # snapshot, so leading/trailing quiet periods are absent and the Holt level
+    # is fitted only on the product's busy span (can run ~4x too high). An
+    # explicit reindex fills every quiet calendar day with 0. (pid, snapshot_date)
+    # is unique so there is no duplicate-index risk; do NOT use combine_first
+    # (it keeps zeros over real data). See FORECAST_FIX_PLAN F2.
    hist = history_df.copy()
    hist['snapshot_date'] = pd.to_datetime(hist['snapshot_date'])
-    hist = hist.set_index('snapshot_date').resample('D').sum().fillna(0)
-    series = hist['units_sold'].values.astype(float)
+    hist = hist.set_index('snapshot_date')['units_sold']
+    full_index = pd.date_range(
+        end=pd.Timestamp(date.today() - timedelta(days=1)),
+        periods=EXP_SMOOTHING_WINDOW, freq='D')
+    series = hist.reindex(full_index, fill_value=0.0).values.astype(float)

    # Need at least 2 non-zero values for smoothing
    if np.count_nonzero(series) < 2:
@@ -956,9 +1073,24 @@ def generate_all_forecasts(conn, curves_df, dow_indices, monthly_indices=None,
    today = date.today()
    forecast_dates = [today + timedelta(days=i) for i in range(FORECAST_HORIZON_DAYS)]

-    # Pre-compute DOW and seasonal multipliers for each forecast date
+    # Pre-compute DOW and seasonal multipliers for each forecast date.
+    # DOW multipliers stay ABSOLUTE — every calibration is a multi-week average
+    # and therefore DOW-neutral, so reshaping by absolute DOW indices is correct.
+    # Seasonal indices must be applied RELATIVE to the calibration period:
+    # each per-product calibration (decay velocity, mature Holt level, launch /
+    # preorder scale) is fitted on raw recent actuals that already embed the
+    # current month's seasonal level. Multiplying by the absolute target-month
+    # index double-counts seasonality (~25% over-forecast at the May→June sale
+    # transition, worse near November). Divide by the trailing-30-day average
+    # index so only the seasonal *change* from calibration to target applies.
+    # See FORECAST_FIX_PLAN F3.
    dow_multipliers = [dow_indices.get(d.isoweekday(), 1.0) for d in forecast_dates]
-    seasonal_multipliers = [monthly_indices.get(d.month, 1.0) for d in forecast_dates]
+    trailing = [today - timedelta(days=i) for i in range(1, 31)]
+    calibration_index = float(np.mean([monthly_indices.get(d.month, 1.0) for d in trailing]))
+    seasonal_multipliers = [
+        monthly_indices.get(d.month, 1.0) / max(calibration_index, 0.1)
+        for d in forecast_dates
+    ]

    # TRUNCATE before streaming writes
    with conn.cursor() as cur:
@@ -1002,9 +1134,33 @@ def generate_all_forecasts(conn, curves_df, dow_indices, monthly_indices=None,
        try:
            curve_info = get_curve_for_product(product, curves_df)

-            if phase in ('preorder', 'launch'):
+            if phase == 'preorder':
                if curve_info:
-                    scale = compute_scale_factor(phase, product, curve_info, batch_data)
+                    scale = compute_scale_factor('preorder', product, curve_info, batch_data)
+                    # Time the launch curve to expected arrival instead of
+                    # running it from today (F4). Pre-arrival days carry the
+                    # observed pre-order trickle rate.
+                    days_until_arrival = batch_data['preorder_arrival_days'].get(pid, 14)
+                    preorder_units = batch_data['preorder_sales'].get(pid, 0)
+                    preorder_days = batch_data['preorder_days'].get(pid, 1)
+                    preorder_daily_rate = preorder_units / max(preorder_days, 1)
+                    forecasts = forecast_preorder(
+                        curve_info, scale, days_until_arrival,
+                        preorder_daily_rate, FORECAST_HORIZON_DAYS)
+                    method = 'lifecycle_curve'
+                else:
+                    # No reliable curve — fall back to velocity if available
+                    velocity = product.get('sales_velocity_daily') or 0
+                    if velocity > 0:
+                        forecasts = np.full(FORECAST_HORIZON_DAYS, velocity)
+                        method = 'velocity'
+                    else:
+                        forecasts = forecast_dormant()
+                        method = 'zero'
+
+            elif phase == 'launch':
+                if curve_info:
+                    scale = compute_scale_factor('launch', product, curve_info, batch_data)
                    forecasts = forecast_from_curve(curve_info, scale, age, FORECAST_HORIZON_DAYS)
                    method = 'lifecycle_curve'
                else:
@@ -1038,8 +1194,16 @@ def generate_all_forecasts(conn, curves_df, dow_indices, monthly_indices=None,
                method = 'velocity'

            else:  # dormant
-                forecasts = forecast_dormant()
-                method = 'zero'
+                # Carry a small positive rate for dormant products that still
+                # trickle sales (restocks/promos/long-tail); only truly dead
+                # products stay at zero. See FORECAST_FIX_PLAN F5.
+                rate = batch_data['dormant_rate'].get(pid, 0)
+                if rate > 0:
+                    forecasts = np.full(FORECAST_HORIZON_DAYS, rate)
+                    method = 'velocity'
+                else:
+                    forecasts = forecast_dormant()
+                    method = 'zero'

            # Confidence interval: use accuracy-calibrated margins per phase
            base_margin = accuracy_margins.get(phase, 0.5)
@@ -1108,6 +1272,8 @@ def archive_forecasts(conn, run_id):
        """)
        cur.execute("CREATE INDEX IF NOT EXISTS idx_pfh_date ON product_forecasts_history(forecast_date)")
        cur.execute("CREATE INDEX IF NOT EXISTS idx_pfh_pid_date ON product_forecasts_history(pid, forecast_date)")
+        # Naive-baseline column for forecast value-added (FVA). See FORECAST_FIX_PLAN F8.
+        cur.execute("ALTER TABLE product_forecasts_history ADD COLUMN IF NOT EXISTS naive_units NUMERIC(10,2)")

        # Find the previous completed run (whose forecasts are still in product_forecasts)
        cur.execute("""
@@ -1124,15 +1290,27 @@ def archive_forecasts(conn, run_id):

        prev_run_id = prev_run[0]

-        # Archive only past-date forecasts (where actuals now exist)
+        # Archive only past-date forecasts (where actuals now exist). Attach the
+        # naive baseline (flat trailing-28-day daily average) at the same time so
+        # forecast value-added can be measured. See FORECAST_FIX_PLAN F8.
        cur.execute("""
            INSERT INTO product_forecasts_history
                (run_id, pid, forecast_date, forecast_units, forecast_revenue,
-                 lifecycle_phase, forecast_method, confidence_lower, confidence_upper, generated_at)
-            SELECT %s, pid, forecast_date, forecast_units, forecast_revenue,
-                lifecycle_phase, forecast_method, confidence_lower, confidence_upper, generated_at
-            FROM product_forecasts
-            WHERE forecast_date < CURRENT_DATE
+                 lifecycle_phase, forecast_method, confidence_lower, confidence_upper,
+                 generated_at, naive_units)
+            SELECT %s, pf.pid, pf.forecast_date, pf.forecast_units, pf.forecast_revenue,
+                pf.lifecycle_phase, pf.forecast_method, pf.confidence_lower, pf.confidence_upper,
+                pf.generated_at, COALESCE(nv.naive_daily, 0)
+            FROM product_forecasts pf
+            LEFT JOIN (
+                SELECT o.pid, SUM(o.quantity) / 28.0 AS naive_daily
+                FROM orders o
+                WHERE o.canceled IS DISTINCT FROM TRUE
+                  AND o.date >= CURRENT_DATE - INTERVAL '28 days'
+                  AND o.date < CURRENT_DATE
+                GROUP BY o.pid
+            ) nv ON nv.pid = pf.pid
+            WHERE pf.forecast_date < CURRENT_DATE
            ON CONFLICT (run_id, pid, forecast_date) DO NOTHING
        """, (prev_run_id,))

@@ -1154,6 +1332,48 @@ def archive_forecasts(conn, run_id):
        return archived


+def archive_future_leads(conn, run_id):
+    """
+    Archive a sampled set of FUTURE-lead forecasts from the just-generated
+    product_forecasts, attributed to the current run.
+
+    The past-date archive in archive_forecasts() only ever captures the 1-day
+    slice that just elapsed, so every accuracy sample lands in the '1-7d' lead
+    bucket and the 15/30/60/90-day forecasts that purchasing actually rides on
+    are never validated. Here we snapshot the 7/14/30/60/89-day-ahead leads
+    (non-dormant) so that, once each date passes, compute_accuracy() can score
+    them in their lead bucket. The naive baseline is attached the same way as in
+    the past-date path. Future-dated rows survive the 90-day prune until their
+    own date passes. See FORECAST_FIX_PLAN F7.
+    """
+    with conn.cursor() as cur:
+        cur.execute("""
+            INSERT INTO product_forecasts_history
+                (run_id, pid, forecast_date, forecast_units, forecast_revenue,
+                 lifecycle_phase, forecast_method, confidence_lower, confidence_upper,
+                 generated_at, naive_units)
+            SELECT %s, pf.pid, pf.forecast_date, pf.forecast_units, pf.forecast_revenue,
+                pf.lifecycle_phase, pf.forecast_method, pf.confidence_lower, pf.confidence_upper,
+                pf.generated_at, COALESCE(nv.naive_daily, 0)
+            FROM product_forecasts pf
+            LEFT JOIN (
+                SELECT o.pid, SUM(o.quantity) / 28.0 AS naive_daily
+                FROM orders o
+                WHERE o.canceled IS DISTINCT FROM TRUE
+                  AND o.date >= CURRENT_DATE - INTERVAL '28 days'
+                  AND o.date < CURRENT_DATE
+                GROUP BY o.pid
+            ) nv ON nv.pid = pf.pid
+            WHERE pf.lifecycle_phase != 'dormant'
+              AND pf.forecast_date - CURRENT_DATE IN (7, 14, 30, 60, 89)
+            ON CONFLICT (run_id, pid, forecast_date) DO NOTHING
+        """, (run_id,))
+        archived = cur.rowcount
+    conn.commit()
+    log.info(f"Archived {archived} future-lead forecast rows (7/14/30/60/89d) for run {run_id}")
+    return archived
+
+
 def compute_accuracy(conn, run_id):
    """
    Compute forecast accuracy metrics from archived history vs. actual sales.
@@ -1162,11 +1382,18 @@ def compute_accuracy(conn, run_id):
    (pid, forecast_date = snapshot_date) to compare forecasted vs. actual units.

    Stores results in forecast_accuracy table, broken down by:
-      - overall: single aggregate row
+      - overall: two rows — 'all' (non-dormant) and 'all_incl_dormant' (F5)
+      - overall_weekly: per-product weekly-grain WMAPE — the informative headline
+        for intermittent demand (daily grain has a ~190% floor) (F9)
      - by_phase: per lifecycle phase
-      - by_lead_time: bucketed by how far ahead the forecast was
+      - by_lead_time: bucketed by how far ahead the forecast was — long-lead
+        buckets populate as the future-lead archives mature (F7)
      - by_method: per forecast method
      - daily: per forecast_date (for trend charts)
+
+    Every dimension also stores naive_wmape (flat trailing-28d baseline) and
+    fva = 1 - wmape/naive_wmape, so the engine can be judged as value-over-naive
+    (F8). Only realized dates (forecast_date < CURRENT_DATE) are scored.
    """
    with conn.cursor() as cur:
        # Ensure accuracy table exists
@@ -1186,6 +1413,10 @@ def compute_accuracy(conn, run_id):
                PRIMARY KEY (run_id, metric_type, dimension_value)
            )
        """)
+        # Naive-baseline WMAPE and forecast value-added (FVA = 1 - wmape/naive_wmape).
+        # See FORECAST_FIX_PLAN F8.
+        cur.execute("ALTER TABLE forecast_accuracy ADD COLUMN IF NOT EXISTS naive_wmape NUMERIC(10,4)")
+        cur.execute("ALTER TABLE forecast_accuracy ADD COLUMN IF NOT EXISTS fva NUMERIC(10,4)")
        conn.commit()

        # Check if we have any history to analyze
@@ -1195,124 +1426,199 @@ def compute_accuracy(conn, run_id):
            log.info("No forecast history available for accuracy computation")
            return

-        # For each (pid, forecast_date) pair, keep only the most recent run's
-        # forecast row. This prevents double-counting when multiple runs have
-        # archived forecasts for the same product×date combination.
-        accuracy_cte = """
-        WITH ranked_history AS (
+        # Base CTEs (FORECAST_FIX_PLAN F7):
+        #  - Only score realized dates (forecast_date < CURRENT_DATE); future-lead
+        #    archives are excluded until their date passes.
+        #  - short_lead*: lead 0-6 deduped per (pid, forecast_date) — preserves the
+        #    meaning of the existing headline metrics. short_lead_eval keeps the
+        #    raw snapshot grid (incl. zero-zero days) for complete-week detection;
+        #    `accuracy` drops zero-zero days for daily-grain metrics.
+        #  - lead_dedup/lead_accuracy: deduped per (pid, forecast_date, lead_bucket)
+        #    so each long-lead bucket gets its own sample (the by_lead_time table).
+        base_cte = """
+        WITH ranked_all AS (
            SELECT
-                pfh.*,
+                pfh.pid, pfh.forecast_date, pfh.forecast_units, pfh.naive_units,
+                pfh.lifecycle_phase, pfh.forecast_method,
                fr.started_at,
-                ROW_NUMBER() OVER (
-                    PARTITION BY pfh.pid, pfh.forecast_date
-                    ORDER BY fr.started_at DESC
-                ) AS rn
+                (pfh.forecast_date - fr.started_at::date) AS lead_days,
+                CASE
+                    WHEN (pfh.forecast_date - fr.started_at::date) BETWEEN 0 AND 6 THEN '1-7d'
+                    WHEN (pfh.forecast_date - fr.started_at::date) BETWEEN 7 AND 13 THEN '8-14d'
+                    WHEN (pfh.forecast_date - fr.started_at::date) BETWEEN 14 AND 29 THEN '15-30d'
+                    WHEN (pfh.forecast_date - fr.started_at::date) BETWEEN 30 AND 59 THEN '31-60d'
+                    ELSE '61-90d'
+                END AS lead_bucket
            FROM product_forecasts_history pfh
            JOIN forecast_runs fr ON fr.id = pfh.run_id
+            WHERE pfh.forecast_date < CURRENT_DATE
+        ),
+        short_lead AS (
+            SELECT *,
+                ROW_NUMBER() OVER (
+                    PARTITION BY pid, forecast_date ORDER BY started_at DESC
+                ) AS rn
+            FROM ranked_all
+            WHERE lead_days BETWEEN 0 AND 6
+        ),
+        short_lead_eval AS (
+            SELECT sl.pid, sl.lifecycle_phase, sl.forecast_method, sl.forecast_date,
+                sl.forecast_units, sl.naive_units,
+                COALESCE(dps.units_sold, 0) AS actual_units,
+                (sl.forecast_units - COALESCE(dps.units_sold, 0)) AS error,
+                ABS(sl.forecast_units - COALESCE(dps.units_sold, 0)) AS abs_error
+            FROM short_lead sl
+            LEFT JOIN daily_product_snapshots dps
+                ON dps.pid = sl.pid AND dps.snapshot_date = sl.forecast_date
+            WHERE sl.rn = 1
        ),
        accuracy AS (
-            SELECT
-                rh.lifecycle_phase,
-                rh.forecast_method,
-                rh.forecast_date,
-                (rh.forecast_date - rh.started_at::date) AS lead_days,
-                rh.forecast_units,
+            SELECT * FROM short_lead_eval
+            WHERE NOT (forecast_units = 0 AND actual_units = 0)
+        ),
+        lead_dedup AS (
+            SELECT *,
+                ROW_NUMBER() OVER (
+                    PARTITION BY pid, forecast_date, lead_bucket ORDER BY started_at DESC
+                ) AS rn
+            FROM ranked_all
+        ),
+        lead_accuracy AS (
+            SELECT ld.lead_bucket, ld.forecast_units, ld.naive_units,
                COALESCE(dps.units_sold, 0) AS actual_units,
-                (rh.forecast_units - COALESCE(dps.units_sold, 0)) AS error,
-                ABS(rh.forecast_units - COALESCE(dps.units_sold, 0)) AS abs_error
-            FROM ranked_history rh
+                (ld.forecast_units - COALESCE(dps.units_sold, 0)) AS error,
+                ABS(ld.forecast_units - COALESCE(dps.units_sold, 0)) AS abs_error
+            FROM lead_dedup ld
            LEFT JOIN daily_product_snapshots dps
-                ON dps.pid = rh.pid AND dps.snapshot_date = rh.forecast_date
-            WHERE rh.rn = 1
-              AND NOT (rh.forecast_units = 0 AND COALESCE(dps.units_sold, 0) = 0)
+                ON dps.pid = ld.pid AND dps.snapshot_date = ld.forecast_date
+            WHERE ld.rn = 1
+              AND ld.lifecycle_phase != 'dormant'
+              AND NOT (ld.forecast_units = 0 AND COALESCE(dps.units_sold, 0) = 0)
        )
        """

-        # Compute and insert metrics for each dimension
-        dimensions = {
-            'overall': "SELECT 'all' AS dim",
-            'by_phase': "SELECT DISTINCT lifecycle_phase AS dim FROM accuracy",
-            'by_lead_time': """
-                SELECT DISTINCT
-                    CASE
-                        WHEN lead_days BETWEEN 0 AND 6 THEN '1-7d'
-                        WHEN lead_days BETWEEN 7 AND 13 THEN '8-14d'
-                        WHEN lead_days BETWEEN 14 AND 29 THEN '15-30d'
-                        WHEN lead_days BETWEEN 30 AND 59 THEN '31-60d'
-                        ELSE '61-90d'
-                    END AS dim
-                FROM accuracy
-            """,
-            'by_method': "SELECT DISTINCT forecast_method AS dim FROM accuracy",
-            'daily': "SELECT DISTINCT forecast_date::text AS dim FROM accuracy",
-        }
-
-        filter_clauses = {
-            'overall': "lifecycle_phase != 'dormant'",
-            'by_phase': "lifecycle_phase = dims.dim",
-            'by_lead_time': """
-                CASE
-                    WHEN lead_days BETWEEN 0 AND 6 THEN '1-7d'
-                    WHEN lead_days BETWEEN 7 AND 13 THEN '8-14d'
-                    WHEN lead_days BETWEEN 14 AND 29 THEN '15-30d'
-                    WHEN lead_days BETWEEN 30 AND 59 THEN '31-60d'
-                    ELSE '61-90d'
-                END = dims.dim
-            """,
-            'by_method': "forecast_method = dims.dim",
-            'daily': "forecast_date::text = dims.dim",
-        }
-
-        total_inserted = 0
-
-        for metric_type, dim_query in dimensions.items():
-            filter_clause = filter_clauses[metric_type]
-
-            sql = f"""
-            {accuracy_cte},
-            dims AS ({dim_query})
+        # Daily-grain aggregate over a source CTE aliased `a`, computing the
+        # engine WMAPE plus the naive-baseline WMAPE (NULL-safe: rows archived
+        # before F8 have naive_units NULL and are excluded from the naive sums).
+        def daily_agg(dim_expr, source, where=None, group_by=None):
+            where_sql = f"WHERE {where}" if where else ""
+            group_sql = f"GROUP BY {group_by}" if group_by else ""
+            return f"""
            SELECT
-                dims.dim,
+                {dim_expr} AS dim,
                COUNT(*) AS sample_size,
                COALESCE(SUM(a.actual_units), 0) AS total_actual,
                COALESCE(SUM(a.forecast_units), 0) AS total_forecast,
                AVG(a.abs_error) AS mae,
                CASE WHEN SUM(a.actual_units) > 0
-                    THEN SUM(a.abs_error) / SUM(a.actual_units)
-                    ELSE NULL END AS wmape,
+                    THEN SUM(a.abs_error) / SUM(a.actual_units) ELSE NULL END AS wmape,
                AVG(a.error) AS bias,
-                SQRT(AVG(POWER(a.error, 2))) AS rmse
-            FROM dims
-            CROSS JOIN accuracy a
-            WHERE {filter_clause}
-            GROUP BY dims.dim
+                SQRT(AVG(POWER(a.error, 2))) AS rmse,
+                CASE WHEN SUM(a.actual_units) FILTER (WHERE a.naive_units IS NOT NULL) > 0
+                    THEN SUM(ABS(a.naive_units - a.actual_units)) FILTER (WHERE a.naive_units IS NOT NULL)
+                         / SUM(a.actual_units) FILTER (WHERE a.naive_units IS NOT NULL)
+                    ELSE NULL END AS naive_wmape
+            FROM {source} a
+            {where_sql}
+            {group_sql}
            """

-            cur.execute(sql)
-            rows = cur.fetchall()
+        insert_sql = """
+            INSERT INTO forecast_accuracy
+                (run_id, metric_type, dimension_value, sample_size,
+                 total_actual_units, total_forecast_units, mae, wmape, bias, rmse,
+                 naive_wmape, fva)
+            VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
+            ON CONFLICT (run_id, metric_type, dimension_value)
+            DO UPDATE SET
+                sample_size = EXCLUDED.sample_size,
+                total_actual_units = EXCLUDED.total_actual_units,
+                total_forecast_units = EXCLUDED.total_forecast_units,
+                mae = EXCLUDED.mae, wmape = EXCLUDED.wmape,
+                bias = EXCLUDED.bias, rmse = EXCLUDED.rmse,
+                naive_wmape = EXCLUDED.naive_wmape, fva = EXCLUDED.fva,
+                computed_at = NOW()
+        """

-            for row in rows:
-                dim_val, sample_size, total_actual, total_forecast, mae, wmape, bias, rmse = row
-                cur.execute("""
-                    INSERT INTO forecast_accuracy
-                        (run_id, metric_type, dimension_value, sample_size,
-                         total_actual_units, total_forecast_units, mae, wmape, bias, rmse)
-                    VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
-                    ON CONFLICT (run_id, metric_type, dimension_value)
-                    DO UPDATE SET
-                        sample_size = EXCLUDED.sample_size,
-                        total_actual_units = EXCLUDED.total_actual_units,
-                        total_forecast_units = EXCLUDED.total_forecast_units,
-                        mae = EXCLUDED.mae, wmape = EXCLUDED.wmape,
-                        bias = EXCLUDED.bias, rmse = EXCLUDED.rmse,
-                        computed_at = NOW()
-                """, (run_id, metric_type, dim_val, sample_size,
-                      float(total_actual), float(total_forecast),
-                      float(mae) if mae is not None else None,
-                      float(wmape) if wmape is not None else None,
-                      float(bias) if bias is not None else None,
-                      float(rmse) if rmse is not None else None))
-                total_inserted += 1
+        def _f(x):
+            return float(x) if x is not None else None
+
+        def run_and_insert(metric_type, sql):
+            cur.execute(base_cte + sql)
+            n = 0
+            for row in cur.fetchall():
+                (dim_val, sample_size, total_actual, total_forecast,
+                 mae, wmape, bias, rmse, naive_wmape) = row
+                fva = None
+                if wmape is not None and naive_wmape is not None and float(naive_wmape) > 0:
+                    fva = 1.0 - float(wmape) / float(naive_wmape)
+                cur.execute(insert_sql, (
+                    run_id, metric_type, dim_val, sample_size,
+                    _f(total_actual), _f(total_forecast), _f(mae), _f(wmape),
+                    _f(bias), _f(rmse), _f(naive_wmape), _f(fva)))
+                n += 1
+            return n
+
+        total_inserted = 0
+
+        # overall: two rows — 'all' (non-dormant, the headline) and
+        # 'all_incl_dormant' (everything, so the ~11% dormant demand stops being
+        # invisible). Both are short-lead (lead 0-6). F5.
+        overall_source = """(
+                SELECT a.*, 'all'::text AS dim FROM accuracy a WHERE a.lifecycle_phase != 'dormant'
+                UNION ALL
+                SELECT a.*, 'all_incl_dormant'::text AS dim FROM accuracy a
+            )"""
+        total_inserted += run_and_insert('overall',
+            daily_agg('a.dim', overall_source, group_by='a.dim'))
+
+        # by_phase / by_method / daily — short-lead daily-grain over `accuracy`.
+        total_inserted += run_and_insert('by_phase',
+            daily_agg('a.lifecycle_phase', 'accuracy', group_by='a.lifecycle_phase'))
+        total_inserted += run_and_insert('by_method',
+            daily_agg('a.forecast_method', 'accuracy', group_by='a.forecast_method'))
+        total_inserted += run_and_insert('daily',
+            daily_agg('a.forecast_date::text', 'accuracy',
+                      where="a.lifecycle_phase != 'dormant'", group_by='a.forecast_date'))
+
+        # by_lead_time — one sample per (pid, date, lead bucket) over `lead_accuracy`.
+        # Buckets beyond '1-7d' populate as the future-lead archives (F7) mature.
+        total_inserted += run_and_insert('by_lead_time',
+            daily_agg('a.lead_bucket', 'lead_accuracy', group_by='a.lead_bucket'))
+
+        # overall_weekly — the informative headline for intermittent retail demand.
+        # Aggregate the short-lead rows to (pid, complete week), then WMAPE over
+        # pid-weeks. Daily-grain WMAPE has a ~190% floor on this catalog; weekly
+        # grain is ~109% and responds to real improvement. F9.
+        weekly_sql = """,
+        weekly AS (
+            SELECT pid, date_trunc('week', forecast_date) AS wk,
+                SUM(forecast_units) AS fc_week,
+                SUM(actual_units) AS act_week,
+                SUM(naive_units) AS naive_week,
+                bool_and(naive_units IS NOT NULL) AS naive_complete
+            FROM short_lead_eval
+            WHERE lifecycle_phase != 'dormant'
+            GROUP BY pid, date_trunc('week', forecast_date)
+            HAVING COUNT(*) = 7
+        )
+        SELECT 'all'::text AS dim,
+            COUNT(*) AS sample_size,
+            COALESCE(SUM(act_week), 0) AS total_actual,
+            COALESCE(SUM(fc_week), 0) AS total_forecast,
+            AVG(ABS(fc_week - act_week)) AS mae,
+            CASE WHEN SUM(act_week) > 0
+                THEN SUM(ABS(fc_week - act_week)) / SUM(act_week) ELSE NULL END AS wmape,
+            AVG(fc_week - act_week) AS bias,
+            SQRT(AVG(POWER(fc_week - act_week, 2))) AS rmse,
+            CASE WHEN SUM(act_week) FILTER (WHERE naive_complete) > 0
+                THEN SUM(ABS(naive_week - act_week)) FILTER (WHERE naive_complete)
+                     / SUM(act_week) FILTER (WHERE naive_complete)
+                ELSE NULL END AS naive_wmape
+        FROM weekly
+        WHERE NOT (fc_week = 0 AND act_week = 0)
+        """
+        total_inserted += run_and_insert('overall_weekly', weekly_sql)

        conn.commit()

@@ -1562,6 +1868,10 @@ def main():
            conn, curves_df, dow_indices, monthly_indices, accuracy_margins
        )

+        # Phase 4b: Snapshot sampled future-lead forecasts (7/14/30/60/89d) from
+        # the fresh run so long-lead accuracy populates once those dates pass (F7).
+        archive_future_leads(conn, run_id)
+
        duration = time.time() - start_time

        # Record run completion (include DOW indices in metadata)
@@ -1,6 +1,12 @@
 const path = require('path');
+const fs = require('fs');
 const { spawn } = require('child_process');

+// Maintenance switch: `touch .pause-auto-update` in inventory-server/ to make the
+// recurring full-update a no-op (e.g. during a long manual full re-import or a
+// snapshot rebuild). Remove the file to resume.
+const PAUSE_FILE = path.join(__dirname, '..', '.pause-auto-update');
+
 function outputProgress(data) {
    if (!data.status) {
        data = {
@@ -22,12 +28,8 @@ function runScript(scriptPath) {
        child.stdout.on('data', (data) => {
            const lines = data.toString().split('\n');
            lines.filter(line => line.trim()).forEach(line => {
-                try {
-                    console.log(line); // Pass through the JSON output
-                    output += line + '\n';
-                } catch (e) {
-                    console.log(line); // If not JSON, just log it directly
-                }
+                console.log(line); // Pass through the (usually JSON) output
+                output += line + '\n';
            });
        });

@@ -50,6 +52,14 @@ function runScript(scriptPath) {
 }

 async function fullUpdate() {
+    if (fs.existsSync(PAUSE_FILE)) {
+        outputProgress({
+            status: 'complete',
+            operation: 'Full update skipped',
+            message: `Auto-update is paused (${PAUSE_FILE} exists) — remove the file to resume`
+        });
+        return;
+    }
    try {
        // Step 1: Import from Production
        outputProgress({
@@ -13,10 +13,14 @@ async function importCategories(prodConnection, localConnection) {
  let skippedCategories = [];

  try {
-    // Start a single transaction for the entire import
-    await localConnection.query('BEGIN');
-    
-    // Temporarily disable the trigger that's causing problems
+    // Start a single transaction for the entire import.
+    // Must use the wrapper's beginTransaction() (dedicated client) — query('BEGIN')
+    // checks out a client per call, so BEGIN/work/COMMIT would not be guaranteed
+    // to share a connection.
+    await localConnection.beginTransaction();
+
+    // Temporarily disable the trigger that's causing problems.
+    // ALTER TABLE ... DISABLE TRIGGER is transactional: a rollback restores it.
    await localConnection.query('ALTER TABLE categories DISABLE TRIGGER update_categories_updated_at');

    // Process each type in order with its own savepoint
@@ -148,8 +152,11 @@ async function importCategories(prodConnection, localConnection) {
      }
    }

+    // Re-enable the trigger INSIDE the transaction so disable/enable are atomic
+    await localConnection.query('ALTER TABLE categories ENABLE TRIGGER update_categories_updated_at');
+
    // Commit the entire transaction - we'll do this even if we have skipped categories
-    await localConnection.query('COMMIT');
+    await localConnection.commit();

    // Update sync status
    await localConnection.query(`
@@ -158,9 +165,6 @@ async function importCategories(prodConnection, localConnection) {
      ON CONFLICT (table_name) DO UPDATE SET
        last_sync_timestamp = NOW()
    `);
-    
-    // Re-enable the trigger
-    await localConnection.query('ALTER TABLE categories ENABLE TRIGGER update_categories_updated_at');

    outputProgress({
      status: "complete",
@@ -187,12 +191,10 @@ async function importCategories(prodConnection, localConnection) {
  } catch (error) {
    console.error("Error importing categories:", error);
    
-    // Only rollback if we haven't committed yet
+    // Only rollback if we haven't committed yet. The rollback also restores the
+    // trigger state (DISABLE TRIGGER was inside the transaction).
    try {
-      await localConnection.query('ROLLBACK');
-      
-      // Make sure we re-enable the trigger even if there was an error
-      await localConnection.query('ALTER TABLE categories ENABLE TRIGGER update_categories_updated_at');
+      await localConnection.rollback();
    } catch (rollbackError) {
      console.error("Error during rollback:", rollbackError);
    }
@@ -24,7 +24,8 @@ async function importDailyDeals(prodConnection, localConnection) {
  const startTime = Date.now();

  try {
-    await localConnection.query('BEGIN');
+    // Wrapper's beginTransaction() pins a dedicated client; query('BEGIN') would not.
+    await localConnection.beginTransaction();

    // Fetch recent daily deals from production (MySQL 5.7, no CTEs)
    // Join product_current_prices to get the actual deal price
@@ -127,7 +128,7 @@ async function importDailyDeals(prodConnection, localConnection) {
        last_sync_timestamp = NOW()
    `);

-    await localConnection.query('COMMIT');
+    await localConnection.commit();

    outputProgress({
      status: "complete",
@@ -149,7 +150,7 @@ async function importDailyDeals(prodConnection, localConnection) {
    console.error("Error importing daily deals:", error);

    try {
-      await localConnection.query('ROLLBACK');
+      await localConnection.rollback();
    } catch (rollbackError) {
      console.error("Error during rollback:", rollbackError);
    }
@@ -1,5 +1,4 @@
 const { outputProgress, formatElapsedTime, estimateRemaining, calculateRate } = require('../metrics-new/utils/progress');
-const { importMissingProducts, setupTemporaryTables, cleanupTemporaryTables, materializeCalculations } = require('./products');

 /**
 * Imports orders from a production MySQL database to a local PostgreSQL database.
@@ -28,6 +27,7 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
    22: 'placed_incomplete',
    30: 'canceled',
    40: 'awaiting_payment',
+    45: 'payment_pending',
    50: 'awaiting_products',
    55: 'shipping_later',
    56: 'shipping_together',
@@ -35,6 +35,7 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
    61: 'flagged',
    62: 'fix_before_pick',
    65: 'manual_picking',
+    67: 'remote_send',
    70: 'in_pt',
    80: 'picked',
    90: 'awaiting_shipment',
@@ -65,6 +66,12 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =

    console.log('Orders: Using last sync time:', lastSyncTime, '(adjusted:', mysqlSyncTime, ')');

+    // Capture the next watermark from MySQL's own clock BEFORE querying any data.
+    // Rows modified while the import runs stay above this watermark for the next
+    // incremental run (overlap re-imports are harmless upserts); writing NOW()
+    // after the import finishes would permanently skip them.
+    const [[{ source_now: sourceNow }]] = await prodConnection.query('SELECT NOW() as source_now');
+
    // First get count of order items - Keep MySQL compatible for production
    const [[{ total }]] = await prodConnection.query(`
      SELECT COUNT(*) as total
@@ -100,7 +107,6 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
        COALESCE(NULLIF(TRIM(oi.prod_itemnumber), ''), 'NO-SKU') as SKU,
        oi.prod_price as price,
        oi.qty_ordered as quantity,
-        COALESCE(oi.prod_price_reg - oi.prod_price, 0) as base_discount,
        oi.stamp as last_modified
      FROM order_items oi
      JOIN _order o ON oi.order_id = o.order_id
@@ -131,10 +137,8 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
      await localConnection.query(`
        DROP TABLE IF EXISTS temp_order_items;
        DROP TABLE IF EXISTS temp_order_meta;
-        DROP TABLE IF EXISTS temp_order_discounts;
        DROP TABLE IF EXISTS temp_order_taxes;
        DROP TABLE IF EXISTS temp_order_costs;
-        DROP TABLE IF EXISTS temp_main_discounts;
        DROP TABLE IF EXISTS temp_item_discounts;

        CREATE TEMP TABLE temp_order_items (
@@ -143,7 +147,6 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
          sku TEXT NOT NULL,
          price NUMERIC(14, 4) NOT NULL,
          quantity INTEGER NOT NULL,
-          base_discount NUMERIC(14, 4) DEFAULT 0,
          PRIMARY KEY (order_id, pid)
        );

@@ -160,20 +163,6 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
          PRIMARY KEY (order_id)
        );

-        CREATE TEMP TABLE temp_order_discounts (
-          order_id INTEGER NOT NULL,
-          pid INTEGER NOT NULL,
-          discount NUMERIC(14, 4) NOT NULL,
-          PRIMARY KEY (order_id, pid)
-        );
-
-        CREATE TEMP TABLE temp_main_discounts (
-          order_id INTEGER NOT NULL,
-          discount_id INTEGER NOT NULL,
-          discount_amount_subtotal NUMERIC(14, 4) DEFAULT 0.0000,
-          PRIMARY KEY (order_id, discount_id)
-        );
-
        CREATE TEMP TABLE temp_item_discounts (
          order_id INTEGER NOT NULL,
          pid INTEGER NOT NULL,
@@ -198,10 +187,8 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =

        CREATE INDEX idx_temp_order_items_pid ON temp_order_items(pid);
        CREATE INDEX idx_temp_order_meta_order_id ON temp_order_meta(order_id);
-        CREATE INDEX idx_temp_order_discounts_order_pid ON temp_order_discounts(order_id, pid);
        CREATE INDEX idx_temp_order_taxes_order_pid ON temp_order_taxes(order_id, pid);
        CREATE INDEX idx_temp_order_costs_order_pid ON temp_order_costs(order_id, pid);
-        CREATE INDEX idx_temp_main_discounts_discount_id ON temp_main_discounts(discount_id);
        CREATE INDEX idx_temp_item_discounts_order_pid ON temp_item_discounts(order_id, pid);
        CREATE INDEX idx_temp_item_discounts_discount_id ON temp_item_discounts(discount_id);
      `);
@@ -216,21 +203,20 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
      await localConnection.beginTransaction();
      try {
        const batch = orderItems.slice(i, Math.min(i + 5000, orderItems.length));
-        const placeholders = batch.map((_, idx) => 
-          `($${idx * 6 + 1}, $${idx * 6 + 2}, $${idx * 6 + 3}, $${idx * 6 + 4}, $${idx * 6 + 5}, $${idx * 6 + 6})`
+        const placeholders = batch.map((_, idx) =>
+          `($${idx * 5 + 1}, $${idx * 5 + 2}, $${idx * 5 + 3}, $${idx * 5 + 4}, $${idx * 5 + 5})`
        ).join(",");
        const values = batch.flatMap(item => [
-          item.order_id, item.prod_pid, item.SKU, item.price, item.quantity, item.base_discount
+          item.order_id, item.prod_pid, item.SKU, item.price, item.quantity
        ]);

        await localConnection.query(`
-          INSERT INTO temp_order_items (order_id, pid, sku, price, quantity, base_discount)
+          INSERT INTO temp_order_items (order_id, pid, sku, price, quantity)
          VALUES ${placeholders}
          ON CONFLICT (order_id, pid) DO UPDATE SET
            sku = EXCLUDED.sku,
            price = EXCLUDED.price,
-            quantity = EXCLUDED.quantity,
-            base_discount = EXCLUDED.base_discount
+            quantity = EXCLUDED.quantity
        `, values);

        await localConnection.commit();
@@ -337,49 +323,15 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
    };

    const processDiscountsBatch = async (batchIds) => {
-      // First, load main discount records
-      const [mainDiscounts] = await prodConnection.query(`
-        SELECT order_id, discount_id, discount_amount_subtotal
-        FROM order_discounts
-        WHERE order_id IN (?)
-      `, [batchIds]);
-
-      if (mainDiscounts.length > 0) {
-        await localConnection.beginTransaction();
-        try {
-          for (let j = 0; j < mainDiscounts.length; j += PG_BATCH_SIZE) {
-            const subBatch = mainDiscounts.slice(j, j + PG_BATCH_SIZE);
-            if (subBatch.length === 0) continue;
-
-            const placeholders = subBatch.map((_, idx) => 
-              `($${idx * 3 + 1}, $${idx * 3 + 2}, $${idx * 3 + 3})`
-            ).join(",");
-            
-            const values = subBatch.flatMap(d => [
-              d.order_id,
-              d.discount_id,
-              d.discount_amount_subtotal || 0
-            ]);
-
-            await localConnection.query(`
-              INSERT INTO temp_main_discounts (order_id, discount_id, discount_amount_subtotal)
-              VALUES ${placeholders}
-              ON CONFLICT (order_id, discount_id) DO UPDATE SET
-                discount_amount_subtotal = EXCLUDED.discount_amount_subtotal
-            `, values);
-          }
-          await localConnection.commit();
-        } catch (error) {
-          await localConnection.rollback();
-          throw error;
-        }
-      }
-
-      // Then, load item discount records
+      // Load item-level discount records. Only which = 2 rows are real per-item
+      // discount amounts; which = 1 rows store the price of free promo-added
+      // items and which = 3 rows are usage records (neither is a discount).
+      // These amounts are NOT included in summary_discount_subtotal, so they
+      // must be added on top of the prorated subtotal discount unconditionally.
      const [discounts] = await prodConnection.query(`
        SELECT order_id, pid, discount_id, amount
        FROM order_discount_items
-        WHERE order_id IN (?)
+        WHERE order_id IN (?) AND which = 2
      `, [batchIds]);

      if (discounts.length === 0) return;
@@ -418,16 +370,6 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
          `, values);
        }

-        // Create aggregated view with a simpler, safer query that avoids duplicates
-        await localConnection.query(`
-          TRUNCATE temp_order_discounts;
-          
-          INSERT INTO temp_order_discounts (order_id, pid, discount)
-          SELECT order_id, pid, SUM(amount) as discount
-          FROM temp_item_discounts
-          GROUP BY order_id, pid
-        `);
-
        await localConnection.commit();
      } catch (error) {
        await localConnection.rollback();
@@ -603,42 +545,54 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
        try {
          const [orders] = await localConnection.query(`
            WITH order_totals AS (
-              SELECT 
+              SELECT
                oi.order_id,
                oi.pid,
-                -- Instead of using ARRAY_AGG which can cause duplicate issues, use SUM with a CASE
-                SUM(CASE 
-                  WHEN COALESCE(md.discount_amount_subtotal, 0) > 0 THEN id.amount 
-                  ELSE 0 
-                END) as promo_discount_sum,
+                -- Item-level promo discounts (which = 2 rows). These live outside
+                -- summary_discount_subtotal, so they are summed unconditionally.
+                SUM(COALESCE(id.amount, 0)) as promo_discount_sum,
                COALESCE(ot.tax, 0) as total_tax,
                COALESCE(oc.costeach, pc.cost_price, oi.price * 0.5) as costeach
              FROM temp_order_items oi
              LEFT JOIN temp_item_discounts id ON oi.order_id = id.order_id AND oi.pid = id.pid
-              LEFT JOIN temp_main_discounts md ON id.order_id = md.order_id AND id.discount_id = md.discount_id
              LEFT JOIN temp_order_taxes ot ON oi.order_id = ot.order_id AND oi.pid = ot.pid
              LEFT JOIN temp_order_costs oc ON oi.order_id = oc.order_id AND oi.pid = oc.pid
              LEFT JOIN temp_product_costs pc ON oi.pid = pc.pid
              WHERE oi.order_id = ANY($1)
              GROUP BY oi.order_id, oi.pid, ot.tax, oc.costeach, pc.cost_price
            )
-            SELECT 
+            SELECT
              oi.order_id as order_number,
              oi.pid::bigint as pid,
              oi.sku,
              om.date,
              oi.price,
              oi.quantity,
+              -- Discount = prorated order-level subtotal discount + item-level promo
+              -- discounts, clamped so a sale line can never be discounted below free.
              (
-                -- Prorated Points Discount (e.g. loyalty points applied at order level)
-                CASE
-                  WHEN om.summary_discount_subtotal > 0 AND om.summary_subtotal > 0 THEN
-                    COALESCE(ROUND((om.summary_discount_subtotal * (oi.price * oi.quantity)) / NULLIF(om.summary_subtotal, 0), 4), 0)
-                  ELSE 0
+                CASE WHEN oi.quantity > 0 THEN
+                  LEAST(
+                    (
+                      CASE
+                        WHEN om.summary_discount_subtotal > 0 AND om.summary_subtotal > 0 THEN
+                          COALESCE(ROUND((om.summary_discount_subtotal * (oi.price * oi.quantity)) / NULLIF(om.summary_subtotal, 0), 4), 0)
+                        ELSE 0
+                      END
+                      + COALESCE(ot.promo_discount_sum, 0)
+                    ),
+                    oi.price * oi.quantity
+                  )
+                ELSE
+                  (
+                    CASE
+                      WHEN om.summary_discount_subtotal > 0 AND om.summary_subtotal > 0 THEN
+                        COALESCE(ROUND((om.summary_discount_subtotal * (oi.price * oi.quantity)) / NULLIF(om.summary_subtotal, 0), 4), 0)
+                      ELSE 0
+                    END
+                    + COALESCE(ot.promo_discount_sum, 0)
+                  )
                END
-                +
-                -- Specific Item-Level Promo Discount (coupon codes, etc.)
-                COALESCE(ot.promo_discount_sum, 0)
              )::NUMERIC(14, 4) as discount,
              COALESCE(ot.total_tax, 0)::NUMERIC(14, 4) as tax,
              false as tax_included,
@@ -765,34 +719,83 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
      }
    }

-    // Start a transaction for updating sync status and dropping temp tables
+    // Reconciliation 2 prep: fetch canceled (15) / combined (16) orders from MySQL
+    // WITHOUT a date_placed filter — combine_orders zeroes date_placed on the source
+    // orders, so the main item query can never re-fetch them. Done before opening
+    // the PG transaction so we don't hold it across a MySQL round-trip.
+    const [statusSweepRows] = await prodConnection.query(`
+      SELECT order_id, order_status
+      FROM _order
+      WHERE order_status IN (15, 16)
+      ${incrementalUpdate ? 'AND stamp > ?' : ''}
+    `, incrementalUpdate ? [mysqlSyncTime] : []);
+
+    let staleItemsDeleted = 0;
+    let sweepUpdated = 0;
+
+    // Final transaction: reconcile deletions, sweep statuses, update sync status, drop temps
    await localConnection.beginTransaction();
    try {
-      // Update sync status
+      // Reconciliation 1: delete PG item rows that no longer exist in MySQL for the
+      // orders fetched this run. temp_order_items holds the complete current item
+      // set of every fetched order (staff edits and unpicked promo items DELETE
+      // order_items rows in MySQL, which an upsert-only import never removes).
+      const [reconcileResult] = await localConnection.query(`
+        DELETE FROM orders o
+        USING (SELECT DISTINCT order_id FROM temp_order_items) fetched
+        WHERE o.order_number = fetched.order_id::text -- orders.order_number is TEXT
+          AND NOT EXISTS (
+            SELECT 1 FROM temp_order_items t
+            WHERE t.order_id = fetched.order_id AND t.pid = o.pid
+          )
+      `);
+      staleItemsDeleted = reconcileResult.rowCount || 0;
+
+      // Reconciliation 2: mark canceled/combined orders. 'combined' source orders were
+      // merged into a new order that carries the same items — counting both would
+      // double-count, so they also get canceled = true (routes filter on canceled).
+      for (const [code, statusText] of [[15, 'canceled'], [16, 'combined']]) {
+        const ids = statusSweepRows.filter(r => r.order_status === code).map(r => r.order_id);
+        for (let i = 0; i < ids.length; i += 5000) {
+          const chunk = ids.slice(i, i + 5000);
+          const [sweepResult] = await localConnection.query(`
+            UPDATE orders
+            SET status = $1, canceled = true
+            WHERE order_number = ANY($2::text[])
+              AND (status IS DISTINCT FROM $1 OR canceled IS DISTINCT FROM true)
+          `, [statusText, chunk.map(String)]);
+          sweepUpdated += sweepResult.rowCount || 0;
+        }
+      }
+
+      // Update sync status with the watermark captured from MySQL BEFORE the
+      // source queries ran (see sourceNow above).
      await localConnection.query(`
        INSERT INTO sync_status (table_name, last_sync_timestamp)
-        VALUES ('orders', NOW())
+        VALUES ('orders', $1)
        ON CONFLICT (table_name) DO UPDATE SET
-          last_sync_timestamp = NOW()
-      `);
-      
+          last_sync_timestamp = $1
+      `, [sourceNow]);
+
      // Cleanup temporary tables
      await localConnection.query(`
        DROP TABLE IF EXISTS temp_order_items;
        DROP TABLE IF EXISTS temp_order_meta;
-        DROP TABLE IF EXISTS temp_order_discounts;
        DROP TABLE IF EXISTS temp_order_taxes;
        DROP TABLE IF EXISTS temp_order_costs;
-        DROP TABLE IF EXISTS temp_main_discounts;
        DROP TABLE IF EXISTS temp_item_discounts;
        DROP TABLE IF EXISTS temp_product_costs;
      `);
-      
+
      // Commit final transaction
      await localConnection.commit();
    } catch (error) {
      await localConnection.rollback();
-      throw error; 
+      throw error;
+    }
+
+    if (staleItemsDeleted > 0 || sweepUpdated > 0) {
+      console.log(`Orders: reconciliation removed ${staleItemsDeleted} stale item rows, swept ${sweepUpdated} canceled/combined rows`);
    }

    return {
@@ -800,6 +803,8 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
      totalImported: Math.floor(importedCount) || 0,
      recordsAdded: parseInt(recordsAdded) || 0,
      recordsUpdated: parseInt(recordsUpdated) || 0,
+      recordsDeleted: staleItemsDeleted,
+      statusSweepUpdated: sweepUpdated,
      totalSkipped: skippedOrders.size || 0,
      missingProducts: missingProducts.size || 0,
      totalProcessed: orderItems.length,  // Total order items in source
@@ -622,6 +622,7 @@ async function materializeCalculations(prodConnection, localConnection, incremen
      AND t.total_sold IS NOT DISTINCT FROM p.total_sold
      AND t.date_online IS NOT DISTINCT FROM p.date_online
      AND t.shop_score IS NOT DISTINCT FROM p.shop_score
+      AND t.categories IS NOT DISTINCT FROM p.categories
  `);
  
  // Get count of products that need updating
@@ -662,6 +663,11 @@ async function importProducts(prodConnection, localConnection, incrementalUpdate
      }
    }

+    // Capture the next watermark from MySQL's own clock BEFORE querying any data.
+    // Rows modified while the import runs stay above this watermark for the next
+    // incremental run (overlap re-imports are harmless upserts).
+    const [[{ source_now: sourceNow }]] = await prodConnection.query('SELECT NOW() as source_now');
+
    // Start a transaction to ensure temporary tables persist
    await localConnection.beginTransaction();

@@ -927,16 +933,22 @@ async function importProducts(prodConnection, localConnection, incrementalUpdate
      // legacy PHP backend will stamp onto the PO line item.
      await syncSupplierCosts(prodConnection, localConnection);

+      // Sync category assignments for ALL products. product_category_index has no
+      // stamp column, so category-only changes never bump any of the incremental
+      // WHERE timestamps — without this pass PG categories go permanently stale.
+      await syncProductCategories(prodConnection, localConnection);
+
      // Commit the transaction
      await localConnection.commit();

-      // Update sync status
+      // Update sync status with the watermark captured from MySQL BEFORE the
+      // source queries ran (see sourceNow above).
      await localConnection.query(`
        INSERT INTO sync_status (table_name, last_sync_timestamp)
-        VALUES ('products', NOW())
+        VALUES ('products', $1)
        ON CONFLICT (table_name) DO UPDATE SET
-          last_sync_timestamp = NOW()
-      `);
+          last_sync_timestamp = $1
+      `, [sourceNow]);

      return {
        status: 'complete',
@@ -1028,11 +1040,126 @@ async function syncSupplierCosts(prodConnection, localConnection) {
  return { updated };
 }

+// Full category-assignment sweep. The incremental product import keys on
+// p.stamp / ci.stamp / price / b2b dates — none of which change when a product
+// is recategorized in product_category_index (the table has no stamp column).
+// This pass compares the canonical GROUP_CONCAT representation against
+// products.categories and rewrites product_categories only for changed pids.
+// Must run inside the caller's transaction (uses ON COMMIT DROP temp table).
+async function syncProductCategories(prodConnection, localConnection) {
+  outputProgress({
+    status: "running",
+    operation: "Products import",
+    message: "Syncing category assignments"
+  });
+
+  // Same expression as the main import query so representations compare equal
+  // (GROUP_CONCAT(DISTINCT int) returns values numerically sorted).
+  const [rows] = await prodConnection.query(`
+    SELECT
+      p.pid,
+      GROUP_CONCAT(DISTINCT CASE
+        WHEN pc.cat_id IS NOT NULL
+        AND pc.type IN (10, 20, 11, 21, 12, 13)
+        AND pci.cat_id NOT IN (16, 17)
+        THEN pci.cat_id
+      END) as category_ids
+    FROM products p
+    LEFT JOIN product_category_index pci ON p.pid = pci.pid
+    LEFT JOIN product_categories pc ON pci.cat_id = pc.cat_id
+    GROUP BY p.pid
+  `);
+
+  if (!rows || rows.length === 0) {
+    return { updated: 0 };
+  }
+
+  await localConnection.query(`
+    CREATE TEMP TABLE temp_category_sync (
+      pid BIGINT PRIMARY KEY,
+      categories TEXT
+    ) ON COMMIT DROP
+  `);
+
+  const CHUNK = 5000;
+  for (let i = 0; i < rows.length; i += CHUNK) {
+    const batch = rows.slice(i, i + CHUNK);
+    const pids = batch.map(r => r.pid);
+    const cats = batch.map(r => r.category_ids);
+    await localConnection.query(
+      `INSERT INTO temp_category_sync (pid, categories)
+       SELECT * FROM UNNEST($1::bigint[], $2::text[])
+       ON CONFLICT (pid) DO NOTHING`,
+      [pids, cats]
+    );
+  }
+
+  // Which existing products actually changed?
+  const [changed] = await localConnection.query(`
+    SELECT t.pid, t.categories
+    FROM temp_category_sync t
+    JOIN products p ON p.pid = t.pid
+    WHERE t.categories IS DISTINCT FROM p.categories
+  `);
+
+  if (changed.rows.length === 0) {
+    return { updated: 0 };
+  }
+
+  await localConnection.query(`
+    UPDATE products p
+    SET categories = t.categories
+    FROM temp_category_sync t
+    WHERE p.pid = t.pid
+      AND t.categories IS DISTINCT FROM p.categories
+  `);
+
+  // Rewrite the relationship rows for changed products only
+  const REL_CHUNK = 1000;
+  for (let i = 0; i < changed.rows.length; i += REL_CHUNK) {
+    const batch = changed.rows.slice(i, i + REL_CHUNK);
+    const pids = batch.map(r => r.pid);
+
+    await localConnection.query(
+      'DELETE FROM product_categories WHERE pid = ANY($1)',
+      [pids]
+    );
+
+    const relPids = [];
+    const relCats = [];
+    for (const row of batch) {
+      if (!row.categories) continue;
+      for (const catId of row.categories.split(',')) {
+        if (catId && catId.trim()) {
+          relPids.push(row.pid);
+          relCats.push(parseInt(catId.trim(), 10));
+        }
+      }
+    }
+    if (relPids.length > 0) {
+      await localConnection.query(`
+        INSERT INTO product_categories (pid, cat_id)
+        SELECT * FROM UNNEST($1::bigint[], $2::int[])
+        ON CONFLICT (pid, cat_id) DO NOTHING
+      `, [relPids, relCats]);
+    }
+  }
+
+  outputProgress({
+    status: "running",
+    operation: "Products import",
+    message: `Category assignments updated for ${changed.rows.length} products`
+  });
+
+  return { updated: changed.rows.length };
+}
+
 module.exports = {
  importProducts,
  importMissingProducts,
  setupTemporaryTables,
  cleanupTemporaryTables,
  materializeCalculations,
-  syncSupplierCosts
+  syncSupplierCosts,
+  syncProductCategories
 };
@@ -72,6 +72,11 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental

    console.log('Purchase Orders: Using last sync time:', lastSyncTime, '(adjusted:', mysqlSyncTime, ')');

+    // Capture the next watermark from MySQL's own clock BEFORE querying any data.
+    // Rows modified while the import runs stay above this watermark for the next
+    // incremental run (overlap re-imports are harmless upserts).
+    const [[{ source_now: sourceNow }]] = await prodConnection.query('SELECT NOW() as source_now');
+
    // Create temp tables for processing
    await localConnection.query(`
      DROP TABLE IF EXISTS temp_purchase_orders;
@@ -267,13 +272,16 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental
    if (totalPOs === 0) {
      console.log('No purchase orders to process, skipping PO import step');
    } else {
-      // Fetch and process POs in batches
-      let offset = 0;
+      // Fetch and process POs in batches using keyset pagination on po_id.
+      // LIMIT/OFFSET over a date_updated predicate silently skips rows when
+      // concurrent updates shift rows between pages.
+      let processedPOCount = 0;
+      let lastPoId = 0;
      let allPOsProcessed = false;
-      
+
      while (!allPOsProcessed) {
        const [poList] = await prodConnection.query(`
-          SELECT 
+          SELECT
            p.po_id,
            p.supplier_id,
            s.companyname AS vendor,
@@ -286,21 +294,23 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental
          FROM po p
          LEFT JOIN suppliers s ON p.supplier_id = s.supplierid
          WHERE p.date_created >= DATE_SUB(CURRENT_DATE, INTERVAL ${yearInterval} YEAR)
+            AND p.po_id > ?
            ${incrementalUpdate ? `
              AND (
-                p.date_updated > ? 
-                OR p.date_ordered > ? 
+                p.date_updated > ?
+                OR p.date_ordered > ?
                OR p.date_estin > ?
              )
            ` : ''}
          ORDER BY p.po_id
-          LIMIT ${PO_BATCH_SIZE} OFFSET ${offset}
-        `, incrementalUpdate ? [mysqlSyncTime, mysqlSyncTime, mysqlSyncTime] : []);
-        
+          LIMIT ${PO_BATCH_SIZE}
+        `, incrementalUpdate ? [lastPoId, mysqlSyncTime, mysqlSyncTime, mysqlSyncTime] : [lastPoId]);
+
        if (poList.length === 0) {
          allPOsProcessed = true;
          break;
        }
+        lastPoId = poList[poList.length - 1].po_id;
        
        // Get products for these POs
        const poIds = poList.map(po => po.po_id);
@@ -332,7 +342,11 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental
            vendor: po.vendor || 'Unknown Vendor',
            date: validateDate(po.date_ordered) || validateDate(po.date_created),
            expected_date: validateDate(po.date_estin),
-            status: poStatusMap[po.status] || 'created',
+            // Unknown codes get a sentinel rather than 'created': defaulting an
+            // unknown cancel-like code to an OPEN status would inflate on-order
+            // FIFO (the metrics CTEs whitelist known-open statuses, so a sentinel
+            // is simply ignored there).
+            status: poStatusMap[po.status] || `unknown_${po.status}`,
            notes: po.notes || '',
            long_note: po.long_note || '',
            ordered: product.qty_each,
@@ -393,20 +407,20 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental
          `, values);
        }
        
-        offset += poList.length;
+        processedPOCount += poList.length;
        totalProcessed += completePOs.length;
-        
+
        outputProgress({
          status: "running",
          operation: "Purchase orders import",
-          message: `Processed ${offset} of ${totalPOs} purchase orders (${totalProcessed} line items)`,
-          current: offset,
+          message: `Processed ${processedPOCount} of ${totalPOs} purchase orders (${totalProcessed} line items)`,
+          current: processedPOCount,
          total: totalPOs,
          elapsed: formatElapsedTime(startTime),
-          remaining: estimateRemaining(startTime, offset, totalPOs),
-          rate: calculateRate(startTime, offset)
+          remaining: estimateRemaining(startTime, processedPOCount, totalPOs),
+          rate: calculateRate(startTime, processedPOCount)
        });
-        
+
        if (poList.length < PO_BATCH_SIZE) {
          allPOsProcessed = true;
        }
@@ -439,13 +453,14 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental
    if (totalReceivings === 0) {
      console.log('No receivings to process, skipping receivings import step');
    } else {
-      // Fetch and process receivings in batches
-      offset = 0; // Reset offset for receivings
+      // Fetch and process receivings in batches (keyset pagination, see POs above)
+      let processedReceivingCount = 0;
+      let lastReceivingId = 0;
      let allReceivingsProcessed = false;
-      
+
      while (!allReceivingsProcessed) {
        const [receivingList] = await prodConnection.query(`
-          SELECT 
+          SELECT
            r.receiving_id,
            r.supplier_id,
            r.status,
@@ -459,6 +474,7 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental
            r.date_checked
          FROM receivings r
          WHERE r.date_created >= DATE_SUB(CURRENT_DATE, INTERVAL ${yearInterval} YEAR)
+            AND r.receiving_id > ?
            ${incrementalUpdate ? `
              AND (
                r.date_updated > ?
@@ -466,13 +482,14 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental
              )
            ` : ''}
          ORDER BY r.receiving_id
-          LIMIT ${PO_BATCH_SIZE} OFFSET ${offset}
-        `, incrementalUpdate ? [mysqlSyncTime, mysqlSyncTime] : []);
-        
+          LIMIT ${PO_BATCH_SIZE}
+        `, incrementalUpdate ? [lastReceivingId, mysqlSyncTime, mysqlSyncTime] : [lastReceivingId]);
+
        if (receivingList.length === 0) {
          allReceivingsProcessed = true;
          break;
        }
+        lastReceivingId = receivingList[receivingList.length - 1].receiving_id;
        
        // Get products for these receivings
        const receivingIds = receivingList.map(r => r.receiving_id);
@@ -545,7 +562,8 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental
            received_date: validateDate(product.received_date) || validateDate(product.receiving_created_date),
            receiving_created_date: validateDate(product.receiving_created_date),
            supplier_id: receiving.supplier_id,
-            status: receivingStatusMap[receiving.status] || 'created'
+            // Sentinel for unknown codes — see PO status mapping note above
+            status: receivingStatusMap[receiving.status] || `unknown_${receiving.status}`
          });
        }
        
@@ -600,18 +618,18 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental
          `, values);
        }
        
-        offset += receivingList.length;
+        processedReceivingCount += receivingList.length;
        totalProcessed += completeReceivings.length;
-        
+
        outputProgress({
          status: "running",
          operation: "Purchase orders import",
-          message: `Processed ${offset} of ${totalReceivings} receivings (${totalProcessed} line items total)`,
-          current: offset,
+          message: `Processed ${processedReceivingCount} of ${totalReceivings} receivings (${totalProcessed} line items total)`,
+          current: processedReceivingCount,
          total: totalReceivings,
          elapsed: formatElapsedTime(startTime),
-          remaining: estimateRemaining(startTime, offset, totalReceivings),
-          rate: calculateRate(startTime, offset)
+          remaining: estimateRemaining(startTime, processedReceivingCount, totalReceivings),
+          rate: calculateRate(startTime, processedReceivingCount)
        });
        
        if (receivingList.length < PO_BATCH_SIZE) {
@@ -829,13 +847,14 @@ async function importPurchaseOrders(prodConnection, localConnection, incremental
    receivingRecordsAdded = receivingsResult.rows.filter(r => r.inserted).length;
    receivingRecordsUpdated = receivingsResult.rows.filter(r => !r.inserted).length;

-    // Update sync status
+    // Update sync status with the watermark captured from MySQL BEFORE the
+    // source queries ran (see sourceNow above).
    await localConnection.query(`
      INSERT INTO sync_status (table_name, last_sync_timestamp)
-      VALUES ('purchase_orders', NOW())
+      VALUES ('purchase_orders', $1)
      ON CONFLICT (table_name) DO UPDATE SET
-        last_sync_timestamp = NOW()
-    `);
+        last_sync_timestamp = $1
+    `, [sourceNow]);

    // Clean up temporary tables
    await localConnection.query(`
@@ -151,7 +151,10 @@ async function importStockSnapshots(prodConnection, localConnection, incremental

        recordsAdded += batch.length;
      } catch (err) {
+        // Fail the step: the next incremental starts at MAX(snapshot_date), so a
+        // swallowed batch error would leave a permanent hole that is never revisited.
        console.error(`Error inserting batch at offset ${i} (date range ending ${currentDate}):`, err.message);
+        throw err;
      }
    }

@@ -165,7 +168,7 @@ async function importStockSnapshots(prodConnection, localConnection, incremental
      current: processedRows,
      total: totalRows,
      elapsed: formatElapsedTime(startTime),
-      rate: calculateRate(processedRows, startTime)
+      rate: calculateRate(startTime, processedRows)
    });
  }

@@ -10,7 +10,7 @@ DECLARE
    _date DATE;
    _count INT;
    _total_records INT := 0;
-    _begin_date DATE := (SELECT MIN(date)::date FROM orders WHERE date >= '2020-01-01'); -- Starting point: captures all historical order data
+    _begin_date DATE := (SELECT MIN((date AT TIME ZONE 'America/Chicago'))::date FROM orders WHERE date >= '2020-01-01'); -- Starting point: captures all historical order data (business days, Central time)
    _end_date DATE := CURRENT_DATE;
 BEGIN
    RAISE NOTICE 'Beginning daily snapshots rebuild from % to %. Starting at %', _begin_date, _end_date, _start_time;
@@ -32,26 +32,34 @@ BEGIN
                p.sku,
                -- Count orders to ensure we only include products with real activity
                COUNT(o.id) as order_count,
-                -- Aggregate Sales (Quantity > 0, Status not Canceled/Returned)
-                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.quantity ELSE 0 END), 0) AS units_sold,
-                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.price * o.quantity ELSE 0 END), 0.00) AS gross_revenue_unadjusted,
-                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.discount ELSE 0 END), 0.00) AS discounts,
-                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN
+                -- Aggregate Sales (Quantity > 0, Status not Canceled/Returned/Combined)
+                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned', 'combined') THEN o.quantity ELSE 0 END), 0) AS units_sold,
+                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned', 'combined') THEN o.price * o.quantity ELSE 0 END), 0.00) AS gross_revenue_unadjusted,
+                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned', 'combined') THEN o.discount ELSE 0 END), 0.00) AS discounts,
+                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned', 'combined') THEN
                    COALESCE(
                        o.costeach,
-                        get_weighted_avg_cost(p.pid, o.date::date),
+                        get_weighted_avg_cost(p.pid, (o.date AT TIME ZONE 'America/Chicago')::date),
                        p.cost_price
                    ) * o.quantity
                ELSE 0 END), 0.00) AS cogs,
-                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN p.regular_price * o.quantity ELSE 0 END), 0.00) AS gross_regular_revenue,
+                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned', 'combined') THEN p.regular_price * o.quantity ELSE 0 END), 0.00) AS gross_regular_revenue,

                -- Aggregate Returns (Quantity < 0 or Status = Returned)
                COALESCE(SUM(CASE WHEN o.quantity < 0 OR COALESCE(o.status, 'pending') = 'returned' THEN ABS(o.quantity) ELSE 0 END), 0) AS units_returned,
-                COALESCE(SUM(CASE WHEN o.quantity < 0 OR COALESCE(o.status, 'pending') = 'returned' THEN o.price * ABS(o.quantity) ELSE 0 END), 0.00) AS returns_revenue
+                COALESCE(SUM(CASE WHEN o.quantity < 0 OR COALESCE(o.status, 'pending') = 'returned' THEN o.price * ABS(o.quantity) ELSE 0 END), 0.00) AS returns_revenue,
+                -- Returns COGS: cost of returned goods offsets sales COGS
+                COALESCE(SUM(CASE WHEN o.quantity < 0 OR COALESCE(o.status, 'pending') = 'returned' THEN
+                    COALESCE(
+                        o.costeach,
+                        get_weighted_avg_cost(p.pid, (o.date AT TIME ZONE 'America/Chicago')::date),
+                        p.cost_price
+                    ) * ABS(o.quantity)
+                ELSE 0 END), 0.00) AS returns_cogs
            FROM public.products p
            LEFT JOIN public.orders o
                ON p.pid = o.pid
-               AND o.date::date = _date
+               AND (o.date AT TIME ZONE 'America/Chicago')::date = _date -- business day (Central)
            GROUP BY p.pid, p.sku
            HAVING COUNT(o.id) > 0 -- Only include products with actual orders for this date
        ),
@@ -65,7 +73,7 @@ BEGIN
                -- Calculate received cost for this day
                SUM(r.qty_each * r.cost_each) AS cost_received
            FROM public.receivings r
-            WHERE r.received_date::date = _date
+            WHERE (r.received_date AT TIME ZONE 'America/Chicago')::date = _date
            GROUP BY r.pid
            HAVING COUNT(DISTINCT r.receiving_id) > 0 OR SUM(r.qty_each) > 0
        ),
@@ -120,9 +128,9 @@ BEGIN
            COALESCE(sd.discounts, 0.00),
            COALESCE(sd.returns_revenue, 0.00),
            COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00) AS net_revenue,
-            COALESCE(sd.cogs, 0.00),
+            COALESCE(sd.cogs, 0.00) - COALESCE(sd.returns_cogs, 0.00) AS cogs, -- net of returned goods' cost
            COALESCE(sd.gross_regular_revenue, 0.00),
-            (COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00)) - COALESCE(sd.cogs, 0.00) AS profit,
+            (COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00)) - (COALESCE(sd.cogs, 0.00) - COALESCE(sd.returns_cogs, 0.00)) AS profit,
            -- Receiving metrics
            COALESCE(rd.units_received, 0),
            COALESCE(rd.cost_received, 0.00),
@@ -123,7 +123,10 @@ BEGIN
        brand_metrics.current_stock_units IS DISTINCT FROM EXCLUDED.current_stock_units OR
        brand_metrics.sales_30d IS DISTINCT FROM EXCLUDED.sales_30d OR
        brand_metrics.revenue_30d IS DISTINCT FROM EXCLUDED.revenue_30d OR
-        brand_metrics.lifetime_sales IS DISTINCT FROM EXCLUDED.lifetime_sales;
+        brand_metrics.lifetime_sales IS DISTINCT FROM EXCLUDED.lifetime_sales OR
+        -- Cost revisions can change profit/cogs with unchanged sales/revenue
+        brand_metrics.profit_30d IS DISTINCT FROM EXCLUDED.profit_30d OR
+        brand_metrics.cogs_30d IS DISTINCT FROM EXCLUDED.cogs_30d;

    -- Update calculate_status
    INSERT INTO public.calculate_status (module_name, last_calculation_timestamp)
@@ -23,17 +23,19 @@ BEGIN
            SUM(pm.current_stock) AS current_stock_units,
            SUM(pm.current_stock_cost) AS current_stock_cost,
            SUM(pm.current_stock_retail) AS current_stock_retail,
-            -- Sales metrics with proper filtering
+            -- Sales metrics — revenue uses plain COALESCE (matching brand/vendor);
+            -- a positive-only revenue filter while cogs/profit sum everything put
+            -- the margin numerator and denominator on different row populations.
            SUM(CASE WHEN pm.sales_7d > 0 THEN pm.sales_7d ELSE 0 END) AS sales_7d,
-            SUM(CASE WHEN pm.revenue_7d > 0 THEN pm.revenue_7d ELSE 0 END) AS revenue_7d,
+            SUM(COALESCE(pm.revenue_7d, 0)) AS revenue_7d,
            SUM(CASE WHEN pm.sales_30d > 0 THEN pm.sales_30d ELSE 0 END) AS sales_30d,
-            SUM(CASE WHEN pm.revenue_30d > 0 THEN pm.revenue_30d ELSE 0 END) AS revenue_30d,
+            SUM(COALESCE(pm.revenue_30d, 0)) AS revenue_30d,
            SUM(COALESCE(pm.cogs_30d, 0)) AS cogs_30d,
            SUM(COALESCE(pm.profit_30d, 0)) AS profit_30d,
            SUM(CASE WHEN pm.sales_365d > 0 THEN pm.sales_365d ELSE 0 END) AS sales_365d,
-            SUM(CASE WHEN pm.revenue_365d > 0 THEN pm.revenue_365d ELSE 0 END) AS revenue_365d,
+            SUM(COALESCE(pm.revenue_365d, 0)) AS revenue_365d,
            SUM(CASE WHEN pm.lifetime_sales > 0 THEN pm.lifetime_sales ELSE 0 END) AS lifetime_sales,
-            SUM(CASE WHEN pm.lifetime_revenue > 0 THEN pm.lifetime_revenue ELSE 0 END) AS lifetime_revenue
+            SUM(COALESCE(pm.lifetime_revenue, 0)) AS lifetime_revenue
        FROM public.product_categories pc
        JOIN public.product_metrics pm ON pc.pid = pm.pid
        GROUP BY pc.cat_id
@@ -62,15 +64,15 @@ BEGIN
            SUM(pm.current_stock_cost) AS current_stock_cost,
            SUM(pm.current_stock_retail) AS current_stock_retail,
            SUM(CASE WHEN pm.sales_7d > 0 THEN pm.sales_7d ELSE 0 END) AS sales_7d,
-            SUM(CASE WHEN pm.revenue_7d > 0 THEN pm.revenue_7d ELSE 0 END) AS revenue_7d,
+            SUM(COALESCE(pm.revenue_7d, 0)) AS revenue_7d,
            SUM(CASE WHEN pm.sales_30d > 0 THEN pm.sales_30d ELSE 0 END) AS sales_30d,
-            SUM(CASE WHEN pm.revenue_30d > 0 THEN pm.revenue_30d ELSE 0 END) AS revenue_30d,
+            SUM(COALESCE(pm.revenue_30d, 0)) AS revenue_30d,
            SUM(COALESCE(pm.cogs_30d, 0)) AS cogs_30d,
            SUM(COALESCE(pm.profit_30d, 0)) AS profit_30d,
            SUM(CASE WHEN pm.sales_365d > 0 THEN pm.sales_365d ELSE 0 END) AS sales_365d,
-            SUM(CASE WHEN pm.revenue_365d > 0 THEN pm.revenue_365d ELSE 0 END) AS revenue_365d,
+            SUM(COALESCE(pm.revenue_365d, 0)) AS revenue_365d,
            SUM(CASE WHEN pm.lifetime_sales > 0 THEN pm.lifetime_sales ELSE 0 END) AS lifetime_sales,
-            SUM(CASE WHEN pm.lifetime_revenue > 0 THEN pm.lifetime_revenue ELSE 0 END) AS lifetime_revenue
+            SUM(COALESCE(pm.lifetime_revenue, 0)) AS lifetime_revenue
        FROM CategoryProducts cp
        JOIN public.product_metrics pm ON cp.pid = pm.pid
        GROUP BY cp.ancestor_cat_id
@@ -200,7 +202,10 @@ BEGIN
        category_metrics.revenue_30d IS DISTINCT FROM EXCLUDED.revenue_30d OR
        category_metrics.lifetime_sales IS DISTINCT FROM EXCLUDED.lifetime_sales OR
        category_metrics.direct_product_count IS DISTINCT FROM EXCLUDED.direct_product_count OR
-        category_metrics.direct_sales_30d IS DISTINCT FROM EXCLUDED.direct_sales_30d;
+        category_metrics.direct_sales_30d IS DISTINCT FROM EXCLUDED.direct_sales_30d OR
+        -- Cost revisions can change profit/cogs with unchanged sales/revenue
+        category_metrics.profit_30d IS DISTINCT FROM EXCLUDED.profit_30d OR
+        category_metrics.cogs_30d IS DISTINCT FROM EXCLUDED.cogs_30d;

    -- Update calculate_status
    INSERT INTO public.calculate_status (module_name, last_calculation_timestamp)
@@ -60,26 +60,31 @@ BEGIN
        GROUP BY p.vendor
    ),
    VendorPOAggregates AS (
-        -- Aggregate PO related stats including lead time calculated from POs to receivings
+        -- Lead time per PO line = days to its FIRST receiving from the same supplier
+        -- (within 180 days), then averaged per vendor. Joining each PO line to EVERY
+        -- later receiving overstated lead time and weighted it toward busy products.
+        -- Same shape as the per-product calc in update_periodic_metrics.sql.
        SELECT
-            po.vendor,
-            COUNT(DISTINCT po.po_id) AS po_count_365d,
-            -- Calculate lead time by averaging the days between PO date and receiving date
-            AVG(GREATEST(1, CASE 
-                WHEN r.received_date IS NOT NULL AND po.date IS NOT NULL 
-                THEN (r.received_date::date - po.date::date) 
-                ELSE NULL 
-            END))::int AS avg_lead_time_days_hist -- Avg lead time from HISTORICAL received POs
-        FROM public.purchase_orders po
-        -- Join to receivings table to find when items were received
-        LEFT JOIN public.receivings r ON r.pid = po.pid AND r.supplier_id = po.supplier_id
-        WHERE po.vendor IS NOT NULL AND po.vendor <> ''
-          AND po.date >= CURRENT_DATE - INTERVAL '1 year' -- Look at POs created in the last year
-          AND po.status = 'done' -- Only calculate lead time on completed POs
-          AND r.received_date IS NOT NULL
-          AND po.date IS NOT NULL
-          AND r.received_date >= po.date
-        GROUP BY po.vendor
+            vendor,
+            COUNT(DISTINCT po_id) AS po_count_365d,
+            ROUND(AVG(GREATEST(1, first_receive_date - po_date)))::int AS avg_lead_time_days_hist
+        FROM (
+            SELECT
+                po.vendor,
+                po.po_id,
+                po.pid,
+                po.date::date AS po_date,
+                MIN(r.received_date::date) AS first_receive_date
+            FROM public.purchase_orders po
+            JOIN public.receivings r ON r.pid = po.pid AND r.supplier_id = po.supplier_id
+                AND r.received_date >= po.date
+                AND r.received_date <= po.date + INTERVAL '180 days'
+            WHERE po.status = 'done'
+              AND po.date >= CURRENT_DATE - INTERVAL '1 year'
+              AND po.vendor IS NOT NULL AND po.vendor <> ''
+            GROUP BY po.vendor, po.po_id, po.pid, po.date
+        ) po_first_receiving
+        GROUP BY vendor
    ),
    AllVendors AS (
        -- Ensure all vendors from products table are included
@@ -154,7 +159,11 @@ BEGIN
        vendor_metrics.on_order_units IS DISTINCT FROM EXCLUDED.on_order_units OR
        vendor_metrics.sales_30d IS DISTINCT FROM EXCLUDED.sales_30d OR
        vendor_metrics.revenue_30d IS DISTINCT FROM EXCLUDED.revenue_30d OR
-        vendor_metrics.lifetime_sales IS DISTINCT FROM EXCLUDED.lifetime_sales;
+        vendor_metrics.lifetime_sales IS DISTINCT FROM EXCLUDED.lifetime_sales OR
+        -- Cost revisions can change profit/cogs with unchanged sales/revenue
+        vendor_metrics.profit_30d IS DISTINCT FROM EXCLUDED.profit_30d OR
+        vendor_metrics.cogs_30d IS DISTINCT FROM EXCLUDED.cogs_30d OR
+        vendor_metrics.avg_lead_time_days IS DISTINCT FROM EXCLUDED.avg_lead_time_days;

    -- Update calculate_status
    INSERT INTO public.calculate_status (module_name, last_calculation_timestamp)
@@ -0,0 +1,69 @@
+-- Migration 003: Item-level promo discounts + business-day (America/Chicago) bucketing
+-- (applied 2026-06-11, together with the IMPORT_METRICS_FIX_PLAN.md batch)
+--
+-- PROBLEM 1 — dropped item-level promo discounts (~$26K / 30 days):
+--   orders.js applied item-level discounts from order_discount_items only when the
+--   parent order_discounts row had discount_amount_subtotal > 0:
+--     SUM(CASE WHEN COALESCE(md.discount_amount_subtotal, 0) > 0 THEN id.amount ELSE 0 END)
+--   In the PHP source, item-level promo discounts (which = 2) are applied to the order
+--   total SEPARATELY from summary_discount_subtotal, so the gate zeroed essentially all
+--   of them (90d live check: of 10,010 type-10 promos, 8,070 had item rows but only 8 had
+--   discount_amount_subtotal > 0). Net effect: orders.discount understated, net_revenue /
+--   profit_30d / margin_30d overstated by ~10% of revenue, discounts_30d ~3x understated.
+--
+--   FIX (orders.js): fetch only order_discount_items rows with which = 2 (which = 1 rows
+--   are prices of free promo-added items, which = 3 are usage records), sum them
+--   unconditionally, and clamp each sale line's total discount to price * quantity.
+--   temp_main_discounts / temp_order_discounts staging removed (unused after the fix).
+--
+-- PROBLEM 2 — Europe/Berlin day bucketing:
+--   orders.date is timestamptz and the PG server timezone is Europe/Berlin, so ::date
+--   casts shifted every order placed after ~5 PM Central onto the NEXT calendar day in
+--   daily_product_snapshots (and skewed yesterday_sales, DOW patterns, forecast accuracy).
+--
+--   FIX (update_daily_snapshots.sql, backfill/rebuild_daily_snapshots.sql,
+--   update_product_metrics.sql): every day-bucketing cast is now
+--     (ts AT TIME ZONE 'America/Chicago')::date
+--   Supporting expression indexes:
+--     CREATE INDEX idx_orders_date_chicago ON orders (((date AT TIME ZONE 'America/Chicago')::date));
+--     CREATE INDEX idx_receivings_received_chicago ON receivings (((received_date AT TIME ZONE 'America/Chicago')::date));
+--
+-- ALSO IN THIS BATCH (same re-import/rebuild):
+--   * 'combined' order status (code 16) excluded from all sales aggregates, and a sweep
+--     in orders.js marks canceled/combined source orders (canceled = true) even though
+--     combine_orders zeroes date_placed (Fixes 4/5).
+--   * Returns now subtract COGS (returns_cogs) in daily snapshots (Fix 8).
+--   * return_rate_30d = returns / sales (Fix 9); gmroi_30d annualized ×12.17 (Fix 10).
+--   * stockout/avg-stock/service-level derived from stock_snapshots presence (Fix 7).
+--
+-- REQUIRED ACTION (cannot be fixed by SQL alone — discount values are baked into rows):
+--   1. Deploy updated orders.js + snapshot SQL files.
+--   2. Pause the recurring import:  touch inventory-server/.pause-auto-update
+--   3. FULL orders re-import:       INCREMENTAL_UPDATE=false node scripts/import-from-prod.js
+--   4. Rebuild snapshots:           psql -f scripts/metrics-new/backfill/rebuild_daily_snapshots.sql
+--   5. Recalculate metrics:         node scripts/calculate-metrics-new.js
+--   6. Resume:                      rm inventory-server/.pause-auto-update
+--
+-- EXPECTED AFTER RE-IMPORT: margin_30d down ~8-10 points (real, not a data incident),
+-- discounts_30d ~3x up, daily sales curves shifted onto correct business days.
+--
+-- VERIFICATION:
+-- (a) PG SUM(discount) over a 30-day window should approximate MySQL
+--     Σ summary_discount_subtotal (prorated) + Σ order_discount_items.amount (which=2)
+--     over the same orders.
+-- (b) Per-day units in daily_product_snapshots should match MySQL
+--     SELECT date_placed_onlydate, SUM(qty_ordered) FROM order_items JOIN _order ...
+--     WHERE order_status >= 20 GROUP BY 1   (MySQL stores Central days).
+-- (c) Migration 002 regression check (discount double-counting) still holds:
+SELECT
+    o.pid,
+    o.order_number,
+    o.price,
+    o.quantity,
+    o.discount,
+    (o.price * o.quantity - o.discount) as net_revenue
+FROM orders o
+WHERE o.pid IN (624756, 614513)
+ORDER BY o.date DESC
+LIMIT 10;
+-- Expected: discount 0 (or genuine promo amount) for regular sales; net close to gross.
@@ -0,0 +1,9 @@
+-- Migration 004: Map order status codes 45 and 67 to text
+--
+-- Follow-up to 001_map_order_statuses.sql: the orders.js orderStatusMap lacked
+-- codes 45 (payment_pending) and 67 (remote_send), so any such orders imported
+-- as numeric strings '45' / '67'. orders.js now maps them; this updates any
+-- existing rows (a full re-import also fixes them — safe to run either way).
+
+UPDATE orders SET status = 'payment_pending' WHERE status = '45';
+UPDATE orders SET status = 'remote_send'     WHERE status = '67';
@@ -39,50 +39,68 @@ BEGIN
    --   2. Stale detection: existing snapshots where aggregates don't match source data
    --      (catches backfilled imports that arrived after snapshot was calculated)
    --   3. Recent recheck: last N days always reprocessed (picks up new orders, corrections)
+    -- NOTE: all order/receiving timestamps are bucketed into business days using
+    -- America/Chicago. The PG server timezone is Europe/Berlin, so a bare ::date
+    -- cast would shift every evening order onto the next day.
    FOR _target_date IN
        SELECT d FROM (
            -- Gap fill: find dates with activity but missing snapshots
            SELECT activity_dates.d
            FROM (
-                SELECT DISTINCT date::date AS d FROM public.orders
-                WHERE date::date >= _backfill_start AND date::date < CURRENT_DATE - _recent_recheck_days
+                SELECT DISTINCT (date AT TIME ZONE 'America/Chicago')::date AS d FROM public.orders
+                WHERE (date AT TIME ZONE 'America/Chicago')::date >= _backfill_start
+                  AND (date AT TIME ZONE 'America/Chicago')::date < CURRENT_DATE - _recent_recheck_days
                UNION
-                SELECT DISTINCT received_date::date AS d FROM public.receivings
-                WHERE received_date::date >= _backfill_start AND received_date::date < CURRENT_DATE - _recent_recheck_days
+                SELECT DISTINCT (received_date AT TIME ZONE 'America/Chicago')::date AS d FROM public.receivings
+                WHERE (received_date AT TIME ZONE 'America/Chicago')::date >= _backfill_start
+                  AND (received_date AT TIME ZONE 'America/Chicago')::date < CURRENT_DATE - _recent_recheck_days
            ) activity_dates
            WHERE NOT EXISTS (
                SELECT 1 FROM public.daily_product_snapshots dps WHERE dps.snapshot_date = activity_dates.d
            )
            UNION
            -- Stale detection: compare snapshot aggregates against source tables
+            -- (must bucket identically to SalesData/ReceivingData or every day
+            -- looks permanently stale)
            SELECT snap_agg.snapshot_date AS d
            FROM (
                SELECT snapshot_date,
                       COALESCE(SUM(units_received), 0)::bigint AS snap_received,
-                       COALESCE(SUM(units_sold), 0)::bigint AS snap_sold
+                       COALESCE(SUM(units_sold), 0)::bigint AS snap_sold,
+                       ROUND(COALESCE(SUM(net_revenue), 0), 2) AS snap_net_revenue
                FROM public.daily_product_snapshots
                WHERE snapshot_date >= _backfill_start
                  AND snapshot_date < CURRENT_DATE - _recent_recheck_days
                GROUP BY snapshot_date
            ) snap_agg
            LEFT JOIN (
-                SELECT received_date::date AS d, SUM(qty_each)::bigint AS actual_received
+                SELECT (received_date AT TIME ZONE 'America/Chicago')::date AS d, SUM(qty_each)::bigint AS actual_received
                FROM public.receivings
-                WHERE received_date::date >= _backfill_start
-                  AND received_date::date < CURRENT_DATE - _recent_recheck_days
-                GROUP BY received_date::date
+                WHERE (received_date AT TIME ZONE 'America/Chicago')::date >= _backfill_start
+                  AND (received_date AT TIME ZONE 'America/Chicago')::date < CURRENT_DATE - _recent_recheck_days
+                GROUP BY 1
            ) recv_agg ON snap_agg.snapshot_date = recv_agg.d
            LEFT JOIN (
-                SELECT date::date AS d,
-                       SUM(CASE WHEN quantity > 0 AND COALESCE(status, 'pending') NOT IN ('canceled', 'returned')
-                                THEN quantity ELSE 0 END)::bigint AS actual_sold
+                SELECT (date AT TIME ZONE 'America/Chicago')::date AS d,
+                       SUM(CASE WHEN quantity > 0 AND COALESCE(status, 'pending') NOT IN ('canceled', 'returned', 'combined')
+                                THEN quantity ELSE 0 END)::bigint AS actual_sold,
+                       -- Mirrors SalesData's net_revenue (gross - discounts - returns)
+                       -- so price/discount corrections older than the recheck window
+                       -- get repaired, not just unit-count changes.
+                       ROUND(
+                           SUM(CASE WHEN quantity > 0 AND COALESCE(status, 'pending') NOT IN ('canceled', 'returned', 'combined')
+                                    THEN price * quantity - discount ELSE 0 END)
+                           - SUM(CASE WHEN quantity < 0 OR COALESCE(status, 'pending') = 'returned'
+                                      THEN price * ABS(quantity) ELSE 0 END)
+                       , 2) AS actual_net_revenue
                FROM public.orders
-                WHERE date::date >= _backfill_start
-                  AND date::date < CURRENT_DATE - _recent_recheck_days
-                GROUP BY date::date
+                WHERE (date AT TIME ZONE 'America/Chicago')::date >= _backfill_start
+                  AND (date AT TIME ZONE 'America/Chicago')::date < CURRENT_DATE - _recent_recheck_days
+                GROUP BY 1
            ) orders_agg ON snap_agg.snapshot_date = orders_agg.d
            WHERE snap_agg.snap_received != COALESCE(recv_agg.actual_received, 0)
               OR snap_agg.snap_sold != COALESCE(orders_agg.actual_sold, 0)
+               OR snap_agg.snap_net_revenue != ROUND(COALESCE(orders_agg.actual_net_revenue, 0), 2)
            UNION
            -- Recent days: always reprocess
            SELECT d::date
@@ -116,26 +134,36 @@ BEGIN
                p.sku,
                -- Track number of orders to ensure we have real data
                COUNT(o.id) as order_count,
-                -- Aggregate Sales (Quantity > 0, Status not Canceled/Returned)
-                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.quantity ELSE 0 END), 0) AS units_sold,
-                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.price * o.quantity ELSE 0 END), 0.00) AS gross_revenue_unadjusted, -- Before discount
-                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.discount ELSE 0 END), 0.00) AS discounts,
-                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN 
+                -- Aggregate Sales (Quantity > 0, Status not Canceled/Returned/Combined)
+                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned', 'combined') THEN o.quantity ELSE 0 END), 0) AS units_sold,
+                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned', 'combined') THEN o.price * o.quantity ELSE 0 END), 0.00) AS gross_revenue_unadjusted, -- Before discount
+                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned', 'combined') THEN o.discount ELSE 0 END), 0.00) AS discounts,
+                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned', 'combined') THEN
                    COALESCE(
                        o.costeach,  -- First use order-specific cost if available
-                        get_weighted_avg_cost(p.pid, o.date::date),  -- Then use weighted average cost
+                        get_weighted_avg_cost(p.pid, (o.date AT TIME ZONE 'America/Chicago')::date),  -- Then use weighted average cost
                        p.cost_price  -- Final fallback to current cost
-                    ) * o.quantity 
+                    ) * o.quantity
                ELSE 0 END), 0.00) AS cogs,
-                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN p.regular_price * o.quantity ELSE 0 END), 0.00) AS gross_regular_revenue, -- Use current regular price for simplicity here
+                COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned', 'combined') THEN p.regular_price * o.quantity ELSE 0 END), 0.00) AS gross_regular_revenue, -- Use current regular price for simplicity here

                -- Aggregate Returns (Quantity < 0 or Status = Returned)
                COALESCE(SUM(CASE WHEN o.quantity < 0 OR COALESCE(o.status, 'pending') = 'returned' THEN ABS(o.quantity) ELSE 0 END), 0) AS units_returned,
-                COALESCE(SUM(CASE WHEN o.quantity < 0 OR COALESCE(o.status, 'pending') = 'returned' THEN o.price * ABS(o.quantity) ELSE 0 END), 0.00) AS returns_revenue
+                COALESCE(SUM(CASE WHEN o.quantity < 0 OR COALESCE(o.status, 'pending') = 'returned' THEN o.price * ABS(o.quantity) ELSE 0 END), 0.00) AS returns_revenue,
+                -- Returns COGS: returned goods come back into stock, so their cost
+                -- offsets the sales COGS for the day (margin would otherwise be
+                -- understated in return-heavy periods).
+                COALESCE(SUM(CASE WHEN o.quantity < 0 OR COALESCE(o.status, 'pending') = 'returned' THEN
+                    COALESCE(
+                        o.costeach,
+                        get_weighted_avg_cost(p.pid, (o.date AT TIME ZONE 'America/Chicago')::date),
+                        p.cost_price
+                    ) * ABS(o.quantity)
+                ELSE 0 END), 0.00) AS returns_cogs
            FROM public.products p -- Start from products to include those with no orders today
            JOIN public.orders o -- Changed to INNER JOIN to only process products with orders
                ON p.pid = o.pid
-               AND o.date::date = _target_date -- Cast to date to ensure compatibility regardless of original type
+               AND (o.date AT TIME ZONE 'America/Chicago')::date = _target_date -- Bucket by business day (Central)
            GROUP BY p.pid, p.sku
            -- No HAVING clause here - we always want to include all orders
        ),
@@ -149,7 +177,7 @@ BEGIN
                -- Calculate the cost received (qty * cost)
                SUM(r.qty_each * r.cost_each) AS cost_received
            FROM public.receivings r
-            WHERE r.received_date::date = _target_date
+            WHERE (r.received_date AT TIME ZONE 'America/Chicago')::date = _target_date
              -- Optional: Filter out canceled receivings if needed
              -- AND r.status <> 'canceled'
            GROUP BY r.pid
@@ -217,9 +245,9 @@ BEGIN
            COALESCE(sd.discounts, 0.00),
            COALESCE(sd.returns_revenue, 0.00),
            COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00) AS net_revenue,
-            COALESCE(sd.cogs, 0.00),
+            COALESCE(sd.cogs, 0.00) - COALESCE(sd.returns_cogs, 0.00) AS cogs, -- net of returned goods' cost
            COALESCE(sd.gross_regular_revenue, 0.00),
-            (COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00)) - COALESCE(sd.cogs, 0.00) AS profit,
+            (COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00)) - (COALESCE(sd.cogs, 0.00) - COALESCE(sd.returns_cogs, 0.00)) AS profit,
            -- Receiving Metrics (From ReceivingData)
            COALESCE(rd.units_received, 0),
            COALESCE(rd.cost_received, 0.00),
@@ -131,18 +131,19 @@ BEGIN
    HistoricalDates AS (
        -- Note: Calculating these MIN/MAX values hourly can be slow on large tables.
        -- Consider calculating periodically or storing on products if import can populate them.
+        -- Dates are bucketed in business time (America/Chicago) to match daily snapshots.
        SELECT
            p.pid,
-            MIN(o.date)::date AS date_first_sold,
-            MAX(o.date)::date AS max_order_date, -- Use MAX for potential recalc of date_last_sold
-            
+            MIN((o.date AT TIME ZONE 'America/Chicago'))::date AS date_first_sold,
+            MAX((o.date AT TIME ZONE 'America/Chicago'))::date AS max_order_date, -- Use MAX for potential recalc of date_last_sold
+
            -- For first received, use the new receivings table
-            MIN(r.received_date)::date AS date_first_received_calc,
-            
+            MIN((r.received_date AT TIME ZONE 'America/Chicago'))::date AS date_first_received_calc,
+
            -- For last received, use the new receivings table
-            MAX(r.received_date)::date AS date_last_received_calc
+            MAX((r.received_date AT TIME ZONE 'America/Chicago'))::date AS date_last_received_calc
        FROM public.products p
-        LEFT JOIN public.orders o ON p.pid = o.pid AND o.quantity > 0 AND o.status NOT IN ('canceled', 'returned')
+        LEFT JOIN public.orders o ON p.pid = o.pid AND o.quantity > 0 AND o.status NOT IN ('canceled', 'returned', 'combined')
        LEFT JOIN public.receivings r ON p.pid = r.pid
        GROUP BY p.pid
    ),
@@ -174,17 +175,19 @@ BEGIN
            SUM(CASE WHEN snapshot_date >= _current_date - INTERVAL '29 days' AND snapshot_date <= _current_date THEN discounts ELSE 0 END) AS discounts_30d,
            SUM(CASE WHEN snapshot_date >= _current_date - INTERVAL '29 days' AND snapshot_date <= _current_date THEN gross_revenue ELSE 0 END) AS gross_revenue_30d,
            SUM(CASE WHEN snapshot_date >= _current_date - INTERVAL '29 days' AND snapshot_date <= _current_date THEN gross_regular_revenue ELSE 0 END) AS gross_regular_revenue_30d,
-            SUM(CASE WHEN snapshot_date >= _current_date - INTERVAL '29 days' AND snapshot_date <= _current_date AND stockout_flag THEN 1 ELSE 0 END) AS stockout_days_30d,
-            
+            -- NOTE: stockout days and avg stock units/cost now come from StockCoverage
+            -- (stock_snapshots has full daily coverage; these activity-only snapshots
+            -- only exist on days with sales/receivings, which made stockout_days ~0
+            -- exactly when stockouts mattered and biased stock averages upward).
+
            SUM(CASE WHEN snapshot_date >= _current_date - INTERVAL '364 days' AND snapshot_date <= _current_date THEN units_sold ELSE 0 END) AS sales_365d,
            SUM(CASE WHEN snapshot_date >= _current_date - INTERVAL '364 days' AND snapshot_date <= _current_date THEN net_revenue ELSE 0 END) AS revenue_365d,
            
            SUM(CASE WHEN snapshot_date >= _current_date - INTERVAL '29 days' AND snapshot_date <= _current_date THEN units_received ELSE 0 END) AS received_qty_30d,
            SUM(CASE WHEN snapshot_date >= _current_date - INTERVAL '29 days' AND snapshot_date <= _current_date THEN cost_received ELSE 0 END) AS received_cost_30d,

-            -- Averages for stock levels - only include dates within the specified period
-            AVG(CASE WHEN snapshot_date >= _current_date - INTERVAL '29 days' AND snapshot_date <= _current_date THEN eod_stock_quantity END) AS avg_stock_units_30d,
-            AVG(CASE WHEN snapshot_date >= _current_date - INTERVAL '29 days' AND snapshot_date <= _current_date THEN eod_stock_cost END) AS avg_stock_cost_30d,
+            -- Retail/gross stock averages stay on activity snapshots: stock_snapshots
+            -- has no eod_stock_retail equivalent (cost-only source table).
            AVG(CASE WHEN snapshot_date >= _current_date - INTERVAL '29 days' AND snapshot_date <= _current_date THEN eod_stock_retail END) AS avg_stock_retail_30d,
            AVG(CASE WHEN snapshot_date >= _current_date - INTERVAL '29 days' AND snapshot_date <= _current_date THEN eod_stock_gross END) AS avg_stock_gross_30d,

@@ -240,16 +243,89 @@ BEGIN
      LEFT JOIN public.settings_vendor sv ON p.vendor = sv.vendor
    ),
    LifetimeRevenue AS (
-        -- Calculate actual revenue from orders table
+        -- Calculate actual revenue from orders table. Negative-quantity rows
+        -- (returns) are included so lifetime revenue nets out returns;
+        -- price * quantity is already signed.
        SELECT
            o.pid,
            SUM(o.price * o.quantity - COALESCE(o.discount, 0)) AS lifetime_revenue_from_orders,
            SUM(o.quantity) AS lifetime_units_from_orders
        FROM public.orders o
-        WHERE o.status NOT IN ('canceled', 'returned')
-          AND o.quantity > 0
+        WHERE o.status NOT IN ('canceled', 'returned', 'combined')
        GROUP BY o.pid
    ),
+    -- Full-coverage stock presence from stock_snapshots (MySQL snap_product_value).
+    -- That source only writes rows for products WITH stock on hand, so a product
+    -- missing from a day the cron ran was out of stock that day. Days before the
+    -- product was created are not counted against it.
+    StockCoverage AS (
+        SELECT
+            pid,
+            eligible_days_30d,
+            days_in_stock_30d,
+            CASE WHEN eligible_days_30d > 0
+                 THEN GREATEST(0, eligible_days_30d - days_in_stock_30d)
+            END AS stockout_days_30d,
+            -- Absent days count as zero stock (the old activity-only average was
+            -- biased toward in-stock days)
+            CASE WHEN eligible_days_30d > 0
+                 THEN sum_qty::numeric / eligible_days_30d
+            END AS avg_stock_units_30d,
+            CASE WHEN eligible_days_30d > 0
+                 THEN sum_value::numeric / eligible_days_30d
+            END AS avg_stock_cost_30d
+        FROM (
+            SELECT
+                p.pid,
+                LEAST(
+                    cal.covered_days,
+                    CASE WHEN p.created_at IS NULL THEN cal.covered_days
+                         ELSE GREATEST(0, (_current_date - GREATEST(p.created_at::date, _current_date - 29) + 1))
+                    END
+                ) AS eligible_days_30d,
+                COALESCE(pres.days_in_stock, 0) AS days_in_stock_30d,
+                COALESCE(pres.sum_qty, 0) AS sum_qty,
+                COALESCE(pres.sum_value, 0) AS sum_value
+            FROM public.products p
+            CROSS JOIN (
+                SELECT COUNT(DISTINCT snapshot_date) AS covered_days
+                FROM public.stock_snapshots
+                WHERE snapshot_date >= _current_date - INTERVAL '29 days'
+                  AND snapshot_date <= _current_date
+            ) cal
+            LEFT JOIN (
+                SELECT pid,
+                       COUNT(*) AS days_in_stock,
+                       SUM(stock_quantity) AS sum_qty,
+                       SUM(stock_value) AS sum_value
+                FROM public.stock_snapshots
+                WHERE snapshot_date >= _current_date - INTERVAL '29 days'
+                  AND snapshot_date <= _current_date
+                GROUP BY pid
+            ) pres ON pres.pid = p.pid
+        ) base
+    ),
+    -- Sales that happened on out-of-stock days (per the stock snapshot), for
+    -- lost-sales incidents and the fill-rate heuristic. Restricted to days the
+    -- stock cron actually ran so e.g. today's sales aren't misread as stockouts.
+    SalesDayStock AS (
+        SELECT
+            dps.pid,
+            SUM(dps.units_sold) AS units_sold_covered,
+            COUNT(*) FILTER (WHERE dps.units_sold > 0 AND ss.pid IS NULL) AS lost_sales_incidents_30d,
+            SUM(CASE WHEN ss.pid IS NULL THEN dps.units_sold ELSE 0 END) AS units_sold_on_stockout_days
+        FROM public.daily_product_snapshots dps
+        JOIN (
+            SELECT DISTINCT snapshot_date FROM public.stock_snapshots
+            WHERE snapshot_date >= _current_date - INTERVAL '29 days'
+              AND snapshot_date <= _current_date
+        ) cal ON cal.snapshot_date = dps.snapshot_date
+        LEFT JOIN public.stock_snapshots ss
+            ON ss.pid = dps.pid AND ss.snapshot_date = dps.snapshot_date
+        WHERE dps.snapshot_date >= _current_date - INTERVAL '29 days'
+          AND dps.snapshot_date <= _current_date
+        GROUP BY dps.pid
+    ),
    PreviousPeriodMetrics AS (
        -- Calculate metrics for previous 30-day period for growth comparison
        SELECT
@@ -302,24 +378,43 @@ BEGIN
        GROUP BY pid
    ),
    ServiceLevels AS (
-        -- Calculate service level and fill rate metrics
+        -- Service level and fill rate built on full-coverage stock data
+        -- (StockCoverage / SalesDayStock) instead of activity-only snapshots.
        SELECT
-            pid,
-            COUNT(*) FILTER (WHERE stockout_flag = true) AS stockout_incidents_30d,
-            COUNT(*) FILTER (WHERE stockout_flag = true AND units_sold > 0) AS lost_sales_incidents_30d,
-            -- Service level: percentage of days without stockouts
-            (1.0 - (COUNT(*) FILTER (WHERE stockout_flag = true)::NUMERIC / NULLIF(COUNT(*), 0))) * 100 AS service_level_30d,
-            -- Fill rate: units sold / (units sold + potential lost sales)
-            CASE 
-                WHEN SUM(units_sold) > 0 THEN
-                    (SUM(units_sold)::NUMERIC / 
-                     (SUM(units_sold) + SUM(CASE WHEN stockout_flag THEN units_sold * 0.2 ELSE 0 END))) * 100
+            sc.pid,
+            sc.stockout_days_30d AS stockout_incidents_30d,
+            sds.lost_sales_incidents_30d,
+            -- Service level: percentage of covered days the product was in stock
+            CASE WHEN sc.eligible_days_30d > 0 THEN
+                (1.0 - (sc.stockout_days_30d::NUMERIC / sc.eligible_days_30d)) * 100
+            END AS service_level_30d,
+            -- Fill rate: units sold / (units sold + potential lost sales).
+            -- The 0.2 lost-sales factor is an arbitrary heuristic: each unit sold on
+            -- an out-of-stock day is assumed to represent 20% additional missed demand.
+            CASE
+                WHEN COALESCE(sds.units_sold_covered, 0) > 0 THEN
+                    (sds.units_sold_covered::NUMERIC /
+                     (sds.units_sold_covered + COALESCE(sds.units_sold_on_stockout_days, 0) * 0.2)) * 100
                ELSE NULL
            END AS fill_rate_30d
-        FROM public.daily_product_snapshots
-        WHERE snapshot_date >= _current_date - INTERVAL '29 days' 
-          AND snapshot_date <= _current_date
-        GROUP BY pid
+        FROM StockCoverage sc
+        LEFT JOIN SalesDayStock sds ON sds.pid = sc.pid
+    ),
+    ProductVelocity AS (
+        -- Single source for sales velocity so every replenishment/cover column stays
+        -- consistent. NULL when the product is excluded from forecasting: excluded
+        -- products now still get a product_metrics row (they used to be filtered out
+        -- entirely and vanished from brand/vendor/category rollups), but their
+        -- forecast-derived columns go NULL / zero.
+        SELECT
+            ci.pid,
+            CASE WHEN COALESCE(s.exclude_forecast, FALSE) THEN NULL
+                 ELSE calculate_sales_velocity(sa.sales_30d::int, COALESCE(sc.stockout_days_30d, 0)::int)
+            END AS daily
+        FROM CurrentInfo ci
+        LEFT JOIN SnapshotAggregates sa ON ci.pid = sa.pid
+        LEFT JOIN StockCoverage sc ON ci.pid = sc.pid
+        LEFT JOIN Settings s ON ci.pid = s.pid
    ),
    SeasonalityAnalysis AS (
        -- Set-based seasonality detection (replaces per-product function calls)
@@ -424,8 +519,8 @@ BEGIN
        END AS age_days,
        sa.sales_7d, sa.revenue_7d, sa.sales_14d, sa.revenue_14d, sa.sales_30d, sa.revenue_30d, sa.cogs_30d, sa.profit_30d,
        sa.returns_units_30d, sa.returns_revenue_30d, sa.discounts_30d, sa.gross_revenue_30d, sa.gross_regular_revenue_30d,
-        sa.stockout_days_30d, sa.sales_365d, sa.revenue_365d,
-        sa.avg_stock_units_30d, sa.avg_stock_cost_30d, sa.avg_stock_retail_30d, sa.avg_stock_gross_30d,
+        sc.stockout_days_30d, sa.sales_365d, sa.revenue_365d,
+        sc.avg_stock_units_30d, sc.avg_stock_cost_30d, sa.avg_stock_retail_30d, sa.avg_stock_gross_30d,
        sa.received_qty_30d, sa.received_cost_30d,
        -- Use total_sold from products table as the source of truth for lifetime sales
        -- This includes all historical data from the production database
@@ -463,66 +558,68 @@ BEGIN
        sa.sales_30d AS avg_sales_per_month_30d, -- Using 30d sales as proxy for month
        (sa.profit_30d / NULLIF(sa.revenue_30d, 0)) * 100 AS margin_30d,
        (sa.profit_30d / NULLIF(sa.cogs_30d, 0)) * 100 AS markup_30d,
-        sa.profit_30d / NULLIF(sa.avg_stock_cost_30d, 0) AS gmroi_30d,
-        sa.sales_30d / NULLIF(sa.avg_stock_units_30d, 0) AS stockturn_30d,
-        (sa.returns_units_30d / NULLIF(sa.sales_30d + sa.returns_units_30d, 0)) * 100 AS return_rate_30d,
+        -- Annualized GMROI (30-day profit extrapolated to a year: × 365/30).
+        -- Conventional benchmark for healthy retail is ≥ 2-3 on this scale.
+        (sa.profit_30d / NULLIF(sc.avg_stock_cost_30d, 0)) * 12.17 AS gmroi_30d,
+        sa.sales_30d / NULLIF(sc.avg_stock_units_30d, 0) AS stockturn_30d,
+        -- Industry-standard definition: returns / sales (not returns / (sales+returns))
+        (sa.returns_units_30d / NULLIF(sa.sales_30d, 0)) * 100 AS return_rate_30d,
        (sa.discounts_30d / NULLIF(sa.gross_revenue_30d, 0)) * 100 AS discount_rate_30d,
-        (sa.stockout_days_30d / 30.0) * 100 AS stockout_rate_30d,
+        (sc.stockout_days_30d::numeric / NULLIF(sc.eligible_days_30d, 0)) * 100 AS stockout_rate_30d,
        sa.gross_regular_revenue_30d - sa.gross_revenue_30d AS markdown_30d,
        ((sa.gross_regular_revenue_30d - sa.gross_revenue_30d) / NULLIF(sa.gross_regular_revenue_30d, 0)) * 100 AS markdown_rate_30d,
        -- Sell-through rate: Industry standard is Units Sold / (Beginning Inventory + Units Received)
        -- Uses actual snapshot from 30 days ago as beginning stock, falls back to avg_stock_units_30d
        (sa.sales_30d / NULLIF(
-            COALESCE(bs.beginning_stock_30d, sa.avg_stock_units_30d::int, 0) + sa.received_qty_30d,
+            COALESCE(bs.beginning_stock_30d, sc.avg_stock_units_30d::int, 0) + sa.received_qty_30d,
            0
        )) * 100 AS sell_through_30d,

-        -- Forecasting intermediate values
-        -- Use the calculate_sales_velocity function instead of repetitive calculation
-        calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) AS sales_velocity_daily,
+        -- Forecasting intermediate values (ProductVelocity; NULL when excluded from forecast)
+        vel.daily AS sales_velocity_daily,
        s.effective_lead_time AS config_lead_time,
        s.effective_days_of_stock AS config_days_of_stock,
        s.effective_safety_stock AS config_safety_stock,
        (s.effective_lead_time + s.effective_days_of_stock) AS planning_period_days,
        
-        calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time AS lead_time_forecast_units,
+        vel.daily * s.effective_lead_time AS lead_time_forecast_units,
        
-        calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock AS days_of_stock_forecast_units,
+        vel.daily * s.effective_days_of_stock AS days_of_stock_forecast_units,
        
-        calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * (s.effective_lead_time + s.effective_days_of_stock) AS planning_period_forecast_units,
+        vel.daily * (s.effective_lead_time + s.effective_days_of_stock) AS planning_period_forecast_units,
        
-        (ci.current_stock + COALESCE(ooi.on_order_qty, 0) - (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time)) AS lead_time_closing_stock,
+        (ci.current_stock + COALESCE(ooi.on_order_qty, 0) - (vel.daily * s.effective_lead_time)) AS lead_time_closing_stock,
        
-        ((ci.current_stock + COALESCE(ooi.on_order_qty, 0) - (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time))) - (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock) AS days_of_stock_closing_stock,
+        ((ci.current_stock + COALESCE(ooi.on_order_qty, 0) - (vel.daily * s.effective_lead_time))) - (vel.daily * s.effective_days_of_stock) AS days_of_stock_closing_stock,
        
-        ((calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time) + (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0) AS replenishment_needed_raw,
+        ((vel.daily * s.effective_lead_time) + (vel.daily * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0) AS replenishment_needed_raw,

        -- Final Forecasting / Replenishment Metrics
-        CEILING(GREATEST(0, (((calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time) + (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0))))::int AS replenishment_units,
-        (CEILING(GREATEST(0, (((calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time) + (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0))))::int) * ci.current_effective_cost AS replenishment_cost,
-        (CEILING(GREATEST(0, (((calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time) + (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0))))::int) * ci.current_price AS replenishment_retail,
-        (CEILING(GREATEST(0, (((calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time) + (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0))))::int) * (ci.current_price - ci.current_effective_cost) AS replenishment_profit,
+        CEILING(GREATEST(0, (((vel.daily * s.effective_lead_time) + (vel.daily * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0))))::int AS replenishment_units,
+        (CEILING(GREATEST(0, (((vel.daily * s.effective_lead_time) + (vel.daily * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0))))::int) * ci.current_effective_cost AS replenishment_cost,
+        (CEILING(GREATEST(0, (((vel.daily * s.effective_lead_time) + (vel.daily * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0))))::int) * ci.current_price AS replenishment_retail,
+        (CEILING(GREATEST(0, (((vel.daily * s.effective_lead_time) + (vel.daily * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0))))::int) * (ci.current_price - ci.current_effective_cost) AS replenishment_profit,

        -- To Order (Apply MOQ/UOM logic here if needed, otherwise equals replenishment)
-        CEILING(GREATEST(0, (((calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time) + (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0))))::int AS to_order_units,
+        CEILING(GREATEST(0, (((vel.daily * s.effective_lead_time) + (vel.daily * s.effective_days_of_stock)) + s.effective_safety_stock - ci.current_stock - COALESCE(ooi.on_order_qty, 0))))::int AS to_order_units,

-        GREATEST(0, - (ci.current_stock + COALESCE(ooi.on_order_qty, 0) - (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time))) AS forecast_lost_sales_units,
-        GREATEST(0, - (ci.current_stock + COALESCE(ooi.on_order_qty, 0) - (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time))) * ci.current_price AS forecast_lost_revenue,
+        GREATEST(0, - (ci.current_stock + COALESCE(ooi.on_order_qty, 0) - (vel.daily * s.effective_lead_time))) AS forecast_lost_sales_units,
+        GREATEST(0, - (ci.current_stock + COALESCE(ooi.on_order_qty, 0) - (vel.daily * s.effective_lead_time))) * ci.current_price AS forecast_lost_revenue,

-        ci.current_stock / NULLIF(calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int), 0) AS stock_cover_in_days,
-        COALESCE(ooi.on_order_qty, 0) / NULLIF(calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int), 0) AS po_cover_in_days,
-        (ci.current_stock + COALESCE(ooi.on_order_qty, 0)) / NULLIF(calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int), 0) AS sells_out_in_days,
+        ci.current_stock / NULLIF(vel.daily, 0) AS stock_cover_in_days,
+        COALESCE(ooi.on_order_qty, 0) / NULLIF(vel.daily, 0) AS po_cover_in_days,
+        (ci.current_stock + COALESCE(ooi.on_order_qty, 0)) / NULLIF(vel.daily, 0) AS sells_out_in_days,

        -- Replenish Date: Date when stock is projected to hit safety stock, minus lead time
        CASE
-            WHEN calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) > 0
-            THEN _current_date + FLOOR(GREATEST(0, ci.current_stock - s.effective_safety_stock) / calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int))::int - s.effective_lead_time
+            WHEN vel.daily > 0
+            THEN _current_date + FLOOR(GREATEST(0, ci.current_stock - s.effective_safety_stock) / vel.daily)::int - s.effective_lead_time
            ELSE NULL
        END AS replenish_date,

-        GREATEST(0, ci.current_stock - s.effective_safety_stock - ((calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time) + (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock)))::int AS overstocked_units,
-        (GREATEST(0, ci.current_stock - s.effective_safety_stock - ((calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time) + (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock)))) * ci.current_effective_cost AS overstocked_cost,
-        (GREATEST(0, ci.current_stock - s.effective_safety_stock - ((calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time) + (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock)))) * ci.current_price AS overstocked_retail,
+        GREATEST(0, ci.current_stock - s.effective_safety_stock - ((vel.daily * s.effective_lead_time) + (vel.daily * s.effective_days_of_stock)))::int AS overstocked_units,
+        (GREATEST(0, ci.current_stock - s.effective_safety_stock - ((vel.daily * s.effective_lead_time) + (vel.daily * s.effective_days_of_stock)))) * ci.current_effective_cost AS overstocked_cost,
+        (GREATEST(0, ci.current_stock - s.effective_safety_stock - ((vel.daily * s.effective_lead_time) + (vel.daily * s.effective_days_of_stock)))) * ci.current_price AS overstocked_retail,

        -- Old Stock Flag
        (ci.created_at::date < _current_date - INTERVAL '60 day') AND
@@ -542,18 +639,18 @@ BEGIN
            ELSE
                CASE
                    -- Check for overstock first 
-                    WHEN GREATEST(0, ci.current_stock - s.effective_safety_stock - ((calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_lead_time) + (calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int) * s.effective_days_of_stock))) > 0 THEN 'Overstock'
+                    WHEN GREATEST(0, ci.current_stock - s.effective_safety_stock - ((vel.daily * s.effective_lead_time) + (vel.daily * s.effective_days_of_stock))) > 0 THEN 'Overstock'
                    
                    -- Check for Critical stock
                    WHEN ci.current_stock <= 0 OR 
-                         (ci.current_stock / NULLIF(calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int), 0)) <= 0 THEN 'Critical'
+                         (ci.current_stock / NULLIF(vel.daily, 0)) <= 0 THEN 'Critical'
                    
-                    WHEN (ci.current_stock / NULLIF(calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int), 0)) < (COALESCE(s.effective_lead_time, 30) * 0.5) THEN 'Critical'
+                    WHEN (ci.current_stock / NULLIF(vel.daily, 0)) < (COALESCE(s.effective_lead_time, 30) * 0.5) THEN 'Critical'
                    
                    -- Check for reorder soon
-                    WHEN ((ci.current_stock + COALESCE(ooi.on_order_qty, 0)) / NULLIF(calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int), 0)) < (COALESCE(s.effective_lead_time, 30) + 7) THEN 
+                    WHEN ((ci.current_stock + COALESCE(ooi.on_order_qty, 0)) / NULLIF(vel.daily, 0)) < (COALESCE(s.effective_lead_time, 30) + 7) THEN 
                         CASE
-                           WHEN (ci.current_stock / NULLIF(calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int), 0)) < (COALESCE(s.effective_lead_time, 30) * 0.5) THEN 'Critical'
+                           WHEN (ci.current_stock / NULLIF(vel.daily, 0)) < (COALESCE(s.effective_lead_time, 30) * 0.5) THEN 'Critical'
                           ELSE 'Reorder Soon'
                         END
                    
@@ -574,7 +671,7 @@ BEGIN
                          END) > 180 THEN 'At Risk'
                          
                    -- Very high stock cover is at risk too
-                    WHEN (ci.current_stock / NULLIF(calculate_sales_velocity(sa.sales_30d::int, sa.stockout_days_30d::int), 0)) > 365 THEN 'At Risk'
+                    WHEN (ci.current_stock / NULLIF(vel.daily, 0)) > 365 THEN 'At Risk'
                         
                    -- New products (less than 30 days old)
                    WHEN (CASE 
@@ -624,7 +721,11 @@ BEGIN
    LEFT JOIN ServiceLevels sl ON ci.pid = sl.pid
    LEFT JOIN BeginningStock bs ON ci.pid = bs.pid
    LEFT JOIN SeasonalityAnalysis season ON ci.pid = season.pid
-    WHERE s.exclude_forecast IS FALSE OR s.exclude_forecast IS NULL -- Exclude products explicitly marked
+    LEFT JOIN StockCoverage sc ON ci.pid = sc.pid
+    LEFT JOIN ProductVelocity vel ON ci.pid = vel.pid
+    -- NOTE: products with exclude_from_forecast still get a metrics row (so they
+    -- appear in brand/vendor/category rollups); only their forecast-derived
+    -- columns are NULLed via ProductVelocity.

    ON CONFLICT (pid) DO UPDATE SET
        last_calculated = EXCLUDED.last_calculated,
@@ -463,7 +463,7 @@ router.get('/efficiency', async (req, res) => {
        SUM(revenue_30d) AS revenue_30d,
        CASE
          WHEN SUM(avg_stock_cost_30d) > 0
-          THEN (SUM(profit_30d) / SUM(avg_stock_cost_30d)) * 12
+          THEN (SUM(profit_30d) / SUM(avg_stock_cost_30d)) * 12.17
          ELSE 0
        END AS gmroi
      FROM product_metrics
@@ -357,6 +357,9 @@ router.get('/forecast/metrics', async (req, res) => {

            const active = parseInt(totals.active_products) || 1;
            const curveProducts = parseInt(totals.curve_products) || 0;
+            // NOTE: despite the name, this is "share of active products forecast via
+            // lifecycle curves" (curve coverage), NOT a statistical confidence. It only
+            // feeds a per-day tooltip field. See FORECAST_FIX_PLAN F9 (point 4).
            const confidenceLevel = parseFloat((curveProducts / active).toFixed(2));

            // Daily series from actual forecast
@@ -687,14 +690,29 @@ router.get('/forecast/accuracy', async (req, res) => {
        const { rows: metrics } = await executeQuery(`
            SELECT metric_type, dimension_value, sample_size,
                total_actual_units, total_forecast_units,
-                mae, wmape, bias, rmse
+                mae, wmape, bias, rmse, naive_wmape, fva
            FROM forecast_accuracy
            WHERE run_id = $1
            ORDER BY metric_type, dimension_value
        `, [latestRunId]);

+        // Shared shaping for an "overall"-style aggregate row (daily or weekly grain).
+        const shapeOverall = (m) => m ? {
+            sampleSize: parseInt(m.sample_size),
+            totalActual: parseFloat(m.total_actual_units) || 0,
+            totalForecast: parseFloat(m.total_forecast_units) || 0,
+            mae: m.mae != null ? parseFloat(parseFloat(m.mae).toFixed(4)) : null,
+            wmape: m.wmape != null ? parseFloat((parseFloat(m.wmape) * 100).toFixed(1)) : null,
+            bias: m.bias != null ? parseFloat(parseFloat(m.bias).toFixed(4)) : null,
+            rmse: m.rmse != null ? parseFloat(parseFloat(m.rmse).toFixed(4)) : null,
+            naiveWmape: m.naive_wmape != null ? parseFloat((parseFloat(m.naive_wmape) * 100).toFixed(1)) : null,
+            fva: m.fva != null ? parseFloat(parseFloat(m.fva).toFixed(3)) : null,
+        } : null;
+
        // Organize into response structure
-        const overall = metrics.find(m => m.metric_type === 'overall');
+        const overall = metrics.find(m => m.metric_type === 'overall' && m.dimension_value === 'all')
+        const overallInclDormant = metrics.find(m => m.metric_type === 'overall' && m.dimension_value === 'all_incl_dormant')
+        const overallWeekly = metrics.find(m => m.metric_type === 'overall_weekly');
        const byPhase = metrics
            .filter(m => m.metric_type === 'by_phase')
            .map(m => ({
@@ -706,6 +724,8 @@ router.get('/forecast/accuracy', async (req, res) => {
                wmape: m.wmape != null ? parseFloat((parseFloat(m.wmape) * 100).toFixed(1)) : null,
                bias: m.bias != null ? parseFloat(parseFloat(m.bias).toFixed(4)) : null,
                rmse: m.rmse != null ? parseFloat(parseFloat(m.rmse).toFixed(4)) : null,
+                naiveWmape: m.naive_wmape != null ? parseFloat((parseFloat(m.naive_wmape) * 100).toFixed(1)) : null,
+                fva: m.fva != null ? parseFloat(parseFloat(m.fva).toFixed(3)) : null,
            }))
            .sort((a, b) => (b.totalActual || 0) - (a.totalActual || 0));

@@ -763,6 +783,26 @@ router.get('/forecast/accuracy', async (req, res) => {
            sampleSize: parseInt(r.sample_size),
        }));

+        // Weekly-grain trend across runs (starts empty for old runs that predate
+        // the overall_weekly metric — that's expected, no backfill). F9.
+        const { rows: weeklyTrendRows } = await executeQuery(`
+            SELECT fr.finished_at::date AS run_date,
+                fa.wmape, fa.naive_wmape, fa.fva, fa.sample_size
+            FROM forecast_accuracy fa
+            JOIN forecast_runs fr ON fr.id = fa.run_id
+            WHERE fa.metric_type = 'overall_weekly'
+              AND fa.dimension_value = 'all'
+            ORDER BY fr.finished_at
+        `);
+
+        const accuracyTrendWeekly = weeklyTrendRows.map(r => ({
+            date: r.run_date instanceof Date ? r.run_date.toISOString().split('T')[0] : r.run_date,
+            wmape: r.wmape != null ? parseFloat((parseFloat(r.wmape) * 100).toFixed(1)) : null,
+            naiveWmape: r.naive_wmape != null ? parseFloat((parseFloat(r.naive_wmape) * 100).toFixed(1)) : null,
+            fva: r.fva != null ? parseFloat(parseFloat(r.fva).toFixed(3)) : null,
+            sampleSize: parseInt(r.sample_size),
+        }));
+
        res.json({
            hasData: true,
            computedAt,
@@ -775,20 +815,15 @@ router.get('/forecast/accuracy', async (req, res) => {
                    ? historyInfo.latest_date.toISOString().split('T')[0]
                    : historyInfo.latest_date,
            },
-            overall: overall ? {
-                sampleSize: parseInt(overall.sample_size),
-                totalActual: parseFloat(overall.total_actual_units) || 0,
-                totalForecast: parseFloat(overall.total_forecast_units) || 0,
-                mae: overall.mae != null ? parseFloat(parseFloat(overall.mae).toFixed(4)) : null,
-                wmape: overall.wmape != null ? parseFloat((parseFloat(overall.wmape) * 100).toFixed(1)) : null,
-                bias: overall.bias != null ? parseFloat(parseFloat(overall.bias).toFixed(4)) : null,
-                rmse: overall.rmse != null ? parseFloat(parseFloat(overall.rmse).toFixed(4)) : null,
-            } : null,
+            overall: shapeOverall(overall),
+            overallInclDormant: shapeOverall(overallInclDormant),
+            overallWeekly: shapeOverall(overallWeekly),
            byPhase,
            byLeadTime,
            byMethod,
            dailyTrend,
            accuracyTrend,
+            accuracyTrendWeekly,
        });
    } catch (err) {
        console.error('Error fetching forecast accuracy:', err);
@@ -2,7 +2,7 @@ import { useQuery } from "@tanstack/react-query"
 import { apiFetch } from '@/utils/api';
 import { BarChart, Bar, ResponsiveContainer, XAxis, YAxis, Tooltip as RechartsTooltip, Cell, LineChart, Line } from "recharts"
 import config from "@/config"
-import { Target, TrendingDown, ArrowUpDown } from "lucide-react"
+import { Target, TrendingDown, ArrowUpDown, Swords } from "lucide-react"
 import { Tooltip as UITooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip"
 import { PHASE_CONFIG } from "@/utils/lifecyclePhases"

@@ -14,6 +14,8 @@ interface OverallMetrics {
  wmape: number | null
  bias: number | null
  rmse: number | null
+  naiveWmape?: number | null
+  fva?: number | null
 }

 interface PhaseAccuracy {
@@ -25,6 +27,8 @@ interface PhaseAccuracy {
  wmape: number | null
  bias: number | null
  rmse: number | null
+  naiveWmape?: number | null
+  fva?: number | null
 }

 interface LeadTimeAccuracy {
@@ -51,11 +55,14 @@ interface AccuracyData {
  daysOfHistory?: number
  historyRange?: { from: string; to: string }
  overall?: OverallMetrics
+  overallInclDormant?: OverallMetrics
+  overallWeekly?: OverallMetrics
  byPhase?: PhaseAccuracy[]
  byLeadTime?: LeadTimeAccuracy[]
  byMethod?: { method: string; sampleSize: number; mae: number | null; wmape: number | null; bias: number | null }[]
  dailyTrend?: { date: string; mae: number | null; wmape: number | null; bias: number | null }[]
  accuracyTrend?: AccuracyTrendPoint[]
+  accuracyTrendWeekly?: { date: string; wmape: number | null; naiveWmape: number | null; fva: number | null; sampleSize: number }[]
 }

 function MetricSkeleton() {
@@ -74,12 +81,30 @@ function formatBias(bias: number | null): string {
 }

 function getAccuracyColor(wmape: number | null): string {
+  // Daily-grain thresholds (used for the by-phase / lead-time bars).
  if (wmape === null) return "text-muted-foreground"
  if (wmape <= 30) return "text-green-600"
  if (wmape <= 50) return "text-yellow-600"
  return "text-red-600"
 }

+function getWeeklyAccuracyColor(wmape: number | null): string {
+  // Weekly per-product grain has a much lower achievable floor than daily grain
+  // on this intermittent-demand catalog, so the headline uses its own thresholds.
+  if (wmape === null) return "text-muted-foreground"
+  if (wmape <= 60) return "text-green-600"
+  if (wmape <= 90) return "text-yellow-600"
+  return "text-red-600"
+}
+
+function formatSignedPct(ratio: number | null, digits = 0): string {
+  // ratio is a fraction (0.7 => +70%); null-safe.
+  if (ratio === null || ratio === undefined) return "N/A"
+  const pct = ratio * 100
+  const sign = pct > 0 ? "+" : ""
+  return `${sign}${pct.toFixed(digits)}%`
+}
+
 export function ForecastAccuracy() {
  const { data, error, isLoading } = useQuery<AccuracyData>({
    queryKey: ["forecast-accuracy"],
@@ -133,6 +158,24 @@ export function ForecastAccuracy() {
    sampleSize: lt.sampleSize,
  }))

+  // Headline prefers the weekly-grain WMAPE (informative); falls back to the
+  // daily-grain number until enough complete weeks of history exist.
+  const weeklyWmape = data?.overallWeekly?.wmape ?? null
+  const usingWeekly = weeklyWmape !== null
+  const headlineWmape = usingWeekly ? weeklyWmape : (data?.overall?.wmape ?? null)
+  const headlineColor = usingWeekly
+    ? getWeeklyAccuracyColor(headlineWmape)
+    : getAccuracyColor(headlineWmape)
+  // Net forecast-vs-actual ratio (e.g. +70% = over-forecasting), from the
+  // daily 'all' totals — far more legible than bias in raw units.
+  const totalFc = data?.overall?.totalForecast ?? 0
+  const totalAct = data?.overall?.totalActual ?? 0
+  const fcVsAct = totalAct > 0 ? (totalFc / totalAct - 1) : null
+  // Value over the naive baseline; prefer weekly grain to match the headline.
+  const naiveSource = data?.overallWeekly ?? data?.overall
+  const naiveWmape = naiveSource?.naiveWmape ?? null
+  const fva = naiveSource?.fva ?? null
+
  return (
    <div>
      <h3 className="text-lg font-medium mb-3">Forecast Accuracy</h3>
@@ -148,10 +191,24 @@ export function ForecastAccuracy() {
              <div className="flex items-baseline justify-between">
                <div className="flex items-center gap-2">
                  <Target className="h-4 w-4 text-muted-foreground" />
-                  <p className="text-sm font-medium text-muted-foreground">WMAPE</p>
+                  <p className="text-sm font-medium text-muted-foreground">
+                    WMAPE <span className="text-[10px] opacity-70">({usingWeekly ? "weekly" : "daily"})</span>
+                  </p>
                </div>
-                <p className={`text-lg font-bold ${getAccuracyColor(data?.overall?.wmape ?? null)}`}>
-                  {formatWmape(data?.overall?.wmape ?? null)}
+                <p className={`text-lg font-bold ${headlineColor}`}>
+                  {formatWmape(headlineWmape)}
+                </p>
+              </div>
+              <div className="flex items-baseline justify-between">
+                <div className="flex items-center gap-2">
+                  <ArrowUpDown className="h-4 w-4 text-muted-foreground" />
+                  <p className="text-sm font-medium text-muted-foreground">Forecast vs actual</p>
+                </div>
+                <p className="text-lg font-bold">
+                  {formatSignedPct(fcVsAct)}
+                  <span className="text-xs font-normal text-muted-foreground ml-1">
+                    {(fcVsAct ?? 0) > 0 ? "over" : (fcVsAct ?? 0) < 0 ? "under" : ""}
+                  </span>
                </p>
              </div>
              <div className="flex items-baseline justify-between">
@@ -160,20 +217,24 @@ export function ForecastAccuracy() {
                  <p className="text-sm font-medium text-muted-foreground">MAE</p>
                </div>
                <p className="text-lg font-bold">
-                  {data?.overall?.mae !== null ? data?.overall?.mae?.toFixed(2) : "N/A"}
+                  {data?.overall?.mae != null ? data?.overall?.mae?.toFixed(2) : "N/A"}
                  <span className="text-xs font-normal text-muted-foreground ml-1">units</span>
                </p>
              </div>
              <div className="flex items-baseline justify-between">
                <div className="flex items-center gap-2">
-                  <ArrowUpDown className="h-4 w-4 text-muted-foreground" />
-                  <p className="text-sm font-medium text-muted-foreground">Bias</p>
+                  <Swords className="h-4 w-4 text-muted-foreground" />
+                  <p className="text-sm font-medium text-muted-foreground">vs naive</p>
                </div>
                <p className="text-lg font-bold">
-                  {formatBias(data?.overall?.bias ?? null)}
-                  <span className="text-xs font-normal text-muted-foreground ml-1">
-                    {(data?.overall?.bias ?? 0) > 0 ? "over" : (data?.overall?.bias ?? 0) < 0 ? "under" : ""}
+                  <span className={fva != null ? (fva > 0 ? "text-green-600" : "text-red-600") : "text-muted-foreground"}>
+                    {fva != null ? `${formatSignedPct(fva)} FVA` : "N/A"}
                  </span>
+                  {naiveWmape != null && (
+                    <span className="text-xs font-normal text-muted-foreground ml-1">
+                      naive {formatWmape(naiveWmape)}
+                    </span>
+                  )}
                </p>
              </div>
            </div>
Author	SHA1	Message	Date
matt	069a44bd54	Import/calculations improvements	2026-06-11 19:32:20 -04:00
matt	3b2f51e6b8	Forecast improvements	2026-06-11 14:55:33 -04:00