Forecast improvements
This commit is contained in:
@@ -0,0 +1,343 @@
|
|||||||
|
# Forecast Accuracy Fix Plan
|
||||||
|
|
||||||
|
**Written:** 2026-06-10, from a code + live-data review of the forecasting pipeline.
|
||||||
|
**Goal:** eliminate the systematic ~1.7–2x over-forecast bias, recover demand the model currently ignores, and fix the accuracy measurement so improvements are visible and long-lead forecasts are validated.
|
||||||
|
|
||||||
|
Read this whole document before starting. Fixes are grouped into phases; each phase is independently deployable and has its own validation step. Line numbers are as of 2026-06-10 — re-locate by function name if the file has drifted.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Diagnosis summary (measured 2026-06-10)
|
||||||
|
|
||||||
|
The dashboard headline is **202% WMAPE**. Decomposition of that number, all measured against `forecast_accuracy` run 129 and ad-hoc queries:
|
||||||
|
|
||||||
|
| Finding | Evidence |
|
||||||
|
|---|---|
|
||||||
|
| Daily-grain WMAPE has a ~190% *floor* for this catalog | Avg demand ≈ 0.11 units/product/day. A perfect rate forecast of intermittent demand scores ≈ 2e^−λ ≈ 190%. A trivial trailing-30d-average naive forecast scores **204%** on the same products/days; the engine scores 221% (slightly *worse than naive*). |
|
||||||
|
| Same forecasts at 21-day-per-product grain: **109%**; bias-corrected: **75%** | Half the headline is metric grain, most of the rest is bias. |
|
||||||
|
| Aggregate over-forecast **+70%** (227,690 forecast vs 133,861 actual units) | Portfolio daily ratio is 1.5–2.5x on most days. |
|
||||||
|
| Decay phase 2.47x over (fc 51,675 / act 20,915) | Root cause F1: velocity inflated **4.07x** (measured: 1.353 vs true 0.332 units/day) by averaging over sparse snapshot rows. |
|
||||||
|
| Preorder phase 2.15x over (fc 67,212 / act 31,189) | Root cause F4: launch curve applied at age=0 starting *today*, ignoring that the product hasn't arrived. |
|
||||||
|
| Mature phase 1.69x over (fc 57,857 / act 34,313) | Root causes F2 (history edge truncation) + F3 (seasonal double-count). |
|
||||||
|
| Dormant products sold **16,180 units** (~11% of demand) against zero forecasts | Root cause F5; also excluded from the headline metric, so invisible. |
|
||||||
|
| All 879,800 accuracy samples are in the **1–7d lead bucket** | Root cause F7: archiving design only ever saves yesterday's slice. 30–90d forecasts (what purchasing uses) are never validated. |
|
||||||
|
| Launch phase is healthy: WMAPE 100%, bias −6%, beats naive | The lifecycle-curve concept works; its calibration inputs are broken. Don't redesign it. |
|
||||||
|
|
||||||
|
**Key data fact** underlying several fixes: `daily_product_snapshots` is **activity-based and sparse** — only ~500–1,800 of ~38K products have a row on a given day. Verified: every pid-day with an order DOES have a snapshot row and units match (5,234/5,234 pid-days, 8,980 vs 8,984 units over 7 days). So *missing row = zero sales*, and any query that aggregates over only the rows that exist is averaging over sold-days.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Environment & operational notes
|
||||||
|
|
||||||
|
- **Files:** engine is `inventory-server/scripts/forecast/forecast_engine.py`; orchestrator `run_forecast.js` in the same dir; consumer endpoints in `inventory-server/src/routes/dashboard.js` (`/forecast/metrics` ~line 308, `/forecast/accuracy` ~line 647); overview UI in `inventory/src/components/overview/ForecastMetrics.tsx` and `ForecastAccuracy.tsx`.
|
||||||
|
- **Local `inventory-server/` is NFS-mounted to `/var/www/inventory/` on the netcup server.** Edits made locally appear on the server immediately — no copy step. Do NOT run bulk `grep`/`find`/`node --check` over `inventory-server/` locally (the mount hangs); `ssh netcup` and run them there.
|
||||||
|
- **Avoid the glob tool** for search in this repo; use bash (`grep`/`rg` via ssh for server-side trees).
|
||||||
|
- **Scheduling:** the engine runs daily at **09:30:01 server time** (runs table is conclusive), but the cron entry is NOT in matt's crontab, `/etc/cron.d`, or pm2. Likely root's crontab (`sudo crontab -l` to confirm). You do not need to touch the schedule for these fixes; just know a run fires at 09:30 daily and occasionally skips days (e.g. 2026-06-07/08).
|
||||||
|
- **Manual test runs:** `ssh netcup`, then `cd /var/www/inventory/scripts/forecast && node run_forecast.js`. Takes ~3.5–4 min. Safe to run any time: the engine TRUNCATEs and rebuilds `product_forecasts`, archives prior past-dated rows, and records a new `forecast_runs` row. Python deps live in the server venv (`venv/`); `run_forecast.js` handles env + venv automatically.
|
||||||
|
- **DB access for validation:** `ssh netcup`, then `PGPASSWORD=6D3GUkxuFgi2UghwgnUd psql -h localhost -U inventory_readonly -d inventory_db`. The engine itself connects with the write user via env vars loaded from `/var/www/inventory/.env` — schema changes should be made idempotently *inside the engine code* (the file already uses `CREATE TABLE IF NOT EXISTS` / `CREATE INDEX IF NOT EXISTS`; use `ALTER TABLE ... ADD COLUMN IF NOT EXISTS` the same way) so no manual migration is needed.
|
||||||
|
- **Python gotchas already handled in this file (don't regress):** numpy types must go through the registered psycopg2 adapters; `pd.Series.combine_first()` keeps zeros over real data — use `reindex(..., fill_value=0.0)`.
|
||||||
|
- Engine runtime budget: currently ~212–227s. Phases 1–2 shouldn't move it meaningfully; Phase 3's extra archiving adds one INSERT…SELECT. If runtime balloons past ~6 min, investigate before shipping.
|
||||||
|
- `--backfill` mode (`backfill_accuracy_data`) is an in-sample backtest using the *old* formulas. **Do not run it anymore**; there is enough real out-of-sample history. Updating it to match the new logic is optional/low priority (F11).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 1 — Bias bugs in the engine (no schema changes)
|
||||||
|
|
||||||
|
### F1. Decay velocity: stop averaging over sparse snapshot rows
|
||||||
|
|
||||||
|
**Where:** `forecast_engine.py`, `batch_load_product_data()`, the decay query (~lines 697–710).
|
||||||
|
|
||||||
|
**Problem:** `AVG(COALESCE(dps.units_sold, 0))` runs over only the snapshot rows that exist — mostly sold-days. Measured inflation on the current 975 decay products: **4.07x** (1.353 vs 0.332 true units/day). This feeds `compute_scale_factor()` for the decay phase and is the single largest bias source.
|
||||||
|
|
||||||
|
**Fix:** divide the sum by calendar days in the window, clipped to the product's age (decay products are 14–60 days old, so a 20-day-old product's window is 20 days, not 30):
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT dps.pid,
|
||||||
|
SUM(COALESCE(dps.units_sold, 0))::float
|
||||||
|
/ GREATEST(LEAST(30, (CURRENT_DATE - pm.date_first_received::date)), 1) AS avg_daily
|
||||||
|
FROM daily_product_snapshots dps
|
||||||
|
JOIN product_metrics pm ON pm.pid = dps.pid
|
||||||
|
WHERE dps.pid = ANY(%s)
|
||||||
|
AND dps.snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
|
||||||
|
AND dps.snapshot_date >= pm.date_first_received::date
|
||||||
|
GROUP BY dps.pid, pm.date_first_received
|
||||||
|
```
|
||||||
|
|
||||||
|
No Python-side changes needed; `data['decay_velocity']` keeps the same shape. Products with zero snapshot rows in the window still get no entry → existing `scale = 1.0` fallback applies (acceptable: decay classification requires `sales_velocity_daily > 0`, so truly dead products don't reach this path).
|
||||||
|
|
||||||
|
### F2. Mature history: reindex over the full calendar window
|
||||||
|
|
||||||
|
**Where:** `forecast_engine.py`, `forecast_mature()` (~lines 833–836).
|
||||||
|
|
||||||
|
**Problem:** `hist.set_index('snapshot_date').resample('D').sum()` only spans first-snapshot → last-snapshot. Interior gaps correctly become zeros, but **leading and trailing quiet periods are absent**, so the Holt level is fitted on the product's busy span. A marginal mature product whose activity clusters in 2 of the last 8 weeks gets a level ~4x too high.
|
||||||
|
|
||||||
|
**Fix:** replace the resample with an explicit reindex over the full `EXP_SMOOTHING_WINDOW` ending yesterday:
|
||||||
|
|
||||||
|
```python
|
||||||
|
hist = history_df.copy()
|
||||||
|
hist['snapshot_date'] = pd.to_datetime(hist['snapshot_date'])
|
||||||
|
hist = hist.set_index('snapshot_date')['units_sold']
|
||||||
|
full_index = pd.date_range(
|
||||||
|
end=pd.Timestamp(date.today() - timedelta(days=1)),
|
||||||
|
periods=EXP_SMOOTHING_WINDOW, freq='D')
|
||||||
|
series = hist.reindex(full_index, fill_value=0.0).values.astype(float)
|
||||||
|
```
|
||||||
|
|
||||||
|
Notes: (pid, snapshot_date) is unique in `daily_product_snapshots`, so no duplicate-index risk. `observed_mean` and the `cap` recompute over the full window automatically (intended — the cap gets correspondingly tighter). Mature products are by definition >60 days old, so the 60-day window never predates first receipt. Do NOT use `combine_first` (see gotchas above).
|
||||||
|
|
||||||
|
### F3. Stop double-applying the monthly seasonal index
|
||||||
|
|
||||||
|
**Where:** `forecast_engine.py`, `generate_all_forecasts()` — the `seasonal_multipliers` pre-compute (~lines 959–961) and application (~line 1050).
|
||||||
|
|
||||||
|
**Problem:** every per-product calibration (decay velocity, mature Holt level, launch first-week scale, preorder rate, slow-mover velocity) is fitted on *raw recent actuals*, which already embed the current month's seasonal level. The forecast then multiplies by the **absolute** monthly index of the target date. Example from the live indices (`forecast_runs.phase_counts` for run 129): May = 1.224 (sale month), June = 0.982. Early-June forecasts were calibrated on May-sale-inflated velocities and barely discounted — a structural ~25% over-forecast at that transition, and it'll be worse around November (1.316).
|
||||||
|
|
||||||
|
**Fix:** apply the seasonal index *relative to the calibration period*. Compute a calibration index as the average monthly index over the trailing 30 calendar days (robust at month boundaries), then divide:
|
||||||
|
|
||||||
|
```python
|
||||||
|
today = date.today()
|
||||||
|
trailing = [today - timedelta(days=i) for i in range(1, 31)]
|
||||||
|
calibration_index = float(np.mean([monthly_indices.get(d.month, 1.0) for d in trailing]))
|
||||||
|
seasonal_multipliers = [
|
||||||
|
monthly_indices.get(d.month, 1.0) / max(calibration_index, 0.1)
|
||||||
|
for d in forecast_dates
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
Leave the DOW multipliers absolute — every calibration is a multi-week average and therefore DOW-neutral, so reshaping by absolute DOW indices is correct.
|
||||||
|
|
||||||
|
**Optional sub-fix (same area, low priority):** the monthly indices are computed from a single trailing 365-day window, so each month appears once and YoY growth contaminates "seasonality". A cheap improvement is widening `SEASONAL_LOOKBACK_DAYS` to 730 and averaging the two observations of each month. Do this only after the main fixes are validated.
|
||||||
|
|
||||||
|
### Phase 1 validation
|
||||||
|
|
||||||
|
Deploy (edit locally; NFS propagates), run the engine manually once, wait for 3–5 daily cycles, then:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
-- Portfolio ratio per day (target: drifts from ~2.0 toward 0.8–1.3)
|
||||||
|
WITH ranked AS (
|
||||||
|
SELECT pfh.pid, pfh.forecast_date, pfh.forecast_units, pfh.lifecycle_phase,
|
||||||
|
ROW_NUMBER() OVER (PARTITION BY pfh.pid, pfh.forecast_date ORDER BY fr.started_at DESC) rn
|
||||||
|
FROM product_forecasts_history pfh
|
||||||
|
JOIN forecast_runs fr ON fr.id = pfh.run_id
|
||||||
|
WHERE pfh.forecast_date >= CURRENT_DATE - 7)
|
||||||
|
SELECT r.forecast_date, round(SUM(r.forecast_units),0) AS fc,
|
||||||
|
SUM(COALESCE(dps.units_sold,0)) AS act,
|
||||||
|
round(SUM(r.forecast_units)/NULLIF(SUM(COALESCE(dps.units_sold,0)),0),2) AS ratio
|
||||||
|
FROM ranked r
|
||||||
|
LEFT JOIN daily_product_snapshots dps ON dps.pid = r.pid AND dps.snapshot_date = r.forecast_date
|
||||||
|
WHERE r.rn = 1 AND r.lifecycle_phase != 'dormant'
|
||||||
|
GROUP BY 1 ORDER BY 1;
|
||||||
|
```
|
||||||
|
|
||||||
|
Also check `forecast_accuracy` `by_phase` rows for the newest run: decay bias should fall from +0.35 toward ~0, mature from +0.17 toward ~0. (Accuracy lags ~1 day behind each fix since it evaluates yesterday's forecasts.)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 2 — Demand the model currently ignores or mistimes
|
||||||
|
|
||||||
|
### F4. Preorder: forecast the preorder rate until arrival, launch curve after
|
||||||
|
|
||||||
|
**Where:** `forecast_engine.py` — `batch_load_product_data()` (add arrival dates), `generate_all_forecasts()` preorder branch (~lines 1005–1009), and `forecast_from_curve()` (or a small wrapper).
|
||||||
|
|
||||||
|
**Problem:** preorder products run the launch curve from `age=0` starting **today**, i.e. full first-week launch sales while the product is still weeks from arriving. Actual preorder-period sales are a much slower trickle.
|
||||||
|
|
||||||
|
**Fix:**
|
||||||
|
|
||||||
|
1. Batch-load each preorder product's expected arrival from `purchase_orders` (line-item grain: it has `pid` and `expected_date` directly). Open statuses verified against live data: `created`, `ordered`, `electronically_sent`, `receiving_started` (~705 open line items currently have a future `expected_date`):
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT pid, MIN(expected_date) AS expected_arrival
|
||||||
|
FROM purchase_orders
|
||||||
|
WHERE pid = ANY(%s)
|
||||||
|
AND status IN ('created', 'ordered', 'electronically_sent', 'receiving_started')
|
||||||
|
AND expected_date IS NOT NULL
|
||||||
|
AND expected_date >= CURRENT_DATE
|
||||||
|
GROUP BY pid
|
||||||
|
```
|
||||||
|
|
||||||
|
Fallbacks, in order: (a) an open PO with a *past* `expected_date` → assume arrival in 7 days; (b) no PO at all → arrival in 14 days (and log a counter of how many hit this default).
|
||||||
|
|
||||||
|
2. In the preorder branch, build the daily array piecewise. Let `days_until_arrival = (expected_arrival - today).days`:
|
||||||
|
- Days `0 .. days_until_arrival-1`: flat observed preorder daily rate = `preorder_sales[pid] / max(preorder_days[pid], 1)` (both already batch-loaded), clamped to ≤ the curve's scaled week-0 daily value.
|
||||||
|
- Days `days_until_arrival .. horizon`: `forecast_from_curve(curve_info, scale, age_days=0, ...)` shifted so the curve's day 0 lands on the arrival date (i.e. pass `horizon_days - days_until_arrival` and offset into the output array).
|
||||||
|
- Keep the existing `compute_scale_factor('preorder', ...)` for the post-arrival curve; the pre-arrival segment doesn't use it.
|
||||||
|
|
||||||
|
This is consistent with how the reference curves were built: historical preorder units were recorded on their **order dates** (pre-arrival), so week-0 of the fitted curves reflects post-receipt orders, not the backlog.
|
||||||
|
|
||||||
|
### F5. Dormant products: small positive rate instead of hard zero, and count them
|
||||||
|
|
||||||
|
**Where:** `forecast_engine.py` — `generate_all_forecasts()` dormant branch (~lines 1040–1042), `batch_load_product_data()`, and `compute_accuracy()`.
|
||||||
|
|
||||||
|
**Problem:** all ~28K dormant products are forecast at exactly 0, yet they sold 16,180 units in the eval window (~11% of all demand) — restocks, promos, long-tail. Worse, dormant is *excluded* from the headline accuracy filter, so this miss is invisible.
|
||||||
|
|
||||||
|
**Fix (cheap version, do this now):**
|
||||||
|
|
||||||
|
1. Batch-load a trailing-180-day order rate for dormant products (11,362 of them have ≥1 sale in 180d — verified):
|
||||||
|
|
||||||
|
```sql
|
||||||
|
SELECT o.pid, SUM(o.quantity) / 180.0 AS rate
|
||||||
|
FROM orders o
|
||||||
|
WHERE o.pid = ANY(%s)
|
||||||
|
AND o.canceled IS DISTINCT FROM TRUE
|
||||||
|
AND o.date >= CURRENT_DATE - INTERVAL '180 days'
|
||||||
|
GROUP BY o.pid
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Dormant branch: if the product has a rate > 0, forecast it flat with `method = 'velocity'`; else keep zeros with `method = 'zero'`. Apply the same DOW/seasonal multipliers as everything else (automatic — they're applied after the branch).
|
||||||
|
3. In `compute_accuracy()`, add a second overall row: `metric_type='overall', dimension_value='all_incl_dormant'` with no dormant filter (keep the existing `'all'` row unchanged for trend continuity). One extra entry in the `dimensions`/`filter_clauses` dicts.
|
||||||
|
|
||||||
|
**Upgrade path (optional, Phase 4):** replace flat rates for `slow_mover` + dormant-with-sales with TSB (Teunter–Syntetos–Babai), the standard intermittent-demand method with obsolescence handling. Per product over a daily series `d_t` (build it from snapshots the F2 way — full calendar reindex):
|
||||||
|
|
||||||
|
```
|
||||||
|
if d_t > 0: p_t = p_{t-1} + β·(1 − p_{t-1}); z_t = z_{t-1} + α·(d_t − z_{t-1})
|
||||||
|
else: p_t = p_{t-1}·(1 − β); z_t = z_{t-1}
|
||||||
|
forecast = p_T · z_T (flat across horizon)
|
||||||
|
```
|
||||||
|
|
||||||
|
Start with α=0.1, β=0.05, initialize p = (nonzero days / total days), z = mean of nonzero demands. Scope: slow_mover (~6K) + dormant with 180d sales (~11K); series from up to 180 days of snapshots (sparse rows → ~manageable volume). Only do this after Phase 3 measurement exists to prove it beats the flat rates.
|
||||||
|
|
||||||
|
### Phase 2 validation
|
||||||
|
|
||||||
|
After 3–5 cycles: preorder `by_phase` bias should drop from +0.85 toward < +0.3; the new `all_incl_dormant` row should appear and its `total_actual_units` minus `'all'`'s should be largely *covered* rather than all-miss (dormant `bias` rising from −1.36 toward ~−0.3 or better).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 3 — Fix the measurement (schema + engine + API + UI)
|
||||||
|
|
||||||
|
> Without this phase you cannot see whether Phases 1–2 worked except by ad-hoc SQL, the lead-time chart stays a single bucket forever, and the dashboard keeps displaying a number with a 190% floor in red.
|
||||||
|
|
||||||
|
### F7. Archive long-lead forecasts so 15/30/60/90d accuracy exists
|
||||||
|
|
||||||
|
**Where:** `forecast_engine.py` — `archive_forecasts()` (~lines 1086–1154), `compute_accuracy()` CTE (~lines 1201–1228).
|
||||||
|
|
||||||
|
**Problem:** the current design archives only *past-dated* rows of the previous run before truncation. With daily runs, that's only ever the 1-day-ahead slice — all 879,800 accuracy samples sit in the '1-7d' bucket and the longer buckets in the UI chart can never populate. Purchasing decisions ride on 30–60d forecasts that are never validated.
|
||||||
|
|
||||||
|
**Fix:**
|
||||||
|
|
||||||
|
1. Keep the existing past-date archiving exactly as is (it provides dense short-lead coverage).
|
||||||
|
2. After `generate_all_forecasts()` completes, additionally archive a **sampled set of future leads** from the new run, non-dormant only, attributed to the *current* run id (correct attribution, unlike the past-date path which attributes to the previous run):
|
||||||
|
|
||||||
|
```sql
|
||||||
|
INSERT INTO product_forecasts_history
|
||||||
|
(run_id, pid, forecast_date, forecast_units, forecast_revenue,
|
||||||
|
lifecycle_phase, forecast_method, confidence_lower, confidence_upper, generated_at)
|
||||||
|
SELECT %(run_id)s, pid, forecast_date, forecast_units, forecast_revenue,
|
||||||
|
lifecycle_phase, forecast_method, confidence_lower, confidence_upper, generated_at
|
||||||
|
FROM product_forecasts
|
||||||
|
WHERE lifecycle_phase != 'dormant'
|
||||||
|
AND forecast_date - CURRENT_DATE IN (7, 14, 30, 60, 89)
|
||||||
|
ON CONFLICT (run_id, pid, forecast_date) DO NOTHING
|
||||||
|
```
|
||||||
|
|
||||||
|
Volume: ~10K non-dormant products × 5 leads ≈ 50K rows/day; the existing 90-day prune (`forecast_date < CURRENT_DATE - 90`) bounds steady state at a few million rows. Note future-dated rows survive until their date passes + 90 days — that's intended.
|
||||||
|
|
||||||
|
3. **CRITICAL companion change** in `compute_accuracy()`: the accuracy CTE must now exclude not-yet-realized rows, or future-dated archives get scored against actual=0:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
FROM product_forecasts_history pfh
|
||||||
|
JOIN forecast_runs fr ON fr.id = pfh.run_id
|
||||||
|
WHERE pfh.forecast_date < CURRENT_DATE -- ADD THIS
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Dedup semantics change.** Today's `ROW_NUMBER() OVER (PARTITION BY pid, forecast_date ORDER BY started_at DESC)` keeps only the latest (= shortest-lead) row per pid/date, which would silently discard all the new long-lead rows. Restructure:
|
||||||
|
- Compute `lead_days = forecast_date - started_at::date` and the lead bucket *inside* `ranked_history`.
|
||||||
|
- For `by_lead_time`: dedup `PARTITION BY pid, forecast_date, lead_bucket` (one sample per pid/date/bucket, latest run wins within a bucket).
|
||||||
|
- For everything else (`overall`, `by_phase`, `by_method`, `daily`, and the new weekly metric below): restrict to `lead_days BETWEEN 0 AND 6` and keep the existing per-(pid, date) dedup. This preserves the current meaning of the headline metrics (short-lead) while the lead-time table becomes real.
|
||||||
|
|
||||||
|
### F8. Track a naive baseline (forecast value-added)
|
||||||
|
|
||||||
|
**Where:** `archive_forecasts()` (both INSERT paths), `compute_accuracy()`, `forecast_accuracy` schema, `/forecast/accuracy` endpoint.
|
||||||
|
|
||||||
|
**Problem:** the engine currently *loses* to a trailing-average naive forecast (221% vs 204% daily WMAPE) and nothing on the dashboard would ever reveal that. Every accuracy improvement should be judged as value-over-naive.
|
||||||
|
|
||||||
|
**Fix:**
|
||||||
|
|
||||||
|
1. Schema (idempotent, in the ensure blocks): `ALTER TABLE product_forecasts_history ADD COLUMN IF NOT EXISTS naive_units NUMERIC(10,2);` and `ALTER TABLE forecast_accuracy ADD COLUMN IF NOT EXISTS naive_wmape NUMERIC(10,4), ADD COLUMN IF NOT EXISTS fva NUMERIC(10,4);`
|
||||||
|
2. Populate `naive_units` during both archive INSERTs via a join — naive = flat trailing-28-day average daily units as of archive time (28 days = DOW-balanced; information available at generation; same value at every lead, which is exactly what a naive baseline means):
|
||||||
|
|
||||||
|
```sql
|
||||||
|
LEFT JOIN (
|
||||||
|
SELECT o.pid, SUM(o.quantity) / 28.0 AS naive_daily
|
||||||
|
FROM orders o
|
||||||
|
WHERE o.canceled IS DISTINCT FROM TRUE
|
||||||
|
AND o.date >= CURRENT_DATE - INTERVAL '28 days' AND o.date < CURRENT_DATE
|
||||||
|
GROUP BY o.pid
|
||||||
|
) nv ON nv.pid = pf.pid
|
||||||
|
-- select COALESCE(nv.naive_daily, 0) AS naive_units
|
||||||
|
```
|
||||||
|
|
||||||
|
3. In `compute_accuracy()`, add to each dimension's aggregate: `SUM(ABS(naive_units - actual_units)) / NULLIF(SUM(actual_units),0) AS naive_wmape` and store `fva = 1 - wmape / naive_wmape` (NULL-safe). Rows archived before this change have `naive_units` NULL — treat NULL as excluded (`FILTER (WHERE naive_units IS NOT NULL)` on the naive sums) rather than as zero.
|
||||||
|
4. Endpoint: include `naiveWmape` and `fva` in the `overall` (and per-phase) payload of `/dashboard/forecast/accuracy` in `dashboard.js`.
|
||||||
|
|
||||||
|
### F9. Weekly-grain headline metric + bias as a percentage
|
||||||
|
|
||||||
|
**Where:** `compute_accuracy()`, `/forecast/accuracy` endpoint, `ForecastAccuracy.tsx`.
|
||||||
|
|
||||||
|
**Problem:** daily-grain WMAPE on this catalog has a ~190% floor — as a headline it's noise. The informative numbers are (a) weekly-per-product WMAPE (currently ~109%, target ~70–85% post-fix) and (b) aggregate bias, which the UI currently renders as `+0.108 units` — indistinguishable from zero while the reality is +70%.
|
||||||
|
|
||||||
|
**Fix:**
|
||||||
|
|
||||||
|
1. New metric in `compute_accuracy()`: `metric_type='overall_weekly', dimension_value='all'`. Definition: using the short-lead deduped rows (lead ≤ 6, non-dormant), aggregate per `(pid, date_trunc('week', forecast_date))` keeping only complete weeks (`COUNT(*) = 7`), then `WMAPE = SUM(ABS(fc_week − act_week)) / SUM(act_week)`, excluding pid-weeks where both are 0. Store sample_size = number of pid-weeks. Compute `naive_wmape`/`fva` the same way from `naive_units`.
|
||||||
|
2. Endpoint: expose as `overallWeekly`; also add a weekly variant to the `accuracyTrend` query (`metric_type='overall_weekly'`). The trend will start empty (old runs lack the row) — that's fine; don't backfill.
|
||||||
|
3. `ForecastAccuracy.tsx`:
|
||||||
|
- Headline WMAPE → `overallWeekly.wmape`, labeled "WMAPE (weekly)". Keep daily WMAPE available in a tooltip if desired.
|
||||||
|
- Color thresholds for weekly grain: green ≤ 60, yellow ≤ 90, red above (tunable; document that they're calibrated for intermittent retail demand).
|
||||||
|
- Replace the bias row: show `(totalForecast / totalActual − 1)` as a signed percentage labeled "Forecast vs actual" (both totals already arrive in `overall`). Keep MAE.
|
||||||
|
- Add a "vs naive" line: naive weekly WMAPE and FVA. FVA > 0 = engine adds value.
|
||||||
|
- The lead-time chart needs no code change — buckets will populate as F7 rows mature (7d lead evaluable after 7 days, 30d after 30, etc.).
|
||||||
|
4. `confidenceLevel` in `/forecast/metrics` ([dashboard.js ~line 360]) is "share of products forecast via lifecycle curves", not confidence. It only feeds a per-day tooltip field — rename the JSON field to `curveCoverage` and update the one consumer in `ForecastMetrics.tsx`, or leave it and add a comment; low priority.
|
||||||
|
|
||||||
|
### Phase 3 validation
|
||||||
|
|
||||||
|
- Next run after deploy: `forecast_accuracy` contains `overall_weekly` and `fva` values; `/dashboard/forecast/accuracy` returns them; the overview popover renders weekly WMAPE, bias %, and the naive comparison.
|
||||||
|
- After 7/14/30 days: `by_lead_time` rows appear for '8-14d', '15-30d', '31-60d' buckets respectively (61-90d after ~60 days).
|
||||||
|
- Confirm engine runtime still < ~5 min and `product_forecasts_history` growth ≈ 50–70K rows/day.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Phase 4 — Optional / after the above is proven
|
||||||
|
|
||||||
|
- **F6. TSB for slow movers + dormant** (spec in F5). Gate on Phase 3 measurement: ship only if weekly FVA improves on those phases.
|
||||||
|
- **F10. Confidence-margin source:** `load_accuracy_margins()` feeds daily-grain per-phase WMAPE (clamped to 1.0) into the intervals, so every interval is ±100% — uninformative. Once `overall_weekly` exists, add per-phase weekly rows (`by_phase_weekly`) and source margins from those instead.
|
||||||
|
- **F11.** Update or delete `backfill_accuracy_data()` (it encodes the old formulas). Until then, just don't run `--backfill`.
|
||||||
|
- **F12.** `compute_dow_indices()` weights by revenue but the multipliers are applied to units — switch `SUM(o.price * o.quantity)` to `SUM(o.quantity)`. Tiny effect.
|
||||||
|
- **F13.** Longer term: for reorder decisions the right target is P(lead-time demand > stock), not a point forecast. Evaluate quantile (pinball) loss at lead-time horizons using the existing confidence-interval columns. Design separately.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Success criteria
|
||||||
|
|
||||||
|
1. Rolling-14-day portfolio forecast/actual ratio within **0.8–1.25** (currently 1.5–2.5).
|
||||||
|
2. Weekly-grain WMAPE ≤ **90%** and **FVA > 0** (engine beats naive) sustained for 2+ weeks.
|
||||||
|
3. Decay/preorder/mature per-phase bias within ±0.1 units/day (currently +0.35 / +0.85 / +0.17).
|
||||||
|
4. `all_incl_dormant` actuals covered: dormant bias better than −0.4 (currently −1.36, i.e. 100% miss).
|
||||||
|
5. Lead-time buckets through 31–60d populated with ≥10K samples each within ~6 weeks.
|
||||||
|
6. Launch phase stays healthy (bias within ±0.15, WMAPE not degraded) — regression guard for F3/F4 changes.
|
||||||
|
|
||||||
|
## 5. Re-measurement appendix
|
||||||
|
|
||||||
|
The naive-vs-engine comparison used in the diagnosis (rerun any time; adjust dates):
|
||||||
|
|
||||||
|
```sql
|
||||||
|
WITH ranked AS (
|
||||||
|
SELECT pfh.pid, pfh.forecast_date, pfh.forecast_units, pfh.lifecycle_phase,
|
||||||
|
ROW_NUMBER() OVER (PARTITION BY pfh.pid, pfh.forecast_date ORDER BY fr.started_at DESC) rn
|
||||||
|
FROM product_forecasts_history pfh
|
||||||
|
JOIN forecast_runs fr ON fr.id = pfh.run_id
|
||||||
|
WHERE pfh.forecast_date BETWEEN CURRENT_DATE - 9 AND CURRENT_DATE - 1),
|
||||||
|
eng AS (SELECT * FROM ranked WHERE rn = 1 AND lifecycle_phase != 'dormant'),
|
||||||
|
naive AS (
|
||||||
|
SELECT o.pid, SUM(o.quantity)/30.0 AS naive_daily FROM orders o
|
||||||
|
WHERE o.canceled IS DISTINCT FROM TRUE
|
||||||
|
AND o.date >= CURRENT_DATE - 39 AND o.date < CURRENT_DATE - 9
|
||||||
|
GROUP BY o.pid)
|
||||||
|
SELECT e.lifecycle_phase, COUNT(*) AS n, SUM(COALESCE(dps.units_sold,0)) AS actual,
|
||||||
|
round(SUM(e.forecast_units),0) AS engine_fc, round(SUM(COALESCE(nv.naive_daily,0)),0) AS naive_fc,
|
||||||
|
round(SUM(ABS(e.forecast_units - COALESCE(dps.units_sold,0)))/NULLIF(SUM(COALESCE(dps.units_sold,0)),0),2) AS engine_wmape,
|
||||||
|
round(SUM(ABS(COALESCE(nv.naive_daily,0) - COALESCE(dps.units_sold,0)))/NULLIF(SUM(COALESCE(dps.units_sold,0)),0),2) AS naive_wmape
|
||||||
|
FROM eng e
|
||||||
|
LEFT JOIN naive nv ON nv.pid = e.pid
|
||||||
|
LEFT JOIN daily_product_snapshots dps ON dps.pid = e.pid AND dps.snapshot_date = e.forecast_date
|
||||||
|
GROUP BY ROLLUP(e.lifecycle_phase) ORDER BY 1;
|
||||||
|
```
|
||||||
|
|
||||||
|
Baseline numbers to beat (June 1–9, 2026): engine 221% / naive 204% daily WMAPE; engine_fc/actual = 1.82; per-phase table in §1.
|
||||||
Binary file not shown.
@@ -634,6 +634,52 @@ def forecast_from_curve(curve_params, scale_factor, age_days, horizon_days):
|
|||||||
return np.array(forecasts)
|
return np.array(forecasts)
|
||||||
|
|
||||||
|
|
||||||
|
def forecast_preorder(curve_params, scale_factor, days_until_arrival,
|
||||||
|
preorder_daily_rate, horizon_days):
|
||||||
|
"""
|
||||||
|
Piecewise pre-order forecast: a flat observed pre-order trickle until the
|
||||||
|
product is expected to arrive, then the scaled launch curve from age 0.
|
||||||
|
|
||||||
|
The launch curve was fit on POST-receipt order history, so running it from
|
||||||
|
today (while the product is still weeks from arriving) front-loads full
|
||||||
|
first-week launch volume that hasn't happened yet — the main driver of the
|
||||||
|
~2.15x preorder over-forecast. Instead we forecast the slow pre-order rate
|
||||||
|
up to the arrival date, then start the curve's day 0 on that date.
|
||||||
|
See FORECAST_FIX_PLAN F4.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
curve_params: (amplitude, decay_rate, baseline, ...) weekly curve
|
||||||
|
scale_factor: per-product multiplier for the post-arrival curve envelope
|
||||||
|
days_until_arrival: calendar days from today until expected arrival
|
||||||
|
preorder_daily_rate: observed pre-order units/day (trickle)
|
||||||
|
horizon_days: forecast horizon length
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
array of daily forecast values of length horizon_days
|
||||||
|
"""
|
||||||
|
amplitude, decay_rate, baseline = curve_params[:3]
|
||||||
|
forecasts = np.zeros(horizon_days)
|
||||||
|
|
||||||
|
# Clamp the arrival offset into the horizon
|
||||||
|
dua = int(max(0, min(days_until_arrival, horizon_days)))
|
||||||
|
|
||||||
|
# Pre-arrival segment: flat pre-order trickle, capped at the curve's scaled
|
||||||
|
# week-0 daily value (a pre-order day shouldn't out-sell the launch peak).
|
||||||
|
if dua > 0:
|
||||||
|
week0_daily = (amplitude / 7.0) * scale_factor + (baseline / 7.0)
|
||||||
|
pre_rate = preorder_daily_rate
|
||||||
|
if week0_daily > 0:
|
||||||
|
pre_rate = min(pre_rate, week0_daily)
|
||||||
|
forecasts[:dua] = max(0.0, pre_rate)
|
||||||
|
|
||||||
|
# Post-arrival segment: scaled launch curve, curve day 0 = arrival date.
|
||||||
|
if dua < horizon_days:
|
||||||
|
curve_part = forecast_from_curve(curve_params, scale_factor, 0, horizon_days - dua)
|
||||||
|
forecasts[dua:] = curve_part
|
||||||
|
|
||||||
|
return forecasts
|
||||||
|
|
||||||
|
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
# Batch data loading (eliminates N+1 per-product queries)
|
# Batch data loading (eliminates N+1 per-product queries)
|
||||||
# ---------------------------------------------------------------------------
|
# ---------------------------------------------------------------------------
|
||||||
@@ -651,9 +697,11 @@ def batch_load_product_data(conn, products):
|
|||||||
data = {
|
data = {
|
||||||
'preorder_sales': {},
|
'preorder_sales': {},
|
||||||
'preorder_days': {},
|
'preorder_days': {},
|
||||||
|
'preorder_arrival_days': {},
|
||||||
'launch_sales': {},
|
'launch_sales': {},
|
||||||
'decay_velocity': {},
|
'decay_velocity': {},
|
||||||
'mature_history': {},
|
'mature_history': {},
|
||||||
|
'dormant_rate': {},
|
||||||
}
|
}
|
||||||
|
|
||||||
# Pre-order sales: orders placed BEFORE first received date
|
# Pre-order sales: orders placed BEFORE first received date
|
||||||
@@ -677,6 +725,39 @@ def batch_load_product_data(conn, products):
|
|||||||
data['preorder_days'][int(row['pid'])] = float(row['preorder_days'])
|
data['preorder_days'][int(row['pid'])] = float(row['preorder_days'])
|
||||||
log.info(f"Batch loaded pre-order sales for {len(data['preorder_sales'])}/{len(preorder_pids)} preorder products")
|
log.info(f"Batch loaded pre-order sales for {len(data['preorder_sales'])}/{len(preorder_pids)} preorder products")
|
||||||
|
|
||||||
|
# Expected arrival per pre-order product, to time the launch curve.
|
||||||
|
# Prefer the soonest FUTURE expected_date on an open PO; if the only open
|
||||||
|
# PO has a past expected_date assume 7 days; if there's no open PO at all
|
||||||
|
# assume 14 days. See FORECAST_FIX_PLAN F4.
|
||||||
|
arrival_sql = """
|
||||||
|
SELECT pid,
|
||||||
|
MIN(expected_date) FILTER (
|
||||||
|
WHERE expected_date IS NOT NULL AND expected_date >= CURRENT_DATE
|
||||||
|
) AS future_arrival
|
||||||
|
FROM purchase_orders
|
||||||
|
WHERE pid = ANY(%s)
|
||||||
|
AND status IN ('created', 'ordered', 'electronically_sent', 'receiving_started')
|
||||||
|
GROUP BY pid
|
||||||
|
"""
|
||||||
|
adf = execute_query(conn, arrival_sql, [preorder_pids])
|
||||||
|
today = date.today()
|
||||||
|
for _, row in adf.iterrows():
|
||||||
|
pid = int(row['pid'])
|
||||||
|
fa = row['future_arrival']
|
||||||
|
if pd.notna(fa):
|
||||||
|
fa_date = pd.Timestamp(fa).date()
|
||||||
|
data['preorder_arrival_days'][pid] = max(0, (fa_date - today).days)
|
||||||
|
else:
|
||||||
|
data['preorder_arrival_days'][pid] = 7 # open PO, expected_date already past
|
||||||
|
no_po = 0
|
||||||
|
for pid in preorder_pids:
|
||||||
|
if int(pid) not in data['preorder_arrival_days']:
|
||||||
|
data['preorder_arrival_days'][int(pid)] = 14 # no open PO at all
|
||||||
|
no_po += 1
|
||||||
|
log.info(f"Batch loaded preorder arrival for "
|
||||||
|
f"{len(data['preorder_arrival_days']) - no_po}/{len(preorder_pids)} via open POs, "
|
||||||
|
f"{no_po} defaulted to 14d")
|
||||||
|
|
||||||
# Launch sales: first 14 days after first received
|
# Launch sales: first 14 days after first received
|
||||||
launch_pids = products[products['phase'] == 'launch']['pid'].tolist()
|
launch_pids = products[products['phase'] == 'launch']['pid'].tolist()
|
||||||
if launch_pids:
|
if launch_pids:
|
||||||
@@ -694,15 +775,23 @@ def batch_load_product_data(conn, products):
|
|||||||
data['launch_sales'][int(row['pid'])] = float(row['total_sold'])
|
data['launch_sales'][int(row['pid'])] = float(row['total_sold'])
|
||||||
log.info(f"Batch loaded launch sales for {len(data['launch_sales'])}/{len(launch_pids)} launch products")
|
log.info(f"Batch loaded launch sales for {len(data['launch_sales'])}/{len(launch_pids)} launch products")
|
||||||
|
|
||||||
# Decay recent velocity: average daily sales over last 30 days
|
# Decay recent velocity: TRUE calendar-daily average over the last 30 days.
|
||||||
|
# We divide the summed units by calendar days (clipped to the product's age),
|
||||||
|
# NOT by the number of snapshot rows. Snapshots are sparse and mostly land on
|
||||||
|
# sold-days, so AVG(units_sold) averages over sold-days only and inflated the
|
||||||
|
# decay rate ~4x (measured 1.353 vs true 0.332 units/day). See FORECAST_FIX_PLAN F1.
|
||||||
decay_pids = products[products['phase'] == 'decay']['pid'].tolist()
|
decay_pids = products[products['phase'] == 'decay']['pid'].tolist()
|
||||||
if decay_pids:
|
if decay_pids:
|
||||||
sql = """
|
sql = """
|
||||||
SELECT dps.pid, AVG(COALESCE(dps.units_sold, 0)) AS avg_daily
|
SELECT dps.pid,
|
||||||
|
SUM(COALESCE(dps.units_sold, 0))::float
|
||||||
|
/ GREATEST(LEAST(30, (CURRENT_DATE - pm.date_first_received::date)), 1) AS avg_daily
|
||||||
FROM daily_product_snapshots dps
|
FROM daily_product_snapshots dps
|
||||||
|
JOIN product_metrics pm ON pm.pid = dps.pid
|
||||||
WHERE dps.pid = ANY(%s)
|
WHERE dps.pid = ANY(%s)
|
||||||
AND dps.snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
|
AND dps.snapshot_date >= CURRENT_DATE - INTERVAL '30 days'
|
||||||
GROUP BY dps.pid
|
AND dps.snapshot_date >= pm.date_first_received::date
|
||||||
|
GROUP BY dps.pid, pm.date_first_received
|
||||||
"""
|
"""
|
||||||
df = execute_query(conn, sql, [decay_pids])
|
df = execute_query(conn, sql, [decay_pids])
|
||||||
for _, row in df.iterrows():
|
for _, row in df.iterrows():
|
||||||
@@ -724,6 +813,25 @@ def batch_load_product_data(conn, products):
|
|||||||
data['mature_history'][int(pid)] = group.copy()
|
data['mature_history'][int(pid)] = group.copy()
|
||||||
log.info(f"Batch loaded history for {len(data['mature_history'])}/{len(mature_pids)} mature products")
|
log.info(f"Batch loaded history for {len(data['mature_history'])}/{len(mature_pids)} mature products")
|
||||||
|
|
||||||
|
# Dormant trailing order rate: dormant products forecast 0 by default, but
|
||||||
|
# ~11K of them still sell (restocks, promos, long-tail) — ~11% of all demand
|
||||||
|
# currently forecast as a hard zero. Load a trailing-180-day daily order rate
|
||||||
|
# so the dormant branch can carry a small positive rate. See FORECAST_FIX_PLAN F5.
|
||||||
|
dormant_pids = products[products['phase'] == 'dormant']['pid'].tolist()
|
||||||
|
if dormant_pids:
|
||||||
|
sql = """
|
||||||
|
SELECT o.pid, SUM(o.quantity) / 180.0 AS rate
|
||||||
|
FROM orders o
|
||||||
|
WHERE o.pid = ANY(%s)
|
||||||
|
AND o.canceled IS DISTINCT FROM TRUE
|
||||||
|
AND o.date >= CURRENT_DATE - INTERVAL '180 days'
|
||||||
|
GROUP BY o.pid
|
||||||
|
"""
|
||||||
|
df = execute_query(conn, sql, [dormant_pids])
|
||||||
|
for _, row in df.iterrows():
|
||||||
|
data['dormant_rate'][int(row['pid'])] = float(row['rate'])
|
||||||
|
log.info(f"Batch loaded dormant order rate for {len(data['dormant_rate'])}/{len(dormant_pids)} dormant products")
|
||||||
|
|
||||||
return data
|
return data
|
||||||
|
|
||||||
|
|
||||||
@@ -829,11 +937,20 @@ def forecast_mature(product, history_df):
|
|||||||
# Not enough data — flat velocity
|
# Not enough data — flat velocity
|
||||||
return np.full(FORECAST_HORIZON_DAYS, velocity)
|
return np.full(FORECAST_HORIZON_DAYS, velocity)
|
||||||
|
|
||||||
# Fill date gaps with 0 sales (days where product had no snapshot = no sales)
|
# Reindex over the FULL calendar window ending yesterday, not just the span
|
||||||
|
# between the first and last snapshot. resample() only covers first→last
|
||||||
|
# snapshot, so leading/trailing quiet periods are absent and the Holt level
|
||||||
|
# is fitted only on the product's busy span (can run ~4x too high). An
|
||||||
|
# explicit reindex fills every quiet calendar day with 0. (pid, snapshot_date)
|
||||||
|
# is unique so there is no duplicate-index risk; do NOT use combine_first
|
||||||
|
# (it keeps zeros over real data). See FORECAST_FIX_PLAN F2.
|
||||||
hist = history_df.copy()
|
hist = history_df.copy()
|
||||||
hist['snapshot_date'] = pd.to_datetime(hist['snapshot_date'])
|
hist['snapshot_date'] = pd.to_datetime(hist['snapshot_date'])
|
||||||
hist = hist.set_index('snapshot_date').resample('D').sum().fillna(0)
|
hist = hist.set_index('snapshot_date')['units_sold']
|
||||||
series = hist['units_sold'].values.astype(float)
|
full_index = pd.date_range(
|
||||||
|
end=pd.Timestamp(date.today() - timedelta(days=1)),
|
||||||
|
periods=EXP_SMOOTHING_WINDOW, freq='D')
|
||||||
|
series = hist.reindex(full_index, fill_value=0.0).values.astype(float)
|
||||||
|
|
||||||
# Need at least 2 non-zero values for smoothing
|
# Need at least 2 non-zero values for smoothing
|
||||||
if np.count_nonzero(series) < 2:
|
if np.count_nonzero(series) < 2:
|
||||||
@@ -956,9 +1073,24 @@ def generate_all_forecasts(conn, curves_df, dow_indices, monthly_indices=None,
|
|||||||
today = date.today()
|
today = date.today()
|
||||||
forecast_dates = [today + timedelta(days=i) for i in range(FORECAST_HORIZON_DAYS)]
|
forecast_dates = [today + timedelta(days=i) for i in range(FORECAST_HORIZON_DAYS)]
|
||||||
|
|
||||||
# Pre-compute DOW and seasonal multipliers for each forecast date
|
# Pre-compute DOW and seasonal multipliers for each forecast date.
|
||||||
|
# DOW multipliers stay ABSOLUTE — every calibration is a multi-week average
|
||||||
|
# and therefore DOW-neutral, so reshaping by absolute DOW indices is correct.
|
||||||
|
# Seasonal indices must be applied RELATIVE to the calibration period:
|
||||||
|
# each per-product calibration (decay velocity, mature Holt level, launch /
|
||||||
|
# preorder scale) is fitted on raw recent actuals that already embed the
|
||||||
|
# current month's seasonal level. Multiplying by the absolute target-month
|
||||||
|
# index double-counts seasonality (~25% over-forecast at the May→June sale
|
||||||
|
# transition, worse near November). Divide by the trailing-30-day average
|
||||||
|
# index so only the seasonal *change* from calibration to target applies.
|
||||||
|
# See FORECAST_FIX_PLAN F3.
|
||||||
dow_multipliers = [dow_indices.get(d.isoweekday(), 1.0) for d in forecast_dates]
|
dow_multipliers = [dow_indices.get(d.isoweekday(), 1.0) for d in forecast_dates]
|
||||||
seasonal_multipliers = [monthly_indices.get(d.month, 1.0) for d in forecast_dates]
|
trailing = [today - timedelta(days=i) for i in range(1, 31)]
|
||||||
|
calibration_index = float(np.mean([monthly_indices.get(d.month, 1.0) for d in trailing]))
|
||||||
|
seasonal_multipliers = [
|
||||||
|
monthly_indices.get(d.month, 1.0) / max(calibration_index, 0.1)
|
||||||
|
for d in forecast_dates
|
||||||
|
]
|
||||||
|
|
||||||
# TRUNCATE before streaming writes
|
# TRUNCATE before streaming writes
|
||||||
with conn.cursor() as cur:
|
with conn.cursor() as cur:
|
||||||
@@ -1002,9 +1134,33 @@ def generate_all_forecasts(conn, curves_df, dow_indices, monthly_indices=None,
|
|||||||
try:
|
try:
|
||||||
curve_info = get_curve_for_product(product, curves_df)
|
curve_info = get_curve_for_product(product, curves_df)
|
||||||
|
|
||||||
if phase in ('preorder', 'launch'):
|
if phase == 'preorder':
|
||||||
if curve_info:
|
if curve_info:
|
||||||
scale = compute_scale_factor(phase, product, curve_info, batch_data)
|
scale = compute_scale_factor('preorder', product, curve_info, batch_data)
|
||||||
|
# Time the launch curve to expected arrival instead of
|
||||||
|
# running it from today (F4). Pre-arrival days carry the
|
||||||
|
# observed pre-order trickle rate.
|
||||||
|
days_until_arrival = batch_data['preorder_arrival_days'].get(pid, 14)
|
||||||
|
preorder_units = batch_data['preorder_sales'].get(pid, 0)
|
||||||
|
preorder_days = batch_data['preorder_days'].get(pid, 1)
|
||||||
|
preorder_daily_rate = preorder_units / max(preorder_days, 1)
|
||||||
|
forecasts = forecast_preorder(
|
||||||
|
curve_info, scale, days_until_arrival,
|
||||||
|
preorder_daily_rate, FORECAST_HORIZON_DAYS)
|
||||||
|
method = 'lifecycle_curve'
|
||||||
|
else:
|
||||||
|
# No reliable curve — fall back to velocity if available
|
||||||
|
velocity = product.get('sales_velocity_daily') or 0
|
||||||
|
if velocity > 0:
|
||||||
|
forecasts = np.full(FORECAST_HORIZON_DAYS, velocity)
|
||||||
|
method = 'velocity'
|
||||||
|
else:
|
||||||
|
forecasts = forecast_dormant()
|
||||||
|
method = 'zero'
|
||||||
|
|
||||||
|
elif phase == 'launch':
|
||||||
|
if curve_info:
|
||||||
|
scale = compute_scale_factor('launch', product, curve_info, batch_data)
|
||||||
forecasts = forecast_from_curve(curve_info, scale, age, FORECAST_HORIZON_DAYS)
|
forecasts = forecast_from_curve(curve_info, scale, age, FORECAST_HORIZON_DAYS)
|
||||||
method = 'lifecycle_curve'
|
method = 'lifecycle_curve'
|
||||||
else:
|
else:
|
||||||
@@ -1038,6 +1194,14 @@ def generate_all_forecasts(conn, curves_df, dow_indices, monthly_indices=None,
|
|||||||
method = 'velocity'
|
method = 'velocity'
|
||||||
|
|
||||||
else: # dormant
|
else: # dormant
|
||||||
|
# Carry a small positive rate for dormant products that still
|
||||||
|
# trickle sales (restocks/promos/long-tail); only truly dead
|
||||||
|
# products stay at zero. See FORECAST_FIX_PLAN F5.
|
||||||
|
rate = batch_data['dormant_rate'].get(pid, 0)
|
||||||
|
if rate > 0:
|
||||||
|
forecasts = np.full(FORECAST_HORIZON_DAYS, rate)
|
||||||
|
method = 'velocity'
|
||||||
|
else:
|
||||||
forecasts = forecast_dormant()
|
forecasts = forecast_dormant()
|
||||||
method = 'zero'
|
method = 'zero'
|
||||||
|
|
||||||
@@ -1108,6 +1272,8 @@ def archive_forecasts(conn, run_id):
|
|||||||
""")
|
""")
|
||||||
cur.execute("CREATE INDEX IF NOT EXISTS idx_pfh_date ON product_forecasts_history(forecast_date)")
|
cur.execute("CREATE INDEX IF NOT EXISTS idx_pfh_date ON product_forecasts_history(forecast_date)")
|
||||||
cur.execute("CREATE INDEX IF NOT EXISTS idx_pfh_pid_date ON product_forecasts_history(pid, forecast_date)")
|
cur.execute("CREATE INDEX IF NOT EXISTS idx_pfh_pid_date ON product_forecasts_history(pid, forecast_date)")
|
||||||
|
# Naive-baseline column for forecast value-added (FVA). See FORECAST_FIX_PLAN F8.
|
||||||
|
cur.execute("ALTER TABLE product_forecasts_history ADD COLUMN IF NOT EXISTS naive_units NUMERIC(10,2)")
|
||||||
|
|
||||||
# Find the previous completed run (whose forecasts are still in product_forecasts)
|
# Find the previous completed run (whose forecasts are still in product_forecasts)
|
||||||
cur.execute("""
|
cur.execute("""
|
||||||
@@ -1124,15 +1290,27 @@ def archive_forecasts(conn, run_id):
|
|||||||
|
|
||||||
prev_run_id = prev_run[0]
|
prev_run_id = prev_run[0]
|
||||||
|
|
||||||
# Archive only past-date forecasts (where actuals now exist)
|
# Archive only past-date forecasts (where actuals now exist). Attach the
|
||||||
|
# naive baseline (flat trailing-28-day daily average) at the same time so
|
||||||
|
# forecast value-added can be measured. See FORECAST_FIX_PLAN F8.
|
||||||
cur.execute("""
|
cur.execute("""
|
||||||
INSERT INTO product_forecasts_history
|
INSERT INTO product_forecasts_history
|
||||||
(run_id, pid, forecast_date, forecast_units, forecast_revenue,
|
(run_id, pid, forecast_date, forecast_units, forecast_revenue,
|
||||||
lifecycle_phase, forecast_method, confidence_lower, confidence_upper, generated_at)
|
lifecycle_phase, forecast_method, confidence_lower, confidence_upper,
|
||||||
SELECT %s, pid, forecast_date, forecast_units, forecast_revenue,
|
generated_at, naive_units)
|
||||||
lifecycle_phase, forecast_method, confidence_lower, confidence_upper, generated_at
|
SELECT %s, pf.pid, pf.forecast_date, pf.forecast_units, pf.forecast_revenue,
|
||||||
FROM product_forecasts
|
pf.lifecycle_phase, pf.forecast_method, pf.confidence_lower, pf.confidence_upper,
|
||||||
WHERE forecast_date < CURRENT_DATE
|
pf.generated_at, COALESCE(nv.naive_daily, 0)
|
||||||
|
FROM product_forecasts pf
|
||||||
|
LEFT JOIN (
|
||||||
|
SELECT o.pid, SUM(o.quantity) / 28.0 AS naive_daily
|
||||||
|
FROM orders o
|
||||||
|
WHERE o.canceled IS DISTINCT FROM TRUE
|
||||||
|
AND o.date >= CURRENT_DATE - INTERVAL '28 days'
|
||||||
|
AND o.date < CURRENT_DATE
|
||||||
|
GROUP BY o.pid
|
||||||
|
) nv ON nv.pid = pf.pid
|
||||||
|
WHERE pf.forecast_date < CURRENT_DATE
|
||||||
ON CONFLICT (run_id, pid, forecast_date) DO NOTHING
|
ON CONFLICT (run_id, pid, forecast_date) DO NOTHING
|
||||||
""", (prev_run_id,))
|
""", (prev_run_id,))
|
||||||
|
|
||||||
@@ -1154,6 +1332,48 @@ def archive_forecasts(conn, run_id):
|
|||||||
return archived
|
return archived
|
||||||
|
|
||||||
|
|
||||||
|
def archive_future_leads(conn, run_id):
|
||||||
|
"""
|
||||||
|
Archive a sampled set of FUTURE-lead forecasts from the just-generated
|
||||||
|
product_forecasts, attributed to the current run.
|
||||||
|
|
||||||
|
The past-date archive in archive_forecasts() only ever captures the 1-day
|
||||||
|
slice that just elapsed, so every accuracy sample lands in the '1-7d' lead
|
||||||
|
bucket and the 15/30/60/90-day forecasts that purchasing actually rides on
|
||||||
|
are never validated. Here we snapshot the 7/14/30/60/89-day-ahead leads
|
||||||
|
(non-dormant) so that, once each date passes, compute_accuracy() can score
|
||||||
|
them in their lead bucket. The naive baseline is attached the same way as in
|
||||||
|
the past-date path. Future-dated rows survive the 90-day prune until their
|
||||||
|
own date passes. See FORECAST_FIX_PLAN F7.
|
||||||
|
"""
|
||||||
|
with conn.cursor() as cur:
|
||||||
|
cur.execute("""
|
||||||
|
INSERT INTO product_forecasts_history
|
||||||
|
(run_id, pid, forecast_date, forecast_units, forecast_revenue,
|
||||||
|
lifecycle_phase, forecast_method, confidence_lower, confidence_upper,
|
||||||
|
generated_at, naive_units)
|
||||||
|
SELECT %s, pf.pid, pf.forecast_date, pf.forecast_units, pf.forecast_revenue,
|
||||||
|
pf.lifecycle_phase, pf.forecast_method, pf.confidence_lower, pf.confidence_upper,
|
||||||
|
pf.generated_at, COALESCE(nv.naive_daily, 0)
|
||||||
|
FROM product_forecasts pf
|
||||||
|
LEFT JOIN (
|
||||||
|
SELECT o.pid, SUM(o.quantity) / 28.0 AS naive_daily
|
||||||
|
FROM orders o
|
||||||
|
WHERE o.canceled IS DISTINCT FROM TRUE
|
||||||
|
AND o.date >= CURRENT_DATE - INTERVAL '28 days'
|
||||||
|
AND o.date < CURRENT_DATE
|
||||||
|
GROUP BY o.pid
|
||||||
|
) nv ON nv.pid = pf.pid
|
||||||
|
WHERE pf.lifecycle_phase != 'dormant'
|
||||||
|
AND pf.forecast_date - CURRENT_DATE IN (7, 14, 30, 60, 89)
|
||||||
|
ON CONFLICT (run_id, pid, forecast_date) DO NOTHING
|
||||||
|
""", (run_id,))
|
||||||
|
archived = cur.rowcount
|
||||||
|
conn.commit()
|
||||||
|
log.info(f"Archived {archived} future-lead forecast rows (7/14/30/60/89d) for run {run_id}")
|
||||||
|
return archived
|
||||||
|
|
||||||
|
|
||||||
def compute_accuracy(conn, run_id):
|
def compute_accuracy(conn, run_id):
|
||||||
"""
|
"""
|
||||||
Compute forecast accuracy metrics from archived history vs. actual sales.
|
Compute forecast accuracy metrics from archived history vs. actual sales.
|
||||||
@@ -1162,11 +1382,18 @@ def compute_accuracy(conn, run_id):
|
|||||||
(pid, forecast_date = snapshot_date) to compare forecasted vs. actual units.
|
(pid, forecast_date = snapshot_date) to compare forecasted vs. actual units.
|
||||||
|
|
||||||
Stores results in forecast_accuracy table, broken down by:
|
Stores results in forecast_accuracy table, broken down by:
|
||||||
- overall: single aggregate row
|
- overall: two rows — 'all' (non-dormant) and 'all_incl_dormant' (F5)
|
||||||
|
- overall_weekly: per-product weekly-grain WMAPE — the informative headline
|
||||||
|
for intermittent demand (daily grain has a ~190% floor) (F9)
|
||||||
- by_phase: per lifecycle phase
|
- by_phase: per lifecycle phase
|
||||||
- by_lead_time: bucketed by how far ahead the forecast was
|
- by_lead_time: bucketed by how far ahead the forecast was — long-lead
|
||||||
|
buckets populate as the future-lead archives mature (F7)
|
||||||
- by_method: per forecast method
|
- by_method: per forecast method
|
||||||
- daily: per forecast_date (for trend charts)
|
- daily: per forecast_date (for trend charts)
|
||||||
|
|
||||||
|
Every dimension also stores naive_wmape (flat trailing-28d baseline) and
|
||||||
|
fva = 1 - wmape/naive_wmape, so the engine can be judged as value-over-naive
|
||||||
|
(F8). Only realized dates (forecast_date < CURRENT_DATE) are scored.
|
||||||
"""
|
"""
|
||||||
with conn.cursor() as cur:
|
with conn.cursor() as cur:
|
||||||
# Ensure accuracy table exists
|
# Ensure accuracy table exists
|
||||||
@@ -1186,6 +1413,10 @@ def compute_accuracy(conn, run_id):
|
|||||||
PRIMARY KEY (run_id, metric_type, dimension_value)
|
PRIMARY KEY (run_id, metric_type, dimension_value)
|
||||||
)
|
)
|
||||||
""")
|
""")
|
||||||
|
# Naive-baseline WMAPE and forecast value-added (FVA = 1 - wmape/naive_wmape).
|
||||||
|
# See FORECAST_FIX_PLAN F8.
|
||||||
|
cur.execute("ALTER TABLE forecast_accuracy ADD COLUMN IF NOT EXISTS naive_wmape NUMERIC(10,4)")
|
||||||
|
cur.execute("ALTER TABLE forecast_accuracy ADD COLUMN IF NOT EXISTS fva NUMERIC(10,4)")
|
||||||
conn.commit()
|
conn.commit()
|
||||||
|
|
||||||
# Check if we have any history to analyze
|
# Check if we have any history to analyze
|
||||||
@@ -1195,109 +1426,109 @@ def compute_accuracy(conn, run_id):
|
|||||||
log.info("No forecast history available for accuracy computation")
|
log.info("No forecast history available for accuracy computation")
|
||||||
return
|
return
|
||||||
|
|
||||||
# For each (pid, forecast_date) pair, keep only the most recent run's
|
# Base CTEs (FORECAST_FIX_PLAN F7):
|
||||||
# forecast row. This prevents double-counting when multiple runs have
|
# - Only score realized dates (forecast_date < CURRENT_DATE); future-lead
|
||||||
# archived forecasts for the same product×date combination.
|
# archives are excluded until their date passes.
|
||||||
accuracy_cte = """
|
# - short_lead*: lead 0-6 deduped per (pid, forecast_date) — preserves the
|
||||||
WITH ranked_history AS (
|
# meaning of the existing headline metrics. short_lead_eval keeps the
|
||||||
|
# raw snapshot grid (incl. zero-zero days) for complete-week detection;
|
||||||
|
# `accuracy` drops zero-zero days for daily-grain metrics.
|
||||||
|
# - lead_dedup/lead_accuracy: deduped per (pid, forecast_date, lead_bucket)
|
||||||
|
# so each long-lead bucket gets its own sample (the by_lead_time table).
|
||||||
|
base_cte = """
|
||||||
|
WITH ranked_all AS (
|
||||||
SELECT
|
SELECT
|
||||||
pfh.*,
|
pfh.pid, pfh.forecast_date, pfh.forecast_units, pfh.naive_units,
|
||||||
|
pfh.lifecycle_phase, pfh.forecast_method,
|
||||||
fr.started_at,
|
fr.started_at,
|
||||||
ROW_NUMBER() OVER (
|
(pfh.forecast_date - fr.started_at::date) AS lead_days,
|
||||||
PARTITION BY pfh.pid, pfh.forecast_date
|
CASE
|
||||||
ORDER BY fr.started_at DESC
|
WHEN (pfh.forecast_date - fr.started_at::date) BETWEEN 0 AND 6 THEN '1-7d'
|
||||||
) AS rn
|
WHEN (pfh.forecast_date - fr.started_at::date) BETWEEN 7 AND 13 THEN '8-14d'
|
||||||
|
WHEN (pfh.forecast_date - fr.started_at::date) BETWEEN 14 AND 29 THEN '15-30d'
|
||||||
|
WHEN (pfh.forecast_date - fr.started_at::date) BETWEEN 30 AND 59 THEN '31-60d'
|
||||||
|
ELSE '61-90d'
|
||||||
|
END AS lead_bucket
|
||||||
FROM product_forecasts_history pfh
|
FROM product_forecasts_history pfh
|
||||||
JOIN forecast_runs fr ON fr.id = pfh.run_id
|
JOIN forecast_runs fr ON fr.id = pfh.run_id
|
||||||
|
WHERE pfh.forecast_date < CURRENT_DATE
|
||||||
|
),
|
||||||
|
short_lead AS (
|
||||||
|
SELECT *,
|
||||||
|
ROW_NUMBER() OVER (
|
||||||
|
PARTITION BY pid, forecast_date ORDER BY started_at DESC
|
||||||
|
) AS rn
|
||||||
|
FROM ranked_all
|
||||||
|
WHERE lead_days BETWEEN 0 AND 6
|
||||||
|
),
|
||||||
|
short_lead_eval AS (
|
||||||
|
SELECT sl.pid, sl.lifecycle_phase, sl.forecast_method, sl.forecast_date,
|
||||||
|
sl.forecast_units, sl.naive_units,
|
||||||
|
COALESCE(dps.units_sold, 0) AS actual_units,
|
||||||
|
(sl.forecast_units - COALESCE(dps.units_sold, 0)) AS error,
|
||||||
|
ABS(sl.forecast_units - COALESCE(dps.units_sold, 0)) AS abs_error
|
||||||
|
FROM short_lead sl
|
||||||
|
LEFT JOIN daily_product_snapshots dps
|
||||||
|
ON dps.pid = sl.pid AND dps.snapshot_date = sl.forecast_date
|
||||||
|
WHERE sl.rn = 1
|
||||||
),
|
),
|
||||||
accuracy AS (
|
accuracy AS (
|
||||||
SELECT
|
SELECT * FROM short_lead_eval
|
||||||
rh.lifecycle_phase,
|
WHERE NOT (forecast_units = 0 AND actual_units = 0)
|
||||||
rh.forecast_method,
|
),
|
||||||
rh.forecast_date,
|
lead_dedup AS (
|
||||||
(rh.forecast_date - rh.started_at::date) AS lead_days,
|
SELECT *,
|
||||||
rh.forecast_units,
|
ROW_NUMBER() OVER (
|
||||||
|
PARTITION BY pid, forecast_date, lead_bucket ORDER BY started_at DESC
|
||||||
|
) AS rn
|
||||||
|
FROM ranked_all
|
||||||
|
),
|
||||||
|
lead_accuracy AS (
|
||||||
|
SELECT ld.lead_bucket, ld.forecast_units, ld.naive_units,
|
||||||
COALESCE(dps.units_sold, 0) AS actual_units,
|
COALESCE(dps.units_sold, 0) AS actual_units,
|
||||||
(rh.forecast_units - COALESCE(dps.units_sold, 0)) AS error,
|
(ld.forecast_units - COALESCE(dps.units_sold, 0)) AS error,
|
||||||
ABS(rh.forecast_units - COALESCE(dps.units_sold, 0)) AS abs_error
|
ABS(ld.forecast_units - COALESCE(dps.units_sold, 0)) AS abs_error
|
||||||
FROM ranked_history rh
|
FROM lead_dedup ld
|
||||||
LEFT JOIN daily_product_snapshots dps
|
LEFT JOIN daily_product_snapshots dps
|
||||||
ON dps.pid = rh.pid AND dps.snapshot_date = rh.forecast_date
|
ON dps.pid = ld.pid AND dps.snapshot_date = ld.forecast_date
|
||||||
WHERE rh.rn = 1
|
WHERE ld.rn = 1
|
||||||
AND NOT (rh.forecast_units = 0 AND COALESCE(dps.units_sold, 0) = 0)
|
AND ld.lifecycle_phase != 'dormant'
|
||||||
|
AND NOT (ld.forecast_units = 0 AND COALESCE(dps.units_sold, 0) = 0)
|
||||||
)
|
)
|
||||||
"""
|
"""
|
||||||
|
|
||||||
# Compute and insert metrics for each dimension
|
# Daily-grain aggregate over a source CTE aliased `a`, computing the
|
||||||
dimensions = {
|
# engine WMAPE plus the naive-baseline WMAPE (NULL-safe: rows archived
|
||||||
'overall': "SELECT 'all' AS dim",
|
# before F8 have naive_units NULL and are excluded from the naive sums).
|
||||||
'by_phase': "SELECT DISTINCT lifecycle_phase AS dim FROM accuracy",
|
def daily_agg(dim_expr, source, where=None, group_by=None):
|
||||||
'by_lead_time': """
|
where_sql = f"WHERE {where}" if where else ""
|
||||||
SELECT DISTINCT
|
group_sql = f"GROUP BY {group_by}" if group_by else ""
|
||||||
CASE
|
return f"""
|
||||||
WHEN lead_days BETWEEN 0 AND 6 THEN '1-7d'
|
|
||||||
WHEN lead_days BETWEEN 7 AND 13 THEN '8-14d'
|
|
||||||
WHEN lead_days BETWEEN 14 AND 29 THEN '15-30d'
|
|
||||||
WHEN lead_days BETWEEN 30 AND 59 THEN '31-60d'
|
|
||||||
ELSE '61-90d'
|
|
||||||
END AS dim
|
|
||||||
FROM accuracy
|
|
||||||
""",
|
|
||||||
'by_method': "SELECT DISTINCT forecast_method AS dim FROM accuracy",
|
|
||||||
'daily': "SELECT DISTINCT forecast_date::text AS dim FROM accuracy",
|
|
||||||
}
|
|
||||||
|
|
||||||
filter_clauses = {
|
|
||||||
'overall': "lifecycle_phase != 'dormant'",
|
|
||||||
'by_phase': "lifecycle_phase = dims.dim",
|
|
||||||
'by_lead_time': """
|
|
||||||
CASE
|
|
||||||
WHEN lead_days BETWEEN 0 AND 6 THEN '1-7d'
|
|
||||||
WHEN lead_days BETWEEN 7 AND 13 THEN '8-14d'
|
|
||||||
WHEN lead_days BETWEEN 14 AND 29 THEN '15-30d'
|
|
||||||
WHEN lead_days BETWEEN 30 AND 59 THEN '31-60d'
|
|
||||||
ELSE '61-90d'
|
|
||||||
END = dims.dim
|
|
||||||
""",
|
|
||||||
'by_method': "forecast_method = dims.dim",
|
|
||||||
'daily': "forecast_date::text = dims.dim",
|
|
||||||
}
|
|
||||||
|
|
||||||
total_inserted = 0
|
|
||||||
|
|
||||||
for metric_type, dim_query in dimensions.items():
|
|
||||||
filter_clause = filter_clauses[metric_type]
|
|
||||||
|
|
||||||
sql = f"""
|
|
||||||
{accuracy_cte},
|
|
||||||
dims AS ({dim_query})
|
|
||||||
SELECT
|
SELECT
|
||||||
dims.dim,
|
{dim_expr} AS dim,
|
||||||
COUNT(*) AS sample_size,
|
COUNT(*) AS sample_size,
|
||||||
COALESCE(SUM(a.actual_units), 0) AS total_actual,
|
COALESCE(SUM(a.actual_units), 0) AS total_actual,
|
||||||
COALESCE(SUM(a.forecast_units), 0) AS total_forecast,
|
COALESCE(SUM(a.forecast_units), 0) AS total_forecast,
|
||||||
AVG(a.abs_error) AS mae,
|
AVG(a.abs_error) AS mae,
|
||||||
CASE WHEN SUM(a.actual_units) > 0
|
CASE WHEN SUM(a.actual_units) > 0
|
||||||
THEN SUM(a.abs_error) / SUM(a.actual_units)
|
THEN SUM(a.abs_error) / SUM(a.actual_units) ELSE NULL END AS wmape,
|
||||||
ELSE NULL END AS wmape,
|
|
||||||
AVG(a.error) AS bias,
|
AVG(a.error) AS bias,
|
||||||
SQRT(AVG(POWER(a.error, 2))) AS rmse
|
SQRT(AVG(POWER(a.error, 2))) AS rmse,
|
||||||
FROM dims
|
CASE WHEN SUM(a.actual_units) FILTER (WHERE a.naive_units IS NOT NULL) > 0
|
||||||
CROSS JOIN accuracy a
|
THEN SUM(ABS(a.naive_units - a.actual_units)) FILTER (WHERE a.naive_units IS NOT NULL)
|
||||||
WHERE {filter_clause}
|
/ SUM(a.actual_units) FILTER (WHERE a.naive_units IS NOT NULL)
|
||||||
GROUP BY dims.dim
|
ELSE NULL END AS naive_wmape
|
||||||
|
FROM {source} a
|
||||||
|
{where_sql}
|
||||||
|
{group_sql}
|
||||||
"""
|
"""
|
||||||
|
|
||||||
cur.execute(sql)
|
insert_sql = """
|
||||||
rows = cur.fetchall()
|
|
||||||
|
|
||||||
for row in rows:
|
|
||||||
dim_val, sample_size, total_actual, total_forecast, mae, wmape, bias, rmse = row
|
|
||||||
cur.execute("""
|
|
||||||
INSERT INTO forecast_accuracy
|
INSERT INTO forecast_accuracy
|
||||||
(run_id, metric_type, dimension_value, sample_size,
|
(run_id, metric_type, dimension_value, sample_size,
|
||||||
total_actual_units, total_forecast_units, mae, wmape, bias, rmse)
|
total_actual_units, total_forecast_units, mae, wmape, bias, rmse,
|
||||||
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
|
naive_wmape, fva)
|
||||||
|
VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)
|
||||||
ON CONFLICT (run_id, metric_type, dimension_value)
|
ON CONFLICT (run_id, metric_type, dimension_value)
|
||||||
DO UPDATE SET
|
DO UPDATE SET
|
||||||
sample_size = EXCLUDED.sample_size,
|
sample_size = EXCLUDED.sample_size,
|
||||||
@@ -1305,14 +1536,89 @@ def compute_accuracy(conn, run_id):
|
|||||||
total_forecast_units = EXCLUDED.total_forecast_units,
|
total_forecast_units = EXCLUDED.total_forecast_units,
|
||||||
mae = EXCLUDED.mae, wmape = EXCLUDED.wmape,
|
mae = EXCLUDED.mae, wmape = EXCLUDED.wmape,
|
||||||
bias = EXCLUDED.bias, rmse = EXCLUDED.rmse,
|
bias = EXCLUDED.bias, rmse = EXCLUDED.rmse,
|
||||||
|
naive_wmape = EXCLUDED.naive_wmape, fva = EXCLUDED.fva,
|
||||||
computed_at = NOW()
|
computed_at = NOW()
|
||||||
""", (run_id, metric_type, dim_val, sample_size,
|
"""
|
||||||
float(total_actual), float(total_forecast),
|
|
||||||
float(mae) if mae is not None else None,
|
def _f(x):
|
||||||
float(wmape) if wmape is not None else None,
|
return float(x) if x is not None else None
|
||||||
float(bias) if bias is not None else None,
|
|
||||||
float(rmse) if rmse is not None else None))
|
def run_and_insert(metric_type, sql):
|
||||||
total_inserted += 1
|
cur.execute(base_cte + sql)
|
||||||
|
n = 0
|
||||||
|
for row in cur.fetchall():
|
||||||
|
(dim_val, sample_size, total_actual, total_forecast,
|
||||||
|
mae, wmape, bias, rmse, naive_wmape) = row
|
||||||
|
fva = None
|
||||||
|
if wmape is not None and naive_wmape is not None and float(naive_wmape) > 0:
|
||||||
|
fva = 1.0 - float(wmape) / float(naive_wmape)
|
||||||
|
cur.execute(insert_sql, (
|
||||||
|
run_id, metric_type, dim_val, sample_size,
|
||||||
|
_f(total_actual), _f(total_forecast), _f(mae), _f(wmape),
|
||||||
|
_f(bias), _f(rmse), _f(naive_wmape), _f(fva)))
|
||||||
|
n += 1
|
||||||
|
return n
|
||||||
|
|
||||||
|
total_inserted = 0
|
||||||
|
|
||||||
|
# overall: two rows — 'all' (non-dormant, the headline) and
|
||||||
|
# 'all_incl_dormant' (everything, so the ~11% dormant demand stops being
|
||||||
|
# invisible). Both are short-lead (lead 0-6). F5.
|
||||||
|
overall_source = """(
|
||||||
|
SELECT a.*, 'all'::text AS dim FROM accuracy a WHERE a.lifecycle_phase != 'dormant'
|
||||||
|
UNION ALL
|
||||||
|
SELECT a.*, 'all_incl_dormant'::text AS dim FROM accuracy a
|
||||||
|
)"""
|
||||||
|
total_inserted += run_and_insert('overall',
|
||||||
|
daily_agg('a.dim', overall_source, group_by='a.dim'))
|
||||||
|
|
||||||
|
# by_phase / by_method / daily — short-lead daily-grain over `accuracy`.
|
||||||
|
total_inserted += run_and_insert('by_phase',
|
||||||
|
daily_agg('a.lifecycle_phase', 'accuracy', group_by='a.lifecycle_phase'))
|
||||||
|
total_inserted += run_and_insert('by_method',
|
||||||
|
daily_agg('a.forecast_method', 'accuracy', group_by='a.forecast_method'))
|
||||||
|
total_inserted += run_and_insert('daily',
|
||||||
|
daily_agg('a.forecast_date::text', 'accuracy',
|
||||||
|
where="a.lifecycle_phase != 'dormant'", group_by='a.forecast_date'))
|
||||||
|
|
||||||
|
# by_lead_time — one sample per (pid, date, lead bucket) over `lead_accuracy`.
|
||||||
|
# Buckets beyond '1-7d' populate as the future-lead archives (F7) mature.
|
||||||
|
total_inserted += run_and_insert('by_lead_time',
|
||||||
|
daily_agg('a.lead_bucket', 'lead_accuracy', group_by='a.lead_bucket'))
|
||||||
|
|
||||||
|
# overall_weekly — the informative headline for intermittent retail demand.
|
||||||
|
# Aggregate the short-lead rows to (pid, complete week), then WMAPE over
|
||||||
|
# pid-weeks. Daily-grain WMAPE has a ~190% floor on this catalog; weekly
|
||||||
|
# grain is ~109% and responds to real improvement. F9.
|
||||||
|
weekly_sql = """,
|
||||||
|
weekly AS (
|
||||||
|
SELECT pid, date_trunc('week', forecast_date) AS wk,
|
||||||
|
SUM(forecast_units) AS fc_week,
|
||||||
|
SUM(actual_units) AS act_week,
|
||||||
|
SUM(naive_units) AS naive_week,
|
||||||
|
bool_and(naive_units IS NOT NULL) AS naive_complete
|
||||||
|
FROM short_lead_eval
|
||||||
|
WHERE lifecycle_phase != 'dormant'
|
||||||
|
GROUP BY pid, date_trunc('week', forecast_date)
|
||||||
|
HAVING COUNT(*) = 7
|
||||||
|
)
|
||||||
|
SELECT 'all'::text AS dim,
|
||||||
|
COUNT(*) AS sample_size,
|
||||||
|
COALESCE(SUM(act_week), 0) AS total_actual,
|
||||||
|
COALESCE(SUM(fc_week), 0) AS total_forecast,
|
||||||
|
AVG(ABS(fc_week - act_week)) AS mae,
|
||||||
|
CASE WHEN SUM(act_week) > 0
|
||||||
|
THEN SUM(ABS(fc_week - act_week)) / SUM(act_week) ELSE NULL END AS wmape,
|
||||||
|
AVG(fc_week - act_week) AS bias,
|
||||||
|
SQRT(AVG(POWER(fc_week - act_week, 2))) AS rmse,
|
||||||
|
CASE WHEN SUM(act_week) FILTER (WHERE naive_complete) > 0
|
||||||
|
THEN SUM(ABS(naive_week - act_week)) FILTER (WHERE naive_complete)
|
||||||
|
/ SUM(act_week) FILTER (WHERE naive_complete)
|
||||||
|
ELSE NULL END AS naive_wmape
|
||||||
|
FROM weekly
|
||||||
|
WHERE NOT (fc_week = 0 AND act_week = 0)
|
||||||
|
"""
|
||||||
|
total_inserted += run_and_insert('overall_weekly', weekly_sql)
|
||||||
|
|
||||||
conn.commit()
|
conn.commit()
|
||||||
|
|
||||||
@@ -1562,6 +1868,10 @@ def main():
|
|||||||
conn, curves_df, dow_indices, monthly_indices, accuracy_margins
|
conn, curves_df, dow_indices, monthly_indices, accuracy_margins
|
||||||
)
|
)
|
||||||
|
|
||||||
|
# Phase 4b: Snapshot sampled future-lead forecasts (7/14/30/60/89d) from
|
||||||
|
# the fresh run so long-lead accuracy populates once those dates pass (F7).
|
||||||
|
archive_future_leads(conn, run_id)
|
||||||
|
|
||||||
duration = time.time() - start_time
|
duration = time.time() - start_time
|
||||||
|
|
||||||
# Record run completion (include DOW indices in metadata)
|
# Record run completion (include DOW indices in metadata)
|
||||||
|
|||||||
@@ -357,6 +357,9 @@ router.get('/forecast/metrics', async (req, res) => {
|
|||||||
|
|
||||||
const active = parseInt(totals.active_products) || 1;
|
const active = parseInt(totals.active_products) || 1;
|
||||||
const curveProducts = parseInt(totals.curve_products) || 0;
|
const curveProducts = parseInt(totals.curve_products) || 0;
|
||||||
|
// NOTE: despite the name, this is "share of active products forecast via
|
||||||
|
// lifecycle curves" (curve coverage), NOT a statistical confidence. It only
|
||||||
|
// feeds a per-day tooltip field. See FORECAST_FIX_PLAN F9 (point 4).
|
||||||
const confidenceLevel = parseFloat((curveProducts / active).toFixed(2));
|
const confidenceLevel = parseFloat((curveProducts / active).toFixed(2));
|
||||||
|
|
||||||
// Daily series from actual forecast
|
// Daily series from actual forecast
|
||||||
@@ -687,14 +690,29 @@ router.get('/forecast/accuracy', async (req, res) => {
|
|||||||
const { rows: metrics } = await executeQuery(`
|
const { rows: metrics } = await executeQuery(`
|
||||||
SELECT metric_type, dimension_value, sample_size,
|
SELECT metric_type, dimension_value, sample_size,
|
||||||
total_actual_units, total_forecast_units,
|
total_actual_units, total_forecast_units,
|
||||||
mae, wmape, bias, rmse
|
mae, wmape, bias, rmse, naive_wmape, fva
|
||||||
FROM forecast_accuracy
|
FROM forecast_accuracy
|
||||||
WHERE run_id = $1
|
WHERE run_id = $1
|
||||||
ORDER BY metric_type, dimension_value
|
ORDER BY metric_type, dimension_value
|
||||||
`, [latestRunId]);
|
`, [latestRunId]);
|
||||||
|
|
||||||
|
// Shared shaping for an "overall"-style aggregate row (daily or weekly grain).
|
||||||
|
const shapeOverall = (m) => m ? {
|
||||||
|
sampleSize: parseInt(m.sample_size),
|
||||||
|
totalActual: parseFloat(m.total_actual_units) || 0,
|
||||||
|
totalForecast: parseFloat(m.total_forecast_units) || 0,
|
||||||
|
mae: m.mae != null ? parseFloat(parseFloat(m.mae).toFixed(4)) : null,
|
||||||
|
wmape: m.wmape != null ? parseFloat((parseFloat(m.wmape) * 100).toFixed(1)) : null,
|
||||||
|
bias: m.bias != null ? parseFloat(parseFloat(m.bias).toFixed(4)) : null,
|
||||||
|
rmse: m.rmse != null ? parseFloat(parseFloat(m.rmse).toFixed(4)) : null,
|
||||||
|
naiveWmape: m.naive_wmape != null ? parseFloat((parseFloat(m.naive_wmape) * 100).toFixed(1)) : null,
|
||||||
|
fva: m.fva != null ? parseFloat(parseFloat(m.fva).toFixed(3)) : null,
|
||||||
|
} : null;
|
||||||
|
|
||||||
// Organize into response structure
|
// Organize into response structure
|
||||||
const overall = metrics.find(m => m.metric_type === 'overall');
|
const overall = metrics.find(m => m.metric_type === 'overall' && m.dimension_value === 'all')
|
||||||
|
const overallInclDormant = metrics.find(m => m.metric_type === 'overall' && m.dimension_value === 'all_incl_dormant')
|
||||||
|
const overallWeekly = metrics.find(m => m.metric_type === 'overall_weekly');
|
||||||
const byPhase = metrics
|
const byPhase = metrics
|
||||||
.filter(m => m.metric_type === 'by_phase')
|
.filter(m => m.metric_type === 'by_phase')
|
||||||
.map(m => ({
|
.map(m => ({
|
||||||
@@ -706,6 +724,8 @@ router.get('/forecast/accuracy', async (req, res) => {
|
|||||||
wmape: m.wmape != null ? parseFloat((parseFloat(m.wmape) * 100).toFixed(1)) : null,
|
wmape: m.wmape != null ? parseFloat((parseFloat(m.wmape) * 100).toFixed(1)) : null,
|
||||||
bias: m.bias != null ? parseFloat(parseFloat(m.bias).toFixed(4)) : null,
|
bias: m.bias != null ? parseFloat(parseFloat(m.bias).toFixed(4)) : null,
|
||||||
rmse: m.rmse != null ? parseFloat(parseFloat(m.rmse).toFixed(4)) : null,
|
rmse: m.rmse != null ? parseFloat(parseFloat(m.rmse).toFixed(4)) : null,
|
||||||
|
naiveWmape: m.naive_wmape != null ? parseFloat((parseFloat(m.naive_wmape) * 100).toFixed(1)) : null,
|
||||||
|
fva: m.fva != null ? parseFloat(parseFloat(m.fva).toFixed(3)) : null,
|
||||||
}))
|
}))
|
||||||
.sort((a, b) => (b.totalActual || 0) - (a.totalActual || 0));
|
.sort((a, b) => (b.totalActual || 0) - (a.totalActual || 0));
|
||||||
|
|
||||||
@@ -763,6 +783,26 @@ router.get('/forecast/accuracy', async (req, res) => {
|
|||||||
sampleSize: parseInt(r.sample_size),
|
sampleSize: parseInt(r.sample_size),
|
||||||
}));
|
}));
|
||||||
|
|
||||||
|
// Weekly-grain trend across runs (starts empty for old runs that predate
|
||||||
|
// the overall_weekly metric — that's expected, no backfill). F9.
|
||||||
|
const { rows: weeklyTrendRows } = await executeQuery(`
|
||||||
|
SELECT fr.finished_at::date AS run_date,
|
||||||
|
fa.wmape, fa.naive_wmape, fa.fva, fa.sample_size
|
||||||
|
FROM forecast_accuracy fa
|
||||||
|
JOIN forecast_runs fr ON fr.id = fa.run_id
|
||||||
|
WHERE fa.metric_type = 'overall_weekly'
|
||||||
|
AND fa.dimension_value = 'all'
|
||||||
|
ORDER BY fr.finished_at
|
||||||
|
`);
|
||||||
|
|
||||||
|
const accuracyTrendWeekly = weeklyTrendRows.map(r => ({
|
||||||
|
date: r.run_date instanceof Date ? r.run_date.toISOString().split('T')[0] : r.run_date,
|
||||||
|
wmape: r.wmape != null ? parseFloat((parseFloat(r.wmape) * 100).toFixed(1)) : null,
|
||||||
|
naiveWmape: r.naive_wmape != null ? parseFloat((parseFloat(r.naive_wmape) * 100).toFixed(1)) : null,
|
||||||
|
fva: r.fva != null ? parseFloat(parseFloat(r.fva).toFixed(3)) : null,
|
||||||
|
sampleSize: parseInt(r.sample_size),
|
||||||
|
}));
|
||||||
|
|
||||||
res.json({
|
res.json({
|
||||||
hasData: true,
|
hasData: true,
|
||||||
computedAt,
|
computedAt,
|
||||||
@@ -775,20 +815,15 @@ router.get('/forecast/accuracy', async (req, res) => {
|
|||||||
? historyInfo.latest_date.toISOString().split('T')[0]
|
? historyInfo.latest_date.toISOString().split('T')[0]
|
||||||
: historyInfo.latest_date,
|
: historyInfo.latest_date,
|
||||||
},
|
},
|
||||||
overall: overall ? {
|
overall: shapeOverall(overall),
|
||||||
sampleSize: parseInt(overall.sample_size),
|
overallInclDormant: shapeOverall(overallInclDormant),
|
||||||
totalActual: parseFloat(overall.total_actual_units) || 0,
|
overallWeekly: shapeOverall(overallWeekly),
|
||||||
totalForecast: parseFloat(overall.total_forecast_units) || 0,
|
|
||||||
mae: overall.mae != null ? parseFloat(parseFloat(overall.mae).toFixed(4)) : null,
|
|
||||||
wmape: overall.wmape != null ? parseFloat((parseFloat(overall.wmape) * 100).toFixed(1)) : null,
|
|
||||||
bias: overall.bias != null ? parseFloat(parseFloat(overall.bias).toFixed(4)) : null,
|
|
||||||
rmse: overall.rmse != null ? parseFloat(parseFloat(overall.rmse).toFixed(4)) : null,
|
|
||||||
} : null,
|
|
||||||
byPhase,
|
byPhase,
|
||||||
byLeadTime,
|
byLeadTime,
|
||||||
byMethod,
|
byMethod,
|
||||||
dailyTrend,
|
dailyTrend,
|
||||||
accuracyTrend,
|
accuracyTrend,
|
||||||
|
accuracyTrendWeekly,
|
||||||
});
|
});
|
||||||
} catch (err) {
|
} catch (err) {
|
||||||
console.error('Error fetching forecast accuracy:', err);
|
console.error('Error fetching forecast accuracy:', err);
|
||||||
|
|||||||
@@ -2,7 +2,7 @@ import { useQuery } from "@tanstack/react-query"
|
|||||||
import { apiFetch } from '@/utils/api';
|
import { apiFetch } from '@/utils/api';
|
||||||
import { BarChart, Bar, ResponsiveContainer, XAxis, YAxis, Tooltip as RechartsTooltip, Cell, LineChart, Line } from "recharts"
|
import { BarChart, Bar, ResponsiveContainer, XAxis, YAxis, Tooltip as RechartsTooltip, Cell, LineChart, Line } from "recharts"
|
||||||
import config from "@/config"
|
import config from "@/config"
|
||||||
import { Target, TrendingDown, ArrowUpDown } from "lucide-react"
|
import { Target, TrendingDown, ArrowUpDown, Swords } from "lucide-react"
|
||||||
import { Tooltip as UITooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip"
|
import { Tooltip as UITooltip, TooltipContent, TooltipProvider, TooltipTrigger } from "@/components/ui/tooltip"
|
||||||
import { PHASE_CONFIG } from "@/utils/lifecyclePhases"
|
import { PHASE_CONFIG } from "@/utils/lifecyclePhases"
|
||||||
|
|
||||||
@@ -14,6 +14,8 @@ interface OverallMetrics {
|
|||||||
wmape: number | null
|
wmape: number | null
|
||||||
bias: number | null
|
bias: number | null
|
||||||
rmse: number | null
|
rmse: number | null
|
||||||
|
naiveWmape?: number | null
|
||||||
|
fva?: number | null
|
||||||
}
|
}
|
||||||
|
|
||||||
interface PhaseAccuracy {
|
interface PhaseAccuracy {
|
||||||
@@ -25,6 +27,8 @@ interface PhaseAccuracy {
|
|||||||
wmape: number | null
|
wmape: number | null
|
||||||
bias: number | null
|
bias: number | null
|
||||||
rmse: number | null
|
rmse: number | null
|
||||||
|
naiveWmape?: number | null
|
||||||
|
fva?: number | null
|
||||||
}
|
}
|
||||||
|
|
||||||
interface LeadTimeAccuracy {
|
interface LeadTimeAccuracy {
|
||||||
@@ -51,11 +55,14 @@ interface AccuracyData {
|
|||||||
daysOfHistory?: number
|
daysOfHistory?: number
|
||||||
historyRange?: { from: string; to: string }
|
historyRange?: { from: string; to: string }
|
||||||
overall?: OverallMetrics
|
overall?: OverallMetrics
|
||||||
|
overallInclDormant?: OverallMetrics
|
||||||
|
overallWeekly?: OverallMetrics
|
||||||
byPhase?: PhaseAccuracy[]
|
byPhase?: PhaseAccuracy[]
|
||||||
byLeadTime?: LeadTimeAccuracy[]
|
byLeadTime?: LeadTimeAccuracy[]
|
||||||
byMethod?: { method: string; sampleSize: number; mae: number | null; wmape: number | null; bias: number | null }[]
|
byMethod?: { method: string; sampleSize: number; mae: number | null; wmape: number | null; bias: number | null }[]
|
||||||
dailyTrend?: { date: string; mae: number | null; wmape: number | null; bias: number | null }[]
|
dailyTrend?: { date: string; mae: number | null; wmape: number | null; bias: number | null }[]
|
||||||
accuracyTrend?: AccuracyTrendPoint[]
|
accuracyTrend?: AccuracyTrendPoint[]
|
||||||
|
accuracyTrendWeekly?: { date: string; wmape: number | null; naiveWmape: number | null; fva: number | null; sampleSize: number }[]
|
||||||
}
|
}
|
||||||
|
|
||||||
function MetricSkeleton() {
|
function MetricSkeleton() {
|
||||||
@@ -74,12 +81,30 @@ function formatBias(bias: number | null): string {
|
|||||||
}
|
}
|
||||||
|
|
||||||
function getAccuracyColor(wmape: number | null): string {
|
function getAccuracyColor(wmape: number | null): string {
|
||||||
|
// Daily-grain thresholds (used for the by-phase / lead-time bars).
|
||||||
if (wmape === null) return "text-muted-foreground"
|
if (wmape === null) return "text-muted-foreground"
|
||||||
if (wmape <= 30) return "text-green-600"
|
if (wmape <= 30) return "text-green-600"
|
||||||
if (wmape <= 50) return "text-yellow-600"
|
if (wmape <= 50) return "text-yellow-600"
|
||||||
return "text-red-600"
|
return "text-red-600"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function getWeeklyAccuracyColor(wmape: number | null): string {
|
||||||
|
// Weekly per-product grain has a much lower achievable floor than daily grain
|
||||||
|
// on this intermittent-demand catalog, so the headline uses its own thresholds.
|
||||||
|
if (wmape === null) return "text-muted-foreground"
|
||||||
|
if (wmape <= 60) return "text-green-600"
|
||||||
|
if (wmape <= 90) return "text-yellow-600"
|
||||||
|
return "text-red-600"
|
||||||
|
}
|
||||||
|
|
||||||
|
function formatSignedPct(ratio: number | null, digits = 0): string {
|
||||||
|
// ratio is a fraction (0.7 => +70%); null-safe.
|
||||||
|
if (ratio === null || ratio === undefined) return "N/A"
|
||||||
|
const pct = ratio * 100
|
||||||
|
const sign = pct > 0 ? "+" : ""
|
||||||
|
return `${sign}${pct.toFixed(digits)}%`
|
||||||
|
}
|
||||||
|
|
||||||
export function ForecastAccuracy() {
|
export function ForecastAccuracy() {
|
||||||
const { data, error, isLoading } = useQuery<AccuracyData>({
|
const { data, error, isLoading } = useQuery<AccuracyData>({
|
||||||
queryKey: ["forecast-accuracy"],
|
queryKey: ["forecast-accuracy"],
|
||||||
@@ -133,6 +158,24 @@ export function ForecastAccuracy() {
|
|||||||
sampleSize: lt.sampleSize,
|
sampleSize: lt.sampleSize,
|
||||||
}))
|
}))
|
||||||
|
|
||||||
|
// Headline prefers the weekly-grain WMAPE (informative); falls back to the
|
||||||
|
// daily-grain number until enough complete weeks of history exist.
|
||||||
|
const weeklyWmape = data?.overallWeekly?.wmape ?? null
|
||||||
|
const usingWeekly = weeklyWmape !== null
|
||||||
|
const headlineWmape = usingWeekly ? weeklyWmape : (data?.overall?.wmape ?? null)
|
||||||
|
const headlineColor = usingWeekly
|
||||||
|
? getWeeklyAccuracyColor(headlineWmape)
|
||||||
|
: getAccuracyColor(headlineWmape)
|
||||||
|
// Net forecast-vs-actual ratio (e.g. +70% = over-forecasting), from the
|
||||||
|
// daily 'all' totals — far more legible than bias in raw units.
|
||||||
|
const totalFc = data?.overall?.totalForecast ?? 0
|
||||||
|
const totalAct = data?.overall?.totalActual ?? 0
|
||||||
|
const fcVsAct = totalAct > 0 ? (totalFc / totalAct - 1) : null
|
||||||
|
// Value over the naive baseline; prefer weekly grain to match the headline.
|
||||||
|
const naiveSource = data?.overallWeekly ?? data?.overall
|
||||||
|
const naiveWmape = naiveSource?.naiveWmape ?? null
|
||||||
|
const fva = naiveSource?.fva ?? null
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<div>
|
<div>
|
||||||
<h3 className="text-lg font-medium mb-3">Forecast Accuracy</h3>
|
<h3 className="text-lg font-medium mb-3">Forecast Accuracy</h3>
|
||||||
@@ -148,10 +191,24 @@ export function ForecastAccuracy() {
|
|||||||
<div className="flex items-baseline justify-between">
|
<div className="flex items-baseline justify-between">
|
||||||
<div className="flex items-center gap-2">
|
<div className="flex items-center gap-2">
|
||||||
<Target className="h-4 w-4 text-muted-foreground" />
|
<Target className="h-4 w-4 text-muted-foreground" />
|
||||||
<p className="text-sm font-medium text-muted-foreground">WMAPE</p>
|
<p className="text-sm font-medium text-muted-foreground">
|
||||||
|
WMAPE <span className="text-[10px] opacity-70">({usingWeekly ? "weekly" : "daily"})</span>
|
||||||
|
</p>
|
||||||
</div>
|
</div>
|
||||||
<p className={`text-lg font-bold ${getAccuracyColor(data?.overall?.wmape ?? null)}`}>
|
<p className={`text-lg font-bold ${headlineColor}`}>
|
||||||
{formatWmape(data?.overall?.wmape ?? null)}
|
{formatWmape(headlineWmape)}
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
<div className="flex items-baseline justify-between">
|
||||||
|
<div className="flex items-center gap-2">
|
||||||
|
<ArrowUpDown className="h-4 w-4 text-muted-foreground" />
|
||||||
|
<p className="text-sm font-medium text-muted-foreground">Forecast vs actual</p>
|
||||||
|
</div>
|
||||||
|
<p className="text-lg font-bold">
|
||||||
|
{formatSignedPct(fcVsAct)}
|
||||||
|
<span className="text-xs font-normal text-muted-foreground ml-1">
|
||||||
|
{(fcVsAct ?? 0) > 0 ? "over" : (fcVsAct ?? 0) < 0 ? "under" : ""}
|
||||||
|
</span>
|
||||||
</p>
|
</p>
|
||||||
</div>
|
</div>
|
||||||
<div className="flex items-baseline justify-between">
|
<div className="flex items-baseline justify-between">
|
||||||
@@ -160,20 +217,24 @@ export function ForecastAccuracy() {
|
|||||||
<p className="text-sm font-medium text-muted-foreground">MAE</p>
|
<p className="text-sm font-medium text-muted-foreground">MAE</p>
|
||||||
</div>
|
</div>
|
||||||
<p className="text-lg font-bold">
|
<p className="text-lg font-bold">
|
||||||
{data?.overall?.mae !== null ? data?.overall?.mae?.toFixed(2) : "N/A"}
|
{data?.overall?.mae != null ? data?.overall?.mae?.toFixed(2) : "N/A"}
|
||||||
<span className="text-xs font-normal text-muted-foreground ml-1">units</span>
|
<span className="text-xs font-normal text-muted-foreground ml-1">units</span>
|
||||||
</p>
|
</p>
|
||||||
</div>
|
</div>
|
||||||
<div className="flex items-baseline justify-between">
|
<div className="flex items-baseline justify-between">
|
||||||
<div className="flex items-center gap-2">
|
<div className="flex items-center gap-2">
|
||||||
<ArrowUpDown className="h-4 w-4 text-muted-foreground" />
|
<Swords className="h-4 w-4 text-muted-foreground" />
|
||||||
<p className="text-sm font-medium text-muted-foreground">Bias</p>
|
<p className="text-sm font-medium text-muted-foreground">vs naive</p>
|
||||||
</div>
|
</div>
|
||||||
<p className="text-lg font-bold">
|
<p className="text-lg font-bold">
|
||||||
{formatBias(data?.overall?.bias ?? null)}
|
<span className={fva != null ? (fva > 0 ? "text-green-600" : "text-red-600") : "text-muted-foreground"}>
|
||||||
<span className="text-xs font-normal text-muted-foreground ml-1">
|
{fva != null ? `${formatSignedPct(fva)} FVA` : "N/A"}
|
||||||
{(data?.overall?.bias ?? 0) > 0 ? "over" : (data?.overall?.bias ?? 0) < 0 ? "under" : ""}
|
|
||||||
</span>
|
</span>
|
||||||
|
{naiveWmape != null && (
|
||||||
|
<span className="text-xs font-normal text-muted-foreground ml-1">
|
||||||
|
naive {formatWmape(naiveWmape)}
|
||||||
|
</span>
|
||||||
|
)}
|
||||||
</p>
|
</p>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|||||||
Reference in New Issue
Block a user