Import/metrics calc fixes

This commit is contained in:
2026-02-08 22:44:57 -05:00
parent 12cc7a4639
commit 7c41a7f799
11 changed files with 828 additions and 55 deletions

276
METRICS_AUDIT.md Normal file
View File

@@ -0,0 +1,276 @@
# Metrics Pipeline Audit Report
**Date:** 2026-02-08
**Scope:** All 6 SQL scripts in `inventory-server/scripts/metrics-new/`, import pipeline, custom functions, and post-calculation data verification.
---
## Executive Summary
The metrics pipeline is architecturally sound and the core calculations are mostly correct. The 30-day sales, revenue, replenishment, and aggregate metrics (brand/vendor/category) all cross-check accurately between the snapshots, product_metrics, and direct orders queries. However, several issues were found ranging from **critical data bugs** to **design limitations** that affect accuracy of specific metrics.
**Issues found: 13** (3 Critical, 4 Medium, 6 Low/Informational)
---
## CRITICAL Issues
### C1. `net_revenue` in daily snapshots never subtracts returns ($35.6K affected)
**Location:** `update_daily_snapshots.sql`, line 181
**Symptom:** `net_revenue` is stored as `gross_revenue - discounts` but should be `gross_revenue - discounts - returns_revenue`.
The SQL formula on line 181 appears correct:
```sql
COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00) AS net_revenue
```
However, actual data shows `net_revenue = gross_revenue - discounts` for ALL 3,252 snapshots that have returns. Total returns not subtracted: **$35,630.03** across 2,946 products. This may be caused by the `returns_revenue` in the SalesData CTE not properly flowing through to the INSERT, or by a prior version of the code that stored these values differently. The profit column (line 184) has the same issue: `(gross - discounts) - cogs` instead of `(gross - discounts - returns) - cogs`.
**Impact:** Net revenue and profit are overstated by the amount of returns. This cascades to all metrics derived from snapshots: `revenue_30d`, `profit_30d`, `margin_30d`, `avg_ros_30d`, and all brand/vendor/category aggregate revenue.
**Recommended fix:** Debug why the returns subtraction isn't taking effect. The formula in the SQL looks correct, so this may be a data-type issue or an execution path issue. After fixing, rebuild snapshots.
**Status:** Owner will resolve. Code formula is correct; snapshots need rebuilding after prior fix deployment.
---
### C2. `eod_stock_quantity` uses CURRENT stock, not historical end-of-day stock
**Location:** `update_daily_snapshots.sql`, lines 123-132 (CurrentStock CTE)
**Symptom:** Every snapshot for a given product shows the same stock quantity regardless of the snapshot date.
The `CurrentStock` CTE simply reads `stock_quantity` from the `products` table:
```sql
SELECT pid, stock_quantity, ... FROM public.products
```
This means a snapshot from January 10 shows the SAME stock as today (February 8). Verified in data:
- Product 662561: stock = 36 on every date (Feb 1-7)
- Product 665397: stock = 25 on every date (Feb 1-7)
- All products checked show identical stock across all snapshot dates
**Impact:** All stock-derived metrics are inaccurate for historical analysis:
- `eod_stock_cost`, `eod_stock_retail`, `eod_stock_gross` (all wrong for past dates)
- `stockout_flag` (based on current stock, not historical)
- `stockout_days_30d` (undercounted since stockout_flag uses current stock)
- `avg_stock_units_30d`, `avg_stock_cost_30d` (no variance, just current stock repeated)
- `gmroi_30d`, `stockturn_30d` (based on avg_stock which is flat)
- `sell_through_30d` (denominator uses current stock assumption)
- `service_level_30d`, `fill_rate_30d`
**This is a known architectural limitation** noted in MEMORY.md. Fixing requires either:
1. Storing stock snapshots separately at end-of-day (ideally via a cron job that records stock before any changes)
2. Reconstructing historical stock from orders and receivings (complex but possible)
**Status: FIXED.** MySQL's `snap_product_value` table (daily EOD stock per product since 2012) is now imported into PostgreSQL `stock_snapshots` table via `scripts/import/stock-snapshots.js`. The `CurrentStock` CTE in `update_daily_snapshots.sql` now uses `LEFT JOIN stock_snapshots` for historical stock, falling back to `products.stock_quantity` when no historical data exists. Requires: run import, then rebuild daily snapshots.
---
### C3. `ON CONFLICT DO UPDATE WHERE` check skips 91%+ of product_metrics updates
**Location:** `update_product_metrics.sql`, lines 558-574
**Symptom:** 623,205 of 681,912 products (91.4%) have `last_calculated` older than 1 day. 592,369 are over 30 days old. 914 products with active 30-day sales haven't been updated in over 7 days.
The upsert's `WHERE` clause only updates if specific fields changed:
```sql
WHERE product_metrics.current_stock IS DISTINCT FROM EXCLUDED.current_stock OR
product_metrics.current_price IS DISTINCT FROM EXCLUDED.current_price OR ...
```
Fields NOT checked include: `stockout_days_30d`, `margin_30d`, `gmroi_30d`, `demand_pattern`, `seasonality_index`, `sales_growth_*`, `service_level_30d`, and many others. If a product's stock, price, sales, and revenue haven't changed, the entire row is skipped even though growth metrics, variability, and other derived fields may need updating.
**Impact:** Most derived metrics (growth, demand patterns, seasonality) are stale for the majority of products. Products with steady sales but unchanged stock/price never get their growth metrics recalculated.
**Recommended fix:** Either:
1. Remove the `WHERE` clause entirely (accept the performance cost of writing all rows every run)
2. Add `last_calculated` age check: `OR product_metrics.last_calculated < NOW() - INTERVAL '7 days'`
3. Add the missing fields to the change-detection check
**Status: FIXED.** Added 12 derived fields to the `IS DISTINCT FROM` check (`profit_30d`, `cogs_30d`, `margin_30d`, `stockout_days_30d`, `sell_through_30d`, `sales_growth_30d_vs_prev`, `revenue_growth_30d_vs_prev`, `demand_pattern`, `seasonal_pattern`, `seasonality_index`, `service_level_30d`, `fill_rate_30d`) plus a time-based safety net: `OR product_metrics.last_calculated < NOW() - INTERVAL '1 day'`. This guarantees every row is refreshed at least daily.
---
## MEDIUM Issues
### M1. Demand variability calculated only over activity days, not full 30-day window
**Location:** `update_product_metrics.sql`, DemandVariability CTE (lines 206-223)
**Symptom:** Variance, std_dev, and CV are computed over only the days that appear in snapshots (activity days), not the full 30-day period including zero-sales days.
Example: Product 41141 (Mexican Poppy) sold 102 units in 30 days across only 3 snapshot days (1, 1, 100). The variance/CV is calculated over just those 3 data points instead of 30 (with 27 zero-sales days).
**Impact:**
- CV is computed on sparse data (3-10 points instead of 30), making it statistically unreliable
- Products with sporadic large orders appear less variable than they really are
- `demand_pattern` classification is affected (stable/variable/sporadic/lumpy)
**Recommended fix:** Join against a generated 30-day date series and COALESCE missing days to 0 units sold before computing variance/stddev/CV.
**Status: FIXED.** Rewrote `DemandVariability` CTE to use `generate_series()` for the full 30-day date range, `CROSS JOIN` with distinct PIDs from snapshots, and `LEFT JOIN` actual snapshot data with `COALESCE(dps.units_sold, 0)` for missing days. Variance/stddev/CV now computed over all 30 data points.
---
### M2. `costeach` fallback to `price * 0.5` affects 32.5% of recent orders
**Location:** `orders.js`, line 600 and 634
**Symptom:** When no cost record exists in `order_costs`, the import falls back to `price * 0.5`.
Data shows 9,839 of 30,266 recent orders (32.5%) use this fallback. Among these, 79 paid products have `costeach = 0` because `price = 0 * 0.5 = 0`, even though the product has a real cost_price.
The daily snapshot has a second line of defense (using `get_weighted_avg_cost()` and then `p.cost_price`), but the orders table's `costeach` column itself contains inaccurate data for ~1/3 of orders.
**Impact:** COGS calculations at the order level are approximate for 1/3 of orders. The snapshot's fallback chain mitigates this somewhat, but any analytics using `orders.costeach` directly will be affected.
**Status: FIXED.** Added `products.cost_price` as intermediate fallback: `COALESCE(oc.costeach, p.cost_price, oi.price * 0.5)`. The products table join was added to both the `order_totals` CTE and the outer SELECT in `orders.js`. Requires a full orders re-import to apply retroactively.
---
### M3. `lifetime_sales` uses MySQL `total_sold` (status >= 20) but orders import uses status >= 15
**Location:** `products.js` line 200 vs `orders.js` line 69
**Symptom:** `total_sold` in the products table comes from MySQL with `order_status >= 20`, excluding status 15 (canceled) and 16 (combined). But the orders import fetches orders with `order_status >= 15`.
Verified in MySQL: For product 31286, `total_sold` (>=20) = 13,786 vs (>=15) = 13,905 (difference of 119 units).
**Impact:** `lifetime_sales` in product_metrics (sourced from `products.total_sold`) slightly understates compared to what the orders table contains. The `lifetime_revenue_quality` field correctly flags most as "estimated" since the orders table only covers ~5 years while `total_sold` is all-time. This is a minor inconsistency (< 1% difference).
**Status:** Accepted. < 1% difference, not worth the complexity of aligning thresholds.
---
### M4. `sell_through_30d` has 868 NULL values and 547 anomalous values for products with sales
**Location:** `update_product_metrics.sql`, lines 356-361
**Formula:** `(sales_30d / (current_stock + sales_30d + returns_units_30d - received_qty_30d)) * 100`
- 868 products with sales but NULL sell_through (denominator = 0, which happens when `current_stock + sales - received = 0`, i.e. all stock came from receiving and was sold)
- 259 products with sell_through > 100%
- 288 products with negative sell_through
**Impact:** Sell-through rate is unreliable for products with significant receiving activity in the same period. The formula tries to approximate "beginning inventory" but the approximation breaks when current stock ≠ actual beginning stock (which is always, per issue C2).
**Status:** Will improve once C2 fix (historical stock) is deployed and snapshots are rebuilt, since `current_stock` in the formula will then reflect actual beginning inventory.
---
## LOW / INFORMATIONAL Issues
### L1. Snapshots only cover ~1,167 products/day out of 681K
Only products with order or receiving activity on a given day get snapshots. This is by design (the `ProductsWithActivity` CTE on line 133 of `update_daily_snapshots.sql`), but it means:
- 560K+ products have zero snapshot history
- Stockout tracking is impossible for products with no sales (they can't appear in snapshots)
- The "avg_stock" metrics (avg_stock_units_30d, etc.) only average over activity days, not all 30 days
This is acceptable for storage efficiency but should be understood when interpreting metrics.
**Status:** Accepted (by design).
---
### L2. `detect_seasonal_pattern` function only compares current month to yearly average
The seasonality detection is simplistic: it compares current month's avg daily sales to yearly avg. This means:
- It can only detect if the CURRENT month is above average, not identify historical seasonal patterns
- Running in January vs July will give completely different results for the same product
- The "peak_season" field always shows the current month/quarter when seasonal (not the actual peak)
This is noted as a P5 (low priority) feature and is adequate for a first pass but should not be relied upon for demand planning.
**Status: FIXED.** Rewrote `detect_seasonal_pattern` function to compare monthly average sales across the full last 12 months. Uses CV across months + peak-to-average ratio for classification: `strong` (CV > 0.5, peak > 150%), `moderate` (CV > 0.3, peak > 120%), `none`. Peak season now identifies the actual highest-sales month. Requires at least 3 months of data. Saved in `db/functions.sql`.
---
### L3. Free product with negative revenue in top sellers
Product 476848 ("Thank You, From ACOT!") shows 254 sales with -$1.00 revenue because one order applied a $1 discount to a $0 product. This is a data oddity, not a calculation bug. Could be addressed by excluding $0-price products from revenue metrics or by data cleanup.
**Status:** Accepted (data oddity, not a bug).
---
### L4. `landing_cost_price` is always NULL
`current_landing_cost_price` in product_metrics is mapped from `current_effective_cost` which is just `cost_price`. The `landing_cost_price` concept (cost + shipping + duties) is not implemented. The field exists but has no meaningful data.
**Status: FIXED.** Removed `landing_cost_price` from `db/schema.sql`, `current_landing_cost_price` from `db/metrics-schema-new.sql`, `update_product_metrics.sql`, and `backfill/populate_initial_product_metrics.sql`. Column should be dropped from the live database via `ALTER TABLE`.
---
### L5. Custom SQL functions not tracked in version control
All 6 custom functions (`calculate_sales_velocity`, `get_weighted_avg_cost`, `safe_divide`, `std_numeric`, `classify_demand_pattern`, `detect_seasonal_pattern`) and the `category_hierarchy` materialized view exist only in the database. They are not defined in any migration or schema file in the repository.
If the database needs to be recreated, these would be lost.
**Status: FIXED.** All 6 functions and the `category_hierarchy` materialized view definition saved to `inventory-server/db/functions.sql`. File is re-runnable via `psql -f functions.sql`.
---
### L6. `get_weighted_avg_cost` limited to last 10 receivings
The function `LIMIT 10` for performance, but this means products with many small receivings may not accurately reflect the true weighted average cost if the cost has changed significantly beyond the last 10 receiving records.
**Status: FIXED.** Removed `LIMIT 10` from `get_weighted_avg_cost`. Data shows max receivings per product is 142 (p95 = 11, avg = 3), so performance impact is negligible. Updated definition in `db/functions.sql`.
---
## Verification Summary
### What's Working Correctly
| Check | Result |
|-------|--------|
| 30d sales: product_metrics vs orders vs snapshots | **MATCH** (verified top 10 sellers) |
| Replenishment formula: manual calc vs stored | **MATCH** (verified 10 products) |
| Brand metrics vs sum of product_metrics | **MATCH** (0 difference across all brands) |
| Order status mapping (numeric → text) | **CORRECT** (all statuses mapped, no numeric remain) |
| Cost price: PostgreSQL vs MySQL source | **MATCH** (within rounding, verified 5 products) |
| total_sold: PostgreSQL vs MySQL source | **MATCH** (verified 5 products) |
| Category rollups (rolled-up > direct for parents) | **CORRECT** |
| ABC classification distribution | **REASONABLE** (A: 8K, B: 12.5K, C: 113K) |
| Lead time calculation (PO → receiving) | **CORRECT** (verified examples) |
### Data Overview
| Metric | Value |
|--------|-------|
| Total products | 681,912 |
| Products in product_metrics | 681,912 (100%) |
| Products with 30d sales | 10,291 (1.5%) |
| Products with negative profit & revenue | 139 (mostly cost > price) |
| Products with negative stock | 0 |
| Snapshot date range | 2020-06-18 to 2026-02-08 |
| Avg products per snapshot day | 1,167 |
| Order date range | 2020-06-18 to 2026-02-08 |
| Total orders | 2,885,825 |
| 'returned' status orders | 0 (returns via negative quantity only) |
---
## Fix Status Summary
| Issue | Severity | Status | Deployment Action Needed |
|-------|----------|--------|--------------------------|
| C1 | Critical | Owner resolving | Rebuild daily snapshots |
| C2 | Critical | **FIXED** | Run import, rebuild daily snapshots |
| C3 | Critical | **FIXED** | Deploy updated `update_product_metrics.sql` |
| M1 | Medium | **FIXED** | Deploy updated `update_product_metrics.sql` |
| M2 | Medium | **FIXED** | Full orders re-import (`--full`) |
| M3 | Medium | Accepted | None |
| M4 | Medium | Pending C2 | Will improve after C2 deployment |
| L1 | Low | Accepted | None |
| L2 | Low | **FIXED** | Deploy `db/functions.sql` to database |
| L3 | Low | Accepted | None |
| L4 | Low | **FIXED** | `ALTER TABLE` to drop columns |
| L5 | Low | **FIXED** | None (file committed) |
| L6 | Low | **FIXED** | Deploy `db/functions.sql` to database |
### Deployment Steps
1. Deploy `db/functions.sql` to PostgreSQL: `psql -d inventory_db -f db/functions.sql` (L2, L6)
2. Run import (includes stock snapshots first load) (C2, M2)
3. Drop stale columns: `ALTER TABLE products DROP COLUMN IF EXISTS landing_cost_price; ALTER TABLE product_metrics DROP COLUMN IF EXISTS current_landing_cost_price;` (L4)
4. Rebuild daily snapshots (C1, C2)
5. Re-run metrics calculation (C3, M1 take effect automatically)

View File

@@ -0,0 +1,241 @@
-- Custom PostgreSQL functions used by the metrics pipeline
-- These must exist in the database before running calculate-metrics-new.js
--
-- To install/update: psql -d inventory_db -f functions.sql
-- All functions use CREATE OR REPLACE so they are safe to re-run.
-- =============================================================================
-- safe_divide: Division helper that returns a default value instead of erroring
-- on NULL or zero denominators.
-- =============================================================================
CREATE OR REPLACE FUNCTION public.safe_divide(
numerator numeric,
denominator numeric,
default_value numeric DEFAULT NULL::numeric
)
RETURNS numeric
LANGUAGE plpgsql
IMMUTABLE
AS $function$
BEGIN
IF denominator IS NULL OR denominator = 0 THEN
RETURN default_value;
ELSE
RETURN numerator / denominator;
END IF;
END;
$function$;
-- =============================================================================
-- std_numeric: Standardized rounding helper for consistent numeric precision.
-- =============================================================================
CREATE OR REPLACE FUNCTION public.std_numeric(
value numeric,
precision_digits integer DEFAULT 2
)
RETURNS numeric
LANGUAGE plpgsql
IMMUTABLE
AS $function$
BEGIN
IF value IS NULL THEN
RETURN NULL;
ELSE
RETURN ROUND(value, precision_digits);
END IF;
END;
$function$;
-- =============================================================================
-- calculate_sales_velocity: Daily sales velocity adjusted for stockout days.
-- Ensures at least 14-day denominator for products with sales to avoid
-- inflated velocity from short windows.
-- =============================================================================
CREATE OR REPLACE FUNCTION public.calculate_sales_velocity(
sales_30d integer,
stockout_days_30d integer
)
RETURNS numeric
LANGUAGE plpgsql
IMMUTABLE
AS $function$
BEGIN
RETURN sales_30d /
NULLIF(
GREATEST(
30.0 - stockout_days_30d,
CASE
WHEN sales_30d > 0 THEN 14.0 -- If we have sales, ensure at least 14 days denominator
ELSE 30.0 -- If no sales, use full period
END
),
0
);
END;
$function$;
-- =============================================================================
-- get_weighted_avg_cost: Weighted average cost from receivings up to a given date.
-- Uses all non-canceled receivings (no row limit) weighted by quantity.
-- =============================================================================
CREATE OR REPLACE FUNCTION public.get_weighted_avg_cost(
p_pid bigint,
p_date date
)
RETURNS numeric
LANGUAGE plpgsql
STABLE
AS $function$
DECLARE
weighted_cost NUMERIC;
BEGIN
SELECT
CASE
WHEN SUM(qty_each) > 0 THEN SUM(cost_each * qty_each) / SUM(qty_each)
ELSE NULL
END INTO weighted_cost
FROM receivings
WHERE pid = p_pid
AND received_date <= p_date
AND status != 'canceled';
RETURN weighted_cost;
END;
$function$;
-- =============================================================================
-- classify_demand_pattern: Classifies demand based on average demand and
-- coefficient of variation (CV). Standard inventory classification:
-- zero: no demand
-- stable: CV <= 0.2 (predictable, easy to forecast)
-- variable: CV <= 0.5 (some variability, still forecastable)
-- sporadic: low volume + high CV (intermittent demand)
-- lumpy: high volume + high CV (unpredictable bursts)
-- =============================================================================
CREATE OR REPLACE FUNCTION public.classify_demand_pattern(
avg_demand numeric,
cv numeric
)
RETURNS character varying
LANGUAGE plpgsql
IMMUTABLE
AS $function$
BEGIN
IF avg_demand IS NULL OR cv IS NULL THEN
RETURN NULL;
ELSIF avg_demand = 0 THEN
RETURN 'zero';
ELSIF cv <= 0.2 THEN
RETURN 'stable';
ELSIF cv <= 0.5 THEN
RETURN 'variable';
ELSIF avg_demand < 1.0 THEN
RETURN 'sporadic';
ELSE
RETURN 'lumpy';
END IF;
END;
$function$;
-- =============================================================================
-- detect_seasonal_pattern: Detects seasonality by comparing monthly average
-- sales across the last 12 months. Uses coefficient of variation across months
-- and peak-to-average ratio to classify patterns.
--
-- Returns:
-- seasonal_pattern: 'none', 'moderate', or 'strong'
-- seasonality_index: peak month avg / overall avg * 100 (100 = no seasonality)
-- peak_season: name of peak month (e.g. 'January'), or NULL if none
-- =============================================================================
CREATE OR REPLACE FUNCTION public.detect_seasonal_pattern(p_pid bigint)
RETURNS TABLE(seasonal_pattern character varying, seasonality_index numeric, peak_season character varying)
LANGUAGE plpgsql
STABLE
AS $function$
DECLARE
v_monthly_cv NUMERIC;
v_max_month_avg NUMERIC;
v_overall_avg NUMERIC;
v_monthly_stddev NUMERIC;
v_peak_month_num INT;
v_data_months INT;
v_seasonality_index NUMERIC;
v_seasonal_pattern VARCHAR;
v_peak_season VARCHAR;
BEGIN
-- Gather monthly average sales over the last 12 months
SELECT
COUNT(*),
AVG(month_avg),
STDDEV(month_avg),
MAX(month_avg)
INTO v_data_months, v_overall_avg, v_monthly_stddev, v_max_month_avg
FROM (
SELECT EXTRACT(MONTH FROM snapshot_date) AS mo, AVG(units_sold) AS month_avg
FROM daily_product_snapshots
WHERE pid = p_pid AND snapshot_date >= CURRENT_DATE - INTERVAL '365 days'
GROUP BY EXTRACT(MONTH FROM snapshot_date)
) monthly;
-- Need at least 3 months of data for meaningful seasonality detection
IF v_data_months < 3 OR v_overall_avg IS NULL OR v_overall_avg = 0 THEN
RETURN QUERY SELECT 'none'::VARCHAR, 100::NUMERIC, NULL::VARCHAR;
RETURN;
END IF;
-- CV of monthly averages
v_monthly_cv := v_monthly_stddev / v_overall_avg;
-- Find peak month number
SELECT EXTRACT(MONTH FROM snapshot_date)::INT INTO v_peak_month_num
FROM daily_product_snapshots
WHERE pid = p_pid AND snapshot_date >= CURRENT_DATE - INTERVAL '365 days'
GROUP BY EXTRACT(MONTH FROM snapshot_date)
ORDER BY AVG(units_sold) DESC
LIMIT 1;
-- Seasonality index: peak month avg / overall avg * 100
v_seasonality_index := ROUND((v_max_month_avg / v_overall_avg * 100)::NUMERIC, 2);
IF v_monthly_cv > 0.5 AND v_seasonality_index > 150 THEN
v_seasonal_pattern := 'strong';
v_peak_season := TRIM(TO_CHAR(TO_DATE(v_peak_month_num::TEXT, 'MM'), 'Month'));
ELSIF v_monthly_cv > 0.3 AND v_seasonality_index > 120 THEN
v_seasonal_pattern := 'moderate';
v_peak_season := TRIM(TO_CHAR(TO_DATE(v_peak_month_num::TEXT, 'MM'), 'Month'));
ELSE
v_seasonal_pattern := 'none';
v_peak_season := NULL;
v_seasonality_index := 100;
END IF;
RETURN QUERY SELECT v_seasonal_pattern, v_seasonality_index, v_peak_season;
END;
$function$;
-- =============================================================================
-- category_hierarchy: Materialized view providing a recursive category tree
-- with ancestor paths for efficient rollup queries.
--
-- Refresh after category changes: REFRESH MATERIALIZED VIEW category_hierarchy;
-- =============================================================================
-- DROP MATERIALIZED VIEW IF EXISTS category_hierarchy;
-- CREATE MATERIALIZED VIEW category_hierarchy AS
-- WITH RECURSIVE cat_tree AS (
-- SELECT cat_id, name, type, parent_id,
-- cat_id AS root_id, 0 AS level, ARRAY[cat_id] AS path
-- FROM categories
-- WHERE parent_id IS NULL
-- UNION ALL
-- SELECT c.cat_id, c.name, c.type, c.parent_id,
-- ct.root_id, ct.level + 1, ct.path || c.cat_id
-- FROM categories c
-- JOIN cat_tree ct ON c.parent_id = ct.cat_id
-- )
-- SELECT cat_id, name, type, parent_id, root_id, level, path,
-- (SELECT array_agg(unnest ORDER BY unnest DESC)
-- FROM unnest(cat_tree.path) unnest
-- WHERE unnest <> cat_tree.cat_id) AS ancestor_ids
-- FROM cat_tree;
--
-- CREATE UNIQUE INDEX ON category_hierarchy (cat_id);

View File

@@ -80,7 +80,6 @@ CREATE TABLE public.product_metrics (
current_price NUMERIC(10, 2), current_price NUMERIC(10, 2),
current_regular_price NUMERIC(10, 2), current_regular_price NUMERIC(10, 2),
current_cost_price NUMERIC(10, 4), -- Increased precision for cost current_cost_price NUMERIC(10, 4), -- Increased precision for cost
current_landing_cost_price NUMERIC(10, 4), -- Increased precision for cost
current_stock INT NOT NULL DEFAULT 0, current_stock INT NOT NULL DEFAULT 0,
current_stock_cost NUMERIC(14, 4) NOT NULL DEFAULT 0.00, current_stock_cost NUMERIC(14, 4) NOT NULL DEFAULT 0.00,
current_stock_retail NUMERIC(14, 4) NOT NULL DEFAULT 0.00, current_stock_retail NUMERIC(14, 4) NOT NULL DEFAULT 0.00,
@@ -156,9 +155,9 @@ CREATE TABLE public.product_metrics (
days_of_stock_closing_stock NUMERIC(10, 2), -- lead_time_closing_stock - days_of_stock_forecast_units days_of_stock_closing_stock NUMERIC(10, 2), -- lead_time_closing_stock - days_of_stock_forecast_units
replenishment_needed_raw NUMERIC(10, 2), -- planning_period_forecast_units + config_safety_stock - current_stock - on_order_qty replenishment_needed_raw NUMERIC(10, 2), -- planning_period_forecast_units + config_safety_stock - current_stock - on_order_qty
replenishment_units INT, -- CEILING(GREATEST(0, replenishment_needed_raw)) replenishment_units INT, -- CEILING(GREATEST(0, replenishment_needed_raw))
replenishment_cost NUMERIC(14, 4), -- replenishment_units * COALESCE(current_landing_cost_price, current_cost_price) replenishment_cost NUMERIC(14, 4), -- replenishment_units * current_cost_price
replenishment_retail NUMERIC(14, 4), -- replenishment_units * current_price replenishment_retail NUMERIC(14, 4), -- replenishment_units * current_price
replenishment_profit NUMERIC(14, 4), -- replenishment_units * (current_price - COALESCE(current_landing_cost_price, current_cost_price)) replenishment_profit NUMERIC(14, 4), -- replenishment_units * (current_price - current_cost_price)
to_order_units INT, -- Apply MOQ/UOM logic to replenishment_units to_order_units INT, -- Apply MOQ/UOM logic to replenishment_units
forecast_lost_sales_units NUMERIC(10, 2), -- GREATEST(0, -lead_time_closing_stock) forecast_lost_sales_units NUMERIC(10, 2), -- GREATEST(0, -lead_time_closing_stock)
forecast_lost_revenue NUMERIC(14, 4), -- forecast_lost_sales_units * current_price forecast_lost_revenue NUMERIC(14, 4), -- forecast_lost_sales_units * current_price
@@ -167,7 +166,7 @@ CREATE TABLE public.product_metrics (
sells_out_in_days NUMERIC(10, 1), -- (current_stock + on_order_qty) / sales_velocity_daily sells_out_in_days NUMERIC(10, 1), -- (current_stock + on_order_qty) / sales_velocity_daily
replenish_date DATE, -- Calc based on when stock hits safety stock minus lead time replenish_date DATE, -- Calc based on when stock hits safety stock minus lead time
overstocked_units INT, -- GREATEST(0, current_stock - config_safety_stock - planning_period_forecast_units) overstocked_units INT, -- GREATEST(0, current_stock - config_safety_stock - planning_period_forecast_units)
overstocked_cost NUMERIC(14, 4), -- overstocked_units * COALESCE(current_landing_cost_price, current_cost_price) overstocked_cost NUMERIC(14, 4), -- overstocked_units * current_cost_price
overstocked_retail NUMERIC(14, 4), -- overstocked_units * current_price overstocked_retail NUMERIC(14, 4), -- overstocked_units * current_price
is_old_stock BOOLEAN, -- Based on age, last sold, last received, on_order status is_old_stock BOOLEAN, -- Based on age, last sold, last received, on_order status

View File

@@ -29,7 +29,6 @@ CREATE TABLE products (
price NUMERIC(14, 4) NOT NULL, price NUMERIC(14, 4) NOT NULL,
regular_price NUMERIC(14, 4) NOT NULL, regular_price NUMERIC(14, 4) NOT NULL,
cost_price NUMERIC(14, 4), cost_price NUMERIC(14, 4),
landing_cost_price NUMERIC(14, 4),
barcode TEXT, barcode TEXT,
harmonized_tariff_code TEXT, harmonized_tariff_code TEXT,
updated_at TIMESTAMP WITH TIME ZONE, updated_at TIMESTAMP WITH TIME ZONE,

View File

@@ -7,6 +7,7 @@ const { importProducts } = require('./import/products');
const importOrders = require('./import/orders'); const importOrders = require('./import/orders');
const importPurchaseOrders = require('./import/purchase-orders'); const importPurchaseOrders = require('./import/purchase-orders');
const importDailyDeals = require('./import/daily-deals'); const importDailyDeals = require('./import/daily-deals');
const importStockSnapshots = require('./import/stock-snapshots');
dotenv.config({ path: path.join(__dirname, "../.env") }); dotenv.config({ path: path.join(__dirname, "../.env") });
@@ -16,6 +17,7 @@ const IMPORT_PRODUCTS = true;
const IMPORT_ORDERS = true; const IMPORT_ORDERS = true;
const IMPORT_PURCHASE_ORDERS = true; const IMPORT_PURCHASE_ORDERS = true;
const IMPORT_DAILY_DEALS = true; const IMPORT_DAILY_DEALS = true;
const IMPORT_STOCK_SNAPSHOTS = true;
// Add flag for incremental updates // Add flag for incremental updates
const INCREMENTAL_UPDATE = process.env.INCREMENTAL_UPDATE !== 'false'; // Default to true unless explicitly set to false const INCREMENTAL_UPDATE = process.env.INCREMENTAL_UPDATE !== 'false'; // Default to true unless explicitly set to false
@@ -81,7 +83,8 @@ async function main() {
IMPORT_PRODUCTS, IMPORT_PRODUCTS,
IMPORT_ORDERS, IMPORT_ORDERS,
IMPORT_PURCHASE_ORDERS, IMPORT_PURCHASE_ORDERS,
IMPORT_DAILY_DEALS IMPORT_DAILY_DEALS,
IMPORT_STOCK_SNAPSHOTS
].filter(Boolean).length; ].filter(Boolean).length;
try { try {
@@ -130,10 +133,11 @@ async function main() {
'products_enabled', $3::boolean, 'products_enabled', $3::boolean,
'orders_enabled', $4::boolean, 'orders_enabled', $4::boolean,
'purchase_orders_enabled', $5::boolean, 'purchase_orders_enabled', $5::boolean,
'daily_deals_enabled', $6::boolean 'daily_deals_enabled', $6::boolean,
'stock_snapshots_enabled', $7::boolean
) )
) RETURNING id ) RETURNING id
`, [INCREMENTAL_UPDATE, IMPORT_CATEGORIES, IMPORT_PRODUCTS, IMPORT_ORDERS, IMPORT_PURCHASE_ORDERS, IMPORT_DAILY_DEALS]); `, [INCREMENTAL_UPDATE, IMPORT_CATEGORIES, IMPORT_PRODUCTS, IMPORT_ORDERS, IMPORT_PURCHASE_ORDERS, IMPORT_DAILY_DEALS, IMPORT_STOCK_SNAPSHOTS]);
importHistoryId = historyResult.rows[0].id; importHistoryId = historyResult.rows[0].id;
} catch (error) { } catch (error) {
console.error("Error creating import history record:", error); console.error("Error creating import history record:", error);
@@ -151,7 +155,8 @@ async function main() {
products: null, products: null,
orders: null, orders: null,
purchaseOrders: null, purchaseOrders: null,
dailyDeals: null dailyDeals: null,
stockSnapshots: null
}; };
let totalRecordsAdded = 0; let totalRecordsAdded = 0;
@@ -257,6 +262,33 @@ async function main() {
} }
} }
if (IMPORT_STOCK_SNAPSHOTS) {
try {
const stepStart = Date.now();
results.stockSnapshots = await importStockSnapshots(prodConnection, localConnection, INCREMENTAL_UPDATE);
stepTimings.stockSnapshots = Math.round((Date.now() - stepStart) / 1000);
if (isImportCancelled) throw new Error("Import cancelled");
completedSteps++;
console.log('Stock snapshots import result:', results.stockSnapshots);
if (results.stockSnapshots?.status === 'error') {
console.error('Stock snapshots import had an error:', results.stockSnapshots.error);
} else {
totalRecordsAdded += parseInt(results.stockSnapshots?.recordsAdded || 0);
totalRecordsUpdated += parseInt(results.stockSnapshots?.recordsUpdated || 0);
}
} catch (error) {
console.error('Error during stock snapshots import:', error);
results.stockSnapshots = {
status: 'error',
error: error.message,
recordsAdded: 0,
recordsUpdated: 0
};
}
}
const endTime = Date.now(); const endTime = Date.now();
const totalElapsedSeconds = Math.round((endTime - startTime) / 1000); const totalElapsedSeconds = Math.round((endTime - startTime) / 1000);
@@ -280,11 +312,13 @@ async function main() {
'orders_result', COALESCE($11::jsonb, 'null'::jsonb), 'orders_result', COALESCE($11::jsonb, 'null'::jsonb),
'purchase_orders_result', COALESCE($12::jsonb, 'null'::jsonb), 'purchase_orders_result', COALESCE($12::jsonb, 'null'::jsonb),
'daily_deals_result', COALESCE($13::jsonb, 'null'::jsonb), 'daily_deals_result', COALESCE($13::jsonb, 'null'::jsonb),
'total_deleted', $14::integer, 'stock_snapshots_enabled', $14::boolean,
'total_skipped', $15::integer, 'stock_snapshots_result', COALESCE($15::jsonb, 'null'::jsonb),
'step_timings', $16::jsonb 'total_deleted', $16::integer,
'total_skipped', $17::integer,
'step_timings', $18::jsonb
) )
WHERE id = $17 WHERE id = $19
`, [ `, [
totalElapsedSeconds, totalElapsedSeconds,
parseInt(totalRecordsAdded), parseInt(totalRecordsAdded),
@@ -299,6 +333,8 @@ async function main() {
JSON.stringify(results.orders), JSON.stringify(results.orders),
JSON.stringify(results.purchaseOrders), JSON.stringify(results.purchaseOrders),
JSON.stringify(results.dailyDeals), JSON.stringify(results.dailyDeals),
IMPORT_STOCK_SNAPSHOTS,
JSON.stringify(results.stockSnapshots),
totalRecordsDeleted, totalRecordsDeleted,
totalRecordsSkipped, totalRecordsSkipped,
JSON.stringify(stepTimings), JSON.stringify(stepTimings),

View File

@@ -597,14 +597,15 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
ELSE 0 ELSE 0
END) as promo_discount_sum, END) as promo_discount_sum,
COALESCE(ot.tax, 0) as total_tax, COALESCE(ot.tax, 0) as total_tax,
COALESCE(oc.costeach, oi.price * 0.5) as costeach COALESCE(oc.costeach, p.cost_price, oi.price * 0.5) as costeach
FROM temp_order_items oi FROM temp_order_items oi
LEFT JOIN temp_item_discounts id ON oi.order_id = id.order_id AND oi.pid = id.pid LEFT JOIN temp_item_discounts id ON oi.order_id = id.order_id AND oi.pid = id.pid
LEFT JOIN temp_main_discounts md ON id.order_id = md.order_id AND id.discount_id = md.discount_id LEFT JOIN temp_main_discounts md ON id.order_id = md.order_id AND id.discount_id = md.discount_id
LEFT JOIN temp_order_taxes ot ON oi.order_id = ot.order_id AND oi.pid = ot.pid LEFT JOIN temp_order_taxes ot ON oi.order_id = ot.order_id AND oi.pid = ot.pid
LEFT JOIN temp_order_costs oc ON oi.order_id = oc.order_id AND oi.pid = oc.pid LEFT JOIN temp_order_costs oc ON oi.order_id = oc.order_id AND oi.pid = oc.pid
LEFT JOIN public.products p ON oi.pid = p.pid
WHERE oi.order_id = ANY($1) WHERE oi.order_id = ANY($1)
GROUP BY oi.order_id, oi.pid, ot.tax, oc.costeach GROUP BY oi.order_id, oi.pid, ot.tax, oc.costeach, p.cost_price
) )
SELECT SELECT
oi.order_id as order_number, oi.order_id as order_number,
@@ -631,10 +632,11 @@ async function importOrders(prodConnection, localConnection, incrementalUpdate =
om.customer_name, om.customer_name,
om.status, om.status,
om.canceled, om.canceled,
COALESCE(ot.costeach, oi.price * 0.5)::NUMERIC(14, 4) as costeach COALESCE(ot.costeach, p.cost_price, oi.price * 0.5)::NUMERIC(14, 4) as costeach
FROM temp_order_items oi FROM temp_order_items oi
JOIN temp_order_meta om ON oi.order_id = om.order_id JOIN temp_order_meta om ON oi.order_id = om.order_id
LEFT JOIN order_totals ot ON oi.order_id = ot.order_id AND oi.pid = ot.pid LEFT JOIN order_totals ot ON oi.order_id = ot.order_id AND oi.pid = ot.pid
LEFT JOIN public.products p ON oi.pid = p.pid
WHERE oi.order_id = ANY($1) WHERE oi.order_id = ANY($1)
ORDER BY oi.order_id, oi.pid ORDER BY oi.order_id, oi.pid
`, [subBatchIds]); `, [subBatchIds]);

View File

@@ -0,0 +1,184 @@
const { outputProgress, formatElapsedTime, calculateRate } = require('../metrics-new/utils/progress');
const BATCH_SIZE = 5000;
/**
* Imports daily stock snapshots from MySQL's snap_product_value table to PostgreSQL.
* This provides historical end-of-day stock quantities per product, dating back to 2012.
*
* MySQL source table: snap_product_value (date, pid, count, pending, value)
* - date: snapshot date (typically yesterday's date, recorded daily by cron)
* - pid: product ID
* - count: end-of-day stock quantity (sum of product_inventory.count)
* - pending: pending/on-order quantity
* - value: total inventory value at cost (sum of costeach * count)
*
* PostgreSQL target table: stock_snapshots (snapshot_date, pid, stock_quantity, pending_quantity, stock_value)
*
* @param {object} prodConnection - MySQL connection to production DB
* @param {object} localConnection - PostgreSQL connection wrapper
* @param {boolean} incrementalUpdate - If true, only fetch new snapshots since last import
* @returns {object} Import statistics
*/
async function importStockSnapshots(prodConnection, localConnection, incrementalUpdate = true) {
const startTime = Date.now();
outputProgress({
status: 'running',
operation: 'Stock snapshots import',
message: 'Starting stock snapshots import...',
current: 0,
total: 0,
elapsed: formatElapsedTime(startTime)
});
// Ensure target table exists
await localConnection.query(`
CREATE TABLE IF NOT EXISTS stock_snapshots (
snapshot_date DATE NOT NULL,
pid BIGINT NOT NULL,
stock_quantity INT NOT NULL DEFAULT 0,
pending_quantity INT NOT NULL DEFAULT 0,
stock_value NUMERIC(14, 4) NOT NULL DEFAULT 0,
PRIMARY KEY (snapshot_date, pid)
)
`);
// Create index for efficient lookups by pid
await localConnection.query(`
CREATE INDEX IF NOT EXISTS idx_stock_snapshots_pid ON stock_snapshots (pid)
`);
// Determine the start date for the import
let startDate = '2020-01-01'; // Default: match the orders/snapshots date range
if (incrementalUpdate) {
const [result] = await localConnection.query(`
SELECT MAX(snapshot_date)::text AS max_date FROM stock_snapshots
`);
if (result.rows[0]?.max_date) {
// Start from the day after the last imported date
startDate = result.rows[0].max_date;
}
}
outputProgress({
status: 'running',
operation: 'Stock snapshots import',
message: `Fetching stock snapshots from MySQL since ${startDate}...`,
current: 0,
total: 0,
elapsed: formatElapsedTime(startTime)
});
// Count total rows to import
const [countResult] = await prodConnection.query(
`SELECT COUNT(*) AS total FROM snap_product_value WHERE date > ?`,
[startDate]
);
const totalRows = countResult[0].total;
if (totalRows === 0) {
outputProgress({
status: 'complete',
operation: 'Stock snapshots import',
message: 'No new stock snapshots to import',
current: 0,
total: 0,
elapsed: formatElapsedTime(startTime)
});
return { recordsAdded: 0, recordsUpdated: 0, status: 'complete' };
}
outputProgress({
status: 'running',
operation: 'Stock snapshots import',
message: `Found ${totalRows.toLocaleString()} stock snapshot rows to import`,
current: 0,
total: totalRows,
elapsed: formatElapsedTime(startTime)
});
// Process in batches using date-based pagination (more efficient than OFFSET)
let processedRows = 0;
let recordsAdded = 0;
let currentDate = startDate;
while (processedRows < totalRows) {
// Fetch a batch of dates
const [dateBatch] = await prodConnection.query(
`SELECT DISTINCT date FROM snap_product_value
WHERE date > ? ORDER BY date LIMIT 10`,
[currentDate]
);
if (dateBatch.length === 0) break;
const dates = dateBatch.map(r => r.date);
const lastDate = dates[dates.length - 1];
// Fetch all rows for these dates
const [rows] = await prodConnection.query(
`SELECT date, pid, count AS stock_quantity, pending AS pending_quantity, value AS stock_value
FROM snap_product_value
WHERE date > ? AND date <= ?
ORDER BY date, pid`,
[currentDate, lastDate]
);
if (rows.length === 0) break;
// Batch insert into PostgreSQL using UNNEST for efficiency
for (let i = 0; i < rows.length; i += BATCH_SIZE) {
const batch = rows.slice(i, i + BATCH_SIZE);
const dates = batch.map(r => r.date);
const pids = batch.map(r => r.pid);
const quantities = batch.map(r => r.stock_quantity);
const pending = batch.map(r => r.pending_quantity);
const values = batch.map(r => r.stock_value);
const [result] = await localConnection.query(`
INSERT INTO stock_snapshots (snapshot_date, pid, stock_quantity, pending_quantity, stock_value)
SELECT * FROM UNNEST(
$1::date[], $2::bigint[], $3::int[], $4::int[], $5::numeric[]
)
ON CONFLICT (snapshot_date, pid) DO UPDATE SET
stock_quantity = EXCLUDED.stock_quantity,
pending_quantity = EXCLUDED.pending_quantity,
stock_value = EXCLUDED.stock_value
`, [dates, pids, quantities, pending, values]);
recordsAdded += batch.length;
}
processedRows += rows.length;
currentDate = lastDate;
outputProgress({
status: 'running',
operation: 'Stock snapshots import',
message: `Imported ${processedRows.toLocaleString()} / ${totalRows.toLocaleString()} rows (through ${currentDate})`,
current: processedRows,
total: totalRows,
elapsed: formatElapsedTime(startTime),
rate: calculateRate(processedRows, startTime)
});
}
outputProgress({
status: 'complete',
operation: 'Stock snapshots import',
message: `Stock snapshots import complete: ${recordsAdded.toLocaleString()} rows`,
current: processedRows,
total: totalRows,
elapsed: formatElapsedTime(startTime)
});
return {
recordsAdded,
recordsUpdated: 0,
status: 'complete'
};
}
module.exports = importStockSnapshots;

View File

@@ -214,7 +214,7 @@ BEGIN
-- Final INSERT/UPDATE statement using all the prepared CTEs -- Final INSERT/UPDATE statement using all the prepared CTEs
INSERT INTO public.product_metrics ( INSERT INTO public.product_metrics (
pid, last_calculated, sku, title, brand, vendor, image_url, is_visible, is_replenishable, pid, last_calculated, sku, title, brand, vendor, image_url, is_visible, is_replenishable,
current_price, current_regular_price, current_cost_price, current_landing_cost_price, current_price, current_regular_price, current_cost_price,
current_stock, current_stock_cost, current_stock_retail, current_stock_gross, current_stock, current_stock_cost, current_stock_retail, current_stock_gross,
on_order_qty, on_order_cost, on_order_retail, earliest_expected_date, on_order_qty, on_order_cost, on_order_retail, earliest_expected_date,
date_created, date_first_received, date_last_received, date_first_sold, date_last_sold, age_days, date_created, date_first_received, date_last_received, date_first_sold, date_last_sold, age_days,
@@ -242,7 +242,7 @@ BEGIN
SELECT SELECT
-- Select columns in order, joining all CTEs by pid -- Select columns in order, joining all CTEs by pid
ci.pid, _start_time, ci.sku, ci.title, ci.brand, ci.vendor, ci.image_url, ci.is_visible, ci.replenishable, ci.pid, _start_time, ci.sku, ci.title, ci.brand, ci.vendor, ci.image_url, ci.is_visible, ci.replenishable,
ci.current_price, ci.current_regular_price, ci.current_cost_price, ci.current_effective_cost, ci.current_price, ci.current_regular_price, ci.current_cost_price,
ci.current_stock, (ci.current_stock * COALESCE(ci.current_effective_cost, 0.00))::numeric(12,2), (ci.current_stock * COALESCE(ci.current_price, 0.00))::numeric(12,2), (ci.current_stock * COALESCE(ci.current_regular_price, 0.00))::numeric(12,2), ci.current_stock, (ci.current_stock * COALESCE(ci.current_effective_cost, 0.00))::numeric(12,2), (ci.current_stock * COALESCE(ci.current_price, 0.00))::numeric(12,2), (ci.current_stock * COALESCE(ci.current_regular_price, 0.00))::numeric(12,2),
COALESCE(ooi.on_order_qty, 0), COALESCE(ooi.on_order_cost, 0.00)::numeric(12,2), (COALESCE(ooi.on_order_qty, 0) * COALESCE(ci.current_price, 0.00))::numeric(12,2), ooi.earliest_expected_date, COALESCE(ooi.on_order_qty, 0), COALESCE(ooi.on_order_cost, 0.00)::numeric(12,2), (COALESCE(ooi.on_order_qty, 0) * COALESCE(ci.current_price, 0.00))::numeric(12,2), ooi.earliest_expected_date,
@@ -415,7 +415,7 @@ BEGIN
-- *** IMPORTANT: List ALL columns here, ensuring order matches INSERT list *** -- *** IMPORTANT: List ALL columns here, ensuring order matches INSERT list ***
-- Update ALL columns to ensure entire row is refreshed -- Update ALL columns to ensure entire row is refreshed
last_calculated = EXCLUDED.last_calculated, sku = EXCLUDED.sku, title = EXCLUDED.title, brand = EXCLUDED.brand, vendor = EXCLUDED.vendor, image_url = EXCLUDED.image_url, is_visible = EXCLUDED.is_visible, is_replenishable = EXCLUDED.is_replenishable, last_calculated = EXCLUDED.last_calculated, sku = EXCLUDED.sku, title = EXCLUDED.title, brand = EXCLUDED.brand, vendor = EXCLUDED.vendor, image_url = EXCLUDED.image_url, is_visible = EXCLUDED.is_visible, is_replenishable = EXCLUDED.is_replenishable,
current_price = EXCLUDED.current_price, current_regular_price = EXCLUDED.current_regular_price, current_cost_price = EXCLUDED.current_cost_price, current_landing_cost_price = EXCLUDED.current_landing_cost_price, current_price = EXCLUDED.current_price, current_regular_price = EXCLUDED.current_regular_price, current_cost_price = EXCLUDED.current_cost_price,
current_stock = EXCLUDED.current_stock, current_stock_cost = EXCLUDED.current_stock_cost, current_stock_retail = EXCLUDED.current_stock_retail, current_stock_gross = EXCLUDED.current_stock_gross, current_stock = EXCLUDED.current_stock, current_stock_cost = EXCLUDED.current_stock_cost, current_stock_retail = EXCLUDED.current_stock_retail, current_stock_gross = EXCLUDED.current_stock_gross,
on_order_qty = EXCLUDED.on_order_qty, on_order_cost = EXCLUDED.on_order_cost, on_order_retail = EXCLUDED.on_order_retail, earliest_expected_date = EXCLUDED.earliest_expected_date, on_order_qty = EXCLUDED.on_order_qty, on_order_cost = EXCLUDED.on_order_cost, on_order_retail = EXCLUDED.on_order_retail, earliest_expected_date = EXCLUDED.earliest_expected_date,
date_created = EXCLUDED.date_created, date_first_received = EXCLUDED.date_first_received, date_last_received = EXCLUDED.date_last_received, date_first_sold = EXCLUDED.date_first_sold, date_last_sold = EXCLUDED.date_last_sold, age_days = EXCLUDED.age_days, date_created = EXCLUDED.date_created, date_first_received = EXCLUDED.date_first_received, date_last_received = EXCLUDED.date_last_received, date_first_sold = EXCLUDED.date_first_sold, date_last_sold = EXCLUDED.date_last_sold, age_days = EXCLUDED.age_days,

View File

@@ -13,7 +13,7 @@ DECLARE
_begin_date DATE := (SELECT MIN(date)::date FROM orders WHERE date >= '2020-01-01'); -- Starting point: captures all historical order data _begin_date DATE := (SELECT MIN(date)::date FROM orders WHERE date >= '2020-01-01'); -- Starting point: captures all historical order data
_end_date DATE := CURRENT_DATE; _end_date DATE := CURRENT_DATE;
BEGIN BEGIN
RAISE NOTICE 'Beginning daily snapshots rebuild from % to %. Starting at %', _begin_date, _end_date, _start_time; RAISE NOTICE 'Begicnning daily snapshots rebuild from % to %. Starting at %', _begin_date, _end_date, _start_time;
-- First truncate the existing snapshots to ensure a clean slate -- First truncate the existing snapshots to ensure a clean slate
TRUNCATE TABLE public.daily_product_snapshots; TRUNCATE TABLE public.daily_product_snapshots;
@@ -36,7 +36,13 @@ BEGIN
COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.quantity ELSE 0 END), 0) AS units_sold, COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.quantity ELSE 0 END), 0) AS units_sold,
COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.price * o.quantity ELSE 0 END), 0.00) AS gross_revenue_unadjusted, COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.price * o.quantity ELSE 0 END), 0.00) AS gross_revenue_unadjusted,
COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.discount ELSE 0 END), 0.00) AS discounts, COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN o.discount ELSE 0 END), 0.00) AS discounts,
COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN COALESCE(o.costeach, p.cost_price) * o.quantity ELSE 0 END), 0.00) AS cogs, COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN
COALESCE(
o.costeach,
get_weighted_avg_cost(p.pid, o.date::date),
p.cost_price
) * o.quantity
ELSE 0 END), 0.00) AS cogs,
COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN p.regular_price * o.quantity ELSE 0 END), 0.00) AS gross_regular_revenue, COALESCE(SUM(CASE WHEN o.quantity > 0 AND COALESCE(o.status, 'pending') NOT IN ('canceled', 'returned') THEN p.regular_price * o.quantity ELSE 0 END), 0.00) AS gross_regular_revenue,
-- Aggregate Returns (Quantity < 0 or Status = Returned) -- Aggregate Returns (Quantity < 0 or Status = Returned)
@@ -63,15 +69,17 @@ BEGIN
GROUP BY r.pid GROUP BY r.pid
HAVING COUNT(DISTINCT r.receiving_id) > 0 OR SUM(r.qty_each) > 0 HAVING COUNT(DISTINCT r.receiving_id) > 0 OR SUM(r.qty_each) > 0
), ),
-- Get stock quantities for the day - note this is approximate since we're using current products data -- Use historical stock from stock_snapshots when available,
-- falling back to current stock from products table
StockData AS ( StockData AS (
SELECT SELECT
p.pid, p.pid,
p.stock_quantity, COALESCE(ss.stock_quantity, p.stock_quantity) AS stock_quantity,
COALESCE(p.cost_price, 0.00) as effective_cost_price, COALESCE(ss.stock_value, p.stock_quantity * COALESCE(p.cost_price, 0.00)) AS stock_value,
COALESCE(p.price, 0.00) as current_price, COALESCE(p.price, 0.00) as current_price,
COALESCE(p.regular_price, 0.00) as current_regular_price COALESCE(p.regular_price, 0.00) as current_regular_price
FROM public.products p FROM public.products p
LEFT JOIN stock_snapshots ss ON p.pid = ss.pid AND ss.snapshot_date = _date
) )
INSERT INTO public.daily_product_snapshots ( INSERT INTO public.daily_product_snapshots (
snapshot_date, snapshot_date,
@@ -99,9 +107,9 @@ BEGIN
_date AS snapshot_date, _date AS snapshot_date,
COALESCE(sd.pid, rd.pid) AS pid, COALESCE(sd.pid, rd.pid) AS pid,
sd.sku, sd.sku,
-- Use current stock as approximation, since historical stock data may not be available -- Historical stock from stock_snapshots, falls back to current stock
s.stock_quantity AS eod_stock_quantity, s.stock_quantity AS eod_stock_quantity,
s.stock_quantity * s.effective_cost_price AS eod_stock_cost, s.stock_value AS eod_stock_cost,
s.stock_quantity * s.current_price AS eod_stock_retail, s.stock_quantity * s.current_price AS eod_stock_retail,
s.stock_quantity * s.current_regular_price AS eod_stock_gross, s.stock_quantity * s.current_regular_price AS eod_stock_gross,
(s.stock_quantity <= 0) AS stockout_flag, (s.stock_quantity <= 0) AS stockout_flag,
@@ -114,7 +122,7 @@ BEGIN
COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00) AS net_revenue, COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00) AS net_revenue,
COALESCE(sd.cogs, 0.00), COALESCE(sd.cogs, 0.00),
COALESCE(sd.gross_regular_revenue, 0.00), COALESCE(sd.gross_regular_revenue, 0.00),
(COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00)) - COALESCE(sd.cogs, 0.00) AS profit, (COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00)) - COALESCE(sd.cogs, 0.00) AS profit,
-- Receiving metrics -- Receiving metrics
COALESCE(rd.units_received, 0), COALESCE(rd.units_received, 0),
COALESCE(rd.cost_received, 0.00), COALESCE(rd.cost_received, 0.00),

View File

@@ -121,14 +121,16 @@ BEGIN
HAVING COUNT(DISTINCT r.receiving_id) > 0 OR SUM(r.qty_each) > 0 HAVING COUNT(DISTINCT r.receiving_id) > 0 OR SUM(r.qty_each) > 0
), ),
CurrentStock AS ( CurrentStock AS (
-- Select current stock values directly from products table -- Use historical stock from stock_snapshots when available,
-- falling back to current stock from products table
SELECT SELECT
pid, p.pid,
stock_quantity, COALESCE(ss.stock_quantity, p.stock_quantity) AS stock_quantity,
COALESCE(cost_price, 0.00) as effective_cost_price, COALESCE(ss.stock_value, p.stock_quantity * COALESCE(p.cost_price, 0.00)) AS stock_value,
COALESCE(price, 0.00) as current_price, COALESCE(p.price, 0.00) AS current_price,
COALESCE(regular_price, 0.00) as current_regular_price COALESCE(p.regular_price, 0.00) AS current_regular_price
FROM public.products FROM public.products p
LEFT JOIN stock_snapshots ss ON p.pid = ss.pid AND ss.snapshot_date = _target_date
), ),
ProductsWithActivity AS ( ProductsWithActivity AS (
-- Quick pre-filter to only process products with activity -- Quick pre-filter to only process products with activity
@@ -168,7 +170,7 @@ BEGIN
COALESCE(sd.sku, p.sku) AS sku, -- Get SKU from sales data or products table COALESCE(sd.sku, p.sku) AS sku, -- Get SKU from sales data or products table
-- Inventory Metrics (Using CurrentStock) -- Inventory Metrics (Using CurrentStock)
cs.stock_quantity AS eod_stock_quantity, cs.stock_quantity AS eod_stock_quantity,
cs.stock_quantity * cs.effective_cost_price AS eod_stock_cost, cs.stock_value AS eod_stock_cost,
cs.stock_quantity * cs.current_price AS eod_stock_retail, cs.stock_quantity * cs.current_price AS eod_stock_retail,
cs.stock_quantity * cs.current_regular_price AS eod_stock_gross, cs.stock_quantity * cs.current_regular_price AS eod_stock_gross,
(cs.stock_quantity <= 0) AS stockout_flag, (cs.stock_quantity <= 0) AS stockout_flag,
@@ -181,7 +183,7 @@ BEGIN
COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00) AS net_revenue, COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00) AS net_revenue,
COALESCE(sd.cogs, 0.00), COALESCE(sd.cogs, 0.00),
COALESCE(sd.gross_regular_revenue, 0.00), COALESCE(sd.gross_regular_revenue, 0.00),
(COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00)) - COALESCE(sd.cogs, 0.00) AS profit, -- Basic profit: Net Revenue - COGS (COALESCE(sd.gross_revenue_unadjusted, 0.00) - COALESCE(sd.discounts, 0.00) - COALESCE(sd.returns_revenue, 0.00)) - COALESCE(sd.cogs, 0.00) AS profit,
-- Receiving Metrics (From ReceivingData) -- Receiving Metrics (From ReceivingData)
COALESCE(rd.units_received, 0), COALESCE(rd.units_received, 0),
COALESCE(rd.cost_received, 0.00), COALESCE(rd.cost_received, 0.00),

View File

@@ -204,22 +204,33 @@ BEGIN
GROUP BY pid GROUP BY pid
), ),
DemandVariability AS ( DemandVariability AS (
-- Calculate variance and standard deviation of daily sales -- Calculate variance and standard deviation of daily sales over the full 30-day window
-- including zero-sales days (not just activity days) for accurate variability metrics
SELECT SELECT
pid, pd.pid,
COUNT(*) AS days_with_data, COUNT(dps.pid) AS days_with_data,
AVG(units_sold) AS avg_daily_sales, AVG(COALESCE(dps.units_sold, 0)) AS avg_daily_sales,
VARIANCE(units_sold) AS sales_variance, VARIANCE(COALESCE(dps.units_sold, 0)) AS sales_variance,
STDDEV(units_sold) AS sales_std_dev, STDDEV(COALESCE(dps.units_sold, 0)) AS sales_std_dev,
-- Coefficient of variation CASE
CASE WHEN AVG(COALESCE(dps.units_sold, 0)) > 0
WHEN AVG(units_sold) > 0 THEN STDDEV(units_sold) / AVG(units_sold) THEN STDDEV(COALESCE(dps.units_sold, 0)) / AVG(COALESCE(dps.units_sold, 0))
ELSE NULL ELSE NULL
END AS sales_cv END AS sales_cv
FROM public.daily_product_snapshots FROM (
WHERE snapshot_date >= _current_date - INTERVAL '29 days' SELECT DISTINCT pid
AND snapshot_date <= _current_date FROM public.daily_product_snapshots
GROUP BY pid WHERE snapshot_date >= _current_date - INTERVAL '29 days'
AND snapshot_date <= _current_date
) pd
CROSS JOIN generate_series(
(_current_date - INTERVAL '29 days')::date,
_current_date,
'1 day'::interval
) AS d(day)
LEFT JOIN public.daily_product_snapshots dps
ON dps.pid = pd.pid AND dps.snapshot_date = d.day::date
GROUP BY pd.pid
), ),
ServiceLevels AS ( ServiceLevels AS (
-- Calculate service level and fill rate metrics -- Calculate service level and fill rate metrics
@@ -257,7 +268,7 @@ BEGIN
barcode, harmonized_tariff_code, vendor_reference, notions_reference, line, subline, artist, barcode, harmonized_tariff_code, vendor_reference, notions_reference, line, subline, artist,
moq, rating, reviews, weight, length, width, height, country_of_origin, location, moq, rating, reviews, weight, length, width, height, country_of_origin, location,
baskets, notifies, preorder_count, notions_inv_count, baskets, notifies, preorder_count, notions_inv_count,
current_price, current_regular_price, current_cost_price, current_landing_cost_price, current_price, current_regular_price, current_cost_price,
current_stock, current_stock_cost, current_stock_retail, current_stock_gross, current_stock, current_stock_cost, current_stock_retail, current_stock_gross,
on_order_qty, on_order_cost, on_order_retail, earliest_expected_date, on_order_qty, on_order_cost, on_order_retail, earliest_expected_date,
date_created, date_first_received, date_last_received, date_first_sold, date_last_sold, age_days, date_created, date_first_received, date_last_received, date_first_sold, date_last_sold, age_days,
@@ -295,7 +306,7 @@ BEGIN
ci.barcode, ci.harmonized_tariff_code, ci.vendor_reference, ci.notions_reference, ci.line, ci.subline, ci.artist, ci.barcode, ci.harmonized_tariff_code, ci.vendor_reference, ci.notions_reference, ci.line, ci.subline, ci.artist,
ci.moq, ci.rating, ci.reviews, ci.weight, ci.length, ci.width, ci.height, ci.country_of_origin, ci.location, ci.moq, ci.rating, ci.reviews, ci.weight, ci.length, ci.width, ci.height, ci.country_of_origin, ci.location,
ci.baskets, ci.notifies, ci.preorder_count, ci.notions_inv_count, ci.baskets, ci.notifies, ci.preorder_count, ci.notions_inv_count,
ci.current_price, ci.current_regular_price, ci.current_cost_price, ci.current_effective_cost, ci.current_price, ci.current_regular_price, ci.current_cost_price,
ci.current_stock, ci.current_stock * ci.current_effective_cost, ci.current_stock * ci.current_price, ci.current_stock * ci.current_regular_price, ci.current_stock, ci.current_stock * ci.current_effective_cost, ci.current_stock * ci.current_price, ci.current_stock * ci.current_regular_price,
COALESCE(ooi.on_order_qty, 0), COALESCE(ooi.on_order_cost, 0.00), COALESCE(ooi.on_order_qty, 0) * ci.current_price, ooi.earliest_expected_date, COALESCE(ooi.on_order_qty, 0), COALESCE(ooi.on_order_cost, 0.00), COALESCE(ooi.on_order_qty, 0) * ci.current_price, ooi.earliest_expected_date,
ci.created_at::date, COALESCE(ci.first_received::date, hd.date_first_received_calc), hd.date_last_received_calc, hd.date_first_sold, COALESCE(ci.date_last_sold, hd.max_order_date), ci.created_at::date, COALESCE(ci.first_received::date, hd.date_first_received_calc), hd.date_last_received_calc, hd.date_first_sold, COALESCE(ci.date_last_sold, hd.max_order_date),
@@ -514,7 +525,7 @@ BEGIN
barcode = EXCLUDED.barcode, harmonized_tariff_code = EXCLUDED.harmonized_tariff_code, vendor_reference = EXCLUDED.vendor_reference, notions_reference = EXCLUDED.notions_reference, line = EXCLUDED.line, subline = EXCLUDED.subline, artist = EXCLUDED.artist, barcode = EXCLUDED.barcode, harmonized_tariff_code = EXCLUDED.harmonized_tariff_code, vendor_reference = EXCLUDED.vendor_reference, notions_reference = EXCLUDED.notions_reference, line = EXCLUDED.line, subline = EXCLUDED.subline, artist = EXCLUDED.artist,
moq = EXCLUDED.moq, rating = EXCLUDED.rating, reviews = EXCLUDED.reviews, weight = EXCLUDED.weight, length = EXCLUDED.length, width = EXCLUDED.width, height = EXCLUDED.height, country_of_origin = EXCLUDED.country_of_origin, location = EXCLUDED.location, moq = EXCLUDED.moq, rating = EXCLUDED.rating, reviews = EXCLUDED.reviews, weight = EXCLUDED.weight, length = EXCLUDED.length, width = EXCLUDED.width, height = EXCLUDED.height, country_of_origin = EXCLUDED.country_of_origin, location = EXCLUDED.location,
baskets = EXCLUDED.baskets, notifies = EXCLUDED.notifies, preorder_count = EXCLUDED.preorder_count, notions_inv_count = EXCLUDED.notions_inv_count, baskets = EXCLUDED.baskets, notifies = EXCLUDED.notifies, preorder_count = EXCLUDED.preorder_count, notions_inv_count = EXCLUDED.notions_inv_count,
current_price = EXCLUDED.current_price, current_regular_price = EXCLUDED.current_regular_price, current_cost_price = EXCLUDED.current_cost_price, current_landing_cost_price = EXCLUDED.current_landing_cost_price, current_price = EXCLUDED.current_price, current_regular_price = EXCLUDED.current_regular_price, current_cost_price = EXCLUDED.current_cost_price,
current_stock = EXCLUDED.current_stock, current_stock_cost = EXCLUDED.current_stock_cost, current_stock_retail = EXCLUDED.current_stock_retail, current_stock_gross = EXCLUDED.current_stock_gross, current_stock = EXCLUDED.current_stock, current_stock_cost = EXCLUDED.current_stock_cost, current_stock_retail = EXCLUDED.current_stock_retail, current_stock_gross = EXCLUDED.current_stock_gross,
on_order_qty = EXCLUDED.on_order_qty, on_order_cost = EXCLUDED.on_order_cost, on_order_retail = EXCLUDED.on_order_retail, earliest_expected_date = EXCLUDED.earliest_expected_date, on_order_qty = EXCLUDED.on_order_qty, on_order_cost = EXCLUDED.on_order_cost, on_order_retail = EXCLUDED.on_order_retail, earliest_expected_date = EXCLUDED.earliest_expected_date,
date_created = EXCLUDED.date_created, date_first_received = EXCLUDED.date_first_received, date_last_received = EXCLUDED.date_last_received, date_first_sold = EXCLUDED.date_first_sold, date_last_sold = EXCLUDED.date_last_sold, age_days = EXCLUDED.age_days, date_created = EXCLUDED.date_created, date_first_received = EXCLUDED.date_first_received, date_last_received = EXCLUDED.date_last_received, date_first_sold = EXCLUDED.date_first_sold, date_last_sold = EXCLUDED.date_last_sold, age_days = EXCLUDED.age_days,
@@ -567,11 +578,26 @@ BEGIN
product_metrics.replenishment_units IS DISTINCT FROM EXCLUDED.replenishment_units OR product_metrics.replenishment_units IS DISTINCT FROM EXCLUDED.replenishment_units OR
product_metrics.stock_cover_in_days IS DISTINCT FROM EXCLUDED.stock_cover_in_days OR product_metrics.stock_cover_in_days IS DISTINCT FROM EXCLUDED.stock_cover_in_days OR
product_metrics.yesterday_sales IS DISTINCT FROM EXCLUDED.yesterday_sales OR product_metrics.yesterday_sales IS DISTINCT FROM EXCLUDED.yesterday_sales OR
-- Check a few other important fields that might change
product_metrics.date_last_sold IS DISTINCT FROM EXCLUDED.date_last_sold OR product_metrics.date_last_sold IS DISTINCT FROM EXCLUDED.date_last_sold OR
product_metrics.earliest_expected_date IS DISTINCT FROM EXCLUDED.earliest_expected_date OR product_metrics.earliest_expected_date IS DISTINCT FROM EXCLUDED.earliest_expected_date OR
product_metrics.lifetime_sales IS DISTINCT FROM EXCLUDED.lifetime_sales OR product_metrics.lifetime_sales IS DISTINCT FROM EXCLUDED.lifetime_sales OR
product_metrics.lifetime_revenue_quality IS DISTINCT FROM EXCLUDED.lifetime_revenue_quality product_metrics.lifetime_revenue_quality IS DISTINCT FROM EXCLUDED.lifetime_revenue_quality OR
-- Derived metrics that can change even when source fields don't
product_metrics.profit_30d IS DISTINCT FROM EXCLUDED.profit_30d OR
product_metrics.cogs_30d IS DISTINCT FROM EXCLUDED.cogs_30d OR
product_metrics.margin_30d IS DISTINCT FROM EXCLUDED.margin_30d OR
product_metrics.stockout_days_30d IS DISTINCT FROM EXCLUDED.stockout_days_30d OR
product_metrics.sell_through_30d IS DISTINCT FROM EXCLUDED.sell_through_30d OR
-- Growth and variability metrics
product_metrics.sales_growth_30d_vs_prev IS DISTINCT FROM EXCLUDED.sales_growth_30d_vs_prev OR
product_metrics.revenue_growth_30d_vs_prev IS DISTINCT FROM EXCLUDED.revenue_growth_30d_vs_prev OR
product_metrics.demand_pattern IS DISTINCT FROM EXCLUDED.demand_pattern OR
product_metrics.seasonal_pattern IS DISTINCT FROM EXCLUDED.seasonal_pattern OR
product_metrics.seasonality_index IS DISTINCT FROM EXCLUDED.seasonality_index OR
product_metrics.service_level_30d IS DISTINCT FROM EXCLUDED.service_level_30d OR
product_metrics.fill_rate_30d IS DISTINCT FROM EXCLUDED.fill_rate_30d OR
-- Time-based safety net: always update if more than 1 day stale
product_metrics.last_calculated < NOW() - INTERVAL '1 day'
; ;
-- Update the status table with the timestamp from the START of this run -- Update the status table with the timestamp from the START of this run