Okay, I understand completely now. The core issue is that the previous approaches tried too hard to reconcile every receipt back to a specific PO line within the `purchase_orders` table structure, which doesn't reflect the reality where receipts can be independent events. Your downstream scripts, especially `daily_snapshots` and `product_metrics`, rely on having a complete picture of *all* receivings. Let's pivot to a model that respects both distinct data streams: **Orders (Intent)** and **Receivings (Actuals)**. **Proposed Solution: Separate `purchase_orders` and `receivings` Tables** This is the cleanest way to model the reality you've described. 1. **`purchase_orders` Table:** * **Purpose:** Tracks the status and details of purchase *orders* placed. Represents the *intent* to receive goods. * **Key Columns:** `po_id`, `pid`, `ordered` (quantity ordered), `po_cost_price`, `date` (order/created date), `expected_date`, `status` (PO lifecycle: 'ordered', 'canceled', 'done'), `vendor`, `notes`, etc. * **Crucially:** This table *does not* need a `received` column or a `receiving_history` column derived from complex allocations. It focuses solely on the PO itself. 2. **`receivings` Table (New or Refined):** * **Purpose:** Tracks every single line item received, regardless of whether it was linked to a PO during the receiving process. Represents the *actual* goods that arrived. * **Key Columns:** * `receiving_id` (Identifier for the overall receiving document/batch) * `pid` (Product ID received) * `received_qty` (Quantity received for this specific line) * `cost_each` (Actual cost paid for this item on this receiving) * `received_date` (Actual date the item was received) * `received_by` (Employee ID/Name) * `source_po_id` (The `po_id` entered on the receiving screen, *nullable*. Stores the original link attempt, even if it was wrong or missing) * `source_receiving_status` (The status from the source `receivings` table: 'partial_received', 'full_received', 'paid', 'canceled') **How the Import Script Changes:** 1. **Fetch POs:** Fetch data from `po` and `po_products`. 2. **Populate `purchase_orders`:** * Insert/Update rows into `purchase_orders` based directly on the fetched PO data. * Set `po_id`, `pid`, `ordered`, `po_cost_price`, `date` (`COALESCE(date_ordered, date_created)`), `expected_date`. * Set `status` by mapping the source `po.status` code directly ('ordered', 'canceled', 'done', etc.). * **No complex allocation needed here.** 3. **Fetch Receivings:** Fetch data from `receivings` and `receivings_products`. 4. **Populate `receivings`:** * For *every* line item fetched from `receivings_products`: * Perform necessary data validation (dates, numbers). * Insert a new row into `receivings` with all the relevant details (`receiving_id`, `pid`, `received_qty`, `cost_each`, `received_date`, `received_by`, `source_po_id`, `source_receiving_status`). * Use `ON CONFLICT (receiving_id, pid)` (or similar unique key based on your source data) `DO UPDATE SET ...` for incremental updates if necessary, or simply delete/re-insert based on `receiving_id` for simplicity if performance allows. **Impact on Downstream Scripts (and how to adapt):** * **Initial Query (Active POs):** * `SELECT ... FROM purchase_orders po WHERE po.status NOT IN ('canceled', 'done', 'paid_equivalent_status?') AND po.date >= ...` * `active_pos`: `COUNT(DISTINCT po.po_id)` based on the filtered POs. * `overdue_pos`: Add `AND po.expected_date < CURRENT_DATE`. * `total_units`: `SUM(po.ordered)`. Represents total units *ordered* on active POs. * `total_cost`: `SUM(po.ordered * po.po_cost_price)`. Cost of units *ordered*. * `total_retail`: `SUM(po.ordered * pm.current_price)`. Retail value of units *ordered*. * **Result:** This query now cleanly reports on the status of *orders* placed, which seems closer to its original intent. The filter `po.receiving_status NOT IN ('partial_received', 'full_received', 'paid')` is replaced by `po.status NOT IN ('canceled', 'done', 'paid_equivalent?')`. The 90% received check is removed as `received` is not reliably tracked *on the PO* anymore. * **`daily_product_snapshots`:** * **`SalesData` CTE:** No change needed. * **`ReceivingData` CTE:** **Must be changed.** Query the **`receivings`** table instead of `purchase_orders`. ```sql ReceivingData AS ( SELECT rl.pid, COUNT(DISTINCT rl.receiving_id) as receiving_doc_count, SUM(rl.received_qty) AS units_received, SUM(rl.received_qty * rl.cost_each) AS cost_received FROM public.receivings rl WHERE rl.received_date::date = _date -- Optional: Filter out canceled receivings if needed -- AND rl.source_receiving_status <> 'canceled' GROUP BY rl.pid ), ``` * **Result:** This now accurately reflects *all* units received on a given day from the definitive source. * **`update_product_metrics`:** * **`CurrentInfo` CTE:** No change needed (pulls from `products`). * **`OnOrderInfo` CTE:** Needs re-evaluation. How do you want to define "On Order"? * **Option A (Strict PO View):** `SUM(po.ordered)` from `purchase_orders po WHERE po.status NOT IN ('canceled', 'done', 'paid_equivalent?')`. This is quantity on *open orders*, ignoring fulfillment state. Simple, but might overestimate if items arrived unlinked. * **Option B (Approximate Fulfillment):** `SUM(po.ordered)` from open POs MINUS `SUM(rl.received_qty)` from `receivings rl` where `rl.source_po_id = po.po_id` (summing only directly linked receivings). Better, but still misses fulfillment via unlinked receivings. * **Option C (Heuristic):** `SUM(po.ordered)` from open POs MINUS `SUM(rl.received_qty)` from `receivings rl` where `rl.pid = po.pid` and `rl.received_date >= po.date`. This *tries* to account for unlinked receivings but is imprecise. * **Recommendation:** Start with **Option A** for simplicity, clearly labeling it "Quantity on Open POs". You might need a separate process or metric for a more nuanced view of expected vs. actual pipeline. ```sql -- Example for Option A OnOrderInfo AS ( SELECT pid, SUM(ordered) AS on_order_qty, -- Total qty on open POs SUM(ordered * po_cost_price) AS on_order_cost -- Cost of qty on open POs FROM public.purchase_orders WHERE status NOT IN ('canceled', 'done', 'paid_equivalent?') -- Define your open statuses GROUP BY pid ), ``` * **`HistoricalDates` CTE:** * `date_first_sold`, `max_order_date`: No change (queries `orders`). * `date_first_received_calc`, `date_last_received_calc`: **Must be changed.** Query `MIN(rl.received_date)` and `MAX(rl.received_date)` from the **`receivings`** table grouped by `pid`. * **`SnapshotAggregates` CTE:** * `received_qty_30d`, `received_cost_30d`: These are calculated from `daily_product_snapshots`, which are now correctly sourced from `receivings`, so this part is fine. * **Forecasting Calculations:** Will use the chosen definition of `on_order_qty`. Be aware of the implications of Option A (potentially inflated if unlinked receivings fulfill orders). * **Result:** Metrics are calculated based on distinct order data and complete receiving data. The definition of "on order" needs careful consideration. **Summary of this Approach:** * **Pros:** * Accurately models distinct order and receiving events. * Provides a definitive source (`receivings`) for all received inventory. * Simplifies the `purchase_orders` table and its import logic. * Avoids complex/potentially inaccurate allocation logic for unlinked receivings within the main tables. * Avoids synthetic records. * Fixes downstream reporting (`daily_snapshots` receiving data). * **Cons:** * Requires creating/managing the `receivings` table. * Requires modifying downstream queries (`ReceivingData`, `OnOrderInfo`, `HistoricalDates`). * Calculating a precise "net quantity still expected to arrive" (true on-order minus all relevant fulfillment) becomes more complex and may require specific business rules or heuristics outside the basic table structure if Option A for `OnOrderInfo` isn't sufficient. This two-table approach (`purchase_orders` + `receivings`) seems the most robust and accurate way to handle your requirement for complete receiving records independent of potentially flawed PO linking. It directly addresses the shortcomings of the previous attempts.