8.7 KiB
Okay, I understand completely now. The core issue is that the previous approaches tried too hard to reconcile every receipt back to a specific PO line within the purchase_orders table structure, which doesn't reflect the reality where receipts can be independent events. Your downstream scripts, especially daily_snapshots and product_metrics, rely on having a complete picture of all receivings.
Let's pivot to a model that respects both distinct data streams: Orders (Intent) and Receivings (Actuals).
Proposed Solution: Separate purchase_orders and receivings Tables
This is the cleanest way to model the reality you've described.
-
purchase_ordersTable:- Purpose: Tracks the status and details of purchase orders placed. Represents the intent to receive goods.
- Key Columns:
po_id,pid,ordered(quantity ordered),po_cost_price,date(order/created date),expected_date,status(PO lifecycle: 'ordered', 'canceled', 'done'),vendor,notes, etc. - Crucially: This table does not need a
receivedcolumn or areceiving_historycolumn derived from complex allocations. It focuses solely on the PO itself.
-
receivingsTable (New or Refined):- Purpose: Tracks every single line item received, regardless of whether it was linked to a PO during the receiving process. Represents the actual goods that arrived.
- Key Columns:
receiving_id(Identifier for the overall receiving document/batch)pid(Product ID received)received_qty(Quantity received for this specific line)cost_each(Actual cost paid for this item on this receiving)received_date(Actual date the item was received)received_by(Employee ID/Name)source_po_id(Thepo_identered on the receiving screen, nullable. Stores the original link attempt, even if it was wrong or missing)source_receiving_status(The status from the sourcereceivingstable: 'partial_received', 'full_received', 'paid', 'canceled')
How the Import Script Changes:
- Fetch POs: Fetch data from
poandpo_products. - Populate
purchase_orders:- Insert/Update rows into
purchase_ordersbased directly on the fetched PO data. - Set
po_id,pid,ordered,po_cost_price,date(COALESCE(date_ordered, date_created)),expected_date. - Set
statusby mapping the sourcepo.statuscode directly ('ordered', 'canceled', 'done', etc.). - No complex allocation needed here.
- Insert/Update rows into
- Fetch Receivings: Fetch data from
receivingsandreceivings_products. - Populate
receivings:- For every line item fetched from
receivings_products:- Perform necessary data validation (dates, numbers).
- Insert a new row into
receivingswith all the relevant details (receiving_id,pid,received_qty,cost_each,received_date,received_by,source_po_id,source_receiving_status).
- Use
ON CONFLICT (receiving_id, pid)(or similar unique key based on your source data)DO UPDATE SET ...for incremental updates if necessary, or simply delete/re-insert based onreceiving_idfor simplicity if performance allows.
- For every line item fetched from
Impact on Downstream Scripts (and how to adapt):
-
Initial Query (Active POs):
SELECT ... FROM purchase_orders po WHERE po.status NOT IN ('canceled', 'done', 'paid_equivalent_status?') AND po.date >= ...active_pos:COUNT(DISTINCT po.po_id)based on the filtered POs.overdue_pos: AddAND po.expected_date < CURRENT_DATE.total_units:SUM(po.ordered). Represents total units ordered on active POs.total_cost:SUM(po.ordered * po.po_cost_price). Cost of units ordered.total_retail:SUM(po.ordered * pm.current_price). Retail value of units ordered.- Result: This query now cleanly reports on the status of orders placed, which seems closer to its original intent. The filter
po.receiving_status NOT IN ('partial_received', 'full_received', 'paid')is replaced bypo.status NOT IN ('canceled', 'done', 'paid_equivalent?'). The 90% received check is removed asreceivedis not reliably tracked on the PO anymore.
-
daily_product_snapshots:SalesDataCTE: No change needed.ReceivingDataCTE: Must be changed. Query thereceivingstable instead ofpurchase_orders.ReceivingData AS ( SELECT rl.pid, COUNT(DISTINCT rl.receiving_id) as receiving_doc_count, SUM(rl.received_qty) AS units_received, SUM(rl.received_qty * rl.cost_each) AS cost_received FROM public.receivings rl WHERE rl.received_date::date = _date -- Optional: Filter out canceled receivings if needed -- AND rl.source_receiving_status <> 'canceled' GROUP BY rl.pid ),- Result: This now accurately reflects all units received on a given day from the definitive source.
-
update_product_metrics:CurrentInfoCTE: No change needed (pulls fromproducts).OnOrderInfoCTE: Needs re-evaluation. How do you want to define "On Order"?- Option A (Strict PO View):
SUM(po.ordered)frompurchase_orders po WHERE po.status NOT IN ('canceled', 'done', 'paid_equivalent?'). This is quantity on open orders, ignoring fulfillment state. Simple, but might overestimate if items arrived unlinked. - Option B (Approximate Fulfillment):
SUM(po.ordered)from open POs MINUSSUM(rl.received_qty)fromreceivings rlwhererl.source_po_id = po.po_id(summing only directly linked receivings). Better, but still misses fulfillment via unlinked receivings. - Option C (Heuristic):
SUM(po.ordered)from open POs MINUSSUM(rl.received_qty)fromreceivings rlwhererl.pid = po.pidandrl.received_date >= po.date. This tries to account for unlinked receivings but is imprecise. - Recommendation: Start with Option A for simplicity, clearly labeling it "Quantity on Open POs". You might need a separate process or metric for a more nuanced view of expected vs. actual pipeline.
-- Example for Option A OnOrderInfo AS ( SELECT pid, SUM(ordered) AS on_order_qty, -- Total qty on open POs SUM(ordered * po_cost_price) AS on_order_cost -- Cost of qty on open POs FROM public.purchase_orders WHERE status NOT IN ('canceled', 'done', 'paid_equivalent?') -- Define your open statuses GROUP BY pid ),- Option A (Strict PO View):
HistoricalDatesCTE:date_first_sold,max_order_date: No change (queriesorders).date_first_received_calc,date_last_received_calc: Must be changed. QueryMIN(rl.received_date)andMAX(rl.received_date)from thereceivingstable grouped bypid.
SnapshotAggregatesCTE:received_qty_30d,received_cost_30d: These are calculated fromdaily_product_snapshots, which are now correctly sourced fromreceivings, so this part is fine.
- Forecasting Calculations: Will use the chosen definition of
on_order_qty. Be aware of the implications of Option A (potentially inflated if unlinked receivings fulfill orders). - Result: Metrics are calculated based on distinct order data and complete receiving data. The definition of "on order" needs careful consideration.
Summary of this Approach:
- Pros:
- Accurately models distinct order and receiving events.
- Provides a definitive source (
receivings) for all received inventory. - Simplifies the
purchase_orderstable and its import logic. - Avoids complex/potentially inaccurate allocation logic for unlinked receivings within the main tables.
- Avoids synthetic records.
- Fixes downstream reporting (
daily_snapshotsreceiving data).
- Cons:
- Requires creating/managing the
receivingstable. - Requires modifying downstream queries (
ReceivingData,OnOrderInfo,HistoricalDates). - Calculating a precise "net quantity still expected to arrive" (true on-order minus all relevant fulfillment) becomes more complex and may require specific business rules or heuristics outside the basic table structure if Option A for
OnOrderInfoisn't sufficient.
- Requires creating/managing the
This two-table approach (purchase_orders + receivings) seems the most robust and accurate way to handle your requirement for complete receiving records independent of potentially flawed PO linking. It directly addresses the shortcomings of the previous attempts.