Files
inventory/docs/split-up-pos.md

8.7 KiB

Okay, I understand completely now. The core issue is that the previous approaches tried too hard to reconcile every receipt back to a specific PO line within the purchase_orders table structure, which doesn't reflect the reality where receipts can be independent events. Your downstream scripts, especially daily_snapshots and product_metrics, rely on having a complete picture of all receivings.

Let's pivot to a model that respects both distinct data streams: Orders (Intent) and Receivings (Actuals).

Proposed Solution: Separate purchase_orders and receivings Tables

This is the cleanest way to model the reality you've described.

  1. purchase_orders Table:

    • Purpose: Tracks the status and details of purchase orders placed. Represents the intent to receive goods.
    • Key Columns: po_id, pid, ordered (quantity ordered), po_cost_price, date (order/created date), expected_date, status (PO lifecycle: 'ordered', 'canceled', 'done'), vendor, notes, etc.
    • Crucially: This table does not need a received column or a receiving_history column derived from complex allocations. It focuses solely on the PO itself.
  2. receivings Table (New or Refined):

    • Purpose: Tracks every single line item received, regardless of whether it was linked to a PO during the receiving process. Represents the actual goods that arrived.
    • Key Columns:
      • receiving_id (Identifier for the overall receiving document/batch)
      • pid (Product ID received)
      • received_qty (Quantity received for this specific line)
      • cost_each (Actual cost paid for this item on this receiving)
      • received_date (Actual date the item was received)
      • received_by (Employee ID/Name)
      • source_po_id (The po_id entered on the receiving screen, nullable. Stores the original link attempt, even if it was wrong or missing)
      • source_receiving_status (The status from the source receivings table: 'partial_received', 'full_received', 'paid', 'canceled')

How the Import Script Changes:

  1. Fetch POs: Fetch data from po and po_products.
  2. Populate purchase_orders:
    • Insert/Update rows into purchase_orders based directly on the fetched PO data.
    • Set po_id, pid, ordered, po_cost_price, date (COALESCE(date_ordered, date_created)), expected_date.
    • Set status by mapping the source po.status code directly ('ordered', 'canceled', 'done', etc.).
    • No complex allocation needed here.
  3. Fetch Receivings: Fetch data from receivings and receivings_products.
  4. Populate receivings:
    • For every line item fetched from receivings_products:
      • Perform necessary data validation (dates, numbers).
      • Insert a new row into receivings with all the relevant details (receiving_id, pid, received_qty, cost_each, received_date, received_by, source_po_id, source_receiving_status).
    • Use ON CONFLICT (receiving_id, pid) (or similar unique key based on your source data) DO UPDATE SET ... for incremental updates if necessary, or simply delete/re-insert based on receiving_id for simplicity if performance allows.

Impact on Downstream Scripts (and how to adapt):

  • Initial Query (Active POs):

    • SELECT ... FROM purchase_orders po WHERE po.status NOT IN ('canceled', 'done', 'paid_equivalent_status?') AND po.date >= ...
    • active_pos: COUNT(DISTINCT po.po_id) based on the filtered POs.
    • overdue_pos: Add AND po.expected_date < CURRENT_DATE.
    • total_units: SUM(po.ordered). Represents total units ordered on active POs.
    • total_cost: SUM(po.ordered * po.po_cost_price). Cost of units ordered.
    • total_retail: SUM(po.ordered * pm.current_price). Retail value of units ordered.
    • Result: This query now cleanly reports on the status of orders placed, which seems closer to its original intent. The filter po.receiving_status NOT IN ('partial_received', 'full_received', 'paid') is replaced by po.status NOT IN ('canceled', 'done', 'paid_equivalent?'). The 90% received check is removed as received is not reliably tracked on the PO anymore.
  • daily_product_snapshots:

    • SalesData CTE: No change needed.
    • ReceivingData CTE: Must be changed. Query the receivings table instead of purchase_orders.
      ReceivingData AS (
          SELECT
              rl.pid,
              COUNT(DISTINCT rl.receiving_id) as receiving_doc_count, 
              SUM(rl.received_qty) AS units_received,
              SUM(rl.received_qty * rl.cost_each) AS cost_received
          FROM public.receivings rl
          WHERE rl.received_date::date = _date
            -- Optional: Filter out canceled receivings if needed
            -- AND rl.source_receiving_status <> 'canceled' 
          GROUP BY rl.pid
      ),
      
    • Result: This now accurately reflects all units received on a given day from the definitive source.
  • update_product_metrics:

    • CurrentInfo CTE: No change needed (pulls from products).
    • OnOrderInfo CTE: Needs re-evaluation. How do you want to define "On Order"?
      • Option A (Strict PO View): SUM(po.ordered) from purchase_orders po WHERE po.status NOT IN ('canceled', 'done', 'paid_equivalent?'). This is quantity on open orders, ignoring fulfillment state. Simple, but might overestimate if items arrived unlinked.
      • Option B (Approximate Fulfillment): SUM(po.ordered) from open POs MINUS SUM(rl.received_qty) from receivings rl where rl.source_po_id = po.po_id (summing only directly linked receivings). Better, but still misses fulfillment via unlinked receivings.
      • Option C (Heuristic): SUM(po.ordered) from open POs MINUS SUM(rl.received_qty) from receivings rl where rl.pid = po.pid and rl.received_date >= po.date. This tries to account for unlinked receivings but is imprecise.
      • Recommendation: Start with Option A for simplicity, clearly labeling it "Quantity on Open POs". You might need a separate process or metric for a more nuanced view of expected vs. actual pipeline.
       -- Example for Option A
       OnOrderInfo AS (
           SELECT
               pid,
               SUM(ordered) AS on_order_qty, -- Total qty on open POs
               SUM(ordered * po_cost_price) AS on_order_cost -- Cost of qty on open POs
           FROM public.purchase_orders
           WHERE status NOT IN ('canceled', 'done', 'paid_equivalent?') -- Define your open statuses
           GROUP BY pid
       ),
      
    • HistoricalDates CTE:
      • date_first_sold, max_order_date: No change (queries orders).
      • date_first_received_calc, date_last_received_calc: Must be changed. Query MIN(rl.received_date) and MAX(rl.received_date) from the receivings table grouped by pid.
    • SnapshotAggregates CTE:
      • received_qty_30d, received_cost_30d: These are calculated from daily_product_snapshots, which are now correctly sourced from receivings, so this part is fine.
    • Forecasting Calculations: Will use the chosen definition of on_order_qty. Be aware of the implications of Option A (potentially inflated if unlinked receivings fulfill orders).
    • Result: Metrics are calculated based on distinct order data and complete receiving data. The definition of "on order" needs careful consideration.

Summary of this Approach:

  • Pros:
    • Accurately models distinct order and receiving events.
    • Provides a definitive source (receivings) for all received inventory.
    • Simplifies the purchase_orders table and its import logic.
    • Avoids complex/potentially inaccurate allocation logic for unlinked receivings within the main tables.
    • Avoids synthetic records.
    • Fixes downstream reporting (daily_snapshots receiving data).
  • Cons:
    • Requires creating/managing the receivings table.
    • Requires modifying downstream queries (ReceivingData, OnOrderInfo, HistoricalDates).
    • Calculating a precise "net quantity still expected to arrive" (true on-order minus all relevant fulfillment) becomes more complex and may require specific business rules or heuristics outside the basic table structure if Option A for OnOrderInfo isn't sufficient.

This two-table approach (purchase_orders + receivings) seems the most robust and accurate way to handle your requirement for complete receiving records independent of potentially flawed PO linking. It directly addresses the shortcomings of the previous attempts.