inventory/docs/calculate-issues.md at 4fdaab9e87717e3945ef88ba3303a3f04d665455

matt/inventory

Fork 0

Files

Matt 4fdaab9e87 Fix o3 issues on product-metrics script

2025-02-11 23:36:14 -05:00

20 KiB

Raw Blame History

Missing Updates for Reorder Point and Safety Stock [RESOLVED - product-metrics.js]

Problem: In the product_metrics table (used by the inventory health view), the fields reorder_point and safety_stock are never updated in the product metrics calculations. Although a helper function (calculateReorderQuantities) exists and computes these values, the update query in the calculateProductMetrics function does not assign any values to these columns.
Effect: The inventory health view relies on these fields (using COALESCE to default them to 0), which means that stock might never be classified as "Reorder" or "Healthy" based on the proper reorder point or safety stock calculations.
Example: Even if a product's base metrics would require a reorder (for example, if its days of inventory are low), the view always shows a value of 0 for reorder_point and safety_stock.
Fix: Update the product metrics query (or add a subsequent update) so that pm.reorder_point and pm.safety_stock are calculated (for instance, by integrating the logic from calculateReorderQuantities) and stored in the table.

Overwritten Module Exports When Combining Scripts [RESOLVED - calculate-metrics.js]

Problem: The code provided shows two distinct exports. The main metrics calculation module exports calculateMetrics (along with cancel and getProgress helpers), but later in the same concatenated file the module exports are overwritten.
Effect: If these two code sections end up in a single module file, the export for the main calculation will be lost. This would break any code that calls the overall metrics calculation.
Example: An external caller expecting to run calculateMetrics would instead receive the calculateProductMetrics function.
Fix: Make sure each script resides in its own module file. Verify that the module boundaries and exports are not accidentally merged or overwritten when deployed.

Potential Formula Issue in EOQ Calculation (Reorder Qty) [RESOLVED - product-metrics.js]

Problem: The helper function calculateReorderQuantities uses an EOQ formula with a holding cost expressed as a percentage (0.25) rather than a per‐unit cost.
Effect: If the intent was to use the traditional EOQ formula (which expects a holding cost per unit rather than a percentage), this could lead to an incorrect reorder quantity.
Example: For a given annual demand and fixed order cost, the computed reorder quantity might be higher or lower than expected.
Fix: Double-check the EOQ formula. If the intention is to compute based on a percentage, then document that clearly; otherwise, adjust the formula to use the proper holding cost value.

Potential Overlap or Redundancy in GMROI Calculation

Problem: In the time aggregates function, GMROI is calculated in two steps. The initial INSERT query computes GMROI as

CASE WHEN s.inventory_value > 0 THEN (s.total_revenue - s.total_cost) / s.inventory_value ELSE 0 END

and then a subsequent UPDATE query recalculates it as an annualized value using gross profit and active days.
Effect: Overwriting a computed value may be intentional to refine the metric, but if not coordinated it can cause confusion or unexpected output in the product_time_aggregates table.
Example: A product's GMROI might first appear as a simple ratio but then be updated to a scaled value based on the number of active days, which could lead to inconsistent reporting if not documented.
Fix: Confirm that the two-step process is intended. If only the annualized GMROI is desired, consolidate the calculation into one query or clearly document why both steps are needed.

This observation complements the earlier note about duplicate or overwritten calculations in the previous script. In both cases, it's important to verify that updates (or recalculations) are intentional rather than an oversight.

Handling of Products Without Orders or Purchase Data

****Problem: In the INSERT query of the time aggregates function, the UNION covers two cases: one for products with order data (from monthly_sales) and one for products that have entries in monthly_stock but no matching order data.
Effect: If a product has neither orders nor purchase orders, it won't get an entry in product_time_aggregates. Depending on business rules, this might be acceptable or might mean missing data.
Example: A product that's new or rarely ordered might not appear in the time aggregates view, potentially affecting downstream calculations.
Fix: If you need every product to have an aggregate record (even with zeros), add an additional query or logic to ensure that products without any matching records in both CTEs are inserted with default values.

Redundant Recalculation of Vendor Metrics

Problem: Similar concepts from prior scripts where cumulative metrics (like total_revenue and total_cost) are calculated in multiple query steps without necessary validation or optimization. In the vendor metrics script, calculations for total revenue and margin are performed within a WITH clause, which is then used in other parts of the process, making it more complex than needed.
Effect: There's unnecessary duplication in querying the same data multiple times across subqueries. It could result in decreased performance and may even lead to excess computation if the subqueries are not optimized or correctly indexed.
Example: Vendor sales and vendor purchase orders (PO) metrics are calculated in separate WITH clauses, leading to repeated calculations.
Fix: Synthesize the required metrics into fewer queries or reuse the results within the WITH clause itself. Avoid redundant calculations of revenue and cost unless truly necessary.

Handling Products Without Orders or Purchase Orders

Problem: In your calculateVendorMetrics script, the initial insert for vendor sales doesn't fully address the products that might not have matching orders or purchase orders. If a vendor has products without any sales within the last 12 months, the results may not be fully accurate unless handled explicitly.
Effect: If no orders exist for a product associated with a particular vendor, that product will not contribute to the vendor's metrics, potentially omitting important data when calculating total_orders or total_revenue.
Example: The scripted statistics fill gaps, but products with no recent purchase or sales orders might not be counted accurately.
Fix: Include logic to handle scenarios where these products still need to be part of the vendor calculation. Use a LEFT JOIN wherever possible to account for cases without sales or purchase orders.

Redundant ON DUPLICATE KEY UPDATE

Problem: Multiple queries in the calculateVendorMetrics script use ON DUPLICATE KEY UPDATE clauses to handle repeated metrics updates. This is useful for ensuring the most up-to-date calculations but can cause inconsistencies if multiple calculations happen for the same product or vendor simultaneously.
Effect: This approach can lead to an inaccurate update of brand-specific data when insertion and update overlap. Each time you add a new batch, an existing entry could be overwritten if not handled correctly.
Example: Vendor country, category, or sales-related metrics could unintentionally update during processing.
Fix: Match on current status more robustly in case of existing rows to avoid unnecessary updates. Ensure that the key used for ON DUPLICATE KEY aligns with any foreign key relationships that might indicate an already processed entry.

SQL Query Performance with Multiple Nested WITH Clauses

Problem: Heavily nested queries (especially WITH clauses) may lead to slow performance depending on the size of the dataset.
Effect: Computational burden could be high when the database is large, e.g., querying purchase orders, vendor sales, and product info simultaneously. Even with proper indexes, the deployment might struggle in production environments.
Example: Multiple WITH clauses in the vendor and brand metrics calculation scripts might work fine in small datasets but degrade performance in production.
Fix: Combine some subqueries and reduce the layer of computations needed for calculating final metrics. Test performance on a production-sized dataset to see how nested queries are handled.

Missing Updates for Reorder Metrics (Vendor/Brand)

Previously Identified Issue: Inconsistent updates for reorder_point and safety_stock across earlier scripts.
Current Impact on This Script: The vendor and brand metrics do not have explicit updates for reorder point or safety stock, which are essential for inventory evaluation.
Effect: The correct thresholds and reorder logic for vendor product inventory aren't fully accounted for in these scripts.
Fix: Integrate relevant logic to update reorder_point or safety_stock within the vendor and brand metrics calculations. Ensure that it's consistently computed and stored.

Data Integrity and Consistency

when tracking sales growth or performance

Problem: Brand metrics include a sales growth clause where negative results can sometimes be skewed severely if period data varies considerably.
Effect: If period boundaries are incorrect or records are missing, this can create drastic growth rate calculations.
Example: If the "previous" period has no sales but "current" has a substantial increase, the growth rate will show as 100%.
Fix: Implement checks that ensure both periods are valid and that the system calculates growth accurately, avoiding growth rates based solely on potential outliers. Replace consistent gaps with a no-growth rate or a meaningful zero.

Exclusion of Vendors With No Sales

The vendor metrics query is driven by the vendor_sales CTE, which aggregates data only for vendors that have orders in the past 12 months.
- Impact: Vendors that have purchase activity (or simply exist in vendor_details) but no recent sales won't show up in vendor_metrics. This could cause the frontend to miss metrics for vendors that might still be important.
- Fix: Consider adding a UNION or changing the driving set so that all vendors (for example, from vendor_details) are included—even if they have zero sales.
Identical Formulas for On-Time Delivery and Order Fill Rates

Both metrics are calculated as (received_orders / total_orders) * 100.
- Impact: If the business expects these to be distinct (for example, one might factor in on-time receipt versus mere receipt), then showing identical values on the frontend could be misleading.
- Fix: Verify and adjust the formulas if on-time delivery and order fill rates should be computed differently.
Handling Nulls and Defaults in Aggregations

The query uses COALESCE in most places, but be sure that every aggregated value (like average lead time) correctly defaults when no data is present.
- Impact: Incorrect defaults might cause odd or missing numbers on the production interface.
- Fix: Double-check that all numeric aggregates reliably default to 0 where needed.
Inconsistent Stock Filtering Conditions

In the main brand metrics query the CTE filters products with the condition

p.stock_quantity <= 5000 AND p.stock_quantity >= 0

whereas in the brand time-based metrics query the condition is only p.stock_quantity <= 5000.
- Impact: This discrepancy may lead to inconsistent numbers (for example, if any products have negative stock, which might be due to data issues) between overall brand metrics and time-based metrics on the frontend.
- Fix: Standardize the filtering criteria so that both queries treat out-of-range stock values in the same way.
Growth Rate Calculation Periods

The growth rate is computed by comparing revenue from the last 3 months ("current") against a period from 15–12 months ago ("previous").
- Impact: This narrow window may not reflect typical year-over-year performance and could lead to volatile or unexpected growth percentages on the frontend.
- Fix: Revisit the business logic for growth—if a longer or different comparison period is preferred, adjust the date intervals accordingly.
Potential NULLs in Aggregated Time-Based Metrics

In the brand time-based metrics query, aggregate expressions such as SUM(o.quantity * o.price) aren't wrapped with COALESCE.
- Impact: If there are no orders for a given brand/month, these sums might return NULL rather than 0, which could propagate into the frontend display.
- Fix: Wrap such aggregates in COALESCE (e.g. COALESCE(SUM(o.quantity * o.price), 0)) to ensure a default numeric value.
Grouping by Category Status in Base Metrics Insert
- Problem: The INSERT for base category metrics groups by both c.cat_id and c.status even though the table's primary key is just category_id.
- Effect: If a category's status changes over time, the grouping may produce unexpected updates (or even multiple groups before the duplicate key update kicks in), possibly causing the wrong status or aggregated figures to be stored.
- Example: A category that toggles between "active" and "inactive" might have its metrics calculated differently on different runs.
- Fix: Ensure that the grouping keys match the primary key (or that the status update logic is exactly as intended) so that a single row per category is maintained.
Potential Null Handling in Margin Calculations
- Problem: In the query for category time metrics, the calculation of average margin uses expressions such as SUM(o.quantity * (o.price - GREATEST(p.cost_price, 0))) without using COALESCE on p.cost_price.
- Effect: If any product's cost_price is NULL, then GREATEST(p.cost_price, 0) returns NULL and the resulting sum (and thus the margin) could become NULL rather than defaulting to 0. This might lead to missing or misleading margin figures on the frontend.
- Example: A product with a missing cost price would make the entire margin expression evaluate to NULL even when sales exist.
- Fix: Replace GREATEST(p.cost_price, 0) with GREATEST(COALESCE(p.cost_price, 0), 0) (or simply use COALESCE(p.cost_price, 0)) to ensure that missing values are handled.
Data Coverage in Growth Rate Calculation
- Problem: The growth rate update depends on multiple CTEs (current period, previous period, and trend analysis) that require a minimum amount of data (for instance, HAVING COUNT(*) >= 6 in the trend_stats CTE).
- Effect: Categories with insufficient historical data will fall into the "ELSE" branch (or may even be skipped if no revenue is present), which might result in a growth rate of 0.0 or an unexpected value.
- Example: A newly created category that has only two months of data won't have trend analysis, so its growth rate will be calculated solely by the simple difference, which might not reflect true performance.
- Fix: Confirm that this fallback behavior is acceptable for production; if not, adjust the logic so that every category receives a consistent growth rate even with sparse data.
Omission of Forecasts for Zero–Sales Categories
- Observation: The category–sales metrics query uses a HAVING AVG(cs.daily_quantity) > 0 clause.
- Effect: Categories without any average daily sales will not receive a forecast record in category_sales_metrics. If the frontend expects a row (even with zeros) for every category, this will lead to missing data.
- Fix: Verify that it's acceptable for categories with no sales to have no forecast entry. If not, adjust the query so that a default forecast (with zeros) is inserted.
Randomness in Category-Level Forecast Revenue Calculation
- Problem: In the category-level forecasts query, the forecast revenue is multiplied by a factor of (0.95 + (RAND() * 0.1)).
- Effect: This introduces randomness into the forecast figures so that repeated runs could yield slightly different values. If deterministic forecasts are expected on the production frontend, this could lead to inconsistent displays.
- Example: The same category might show a 5% higher forecast on one run and 3% on another because of the random multiplier.
- Fix: Confirm that this randomness is intentional for your forecasting model; if forecasts are meant to be reproducible, remove or replace the RAND() factor with a fixed multiplier.
Multi-Statement Cleanup of Temporary Tables
- Problem: The cleanup query drops multiple temporary tables in one call (separated by semicolons).
- Effect: If your Node.js MySQL driver isn't configured to allow multi-statement execution, this query may fail, leaving temporary tables behind. Leftover temporary tables might eventually cause conflicts or resource issues.
- Example: Running the cleanup query could produce an error like "multi-statement queries not enabled," preventing proper cleanup.
- Fix: Either configure your database connection to allow multi-statements or issue separate queries for each temporary table drop to ensure that the cleanup runs successfully.
Handling Products with No Sales Data
- Problem: In the product-level forecast calculation, the CTE daily_stats includes a HAVING AVG(ds.daily_quantity) > 0 clause.
- Effect: Products that have no sales (or a zero average daily quantity) will be excluded from the forecasts. This means the frontend won't show forecasts for non–selling products, which might be acceptable but could also be a completeness issue.
- Example: A product that has never sold will not appear in the sales_forecasts table.
- Fix: Confirm that it is intended for forecasts to be generated only for products with some sales activity. If forecasts are required for all products, adjust the query to insert default forecast records for products with zero sales.
Complexity of the Forecast Formula Involving the Seasonality Factor

Issue:

The sales forecast calculations incorporate an adjustment factor using COALESCE(sf.seasonality_factor, 0) to modify forecast units and revenue. This means that if the seasonality data is missing (or not populated), the factor defaults to 0.
Potential Problem:

A default value of 0 will drastically alter the forecast calculations—often leading to a forecast of 0 or an overly dampened forecast—when in reality the intended behavior might be to use a neutral multiplier (typically 1.0). This could result in forecasts that are not reflective of the actual seasonal impact, thereby skewing the figures that reach the frontend.
Fix:

Review your data source for seasonality (the sales_seasonality table) and ensure it's consistently populated. Alternatively, if missing seasonality data is possible, consider using a more neutral default (such as 1.0) in your COALESCE. This change would prevent the forecast formulas from over-simplifying (or even nullifying) the forecast output due to missing seasonality factors.

Group By with Seasonality Factor Variability
- Observation: In the forecast insertion query, the GROUP BY clause includes sf.seasonality_factor along with other fields.
- Effect: If the seasonality factor differs (or is NULL versus a value) for different forecast dates, this might result in multiple rows for the same product and forecast date. However, the ON DUPLICATE KEY UPDATE clause will merge them—but only if the primary key (pid, forecast_date) is truly unique.
- Fix: Verify that the grouping produces exactly one row per product per forecast date. If there's potential for multiple rows due to seasonality variability, consider applying a COALESCE or an aggregation on the seasonality factor so that it does not affect grouping.
Memory Management for Temporary Tables [RESOLVED - calculate-metrics.js]

Problem: In metrics calculations, temporary tables aren't always properly cleaned up if the process fails between creation and the DROP statement.
Effect: If a process fails after creating temporary tables but before dropping them, these tables remain in memory until the connection is closed. In a production environment with multiple calculation runs, this could lead to memory leaks or table name conflicts.
Example: The temp_revenue_ranks table creation in ABC classification could remain if the process fails before reaching the DROP statement.
Fix: Implement proper cleanup in a finally block or use transaction management that ensures temporary tables are always cleaned up, even in failure scenarios.

20 KiB Raw Blame History Unescape Escape

20 KiB

Raw Blame History