# Metrics Calculation System Documentation This document provides a comprehensive description of the metrics calculation system in the inventory management application, detailing the calculation modules, data transformations, formulas, and overall architecture. ## Table of Contents 1. [Overview](#overview) 2. [System Architecture](#system-architecture) 3. [Calculation Process](#calculation-process) 4. [Metrics Modules](#metrics-modules) - [Product Metrics](#product-metrics) - [Time Aggregates](#time-aggregates) - [Financial Metrics](#financial-metrics) - [Vendor Metrics](#vendor-metrics) - [Category Metrics](#category-metrics) - [Brand Metrics](#brand-metrics) - [Sales Forecasts](#sales-forecasts) 5. [Database Schema](#database-schema) 6. [Performance Optimizations](#performance-optimizations) 7. [Progress Tracking and Reporting](#progress-tracking-and-reporting) 8. [Error Handling](#error-handling) 9. [Utility Components](#utility-components) 10. [Implementation Notes](#implementation-notes) ## Overview The metrics calculation system processes inventory, sales, and purchase data to generate insights and forecasts for business decision-making. It powers dashboards, reports, and automated inventory recommendations that are core to the business operations. The system can operate in two modes: - **Complete Calculation**: Calculates all metrics from scratch - **Selective Calculation**: Calculates only specified metric types Key features of the system include: - Progress tracking with estimated completion time - Cancellation capability for long-running operations - Fault tolerance with automatic recovery - Temporary table usage for optimal performance - Batched processing for memory efficiency - Multiple calculation modules that can run independently ## System Architecture The calculation system follows a modular architecture with these components: ``` ├── calculate-metrics.js # Main orchestration script ├── metrics/ # Calculation modules │ ├── product-metrics.js # Product-level metrics calculation │ ├── time-aggregates.js # Time-based aggregation calculations │ ├── financial-metrics.js # Financial performance metrics │ ├── vendor-metrics.js # Supplier/vendor analytics │ ├── category-metrics.js # Category-level performance metrics │ ├── brand-metrics.js # Brand performance analytics │ ├── sales-forecasts.js # Future sales prediction │ └── utils/ # Shared utilities │ ├── db.js # Database connection management │ └── progress.js # Progress tracking utilities ``` The system uses PostgreSQL for data storage and processing, leveraging: - Temporary tables for intermediate calculations - Database transactions for data consistency - SQL window functions for ranking and trends - Batch processing for memory efficiency ## Calculation Process The calculation process follows a sequential flow with the following steps: 1. **Initialization** - Establish database connection - Create calculation history record - Set up progress tracking - Prepare environment 2. **Module Execution** - Product Metrics: Calculate inventory metrics and ABC classification - Time Aggregates: Calculate time-series data aggregations - Financial Metrics: Calculate profit, margin, and ROI metrics - Vendor Metrics: Calculate supplier performance metrics - Category Metrics: Calculate category performance metrics - Brand Metrics: Calculate brand performance metrics - Sales Forecasts: Generate future sales predictions 3. **Finalization** - Clean up temporary tables - Update calculation history status - Record completion time - Release database connections Each module creates its own temporary tables, performs calculations, updates the appropriate destination tables, and then cleans up. Modules can be enabled or disabled through configuration flags: ```javascript // Configuration flags for controlling which metrics to calculate // Set to 1 to skip the corresponding calculation, 0 to run it const SKIP_PRODUCT_METRICS = 0; const SKIP_TIME_AGGREGATES = 0; const SKIP_FINANCIAL_METRICS = 0; const SKIP_VENDOR_METRICS = 0; const SKIP_CATEGORY_METRICS = 0; const SKIP_BRAND_METRICS = 0; const SKIP_SALES_FORECASTS = 0; ``` ## Metrics Modules ### Product Metrics **Purpose**: Calculate core metrics for individual products including stock status, sales rates, inventory classifications, and reorder recommendations. **Source Tables**: - `products`: Core product information and current stock - `orders`: Sales order data - `purchase_orders`: Supplier orders **Destination Tables**: - `product_metrics`: Main repository for calculated metrics - `abc_classification`: Product importance classification **Key Calculations**: | Metric | Formula/Logic | Description | |--------|--------------|-------------| | Inventory Value | `stock_quantity * cost_price` | Current value of inventory at cost | | Daily Sales Avg | `SUM(quantity) / COUNT(DISTINCT DATE(date))` | Average daily unit sales in past 90 days | | Weekly Sales Avg | `SUM(quantity) / CEIL(COUNT(DISTINCT DATE(date)) / 7)` | Average weekly unit sales in past 90 days | | Monthly Sales Avg | `SUM(quantity) / CEIL(COUNT(DISTINCT DATE(date)) / 30)` | Average monthly unit sales in past 90 days | | Total Revenue | `SUM(quantity * price)` | Total revenue in past 90 days | | Avg Margin Percent | `((SUM(quantity * price) - SUM(quantity * cost_price)) / SUM(quantity * price)) * 100` | Average margin percentage | | Avg Lead Time | `AVG(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0)` | Average lead time in days from order to receipt | | Days of Inventory | `floor(stock_quantity / daily_sales_avg)` | How many days current stock will last | | Weeks of Inventory | `floor(stock_quantity / weekly_sales_avg)` | How many weeks current stock will last | | Stock Status | Complex conditional logic (see below) | Current inventory status category | | Safety Stock | `CEIL(daily_sales_avg * SQRT(avg_lead_time_days) * 1.96)` | Buffer stock to maintain service level (95%) | | Reorder Point | `CEIL(daily_sales_avg * avg_lead_time_days) + safety_stock` | When to reorder based on lead time and safety stock | | Reorder Qty | `SQRT((2 * annual_demand * order_cost) / holding_cost)` | Economic Order Quantity (EOQ) formula | | Overstocked Amt | `MAX(0, stock_quantity - (daily_sales_avg * overstock_days))` | Units considered excess inventory | | ABC Class | Percentile ranking by revenue | 'A': top 20%, 'B': 21-50%, 'C': remainder | | Turnover Rate | `(total_sold / avg_nonzero_stock) * (365.0 / active_days)` | Inventory turnover rate adjusted for zero-stock periods | | Forecast Accuracy | `100 - MIN(AVG(ABS(forecast_quantity - actual_quantity) / actual_quantity * 100), 100)` | Historic forecast accuracy percentage | | Forecast Bias | `AVG((forecast_quantity - actual_quantity) / actual_quantity * 100)` | Directional bias of forecasts (positive = over-forecasting) | **Stock Status Logic**: ```sql CASE WHEN stock_quantity <= 0 THEN 'Out of Stock' WHEN daily_sales_avg = 0 AND stock_quantity <= low_stock_threshold THEN 'Low Stock' WHEN daily_sales_avg = 0 THEN 'In Stock' WHEN stock_quantity / daily_sales_avg <= critical_days THEN 'Critical' WHEN stock_quantity / daily_sales_avg <= reorder_days THEN 'Reorder' WHEN stock_quantity / daily_sales_avg > overstock_days THEN 'Overstocked' ELSE 'Healthy' END ``` **Process Flow**: 1. Create temporary tables for sales and purchase metrics 2. Calculate sales averages by product 3. Calculate lead times and purchase metrics 4. Update product_metrics with calculated values 5. Calculate ABC classification using revenue ranking 6. Calculate turnover rates with zero-stock period adjustment 7. Calculate forecast accuracy and bias using historical data **Optimizations**: - Batch processing of products (5,000 per batch) - Timeout protection for long-running queries - Indexed temporary tables for performance - Strategic transaction management ### Time Aggregates **Purpose**: Aggregate sales and purchase data by time periods (year/month) for trend analysis and historical reporting. **Source Tables**: - `products`: Core product information - `orders`: Sales order data - `purchase_orders`: Supplier orders **Destination Tables**: - `product_time_aggregates`: Time-based metrics by product, year, and month **Key Calculations**: | Metric | Formula/Logic | Description | |--------|--------------|-------------| | Total Quantity Sold | `SUM(quantity)` | Units sold in the time period | | Total Revenue | `SUM((price - discount) * quantity)` | Net revenue in the time period | | Total Cost | `SUM(cost_price * quantity)` | Cost of goods sold in time period | | Order Count | `COUNT(DISTINCT order_number)` | Number of orders in the time period | | Stock Received | `SUM(received)` | Units received from suppliers in time period | | Stock Ordered | `SUM(ordered)` | Units ordered from suppliers in time period | | Avg Price | `AVG(price - discount)` | Average net selling price in time period | | Profit Margin | `((total_revenue - total_cost) / total_revenue) * 100` | Percentage profit margin in time period | | GMROI | `((total_revenue - total_cost) * (365.0 / active_days)) / inventory_value` | Gross Margin Return on Inventory Investment | **Process Flow**: 1. Calculate monthly aggregates for sales data 2. Calculate monthly aggregates for purchase data 3. Combine sales and purchase data for complete monthly picture 4. Update financial metrics for each time period 5. Calculate efficiency metrics like GMROI **SQL Example (Monthly Aggregation)**: ```sql WITH monthly_sales AS ( SELECT o.pid, EXTRACT(YEAR FROM o.date) as year, EXTRACT(MONTH FROM o.date) as month, SUM(o.quantity) as total_quantity_sold, SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) as total_revenue, SUM(COALESCE(p.cost_price, 0) * o.quantity) as total_cost, COUNT(DISTINCT o.order_number) as order_count, AVG(o.price - COALESCE(o.discount, 0)) as avg_price, -- Profit margin calculation CASE WHEN SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) > 0 THEN ((SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) - SUM(COALESCE(p.cost_price, 0) * o.quantity)) / SUM((o.price - COALESCE(o.discount, 0)) * o.quantity)) * 100 ELSE 0 END as profit_margin, p.cost_price * p.stock_quantity as inventory_value, COUNT(DISTINCT DATE(o.date)) as active_days FROM orders o JOIN products p ON o.pid = p.pid WHERE o.canceled = false GROUP BY o.pid, EXTRACT(YEAR FROM o.date), EXTRACT(MONTH FROM o.date), p.cost_price, p.stock_quantity ) -- Insert results into product_time_aggregates table ``` ### Financial Metrics **Purpose**: Calculate profit, revenue, margin, and return on investment metrics for financial analysis. **Source Tables**: - `products`: Core product information - `orders`: Sales order data - `product_metrics`: Previously calculated product metrics **Destination Tables**: - `financial_metrics`: Financial performance metrics - `product_metrics`: Updated with financial calculations **Key Calculations**: | Metric | Formula/Logic | Description | |--------|--------------|-------------| | Gross Profit | `SUM(quantity * (price - cost_price))` | Total profit across all sales | | Gross Margin | `(gross_profit / SUM(quantity * price)) * 100` | Profit margin percentage | | GMROI | `gross_profit / AVG(inventory_value)` | Gross Margin Return on Inventory Investment | | GMROII | `GMROI / inventory_turns` | GMROI Index (normalized by inventory turns) | | Revenue per Unit | `SUM(price * quantity) / SUM(quantity)` | Average revenue per unit sold | | Cost per Unit | `SUM(cost_price * quantity) / SUM(quantity)` | Average cost per unit sold | | COGS | `SUM(cost_price * quantity)` | Cost of Goods Sold total | | Total Discounts | `SUM(discount * quantity)` | Total discount amount | | Discount Rate | `total_discounts / SUM(price * quantity)` | Percentage of revenue discounted | | Revenue Rank | `RANK() OVER (ORDER BY total_revenue DESC)` | Revenue ranking compared to other products | | Profit Rank | `RANK() OVER (ORDER BY gross_profit DESC)` | Profit ranking compared to other products | | ROI | `(gross_profit / SUM(cost_price * quantity)) * 100` | Return on Investment percentage | **Process Flow**: 1. Calculate base financial metrics for each product 2. Calculate comparative rankings across products 3. Calculate inventory efficiency metrics 4. Update appropriate destination tables ### Vendor Metrics **Purpose**: Analyze supplier performance including lead times, reliability, and cost metrics. **Source Tables**: - `products`: Core product information - `purchase_orders`: Supplier orders - `vendors`: Supplier information **Destination Tables**: - `vendor_metrics`: Supplier performance metrics - `vendor_time_aggregates`: Time-based supplier metrics **Key Calculations**: | Metric | Formula/Logic | Description | |--------|--------------|-------------| | Avg Lead Time | `AVG(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0)` | Average lead time in days | | Lead Time Variance | `STDDEV(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0)` | Variance in lead times | | Fill Rate | `SUM(received) / NULLIF(SUM(ordered), 0) * 100` | Percentage of ordered units received | | On-Time Delivery | `COUNT(CASE WHEN received_date <= expected_date THEN 1 END) / COUNT(*) * 100` | Percentage of on-time deliveries | | Avg Cost | `AVG(cost_price)` | Average cost per unit | | Cost Variance | `STDDEV(cost_price) / AVG(cost_price) * 100` | Variation in costs (%) | | Total Spend | `SUM(ordered * cost_price)` | Total amount spent with vendor | | Order Frequency | `COUNT(DISTINCT po_id) / COUNT(DISTINCT DATE_TRUNC('month', date))` | Average orders per month | | Quality Rating | Complex formula based on returns | Rating of product quality from vendor | | Vendor Score | Weighted score of all metrics | Overall performance score | **Process Flow**: 1. Calculate lead time metrics by vendor 2. Calculate fill rate and delivery metrics 3. Calculate cost and spend metrics 4. Calculate quality and reliability metrics 5. Generate overall vendor score 6. Update time-based aggregates by month ### Category Metrics **Purpose**: Analyze performance metrics at the category level for merchandising insights. **Source Tables**: - `products`: Core product information - `product_categories`: Category relationships - `orders`: Sales order data - `categories`: Category information **Destination Tables**: - `category_metrics`: Category performance metrics - `category_time_aggregates`: Time-based category metrics **Key Calculations**: | Metric | Formula/Logic | Description | |--------|--------------|-------------| | Total Products | `COUNT(DISTINCT p.pid)` | Number of products in category | | Active Products | `COUNT(DISTINCT CASE WHEN o.pid IS NOT NULL THEN p.pid END)` | Products with sales in period | | Total Revenue | `SUM(o.quantity * o.price)` | Total revenue from category | | Total Units | `SUM(o.quantity)` | Total units sold from category | | Avg Price | `SUM(o.quantity * o.price) / SUM(o.quantity)` | Average selling price in category | | Gross Profit | `SUM(o.quantity * (o.price - p.cost_price))` | Total profit from category | | Margin Percent | `(gross_profit / total_revenue) * 100` | Profit margin percentage | | Inventory Value | `SUM(p.stock_quantity * p.cost_price)` | Current inventory value at cost | | Inventory Turns | `COGS / AVG(inventory_value)` | How many times inventory turns over | | Revenue Growth | `(current_period_revenue / previous_period_revenue - 1) * 100` | Revenue growth percentage | | Category Share | `category_revenue / total_revenue * 100` | Percentage of total revenue | | Avg Weekly Sales | `SUM(o.quantity) / COUNT(DISTINCT DATE_TRUNC('week', o.date))` | Average weekly units sold | **Process Flow**: 1. Calculate category product counts 2. Calculate category sales metrics 3. Calculate category profitability 4. Calculate category growth rates 5. Calculate category share metrics 6. Update time-based aggregates by month ### Brand Metrics **Purpose**: Analyze performance metrics at the brand level for brand management insights. **Source Tables**: - `products`: Core product information - `product_categories`: Brand relationships (brands are stored as categories) - `orders`: Sales order data **Destination Tables**: - `brand_metrics`: Brand performance metrics - `brand_time_aggregates`: Time-based brand metrics **Key Calculations**: | Metric | Formula/Logic | Description | |--------|--------------|-------------| | Total Products | `COUNT(DISTINCT p.pid)` | Number of products in brand | | Active Products | `COUNT(DISTINCT CASE WHEN o.pid IS NOT NULL THEN p.pid END)` | Products with sales in period | | Total Revenue | `SUM(o.quantity * o.price)` | Total revenue from brand | | Total Units | `SUM(o.quantity)` | Total units sold from brand | | Avg Price | `SUM(o.quantity * o.price) / SUM(o.quantity)` | Average selling price in brand | | Gross Profit | `SUM(o.quantity * (o.price - p.cost_price))` | Total profit from brand | | Margin Percent | `(gross_profit / total_revenue) * 100` | Profit margin percentage | | Inventory Value | `SUM(p.stock_quantity * p.cost_price)` | Current inventory value at cost | | Inventory Turns | `COGS / AVG(inventory_value)` | How many times inventory turns over | | Revenue Growth | `(current_period_revenue / previous_period_revenue - 1) * 100` | Revenue growth percentage | | Brand Share | `brand_revenue / total_revenue * 100` | Percentage of total revenue | | Brand Velocity | `SUM(o.quantity) / COUNT(DISTINCT DATE(o.date))` | Daily sales velocity | **Process Flow**: 1. Calculate brand product counts 2. Calculate brand sales metrics 3. Calculate brand profitability 4. Calculate brand growth rates 5. Calculate brand share metrics 6. Update time-based aggregates by month ### Sales Forecasts **Purpose**: Generate sales forecasts for future periods using historical data and seasonality factors. **Source Tables**: - `products`: Core product information - `orders`: Sales order data - `product_categories`: Category relationships - `sales_seasonality`: Seasonal adjustment factors **Destination Tables**: - `sales_forecasts`: Product-level sales forecasts - `category_forecasts`: Category-level sales forecasts **Key Calculations**: | Metric | Formula/Logic | Description | |--------|--------------|-------------| | Daily Sales Avg | `AVG(daily_quantity)` for each day of week | Average sales by day of week | | Sales Std Dev | `STDDEV(daily_quantity)` | Standard deviation of daily sales | | Sales Variance Ratio | `STDDEV(daily_quantity) / AVG(daily_quantity)` | Measure of sales variability | | Forecast Quantity | `avg_daily_qty * (1 + seasonality_factor) * confidence_adjustment` | Predicted daily sales | | Confidence Level | Complex scale based on data quality | Confidence in forecast (60-90%) | | Seasonality Factor | Monthly adjustment factor | Sales variation by month (-30% to +30%) | | Forecast Revenue | `forecast_quantity * avg_price` | Predicted daily revenue | | Category Forecast | Sum of product forecasts with adjustments | Forecast at category level | **Confidence Level Calculation**: ```sql CASE WHEN total_days >= 60 AND daily_variance_ratio < 0.5 THEN 90 WHEN total_days >= 60 THEN 85 WHEN total_days >= 30 AND daily_variance_ratio < 0.5 THEN 80 WHEN total_days >= 30 THEN 75 WHEN total_days >= 14 AND daily_variance_ratio < 0.5 THEN 70 WHEN total_days >= 14 THEN 65 ELSE 60 END ``` **Process Flow**: 1. Create temporary tables for forecast dates (next 30 days) 2. Calculate daily sales statistics by product and day of week 3. Calculate product-level statistics (variability, seasonality) 4. Generate product-level forecasts with confidence levels 5. Aggregate to category level forecasts 6. Apply seasonality adjustments 7. Store forecasts in respective tables **Method Details**: - Uses a combination of time-series analysis and statistical methods - Applies day-of-week patterns from historical data - Adjusts for seasonal variations by month - Accounts for data quality in confidence calculation - Handles products with intermittent demand differently - Provides both unit and revenue forecasts ## Database Schema ### Main Tables #### calculate_history | Column | Type | Description | |--------|------|-------------| | id | SERIAL | Primary key | | start_time | TIMESTAMP | Calculation start time | | end_time | TIMESTAMP | Calculation end time | | duration_seconds | INTEGER | Elapsed time in seconds | | status | TEXT | 'running', 'completed', 'cancelled', 'failed' | | total_products | INTEGER | Total products processed | | total_orders | INTEGER | Total orders processed | | total_purchase_orders | INTEGER | Total purchase orders processed | | processed_products | INTEGER | Number of processed products | | processed_orders | INTEGER | Number of processed orders | | processed_purchase_orders | INTEGER | Number of processed purchase orders | | error_message | TEXT | Error message if failed | | additional_info | JSONB | Configuration and results | #### calculate_status | Column | Type | Description | |--------|------|-------------| | module_name | TEXT | Name of calculation module | | last_calculation_timestamp | TIMESTAMP | When module last completed | #### product_metrics | Column | Type | Description | |--------|------|-------------| | pid | BIGINT | Product ID (PK) | | inventory_value | DECIMAL(10,2) | Current inventory value | | daily_sales_avg | DECIMAL(10,3) | Average daily sales (units) | | weekly_sales_avg | DECIMAL(10,3) | Average weekly sales (units) | | monthly_sales_avg | DECIMAL(10,3) | Average monthly sales (units) | | total_revenue | DECIMAL(10,2) | Total revenue from product | | avg_margin_percent | DECIMAL(10,2) | Average margin percentage | | first_sale_date | DATE | Date of first sale | | last_sale_date | DATE | Date of last sale | | avg_lead_time_days | DOUBLE PRECISION | Average lead time in days | | days_of_inventory | INTEGER | Days until stockout at current rate | | weeks_of_inventory | INTEGER | Weeks until stockout at current rate | | stock_status | TEXT | Current stock status | | safety_stock | INTEGER | Recommended safety stock | | reorder_point | INTEGER | Recommended reorder point | | reorder_qty | INTEGER | Recommended order quantity | | overstocked_amt | INTEGER | Units considered excess inventory | | abc_class | CHAR(1) | ABC classification | | turnover_rate | DECIMAL(10,2) | Inventory turnover rate | | forecast_accuracy | DECIMAL(10,2) | Accuracy of past forecasts (%) | | forecast_bias | DECIMAL(10,2) | Directional bias of forecasts (%) | | last_forecast_date | DATE | Date of most recent forecast | | last_calculated_at | TIMESTAMP | Last calculation time | #### product_time_aggregates | Column | Type | Description | |--------|------|-------------| | pid | BIGINT | Product ID (PK) | | year | INTEGER | Year (PK) | | month | INTEGER | Month (PK) | | total_quantity_sold | INTEGER | Units sold in period | | total_revenue | DECIMAL(10,2) | Revenue in period | | total_cost | DECIMAL(10,2) | Cost in period | | order_count | INTEGER | Number of orders | | stock_received | INTEGER | Units received | | stock_ordered | INTEGER | Units ordered | | avg_price | DECIMAL(10,2) | Average price | | profit_margin | DECIMAL(10,2) | Profit margin (%) | | inventory_value | DECIMAL(10,2) | Inventory value | | gmroi | DECIMAL(10,3) | Gross Margin Return on Investment | #### sales_forecasts | Column | Type | Description | |--------|------|-------------| | pid | BIGINT | Product ID (PK) | | forecast_date | DATE | Date of forecast (PK) | | forecast_quantity | DECIMAL(10,2) | Predicted quantity | | confidence_level | INTEGER | Confidence (60-90%) | | created_at | TIMESTAMP | Creation time | #### category_forecasts | Column | Type | Description | |--------|------|-------------| | category_id | BIGINT | Category ID (PK) | | forecast_date | DATE | Date of forecast (PK) | | forecast_units | DECIMAL(10,2) | Predicted units | | forecast_revenue | DECIMAL(10,2) | Predicted revenue | | confidence_level | INTEGER | Confidence (60-90%) | | created_at | TIMESTAMP | Creation time | #### vendor_metrics | Column | Type | Description | |--------|------|-------------| | vendor_id | BIGINT | Vendor ID (PK) | | avg_lead_time | DECIMAL(10,2) | Average lead time in days | | lead_time_variance | DECIMAL(10,2) | Variance in lead times | | fill_rate | DECIMAL(10,2) | Percentage of ordered units received | | on_time_delivery | DECIMAL(10,2) | Percentage of on-time deliveries | | avg_cost | DECIMAL(10,2) | Average cost per unit | | cost_variance | DECIMAL(10,2) | Variation in costs (%) | | total_spend | DECIMAL(10,2) | Total amount spent | | order_frequency | DECIMAL(10,2) | Average orders per month | | quality_rating | DECIMAL(10,2) | Rating of product quality | | vendor_score | DECIMAL(10,2) | Overall performance score | | last_calculated_at | TIMESTAMP | Last calculation time | ## Performance Optimizations The metrics calculation system employs several optimizations for performance: ### 1. Batch Processing Products are processed in configurable batch sizes (default 5,000) to: - Reduce memory usage - Improve transaction management - Enable partial progress in case of failure **Implementation**: ```javascript // Process updates in batches let lastPid = 0; let batchCount = 0; const MAX_BATCHES = 1000; // Safety limit const BATCH_SIZE = 5000; while (batchCount < MAX_BATCHES) { batchCount++; const batch = await connection.query( 'SELECT pid FROM products WHERE pid > $1 ORDER BY pid LIMIT $2', [lastPid, BATCH_SIZE] ); if (batch.rows.length === 0) break; // Process the entire batch in a single efficient query await connection.query(` UPDATE product_metrics pm SET /* fields to update */ FROM /* joined tables */ WHERE p.pid = ANY($1::bigint[]) AND pm.pid = p.pid `, [batch.rows.map(row => row.pid)]); lastPid = batch.rows[batch.rows.length - 1].pid; } ``` ### 2. Temporary Tables The system creates temporary tables for intermediate calculations to: - Reduce query complexity - Improve join performance - Enable indexing of intermediate results **Example**: ```javascript // Create temp_sales_metrics await connection.query(` CREATE TEMPORARY TABLE temp_sales_metrics ( pid BIGINT NOT NULL, daily_sales_avg DECIMAL(10,3), weekly_sales_avg DECIMAL(10,3), monthly_sales_avg DECIMAL(10,3), total_revenue DECIMAL(10,3), avg_margin_percent DECIMAL(10,3), first_sale_date DATE, last_sale_date DATE, PRIMARY KEY (pid) ) `); // Create indexes on temporary tables await connection.query('CREATE INDEX ON temp_revenue_ranks (rank_num)'); ``` ### 3. Query Timeouts Long-running queries have timeout protection: ```javascript await Promise.race([ connection.query(` INSERT INTO temp_purchase_metrics /* Complex query here */ `), new Promise((_, reject) => setTimeout(() => reject(new Error('Timeout: query took too long')), 60000) ) ]).catch(async (err) => { logError(err, 'Error populating table, continuing with fallback'); // Fallback mechanism }); ``` ### 4. Connection Pooling Database connections are managed through a connection pool: ```javascript const dbConfig = { // Database connection details max: 10, // connection pool max size idleTimeoutMillis: 30000, connectionTimeoutMillis: 60000 }; const pool = new Pool(dbConfig); ``` ### 5. Transaction Management Each module uses appropriate transaction boundaries: ```javascript // Start transaction await connection.query('BEGIN'); try { // Perform operations... // Commit transaction await connection.query('COMMIT'); } catch (error) { // Rollback on error await connection.query('ROLLBACK'); throw error; } ``` ## Progress Tracking and Reporting The system includes a comprehensive progress tracking system that: 1. Estimates completion time based on processed items 2. Reports progress in real-time 3. Persists progress information for UI display 4. Handles interruptions gracefully ### Progress Output Format ```javascript { status: 'running', // running, complete, error, cancelled operation: 'Operation description', current: 1000, // Current number of processed items total: 10000, // Total items to process elapsed: '2m 30s', // Formatted elapsed time remaining: '7m 15s', // Estimated remaining time rate: 120, // Processing rate (items/second) percentage: '10.0', // Progress percentage timing: { start_time: '2023-03-26T12:00:00Z', end_time: '2023-03-26T12:02:30Z', elapsed_seconds: 150 } } ``` ### Key Functions **formatElapsedTime**: Converts milliseconds to human-readable format ```javascript function formatElapsedTime(elapsed) { // If elapsed is a timestamp, convert to elapsed milliseconds if (elapsed instanceof Date || elapsed > 1000000000000) { elapsed = Date.now() - elapsed; } else { // If elapsed is in seconds, convert to milliseconds elapsed = elapsed * 1000; } const seconds = Math.floor(elapsed / 1000); const minutes = Math.floor(seconds / 60); const hours = Math.floor(minutes / 60); if (hours > 0) { return `${hours}h ${minutes % 60}m`; } else if (minutes > 0) { return `${minutes}m ${seconds % 60}s`; } else { return `${seconds}s`; } } ``` **estimateRemaining**: Calculates estimated time to completion ```javascript function estimateRemaining(startTime, current, total) { if (current === 0) return null; const elapsed = Date.now() - startTime; const rate = current / elapsed; const remaining = (total - current) / rate; const minutes = Math.floor(remaining / 60000); const seconds = Math.floor((remaining % 60000) / 1000); if (minutes > 0) { return `${minutes}m ${seconds}s`; } else { return `${seconds}s`; } } ``` **outputProgress**: Outputs progress information ```javascript function outputProgress(data) { // Save progress to file for resumption saveProgress(data); // Format as JSON const event = { progress: data }; // Send to stdout for frontend process.stdout.write(JSON.stringify(event) + '\n'); // Log significant events to disk if (isSignificantEvent(data)) { logImport(`${data.operation || 'Operation'}${data.message ? ': ' + data.message : ''}${data.error ? ' Error: ' + data.error : ''}${data.status ? ' Status: ' + data.status : ''}`); } } ``` ## Error Handling The system includes comprehensive error handling: ### 1. Global Error Handlers ```javascript // Add error handler for uncaught exceptions process.on('uncaughtException', (error) => { console.error('Uncaught Exception:', error); process.exit(1); }); // Add error handler for unhandled promise rejections process.on('unhandledRejection', (reason, promise) => { console.error('Unhandled Rejection at:', promise, 'reason:', reason); process.exit(1); }); ``` ### 2. Module-Level Error Handling Each module uses try/catch blocks with detailed error reporting: ```javascript try { // Module operations } catch (error) { success = false; logError(error, 'Error calculating product metrics'); // Update history with error await connection.query(` UPDATE calculate_history SET end_time = NOW(), duration_seconds = $1, processed_products = $2, processed_orders = $3, processed_purchase_orders = $4, status = $5, error_message = $6 WHERE id = $7 `, [ totalElapsedSeconds, processedProducts || 0, processedOrders || 0, processedPurchaseOrders || 0, isCancelled ? 'cancelled' : 'failed', error.message, calculateHistoryId ]); throw error; } finally { // Cleanup operations if (connection) { try { await cleanupTemporaryTables(connection); connection.release(); } catch (err) { console.error('Error in final cleanup:', err); } } } ``` ### 3. Error Logging Errors are logged to file and console: ```javascript function logError(error, context = '') { const timestamp = new Date().toISOString(); const errorMessage = `[${timestamp}] ${context}\nError: ${error.message}\nStack: ${error.stack}\n\n`; // Log to error file fs.appendFileSync(ERROR_LOG, errorMessage); // Also log to console console.error(`\n${context}\nError: ${error.message}`); } ``` ### 4. Timeout Management A global timeout prevents runaway processes: ```javascript // Set a maximum execution time (30 minutes) const MAX_EXECUTION_TIME = 30 * 60 * 1000; const timeout = setTimeout(() => { console.error(`Calculation timed out after ${MAX_EXECUTION_TIME/1000} seconds, forcing termination`); // Call cancel and force exit cancelCalculation(); process.exit(1); }, MAX_EXECUTION_TIME); // Clear timeout when process completes clearTimeout(timeout); ``` ### 5. Cancellation Support Users can cancel long-running calculations: ```javascript function cancelCalculation() { isCancelled = true; console.log('Calculation has been cancelled by user'); // Force-terminate any query that's been running for more than 5 seconds try { const connection = getConnection(); connection.then(async (conn) => { try { // Identify and terminate long-running queries from our application await conn.query(` SELECT pg_cancel_backend(pid) FROM pg_stat_activity WHERE query_start < now() - interval '5 seconds' AND application_name LIKE '%node%' AND query NOT LIKE '%pg_cancel_backend%' `); // Clean up any temporary tables await cleanupTemporaryTables(conn); // Release connection conn.release(); } catch (err) { console.error('Error during force cancellation:', err); conn.release(); } }).catch(err => { console.error('Could not get connection for cancellation:', err); }); } catch (err) { console.error('Failed to terminate running queries:', err); } return { success: true, message: 'Calculation has been cancelled' }; } ``` ## Utility Components ### Database Utilities (db.js) **Purpose**: Manage database connections and provide a connection pool for efficient resource usage. **Key Functions**: ```javascript async function getConnection() { return await pool.connect(); } async function closePool() { await pool.end(); } ``` **Configuration**: ```javascript const dbConfig = { host: process.env.DB_HOST, user: process.env.DB_USER, password: process.env.DB_PASSWORD, database: process.env.DB_NAME, port: process.env.DB_PORT || 5432, ssl: process.env.DB_SSL === 'true', // Performance optimizations max: 10, // connection pool max size idleTimeoutMillis: 30000, connectionTimeoutMillis: 60000 }; ``` ### Progress Utilities (progress.js) **Purpose**: Provide functions for tracking and reporting calculation progress. **Key Functions**: - `formatElapsedTime`: Convert milliseconds to human-readable format - `estimateRemaining`: Estimate remaining time based on progress - `calculateRate`: Calculate processing rate in items per second - `logError`: Log errors to file and console - `outputProgress`: Output progress information to multiple channels - `saveProgress`: Persist progress information to disk - `clearProgress`: Clear saved progress information - `getProgress`: Retrieve previously saved progress **Progress Persistence**: ```javascript function saveProgress(progress) { try { fs.writeFileSync(STATUS_FILE, JSON.stringify({ ...progress, timestamp: Date.now() })); } catch (err) { console.error('Failed to save progress:', err); } } function getProgress() { try { if (fs.existsSync(STATUS_FILE)) { const progress = JSON.parse(fs.readFileSync(STATUS_FILE, 'utf8')); // Check if the progress is still valid (less than 1 hour old) if (progress.timestamp && Date.now() - progress.timestamp < 3600000) { return progress; } else { // Clear old progress clearProgress(); } } } catch (err) { console.error('Failed to read progress:', err); clearProgress(); } return null; } ``` ## Implementation Notes ### Module Dependencies The main `calculate-metrics.js` script requires the following modules: ```javascript const progress = require('./metrics/utils/progress'); const { getConnection, closePool } = require('./metrics/utils/db'); const calculateProductMetrics = require('./metrics/product-metrics'); const calculateTimeAggregates = require('./metrics/time-aggregates'); const calculateFinancialMetrics = require('./metrics/financial-metrics'); const calculateVendorMetrics = require('./metrics/vendor-metrics'); const calculateCategoryMetrics = require('./metrics/category-metrics'); const calculateBrandMetrics = require('./metrics/brand-metrics'); const calculateSalesForecasts = require('./metrics/sales-forecasts'); ``` ### Temporary Tables The system uses these temporary tables during calculation: ```javascript const TEMP_TABLES = [ 'temp_revenue_ranks', 'temp_sales_metrics', 'temp_purchase_metrics', 'temp_product_metrics', 'temp_vendor_metrics', 'temp_category_metrics', 'temp_brand_metrics', 'temp_forecast_dates', 'temp_daily_sales', 'temp_product_stats', 'temp_category_sales', 'temp_category_stats' ]; ``` ### Ensuring Valid Progress Data The system includes checks to ensure progress data is valid: ```javascript // Helper function to ensure valid progress numbers const ensureValidProgress = (current, total) => ({ current: Number(current) || 0, total: Number(total) || 1, // Default to 1 to avoid division by zero percentage: (((Number(current) || 0) / (Number(total) || 1)) * 100).toFixed(1) }); ``` ### Module Execution Logic The main calculation function executes modules conditionally: ```javascript if (!SKIP_PRODUCT_METRICS) { const result = await calculateProductMetrics(startTime, totalProducts); await updateProgress(result.processedProducts, result.processedOrders, result.processedPurchaseOrders); if (!result.success) { throw new Error('Product metrics calculation failed'); } } else { console.log('Skipping product metrics calculation...'); // Update progress proportionally } ``` ### Command Line Usage When run from the command line: ```javascript // Run directly if called from command line if (require.main === module) { calculateMetrics().catch(error => { if (!error.message.includes('Operation cancelled')) { console.error('Error:', error); } process.exit(1); }); } ``` ### Module Export The script exports functions for use by other components: ```javascript module.exports = { calculateMetrics, cancelCalculation, getProgress: global.getProgress }; ```