Files
inventory/docs/metrics-calculation-system.md

39 KiB

Metrics Calculation System Documentation

This document provides a comprehensive description of the metrics calculation system in the inventory management application, detailing the calculation modules, data transformations, formulas, and overall architecture.

Table of Contents

  1. Overview
  2. System Architecture
  3. Calculation Process
  4. Metrics Modules
  5. Database Schema
  6. Performance Optimizations
  7. Progress Tracking and Reporting
  8. Error Handling
  9. Utility Components
  10. Implementation Notes

Overview

The metrics calculation system processes inventory, sales, and purchase data to generate insights and forecasts for business decision-making. It powers dashboards, reports, and automated inventory recommendations that are core to the business operations.

The system can operate in two modes:

  • Complete Calculation: Calculates all metrics from scratch
  • Selective Calculation: Calculates only specified metric types

Key features of the system include:

  • Progress tracking with estimated completion time
  • Cancellation capability for long-running operations
  • Fault tolerance with automatic recovery
  • Temporary table usage for optimal performance
  • Batched processing for memory efficiency
  • Multiple calculation modules that can run independently

System Architecture

The calculation system follows a modular architecture with these components:

├── calculate-metrics.js           # Main orchestration script
├── metrics/                       # Calculation modules
│   ├── product-metrics.js         # Product-level metrics calculation
│   ├── time-aggregates.js         # Time-based aggregation calculations
│   ├── financial-metrics.js       # Financial performance metrics
│   ├── vendor-metrics.js          # Supplier/vendor analytics
│   ├── category-metrics.js        # Category-level performance metrics
│   ├── brand-metrics.js           # Brand performance analytics
│   ├── sales-forecasts.js         # Future sales prediction
│   └── utils/                     # Shared utilities
│       ├── db.js                  # Database connection management
│       └── progress.js            # Progress tracking utilities

The system uses PostgreSQL for data storage and processing, leveraging:

  • Temporary tables for intermediate calculations
  • Database transactions for data consistency
  • SQL window functions for ranking and trends
  • Batch processing for memory efficiency

Calculation Process

The calculation process follows a sequential flow with the following steps:

  1. Initialization

    • Establish database connection
    • Create calculation history record
    • Set up progress tracking
    • Prepare environment
  2. Module Execution

    • Product Metrics: Calculate inventory metrics and ABC classification
    • Time Aggregates: Calculate time-series data aggregations
    • Financial Metrics: Calculate profit, margin, and ROI metrics
    • Vendor Metrics: Calculate supplier performance metrics
    • Category Metrics: Calculate category performance metrics
    • Brand Metrics: Calculate brand performance metrics
    • Sales Forecasts: Generate future sales predictions
  3. Finalization

    • Clean up temporary tables
    • Update calculation history status
    • Record completion time
    • Release database connections

Each module creates its own temporary tables, performs calculations, updates the appropriate destination tables, and then cleans up. Modules can be enabled or disabled through configuration flags:

// Configuration flags for controlling which metrics to calculate
// Set to 1 to skip the corresponding calculation, 0 to run it
const SKIP_PRODUCT_METRICS = 0;
const SKIP_TIME_AGGREGATES = 0;
const SKIP_FINANCIAL_METRICS = 0;
const SKIP_VENDOR_METRICS = 0;
const SKIP_CATEGORY_METRICS = 0;
const SKIP_BRAND_METRICS = 0;
const SKIP_SALES_FORECASTS = 0;

Metrics Modules

Product Metrics

Purpose: Calculate core metrics for individual products including stock status, sales rates, inventory classifications, and reorder recommendations.

Source Tables:

  • products: Core product information and current stock
  • orders: Sales order data
  • purchase_orders: Supplier orders

Destination Tables:

  • product_metrics: Main repository for calculated metrics
  • abc_classification: Product importance classification

Key Calculations:

Metric Formula/Logic Description
Inventory Value stock_quantity * cost_price Current value of inventory at cost
Daily Sales Avg SUM(quantity) / COUNT(DISTINCT DATE(date)) Average daily unit sales in past 90 days
Weekly Sales Avg SUM(quantity) / CEIL(COUNT(DISTINCT DATE(date)) / 7) Average weekly unit sales in past 90 days
Monthly Sales Avg SUM(quantity) / CEIL(COUNT(DISTINCT DATE(date)) / 30) Average monthly unit sales in past 90 days
Total Revenue SUM(quantity * price) Total revenue in past 90 days
Avg Margin Percent ((SUM(quantity * price) - SUM(quantity * cost_price)) / SUM(quantity * price)) * 100 Average margin percentage
Avg Lead Time AVG(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0) Average lead time in days from order to receipt
Days of Inventory floor(stock_quantity / daily_sales_avg) How many days current stock will last
Weeks of Inventory floor(stock_quantity / weekly_sales_avg) How many weeks current stock will last
Stock Status Complex conditional logic (see below) Current inventory status category
Safety Stock CEIL(daily_sales_avg * SQRT(avg_lead_time_days) * 1.96) Buffer stock to maintain service level (95%)
Reorder Point CEIL(daily_sales_avg * avg_lead_time_days) + safety_stock When to reorder based on lead time and safety stock
Reorder Qty SQRT((2 * annual_demand * order_cost) / holding_cost) Economic Order Quantity (EOQ) formula
Overstocked Amt MAX(0, stock_quantity - (daily_sales_avg * overstock_days)) Units considered excess inventory
ABC Class Percentile ranking by revenue 'A': top 20%, 'B': 21-50%, 'C': remainder
Turnover Rate (total_sold / avg_nonzero_stock) * (365.0 / active_days) Inventory turnover rate adjusted for zero-stock periods
Forecast Accuracy 100 - MIN(AVG(ABS(forecast_quantity - actual_quantity) / actual_quantity * 100), 100) Historic forecast accuracy percentage
Forecast Bias AVG((forecast_quantity - actual_quantity) / actual_quantity * 100) Directional bias of forecasts (positive = over-forecasting)

Stock Status Logic:

CASE
    WHEN stock_quantity <= 0 THEN 'Out of Stock'
    WHEN daily_sales_avg = 0 AND stock_quantity <= low_stock_threshold THEN 'Low Stock'
    WHEN daily_sales_avg = 0 THEN 'In Stock'
    WHEN stock_quantity / daily_sales_avg <= critical_days THEN 'Critical'
    WHEN stock_quantity / daily_sales_avg <= reorder_days THEN 'Reorder'
    WHEN stock_quantity / daily_sales_avg > overstock_days THEN 'Overstocked'
    ELSE 'Healthy'
END

Process Flow:

  1. Create temporary tables for sales and purchase metrics
  2. Calculate sales averages by product
  3. Calculate lead times and purchase metrics
  4. Update product_metrics with calculated values
  5. Calculate ABC classification using revenue ranking
  6. Calculate turnover rates with zero-stock period adjustment
  7. Calculate forecast accuracy and bias using historical data

Optimizations:

  • Batch processing of products (5,000 per batch)
  • Timeout protection for long-running queries
  • Indexed temporary tables for performance
  • Strategic transaction management

Time Aggregates

Purpose: Aggregate sales and purchase data by time periods (year/month) for trend analysis and historical reporting.

Source Tables:

  • products: Core product information
  • orders: Sales order data
  • purchase_orders: Supplier orders

Destination Tables:

  • product_time_aggregates: Time-based metrics by product, year, and month

Key Calculations:

Metric Formula/Logic Description
Total Quantity Sold SUM(quantity) Units sold in the time period
Total Revenue SUM((price - discount) * quantity) Net revenue in the time period
Total Cost SUM(cost_price * quantity) Cost of goods sold in time period
Order Count COUNT(DISTINCT order_number) Number of orders in the time period
Stock Received SUM(received) Units received from suppliers in time period
Stock Ordered SUM(ordered) Units ordered from suppliers in time period
Avg Price AVG(price - discount) Average net selling price in time period
Profit Margin ((total_revenue - total_cost) / total_revenue) * 100 Percentage profit margin in time period
GMROI ((total_revenue - total_cost) * (365.0 / active_days)) / inventory_value Gross Margin Return on Inventory Investment

Process Flow:

  1. Calculate monthly aggregates for sales data
  2. Calculate monthly aggregates for purchase data
  3. Combine sales and purchase data for complete monthly picture
  4. Update financial metrics for each time period
  5. Calculate efficiency metrics like GMROI

SQL Example (Monthly Aggregation):

WITH monthly_sales AS (
    SELECT 
        o.pid,
        EXTRACT(YEAR FROM o.date) as year,
        EXTRACT(MONTH FROM o.date) as month,
        SUM(o.quantity) as total_quantity_sold,
        SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) as total_revenue,
        SUM(COALESCE(p.cost_price, 0) * o.quantity) as total_cost,
        COUNT(DISTINCT o.order_number) as order_count,
        AVG(o.price - COALESCE(o.discount, 0)) as avg_price,
        -- Profit margin calculation
        CASE 
            WHEN SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) > 0 
            THEN ((SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) - SUM(COALESCE(p.cost_price, 0) * o.quantity)) 
                / SUM((o.price - COALESCE(o.discount, 0)) * o.quantity)) * 100
            ELSE 0 
        END as profit_margin,
        p.cost_price * p.stock_quantity as inventory_value,
        COUNT(DISTINCT DATE(o.date)) as active_days
    FROM orders o
    JOIN products p ON o.pid = p.pid
    WHERE o.canceled = false
    GROUP BY o.pid, EXTRACT(YEAR FROM o.date), EXTRACT(MONTH FROM o.date), p.cost_price, p.stock_quantity
)
-- Insert results into product_time_aggregates table

Financial Metrics

Purpose: Calculate profit, revenue, margin, and return on investment metrics for financial analysis.

Source Tables:

  • products: Core product information
  • orders: Sales order data
  • product_metrics: Previously calculated product metrics

Destination Tables:

  • financial_metrics: Financial performance metrics
  • product_metrics: Updated with financial calculations

Key Calculations:

Metric Formula/Logic Description
Gross Profit SUM(quantity * (price - cost_price)) Total profit across all sales
Gross Margin (gross_profit / SUM(quantity * price)) * 100 Profit margin percentage
GMROI gross_profit / AVG(inventory_value) Gross Margin Return on Inventory Investment
GMROII GMROI / inventory_turns GMROI Index (normalized by inventory turns)
Revenue per Unit SUM(price * quantity) / SUM(quantity) Average revenue per unit sold
Cost per Unit SUM(cost_price * quantity) / SUM(quantity) Average cost per unit sold
COGS SUM(cost_price * quantity) Cost of Goods Sold total
Total Discounts SUM(discount * quantity) Total discount amount
Discount Rate total_discounts / SUM(price * quantity) Percentage of revenue discounted
Revenue Rank RANK() OVER (ORDER BY total_revenue DESC) Revenue ranking compared to other products
Profit Rank RANK() OVER (ORDER BY gross_profit DESC) Profit ranking compared to other products
ROI (gross_profit / SUM(cost_price * quantity)) * 100 Return on Investment percentage

Process Flow:

  1. Calculate base financial metrics for each product
  2. Calculate comparative rankings across products
  3. Calculate inventory efficiency metrics
  4. Update appropriate destination tables

Vendor Metrics

Purpose: Analyze supplier performance including lead times, reliability, and cost metrics.

Source Tables:

  • products: Core product information
  • purchase_orders: Supplier orders
  • vendors: Supplier information

Destination Tables:

  • vendor_metrics: Supplier performance metrics
  • vendor_time_aggregates: Time-based supplier metrics

Key Calculations:

Metric Formula/Logic Description
Avg Lead Time AVG(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0) Average lead time in days
Lead Time Variance STDDEV(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0) Variance in lead times
Fill Rate SUM(received) / NULLIF(SUM(ordered), 0) * 100 Percentage of ordered units received
On-Time Delivery COUNT(CASE WHEN received_date <= expected_date THEN 1 END) / COUNT(*) * 100 Percentage of on-time deliveries
Avg Cost AVG(cost_price) Average cost per unit
Cost Variance STDDEV(cost_price) / AVG(cost_price) * 100 Variation in costs (%)
Total Spend SUM(ordered * cost_price) Total amount spent with vendor
Order Frequency COUNT(DISTINCT po_id) / COUNT(DISTINCT DATE_TRUNC('month', date)) Average orders per month
Quality Rating Complex formula based on returns Rating of product quality from vendor
Vendor Score Weighted score of all metrics Overall performance score

Process Flow:

  1. Calculate lead time metrics by vendor
  2. Calculate fill rate and delivery metrics
  3. Calculate cost and spend metrics
  4. Calculate quality and reliability metrics
  5. Generate overall vendor score
  6. Update time-based aggregates by month

Category Metrics

Purpose: Analyze performance metrics at the category level for merchandising insights.

Source Tables:

  • products: Core product information
  • product_categories: Category relationships
  • orders: Sales order data
  • categories: Category information

Destination Tables:

  • category_metrics: Category performance metrics
  • category_time_aggregates: Time-based category metrics

Key Calculations:

Metric Formula/Logic Description
Total Products COUNT(DISTINCT p.pid) Number of products in category
Active Products COUNT(DISTINCT CASE WHEN o.pid IS NOT NULL THEN p.pid END) Products with sales in period
Total Revenue SUM(o.quantity * o.price) Total revenue from category
Total Units SUM(o.quantity) Total units sold from category
Avg Price SUM(o.quantity * o.price) / SUM(o.quantity) Average selling price in category
Gross Profit SUM(o.quantity * (o.price - p.cost_price)) Total profit from category
Margin Percent (gross_profit / total_revenue) * 100 Profit margin percentage
Inventory Value SUM(p.stock_quantity * p.cost_price) Current inventory value at cost
Inventory Turns COGS / AVG(inventory_value) How many times inventory turns over
Revenue Growth (current_period_revenue / previous_period_revenue - 1) * 100 Revenue growth percentage
Category Share category_revenue / total_revenue * 100 Percentage of total revenue
Avg Weekly Sales SUM(o.quantity) / COUNT(DISTINCT DATE_TRUNC('week', o.date)) Average weekly units sold

Process Flow:

  1. Calculate category product counts
  2. Calculate category sales metrics
  3. Calculate category profitability
  4. Calculate category growth rates
  5. Calculate category share metrics
  6. Update time-based aggregates by month

Brand Metrics

Purpose: Analyze performance metrics at the brand level for brand management insights.

Source Tables:

  • products: Core product information
  • product_categories: Brand relationships (brands are stored as categories)
  • orders: Sales order data

Destination Tables:

  • brand_metrics: Brand performance metrics
  • brand_time_aggregates: Time-based brand metrics

Key Calculations:

Metric Formula/Logic Description
Total Products COUNT(DISTINCT p.pid) Number of products in brand
Active Products COUNT(DISTINCT CASE WHEN o.pid IS NOT NULL THEN p.pid END) Products with sales in period
Total Revenue SUM(o.quantity * o.price) Total revenue from brand
Total Units SUM(o.quantity) Total units sold from brand
Avg Price SUM(o.quantity * o.price) / SUM(o.quantity) Average selling price in brand
Gross Profit SUM(o.quantity * (o.price - p.cost_price)) Total profit from brand
Margin Percent (gross_profit / total_revenue) * 100 Profit margin percentage
Inventory Value SUM(p.stock_quantity * p.cost_price) Current inventory value at cost
Inventory Turns COGS / AVG(inventory_value) How many times inventory turns over
Revenue Growth (current_period_revenue / previous_period_revenue - 1) * 100 Revenue growth percentage
Brand Share brand_revenue / total_revenue * 100 Percentage of total revenue
Brand Velocity SUM(o.quantity) / COUNT(DISTINCT DATE(o.date)) Daily sales velocity

Process Flow:

  1. Calculate brand product counts
  2. Calculate brand sales metrics
  3. Calculate brand profitability
  4. Calculate brand growth rates
  5. Calculate brand share metrics
  6. Update time-based aggregates by month

Sales Forecasts

Purpose: Generate sales forecasts for future periods using historical data and seasonality factors.

Source Tables:

  • products: Core product information
  • orders: Sales order data
  • product_categories: Category relationships
  • sales_seasonality: Seasonal adjustment factors

Destination Tables:

  • sales_forecasts: Product-level sales forecasts
  • category_forecasts: Category-level sales forecasts

Key Calculations:

Metric Formula/Logic Description
Daily Sales Avg AVG(daily_quantity) for each day of week Average sales by day of week
Sales Std Dev STDDEV(daily_quantity) Standard deviation of daily sales
Sales Variance Ratio STDDEV(daily_quantity) / AVG(daily_quantity) Measure of sales variability
Forecast Quantity avg_daily_qty * (1 + seasonality_factor) * confidence_adjustment Predicted daily sales
Confidence Level Complex scale based on data quality Confidence in forecast (60-90%)
Seasonality Factor Monthly adjustment factor Sales variation by month (-30% to +30%)
Forecast Revenue forecast_quantity * avg_price Predicted daily revenue
Category Forecast Sum of product forecasts with adjustments Forecast at category level

Confidence Level Calculation:

CASE
    WHEN total_days >= 60 AND daily_variance_ratio < 0.5 THEN 90
    WHEN total_days >= 60 THEN 85
    WHEN total_days >= 30 AND daily_variance_ratio < 0.5 THEN 80
    WHEN total_days >= 30 THEN 75
    WHEN total_days >= 14 AND daily_variance_ratio < 0.5 THEN 70
    WHEN total_days >= 14 THEN 65
    ELSE 60
END

Process Flow:

  1. Create temporary tables for forecast dates (next 30 days)
  2. Calculate daily sales statistics by product and day of week
  3. Calculate product-level statistics (variability, seasonality)
  4. Generate product-level forecasts with confidence levels
  5. Aggregate to category level forecasts
  6. Apply seasonality adjustments
  7. Store forecasts in respective tables

Method Details:

  • Uses a combination of time-series analysis and statistical methods
  • Applies day-of-week patterns from historical data
  • Adjusts for seasonal variations by month
  • Accounts for data quality in confidence calculation
  • Handles products with intermittent demand differently
  • Provides both unit and revenue forecasts

Database Schema

Main Tables

calculate_history

Column Type Description
id SERIAL Primary key
start_time TIMESTAMP Calculation start time
end_time TIMESTAMP Calculation end time
duration_seconds INTEGER Elapsed time in seconds
status TEXT 'running', 'completed', 'cancelled', 'failed'
total_products INTEGER Total products processed
total_orders INTEGER Total orders processed
total_purchase_orders INTEGER Total purchase orders processed
processed_products INTEGER Number of processed products
processed_orders INTEGER Number of processed orders
processed_purchase_orders INTEGER Number of processed purchase orders
error_message TEXT Error message if failed
additional_info JSONB Configuration and results

calculate_status

Column Type Description
module_name TEXT Name of calculation module
last_calculation_timestamp TIMESTAMP When module last completed

product_metrics

Column Type Description
pid BIGINT Product ID (PK)
inventory_value DECIMAL(10,2) Current inventory value
daily_sales_avg DECIMAL(10,3) Average daily sales (units)
weekly_sales_avg DECIMAL(10,3) Average weekly sales (units)
monthly_sales_avg DECIMAL(10,3) Average monthly sales (units)
total_revenue DECIMAL(10,2) Total revenue from product
avg_margin_percent DECIMAL(10,2) Average margin percentage
first_sale_date DATE Date of first sale
last_sale_date DATE Date of last sale
avg_lead_time_days DOUBLE PRECISION Average lead time in days
days_of_inventory INTEGER Days until stockout at current rate
weeks_of_inventory INTEGER Weeks until stockout at current rate
stock_status TEXT Current stock status
safety_stock INTEGER Recommended safety stock
reorder_point INTEGER Recommended reorder point
reorder_qty INTEGER Recommended order quantity
overstocked_amt INTEGER Units considered excess inventory
abc_class CHAR(1) ABC classification
turnover_rate DECIMAL(10,2) Inventory turnover rate
forecast_accuracy DECIMAL(10,2) Accuracy of past forecasts (%)
forecast_bias DECIMAL(10,2) Directional bias of forecasts (%)
last_forecast_date DATE Date of most recent forecast
last_calculated_at TIMESTAMP Last calculation time

product_time_aggregates

Column Type Description
pid BIGINT Product ID (PK)
year INTEGER Year (PK)
month INTEGER Month (PK)
total_quantity_sold INTEGER Units sold in period
total_revenue DECIMAL(10,2) Revenue in period
total_cost DECIMAL(10,2) Cost in period
order_count INTEGER Number of orders
stock_received INTEGER Units received
stock_ordered INTEGER Units ordered
avg_price DECIMAL(10,2) Average price
profit_margin DECIMAL(10,2) Profit margin (%)
inventory_value DECIMAL(10,2) Inventory value
gmroi DECIMAL(10,3) Gross Margin Return on Investment

sales_forecasts

Column Type Description
pid BIGINT Product ID (PK)
forecast_date DATE Date of forecast (PK)
forecast_quantity DECIMAL(10,2) Predicted quantity
confidence_level INTEGER Confidence (60-90%)
created_at TIMESTAMP Creation time

category_forecasts

Column Type Description
category_id BIGINT Category ID (PK)
forecast_date DATE Date of forecast (PK)
forecast_units DECIMAL(10,2) Predicted units
forecast_revenue DECIMAL(10,2) Predicted revenue
confidence_level INTEGER Confidence (60-90%)
created_at TIMESTAMP Creation time

vendor_metrics

Column Type Description
vendor_id BIGINT Vendor ID (PK)
avg_lead_time DECIMAL(10,2) Average lead time in days
lead_time_variance DECIMAL(10,2) Variance in lead times
fill_rate DECIMAL(10,2) Percentage of ordered units received
on_time_delivery DECIMAL(10,2) Percentage of on-time deliveries
avg_cost DECIMAL(10,2) Average cost per unit
cost_variance DECIMAL(10,2) Variation in costs (%)
total_spend DECIMAL(10,2) Total amount spent
order_frequency DECIMAL(10,2) Average orders per month
quality_rating DECIMAL(10,2) Rating of product quality
vendor_score DECIMAL(10,2) Overall performance score
last_calculated_at TIMESTAMP Last calculation time

Performance Optimizations

The metrics calculation system employs several optimizations for performance:

1. Batch Processing

Products are processed in configurable batch sizes (default 5,000) to:

  • Reduce memory usage
  • Improve transaction management
  • Enable partial progress in case of failure

Implementation:

// Process updates in batches
let lastPid = 0;
let batchCount = 0;
const MAX_BATCHES = 1000; // Safety limit
const BATCH_SIZE = 5000;

while (batchCount < MAX_BATCHES) {
    batchCount++;
    const batch = await connection.query(
        'SELECT pid FROM products WHERE pid > $1 ORDER BY pid LIMIT $2',
        [lastPid, BATCH_SIZE]
    );

    if (batch.rows.length === 0) break;

    // Process the entire batch in a single efficient query
    await connection.query(`
        UPDATE product_metrics pm
        SET /* fields to update */
        FROM /* joined tables */
        WHERE p.pid = ANY($1::bigint[])
        AND pm.pid = p.pid
    `, [batch.rows.map(row => row.pid)]);

    lastPid = batch.rows[batch.rows.length - 1].pid;
}

2. Temporary Tables

The system creates temporary tables for intermediate calculations to:

  • Reduce query complexity
  • Improve join performance
  • Enable indexing of intermediate results

Example:

// Create temp_sales_metrics
await connection.query(`
    CREATE TEMPORARY TABLE temp_sales_metrics (
        pid BIGINT NOT NULL,
        daily_sales_avg DECIMAL(10,3),
        weekly_sales_avg DECIMAL(10,3),
        monthly_sales_avg DECIMAL(10,3),
        total_revenue DECIMAL(10,3),
        avg_margin_percent DECIMAL(10,3),
        first_sale_date DATE,
        last_sale_date DATE,
        PRIMARY KEY (pid)
    )
`);

// Create indexes on temporary tables
await connection.query('CREATE INDEX ON temp_revenue_ranks (rank_num)');

3. Query Timeouts

Long-running queries have timeout protection:

await Promise.race([
    connection.query(`
        INSERT INTO temp_purchase_metrics
        /* Complex query here */
    `),
    new Promise((_, reject) => 
        setTimeout(() => reject(new Error('Timeout: query took too long')), 60000)
    )
]).catch(async (err) => {
    logError(err, 'Error populating table, continuing with fallback');
    // Fallback mechanism
});

4. Connection Pooling

Database connections are managed through a connection pool:

const dbConfig = {
    // Database connection details
    max: 10, // connection pool max size
    idleTimeoutMillis: 30000,
    connectionTimeoutMillis: 60000
};

const pool = new Pool(dbConfig);

5. Transaction Management

Each module uses appropriate transaction boundaries:

// Start transaction
await connection.query('BEGIN');
try {
    // Perform operations...
    
    // Commit transaction
    await connection.query('COMMIT');
} catch (error) {
    // Rollback on error
    await connection.query('ROLLBACK');
    throw error;
}

Progress Tracking and Reporting

The system includes a comprehensive progress tracking system that:

  1. Estimates completion time based on processed items
  2. Reports progress in real-time
  3. Persists progress information for UI display
  4. Handles interruptions gracefully

Progress Output Format

{
    status: 'running', // running, complete, error, cancelled
    operation: 'Operation description',
    current: 1000, // Current number of processed items
    total: 10000, // Total items to process
    elapsed: '2m 30s', // Formatted elapsed time
    remaining: '7m 15s', // Estimated remaining time
    rate: 120, // Processing rate (items/second)
    percentage: '10.0', // Progress percentage
    timing: {
        start_time: '2023-03-26T12:00:00Z',
        end_time: '2023-03-26T12:02:30Z',
        elapsed_seconds: 150
    }
}

Key Functions

formatElapsedTime: Converts milliseconds to human-readable format

function formatElapsedTime(elapsed) {
    // If elapsed is a timestamp, convert to elapsed milliseconds
    if (elapsed instanceof Date || elapsed > 1000000000000) {
        elapsed = Date.now() - elapsed;
    } else {
        // If elapsed is in seconds, convert to milliseconds
        elapsed = elapsed * 1000;
    }
    
    const seconds = Math.floor(elapsed / 1000);
    const minutes = Math.floor(seconds / 60);
    const hours = Math.floor(minutes / 60);
    
    if (hours > 0) {
        return `${hours}h ${minutes % 60}m`;
    } else if (minutes > 0) {
        return `${minutes}m ${seconds % 60}s`;
    } else {
        return `${seconds}s`;
    }
}

estimateRemaining: Calculates estimated time to completion

function estimateRemaining(startTime, current, total) {
    if (current === 0) return null;
    const elapsed = Date.now() - startTime;
    const rate = current / elapsed;
    const remaining = (total - current) / rate;
    
    const minutes = Math.floor(remaining / 60000);
    const seconds = Math.floor((remaining % 60000) / 1000);
    
    if (minutes > 0) {
        return `${minutes}m ${seconds}s`;
    } else {
        return `${seconds}s`;
    }
}

outputProgress: Outputs progress information

function outputProgress(data) {
    // Save progress to file for resumption
    saveProgress(data);
    // Format as JSON
    const event = {
        progress: data
    };
    // Send to stdout for frontend
    process.stdout.write(JSON.stringify(event) + '\n');
    
    // Log significant events to disk
    if (isSignificantEvent(data)) {
        logImport(`${data.operation || 'Operation'}${data.message ? ': ' + data.message : ''}${data.error ? ' Error: ' + data.error : ''}${data.status ? ' Status: ' + data.status : ''}`);
    }
}

Error Handling

The system includes comprehensive error handling:

1. Global Error Handlers

// Add error handler for uncaught exceptions
process.on('uncaughtException', (error) => {
    console.error('Uncaught Exception:', error);
    process.exit(1);
});

// Add error handler for unhandled promise rejections
process.on('unhandledRejection', (reason, promise) => {
    console.error('Unhandled Rejection at:', promise, 'reason:', reason);
    process.exit(1);
});

2. Module-Level Error Handling

Each module uses try/catch blocks with detailed error reporting:

try {
    // Module operations
} catch (error) {
    success = false;
    logError(error, 'Error calculating product metrics');
    
    // Update history with error
    await connection.query(`
        UPDATE calculate_history 
        SET 
            end_time = NOW(),
            duration_seconds = $1,
            processed_products = $2,
            processed_orders = $3,
            processed_purchase_orders = $4,
            status = $5,
            error_message = $6
        WHERE id = $7
    `, [
        totalElapsedSeconds,
        processedProducts || 0,
        processedOrders || 0,
        processedPurchaseOrders || 0,
        isCancelled ? 'cancelled' : 'failed',
        error.message,
        calculateHistoryId
    ]);
    
    throw error;
} finally {
    // Cleanup operations
    if (connection) {
        try {
            await cleanupTemporaryTables(connection);
            connection.release();
        } catch (err) {
            console.error('Error in final cleanup:', err);
        }
    }
}

3. Error Logging

Errors are logged to file and console:

function logError(error, context = '') {
    const timestamp = new Date().toISOString();
    const errorMessage = `[${timestamp}] ${context}\nError: ${error.message}\nStack: ${error.stack}\n\n`;
    
    // Log to error file
    fs.appendFileSync(ERROR_LOG, errorMessage);
    
    // Also log to console
    console.error(`\n${context}\nError: ${error.message}`);
}

4. Timeout Management

A global timeout prevents runaway processes:

// Set a maximum execution time (30 minutes)
const MAX_EXECUTION_TIME = 30 * 60 * 1000;
const timeout = setTimeout(() => {
    console.error(`Calculation timed out after ${MAX_EXECUTION_TIME/1000} seconds, forcing termination`);
    // Call cancel and force exit
    cancelCalculation();
    process.exit(1);
}, MAX_EXECUTION_TIME);

// Clear timeout when process completes
clearTimeout(timeout);

5. Cancellation Support

Users can cancel long-running calculations:

function cancelCalculation() {
    isCancelled = true;
    console.log('Calculation has been cancelled by user');
    
    // Force-terminate any query that's been running for more than 5 seconds
    try {
        const connection = getConnection();
        connection.then(async (conn) => {
            try {
                // Identify and terminate long-running queries from our application
                await conn.query(`
                    SELECT pg_cancel_backend(pid) 
                    FROM pg_stat_activity 
                    WHERE query_start < now() - interval '5 seconds'
                    AND application_name LIKE '%node%'
                    AND query NOT LIKE '%pg_cancel_backend%'
                `);
                
                // Clean up any temporary tables
                await cleanupTemporaryTables(conn);
                
                // Release connection
                conn.release();
            } catch (err) {
                console.error('Error during force cancellation:', err);
                conn.release();
            }
        }).catch(err => {
            console.error('Could not get connection for cancellation:', err);
        });
    } catch (err) {
        console.error('Failed to terminate running queries:', err);
    }
    
    return {
        success: true,
        message: 'Calculation has been cancelled'
    };
}

Utility Components

Database Utilities (db.js)

Purpose: Manage database connections and provide a connection pool for efficient resource usage.

Key Functions:

async function getConnection() {
    return await pool.connect();
}

async function closePool() {
    await pool.end();
}

Configuration:

const dbConfig = {
    host: process.env.DB_HOST,
    user: process.env.DB_USER,
    password: process.env.DB_PASSWORD,
    database: process.env.DB_NAME,
    port: process.env.DB_PORT || 5432,
    ssl: process.env.DB_SSL === 'true',
    // Performance optimizations
    max: 10, // connection pool max size
    idleTimeoutMillis: 30000,
    connectionTimeoutMillis: 60000
};

Progress Utilities (progress.js)

Purpose: Provide functions for tracking and reporting calculation progress.

Key Functions:

  • formatElapsedTime: Convert milliseconds to human-readable format
  • estimateRemaining: Estimate remaining time based on progress
  • calculateRate: Calculate processing rate in items per second
  • logError: Log errors to file and console
  • outputProgress: Output progress information to multiple channels
  • saveProgress: Persist progress information to disk
  • clearProgress: Clear saved progress information
  • getProgress: Retrieve previously saved progress

Progress Persistence:

function saveProgress(progress) {
    try {
        fs.writeFileSync(STATUS_FILE, JSON.stringify({
            ...progress,
            timestamp: Date.now()
        }));
    } catch (err) {
        console.error('Failed to save progress:', err);
    }
}

function getProgress() {
    try {
        if (fs.existsSync(STATUS_FILE)) {
            const progress = JSON.parse(fs.readFileSync(STATUS_FILE, 'utf8'));
            // Check if the progress is still valid (less than 1 hour old)
            if (progress.timestamp && Date.now() - progress.timestamp < 3600000) {
                return progress;
            } else {
                // Clear old progress
                clearProgress();
            }
        }
    } catch (err) {
        console.error('Failed to read progress:', err);
        clearProgress();
    }
    return null;
}

Implementation Notes

Module Dependencies

The main calculate-metrics.js script requires the following modules:

const progress = require('./metrics/utils/progress');
const { getConnection, closePool } = require('./metrics/utils/db');
const calculateProductMetrics = require('./metrics/product-metrics');
const calculateTimeAggregates = require('./metrics/time-aggregates');
const calculateFinancialMetrics = require('./metrics/financial-metrics');
const calculateVendorMetrics = require('./metrics/vendor-metrics');
const calculateCategoryMetrics = require('./metrics/category-metrics');
const calculateBrandMetrics = require('./metrics/brand-metrics');
const calculateSalesForecasts = require('./metrics/sales-forecasts');

Temporary Tables

The system uses these temporary tables during calculation:

const TEMP_TABLES = [
    'temp_revenue_ranks',
    'temp_sales_metrics',
    'temp_purchase_metrics',
    'temp_product_metrics',
    'temp_vendor_metrics',
    'temp_category_metrics',
    'temp_brand_metrics',
    'temp_forecast_dates',
    'temp_daily_sales',
    'temp_product_stats',
    'temp_category_sales',
    'temp_category_stats'
];

Ensuring Valid Progress Data

The system includes checks to ensure progress data is valid:

// Helper function to ensure valid progress numbers
const ensureValidProgress = (current, total) => ({
    current: Number(current) || 0,
    total: Number(total) || 1,  // Default to 1 to avoid division by zero
    percentage: (((Number(current) || 0) / (Number(total) || 1)) * 100).toFixed(1)
});

Module Execution Logic

The main calculation function executes modules conditionally:

if (!SKIP_PRODUCT_METRICS) {
    const result = await calculateProductMetrics(startTime, totalProducts);
    await updateProgress(result.processedProducts, result.processedOrders, result.processedPurchaseOrders);
    if (!result.success) {
        throw new Error('Product metrics calculation failed');
    }
} else {
    console.log('Skipping product metrics calculation...');
    // Update progress proportionally
}

Command Line Usage

When run from the command line:

// Run directly if called from command line
if (require.main === module) {
    calculateMetrics().catch(error => {
        if (!error.message.includes('Operation cancelled')) {
            console.error('Error:', error);
        }
        process.exit(1);
    });
}

Module Export

The script exports functions for use by other components:

module.exports = {
    calculateMetrics,
    cancelCalculation,
    getProgress: global.getProgress
};