Files

Matt e5c4f617c5 Get frontend pages loading data again, remove unused components

2025-03-26 21:47:24 -04:00

39 KiB

Raw Blame History

Metrics Calculation System Documentation

This document provides a comprehensive description of the metrics calculation system in the inventory management application, detailing the calculation modules, data transformations, formulas, and overall architecture.

Overview
System Architecture
Calculation Process
Metrics Modules
Database Schema
Performance Optimizations
Progress Tracking and Reporting
Error Handling
Utility Components
Implementation Notes

Overview

The metrics calculation system processes inventory, sales, and purchase data to generate insights and forecasts for business decision-making. It powers dashboards, reports, and automated inventory recommendations that are core to the business operations.

The system can operate in two modes:

Complete Calculation: Calculates all metrics from scratch
Selective Calculation: Calculates only specified metric types

Key features of the system include:

Progress tracking with estimated completion time
Cancellation capability for long-running operations
Fault tolerance with automatic recovery
Temporary table usage for optimal performance
Batched processing for memory efficiency
Multiple calculation modules that can run independently

System Architecture

The calculation system follows a modular architecture with these components:

├── calculate-metrics.js           # Main orchestration script
├── metrics/                       # Calculation modules
│   ├── product-metrics.js         # Product-level metrics calculation
│   ├── time-aggregates.js         # Time-based aggregation calculations
│   ├── financial-metrics.js       # Financial performance metrics
│   ├── vendor-metrics.js          # Supplier/vendor analytics
│   ├── category-metrics.js        # Category-level performance metrics
│   ├── brand-metrics.js           # Brand performance analytics
│   ├── sales-forecasts.js         # Future sales prediction
│   └── utils/                     # Shared utilities
│       ├── db.js                  # Database connection management
│       └── progress.js            # Progress tracking utilities

The system uses PostgreSQL for data storage and processing, leveraging:

Temporary tables for intermediate calculations
Database transactions for data consistency
SQL window functions for ranking and trends
Batch processing for memory efficiency

Calculation Process

The calculation process follows a sequential flow with the following steps:

Initialization
- Establish database connection
- Create calculation history record
- Set up progress tracking
- Prepare environment
Module Execution
- Product Metrics: Calculate inventory metrics and ABC classification
- Time Aggregates: Calculate time-series data aggregations
- Financial Metrics: Calculate profit, margin, and ROI metrics
- Vendor Metrics: Calculate supplier performance metrics
- Category Metrics: Calculate category performance metrics
- Brand Metrics: Calculate brand performance metrics
- Sales Forecasts: Generate future sales predictions
Finalization
- Clean up temporary tables
- Update calculation history status
- Record completion time
- Release database connections

Each module creates its own temporary tables, performs calculations, updates the appropriate destination tables, and then cleans up. Modules can be enabled or disabled through configuration flags:

// Configuration flags for controlling which metrics to calculate
// Set to 1 to skip the corresponding calculation, 0 to run it
const SKIP_PRODUCT_METRICS = 0;
const SKIP_TIME_AGGREGATES = 0;
const SKIP_FINANCIAL_METRICS = 0;
const SKIP_VENDOR_METRICS = 0;
const SKIP_CATEGORY_METRICS = 0;
const SKIP_BRAND_METRICS = 0;
const SKIP_SALES_FORECASTS = 0;

Metrics Modules

Product Metrics

Purpose: Calculate core metrics for individual products including stock status, sales rates, inventory classifications, and reorder recommendations.

Source Tables:

products: Core product information and current stock
orders: Sales order data
purchase_orders: Supplier orders

Destination Tables:

product_metrics: Main repository for calculated metrics
abc_classification: Product importance classification

Key Calculations:

Metric	Formula/Logic	Description
Inventory Value	`stock_quantity * cost_price`	Current value of inventory at cost
Daily Sales Avg	`SUM(quantity) / COUNT(DISTINCT DATE(date))`	Average daily unit sales in past 90 days
Weekly Sales Avg	`SUM(quantity) / CEIL(COUNT(DISTINCT DATE(date)) / 7)`	Average weekly unit sales in past 90 days
Monthly Sales Avg	`SUM(quantity) / CEIL(COUNT(DISTINCT DATE(date)) / 30)`	Average monthly unit sales in past 90 days
Total Revenue	`SUM(quantity * price)`	Total revenue in past 90 days
Avg Margin Percent	`((SUM(quantity * price) - SUM(quantity * cost_price)) / SUM(quantity * price)) * 100`	Average margin percentage
Avg Lead Time	`AVG(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0)`	Average lead time in days from order to receipt
Days of Inventory	`floor(stock_quantity / daily_sales_avg)`	How many days current stock will last
Weeks of Inventory	`floor(stock_quantity / weekly_sales_avg)`	How many weeks current stock will last
Stock Status	Complex conditional logic (see below)	Current inventory status category
Safety Stock	`CEIL(daily_sales_avg * SQRT(avg_lead_time_days) * 1.96)`	Buffer stock to maintain service level (95%)
Reorder Point	`CEIL(daily_sales_avg * avg_lead_time_days) + safety_stock`	When to reorder based on lead time and safety stock
Reorder Qty	`SQRT((2 * annual_demand * order_cost) / holding_cost)`	Economic Order Quantity (EOQ) formula
Overstocked Amt	`MAX(0, stock_quantity - (daily_sales_avg * overstock_days))`	Units considered excess inventory
ABC Class	Percentile ranking by revenue	'A': top 20%, 'B': 21-50%, 'C': remainder
Turnover Rate	`(total_sold / avg_nonzero_stock) * (365.0 / active_days)`	Inventory turnover rate adjusted for zero-stock periods
Forecast Accuracy	`100 - MIN(AVG(ABS(forecast_quantity - actual_quantity) / actual_quantity * 100), 100)`	Historic forecast accuracy percentage
Forecast Bias	`AVG((forecast_quantity - actual_quantity) / actual_quantity * 100)`	Directional bias of forecasts (positive = over-forecasting)

Stock Status Logic:

CASE
    WHEN stock_quantity <= 0 THEN 'Out of Stock'
    WHEN daily_sales_avg = 0 AND stock_quantity <= low_stock_threshold THEN 'Low Stock'
    WHEN daily_sales_avg = 0 THEN 'In Stock'
    WHEN stock_quantity / daily_sales_avg <= critical_days THEN 'Critical'
    WHEN stock_quantity / daily_sales_avg <= reorder_days THEN 'Reorder'
    WHEN stock_quantity / daily_sales_avg > overstock_days THEN 'Overstocked'
    ELSE 'Healthy'
END

Process Flow:

Create temporary tables for sales and purchase metrics
Calculate sales averages by product
Calculate lead times and purchase metrics
Update product_metrics with calculated values
Calculate ABC classification using revenue ranking
Calculate turnover rates with zero-stock period adjustment
Calculate forecast accuracy and bias using historical data

Optimizations:

Batch processing of products (5,000 per batch)
Timeout protection for long-running queries
Indexed temporary tables for performance
Strategic transaction management

Time Aggregates

Purpose: Aggregate sales and purchase data by time periods (year/month) for trend analysis and historical reporting.

Source Tables:

products: Core product information
orders: Sales order data
purchase_orders: Supplier orders

Destination Tables:

product_time_aggregates: Time-based metrics by product, year, and month

Key Calculations:

Metric	Formula/Logic	Description
Total Quantity Sold	`SUM(quantity)`	Units sold in the time period
Total Revenue	`SUM((price - discount) * quantity)`	Net revenue in the time period
Total Cost	`SUM(cost_price * quantity)`	Cost of goods sold in time period
Order Count	`COUNT(DISTINCT order_number)`	Number of orders in the time period
Stock Received	`SUM(received)`	Units received from suppliers in time period
Stock Ordered	`SUM(ordered)`	Units ordered from suppliers in time period
Avg Price	`AVG(price - discount)`	Average net selling price in time period
Profit Margin	`((total_revenue - total_cost) / total_revenue) * 100`	Percentage profit margin in time period
GMROI	`((total_revenue - total_cost) * (365.0 / active_days)) / inventory_value`	Gross Margin Return on Inventory Investment

Process Flow:

Calculate monthly aggregates for sales data
Calculate monthly aggregates for purchase data
Combine sales and purchase data for complete monthly picture
Update financial metrics for each time period
Calculate efficiency metrics like GMROI

SQL Example (Monthly Aggregation):

WITH monthly_sales AS (
    SELECT 
        o.pid,
        EXTRACT(YEAR FROM o.date) as year,
        EXTRACT(MONTH FROM o.date) as month,
        SUM(o.quantity) as total_quantity_sold,
        SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) as total_revenue,
        SUM(COALESCE(p.cost_price, 0) * o.quantity) as total_cost,
        COUNT(DISTINCT o.order_number) as order_count,
        AVG(o.price - COALESCE(o.discount, 0)) as avg_price,
        -- Profit margin calculation
        CASE 
            WHEN SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) > 0 
            THEN ((SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) - SUM(COALESCE(p.cost_price, 0) * o.quantity)) 
                / SUM((o.price - COALESCE(o.discount, 0)) * o.quantity)) * 100
            ELSE 0 
        END as profit_margin,
        p.cost_price * p.stock_quantity as inventory_value,
        COUNT(DISTINCT DATE(o.date)) as active_days
    FROM orders o
    JOIN products p ON o.pid = p.pid
    WHERE o.canceled = false
    GROUP BY o.pid, EXTRACT(YEAR FROM o.date), EXTRACT(MONTH FROM o.date), p.cost_price, p.stock_quantity
)
-- Insert results into product_time_aggregates table

Financial Metrics

Purpose: Calculate profit, revenue, margin, and return on investment metrics for financial analysis.

Source Tables:

products: Core product information
orders: Sales order data
product_metrics: Previously calculated product metrics

Destination Tables:

financial_metrics: Financial performance metrics
product_metrics: Updated with financial calculations

Key Calculations:

Metric	Formula/Logic	Description
Gross Profit	`SUM(quantity * (price - cost_price))`	Total profit across all sales
Gross Margin	`(gross_profit / SUM(quantity * price)) * 100`	Profit margin percentage
GMROI	`gross_profit / AVG(inventory_value)`	Gross Margin Return on Inventory Investment
GMROII	`GMROI / inventory_turns`	GMROI Index (normalized by inventory turns)
Revenue per Unit	`SUM(price * quantity) / SUM(quantity)`	Average revenue per unit sold
Cost per Unit	`SUM(cost_price * quantity) / SUM(quantity)`	Average cost per unit sold
COGS	`SUM(cost_price * quantity)`	Cost of Goods Sold total
Total Discounts	`SUM(discount * quantity)`	Total discount amount
Discount Rate	`total_discounts / SUM(price * quantity)`	Percentage of revenue discounted
Revenue Rank	`RANK() OVER (ORDER BY total_revenue DESC)`	Revenue ranking compared to other products
Profit Rank	`RANK() OVER (ORDER BY gross_profit DESC)`	Profit ranking compared to other products
ROI	`(gross_profit / SUM(cost_price * quantity)) * 100`	Return on Investment percentage

Process Flow:

Calculate base financial metrics for each product
Calculate comparative rankings across products
Calculate inventory efficiency metrics
Update appropriate destination tables

Vendor Metrics

Purpose: Analyze supplier performance including lead times, reliability, and cost metrics.

Source Tables:

products: Core product information
purchase_orders: Supplier orders
vendors: Supplier information

Destination Tables:

vendor_metrics: Supplier performance metrics
vendor_time_aggregates: Time-based supplier metrics

Key Calculations:

Metric	Formula/Logic	Description
Avg Lead Time	`AVG(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0)`	Average lead time in days
Lead Time Variance	`STDDEV(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0)`	Variance in lead times
Fill Rate	`SUM(received) / NULLIF(SUM(ordered), 0) * 100`	Percentage of ordered units received
On-Time Delivery	`COUNT(CASE WHEN received_date <= expected_date THEN 1 END) / COUNT() 100`	Percentage of on-time deliveries
Avg Cost	`AVG(cost_price)`	Average cost per unit
Cost Variance	`STDDEV(cost_price) / AVG(cost_price) * 100`	Variation in costs (%)
Total Spend	`SUM(ordered * cost_price)`	Total amount spent with vendor
Order Frequency	`COUNT(DISTINCT po_id) / COUNT(DISTINCT DATE_TRUNC('month', date))`	Average orders per month
Quality Rating	Complex formula based on returns	Rating of product quality from vendor
Vendor Score	Weighted score of all metrics	Overall performance score

Process Flow:

Calculate lead time metrics by vendor
Calculate fill rate and delivery metrics
Calculate cost and spend metrics
Calculate quality and reliability metrics
Generate overall vendor score
Update time-based aggregates by month

Category Metrics

Purpose: Analyze performance metrics at the category level for merchandising insights.

Source Tables:

products: Core product information
product_categories: Category relationships
orders: Sales order data
categories: Category information

Destination Tables:

category_metrics: Category performance metrics
category_time_aggregates: Time-based category metrics

Key Calculations:

Metric	Formula/Logic	Description
Total Products	`COUNT(DISTINCT p.pid)`	Number of products in category
Active Products	`COUNT(DISTINCT CASE WHEN o.pid IS NOT NULL THEN p.pid END)`	Products with sales in period
Total Revenue	`SUM(o.quantity * o.price)`	Total revenue from category
Total Units	`SUM(o.quantity)`	Total units sold from category
Avg Price	`SUM(o.quantity * o.price) / SUM(o.quantity)`	Average selling price in category
Gross Profit	`SUM(o.quantity * (o.price - p.cost_price))`	Total profit from category
Margin Percent	`(gross_profit / total_revenue) * 100`	Profit margin percentage
Inventory Value	`SUM(p.stock_quantity * p.cost_price)`	Current inventory value at cost
Inventory Turns	`COGS / AVG(inventory_value)`	How many times inventory turns over
Revenue Growth	`(current_period_revenue / previous_period_revenue - 1) * 100`	Revenue growth percentage
Category Share	`category_revenue / total_revenue * 100`	Percentage of total revenue
Avg Weekly Sales	`SUM(o.quantity) / COUNT(DISTINCT DATE_TRUNC('week', o.date))`	Average weekly units sold

Process Flow:

Calculate category product counts
Calculate category sales metrics
Calculate category profitability
Calculate category growth rates
Calculate category share metrics
Update time-based aggregates by month

Brand Metrics

Purpose: Analyze performance metrics at the brand level for brand management insights.

Source Tables:

products: Core product information
product_categories: Brand relationships (brands are stored as categories)
orders: Sales order data

Destination Tables:

brand_metrics: Brand performance metrics
brand_time_aggregates: Time-based brand metrics

Key Calculations:

Metric	Formula/Logic	Description
Total Products	`COUNT(DISTINCT p.pid)`	Number of products in brand
Active Products	`COUNT(DISTINCT CASE WHEN o.pid IS NOT NULL THEN p.pid END)`	Products with sales in period
Total Revenue	`SUM(o.quantity * o.price)`	Total revenue from brand
Total Units	`SUM(o.quantity)`	Total units sold from brand
Avg Price	`SUM(o.quantity * o.price) / SUM(o.quantity)`	Average selling price in brand
Gross Profit	`SUM(o.quantity * (o.price - p.cost_price))`	Total profit from brand
Margin Percent	`(gross_profit / total_revenue) * 100`	Profit margin percentage
Inventory Value	`SUM(p.stock_quantity * p.cost_price)`	Current inventory value at cost
Inventory Turns	`COGS / AVG(inventory_value)`	How many times inventory turns over
Revenue Growth	`(current_period_revenue / previous_period_revenue - 1) * 100`	Revenue growth percentage
Brand Share	`brand_revenue / total_revenue * 100`	Percentage of total revenue
Brand Velocity	`SUM(o.quantity) / COUNT(DISTINCT DATE(o.date))`	Daily sales velocity

Process Flow:

Calculate brand product counts
Calculate brand sales metrics
Calculate brand profitability
Calculate brand growth rates
Calculate brand share metrics
Update time-based aggregates by month

Sales Forecasts

Purpose: Generate sales forecasts for future periods using historical data and seasonality factors.

Source Tables:

products: Core product information
orders: Sales order data
product_categories: Category relationships
sales_seasonality: Seasonal adjustment factors

Destination Tables:

sales_forecasts: Product-level sales forecasts
category_forecasts: Category-level sales forecasts

Key Calculations:

Metric	Formula/Logic	Description
Daily Sales Avg	`AVG(daily_quantity)` for each day of week	Average sales by day of week
Sales Std Dev	`STDDEV(daily_quantity)`	Standard deviation of daily sales
Sales Variance Ratio	`STDDEV(daily_quantity) / AVG(daily_quantity)`	Measure of sales variability
Forecast Quantity	`avg_daily_qty * (1 + seasonality_factor) * confidence_adjustment`	Predicted daily sales
Confidence Level	Complex scale based on data quality	Confidence in forecast (60-90%)
Seasonality Factor	Monthly adjustment factor	Sales variation by month (-30% to +30%)
Forecast Revenue	`forecast_quantity * avg_price`	Predicted daily revenue
Category Forecast	Sum of product forecasts with adjustments	Forecast at category level

Confidence Level Calculation:

CASE
    WHEN total_days >= 60 AND daily_variance_ratio < 0.5 THEN 90
    WHEN total_days >= 60 THEN 85
    WHEN total_days >= 30 AND daily_variance_ratio < 0.5 THEN 80
    WHEN total_days >= 30 THEN 75
    WHEN total_days >= 14 AND daily_variance_ratio < 0.5 THEN 70
    WHEN total_days >= 14 THEN 65
    ELSE 60
END

Process Flow:

Create temporary tables for forecast dates (next 30 days)
Calculate daily sales statistics by product and day of week
Calculate product-level statistics (variability, seasonality)
Generate product-level forecasts with confidence levels
Aggregate to category level forecasts
Apply seasonality adjustments
Store forecasts in respective tables

Method Details:

Uses a combination of time-series analysis and statistical methods
Applies day-of-week patterns from historical data
Adjusts for seasonal variations by month
Accounts for data quality in confidence calculation
Handles products with intermittent demand differently
Provides both unit and revenue forecasts

Database Schema

Main Tables

calculate_history

Column	Type	Description
id	SERIAL	Primary key
start_time	TIMESTAMP	Calculation start time
end_time	TIMESTAMP	Calculation end time
duration_seconds	INTEGER	Elapsed time in seconds
status	TEXT	'running', 'completed', 'cancelled', 'failed'
total_products	INTEGER	Total products processed
total_orders	INTEGER	Total orders processed
total_purchase_orders	INTEGER	Total purchase orders processed
processed_products	INTEGER	Number of processed products
processed_orders	INTEGER	Number of processed orders
processed_purchase_orders	INTEGER	Number of processed purchase orders
error_message	TEXT	Error message if failed
additional_info	JSONB	Configuration and results

calculate_status

Column	Type	Description
module_name	TEXT	Name of calculation module
last_calculation_timestamp	TIMESTAMP	When module last completed

product_metrics

Column	Type	Description
pid	BIGINT	Product ID (PK)
inventory_value	DECIMAL(10,2)	Current inventory value
daily_sales_avg	DECIMAL(10,3)	Average daily sales (units)
weekly_sales_avg	DECIMAL(10,3)	Average weekly sales (units)
monthly_sales_avg	DECIMAL(10,3)	Average monthly sales (units)
total_revenue	DECIMAL(10,2)	Total revenue from product
avg_margin_percent	DECIMAL(10,2)	Average margin percentage
first_sale_date	DATE	Date of first sale
last_sale_date	DATE	Date of last sale
avg_lead_time_days	DOUBLE PRECISION	Average lead time in days
days_of_inventory	INTEGER	Days until stockout at current rate
weeks_of_inventory	INTEGER	Weeks until stockout at current rate
stock_status	TEXT	Current stock status
safety_stock	INTEGER	Recommended safety stock
reorder_point	INTEGER	Recommended reorder point
reorder_qty	INTEGER	Recommended order quantity
overstocked_amt	INTEGER	Units considered excess inventory
abc_class	CHAR(1)	ABC classification
turnover_rate	DECIMAL(10,2)	Inventory turnover rate
forecast_accuracy	DECIMAL(10,2)	Accuracy of past forecasts (%)
forecast_bias	DECIMAL(10,2)	Directional bias of forecasts (%)
last_forecast_date	DATE	Date of most recent forecast
last_calculated_at	TIMESTAMP	Last calculation time

product_time_aggregates

Column	Type	Description
pid	BIGINT	Product ID (PK)
year	INTEGER	Year (PK)
month	INTEGER	Month (PK)
total_quantity_sold	INTEGER	Units sold in period
total_revenue	DECIMAL(10,2)	Revenue in period
total_cost	DECIMAL(10,2)	Cost in period
order_count	INTEGER	Number of orders
stock_received	INTEGER	Units received
stock_ordered	INTEGER	Units ordered
avg_price	DECIMAL(10,2)	Average price
profit_margin	DECIMAL(10,2)	Profit margin (%)
inventory_value	DECIMAL(10,2)	Inventory value
gmroi	DECIMAL(10,3)	Gross Margin Return on Investment

sales_forecasts

Column	Type	Description
pid	BIGINT	Product ID (PK)
forecast_date	DATE	Date of forecast (PK)
forecast_quantity	DECIMAL(10,2)	Predicted quantity
confidence_level	INTEGER	Confidence (60-90%)
created_at	TIMESTAMP	Creation time

category_forecasts

Column	Type	Description
category_id	BIGINT	Category ID (PK)
forecast_date	DATE	Date of forecast (PK)
forecast_units	DECIMAL(10,2)	Predicted units
forecast_revenue	DECIMAL(10,2)	Predicted revenue
confidence_level	INTEGER	Confidence (60-90%)
created_at	TIMESTAMP	Creation time

vendor_metrics

Column	Type	Description
vendor_id	BIGINT	Vendor ID (PK)
avg_lead_time	DECIMAL(10,2)	Average lead time in days
lead_time_variance	DECIMAL(10,2)	Variance in lead times
fill_rate	DECIMAL(10,2)	Percentage of ordered units received
on_time_delivery	DECIMAL(10,2)	Percentage of on-time deliveries
avg_cost	DECIMAL(10,2)	Average cost per unit
cost_variance	DECIMAL(10,2)	Variation in costs (%)
total_spend	DECIMAL(10,2)	Total amount spent
order_frequency	DECIMAL(10,2)	Average orders per month
quality_rating	DECIMAL(10,2)	Rating of product quality
vendor_score	DECIMAL(10,2)	Overall performance score
last_calculated_at	TIMESTAMP	Last calculation time

Performance Optimizations

The metrics calculation system employs several optimizations for performance:

1. Batch Processing

Products are processed in configurable batch sizes (default 5,000) to:

Reduce memory usage
Improve transaction management
Enable partial progress in case of failure

Implementation:

// Process updates in batches
let lastPid = 0;
let batchCount = 0;
const MAX_BATCHES = 1000; // Safety limit
const BATCH_SIZE = 5000;

while (batchCount < MAX_BATCHES) {
    batchCount++;
    const batch = await connection.query(
        'SELECT pid FROM products WHERE pid > $1 ORDER BY pid LIMIT $2',
        [lastPid, BATCH_SIZE]
    );

    if (batch.rows.length === 0) break;

    // Process the entire batch in a single efficient query
    await connection.query(`
        UPDATE product_metrics pm
        SET /* fields to update */
        FROM /* joined tables */
        WHERE p.pid = ANY($1::bigint[])
        AND pm.pid = p.pid
    `, [batch.rows.map(row => row.pid)]);

    lastPid = batch.rows[batch.rows.length - 1].pid;
}

2. Temporary Tables

The system creates temporary tables for intermediate calculations to:

Reduce query complexity
Improve join performance
Enable indexing of intermediate results

Example:

// Create temp_sales_metrics
await connection.query(`
    CREATE TEMPORARY TABLE temp_sales_metrics (
        pid BIGINT NOT NULL,
        daily_sales_avg DECIMAL(10,3),
        weekly_sales_avg DECIMAL(10,3),
        monthly_sales_avg DECIMAL(10,3),
        total_revenue DECIMAL(10,3),
        avg_margin_percent DECIMAL(10,3),
        first_sale_date DATE,
        last_sale_date DATE,
        PRIMARY KEY (pid)
    )
`);

// Create indexes on temporary tables
await connection.query('CREATE INDEX ON temp_revenue_ranks (rank_num)');

3. Query Timeouts

Long-running queries have timeout protection:

await Promise.race([
    connection.query(`
        INSERT INTO temp_purchase_metrics
        /* Complex query here */
    `),
    new Promise((_, reject) => 
        setTimeout(() => reject(new Error('Timeout: query took too long')), 60000)
    )
]).catch(async (err) => {
    logError(err, 'Error populating table, continuing with fallback');
    // Fallback mechanism
});

4. Connection Pooling

Database connections are managed through a connection pool:

const dbConfig = {
    // Database connection details
    max: 10, // connection pool max size
    idleTimeoutMillis: 30000,
    connectionTimeoutMillis: 60000
};

const pool = new Pool(dbConfig);

5. Transaction Management

Each module uses appropriate transaction boundaries:

// Start transaction
await connection.query('BEGIN');
try {
    // Perform operations...
    
    // Commit transaction
    await connection.query('COMMIT');
} catch (error) {
    // Rollback on error
    await connection.query('ROLLBACK');
    throw error;
}

Progress Tracking and Reporting

The system includes a comprehensive progress tracking system that:

Estimates completion time based on processed items
Reports progress in real-time
Persists progress information for UI display
Handles interruptions gracefully

Progress Output Format

{
    status: 'running', // running, complete, error, cancelled
    operation: 'Operation description',
    current: 1000, // Current number of processed items
    total: 10000, // Total items to process
    elapsed: '2m 30s', // Formatted elapsed time
    remaining: '7m 15s', // Estimated remaining time
    rate: 120, // Processing rate (items/second)
    percentage: '10.0', // Progress percentage
    timing: {
        start_time: '2023-03-26T12:00:00Z',
        end_time: '2023-03-26T12:02:30Z',
        elapsed_seconds: 150
    }
}

Key Functions

formatElapsedTime: Converts milliseconds to human-readable format

function formatElapsedTime(elapsed) {
    // If elapsed is a timestamp, convert to elapsed milliseconds
    if (elapsed instanceof Date || elapsed > 1000000000000) {
        elapsed = Date.now() - elapsed;
    } else {
        // If elapsed is in seconds, convert to milliseconds
        elapsed = elapsed * 1000;
    }
    
    const seconds = Math.floor(elapsed / 1000);
    const minutes = Math.floor(seconds / 60);
    const hours = Math.floor(minutes / 60);
    
    if (hours > 0) {
        return `${hours}h ${minutes % 60}m`;
    } else if (minutes > 0) {
        return `${minutes}m ${seconds % 60}s`;
    } else {
        return `${seconds}s`;
    }
}

estimateRemaining: Calculates estimated time to completion

function estimateRemaining(startTime, current, total) {
    if (current === 0) return null;
    const elapsed = Date.now() - startTime;
    const rate = current / elapsed;
    const remaining = (total - current) / rate;
    
    const minutes = Math.floor(remaining / 60000);
    const seconds = Math.floor((remaining % 60000) / 1000);
    
    if (minutes > 0) {
        return `${minutes}m ${seconds}s`;
    } else {
        return `${seconds}s`;
    }
}

outputProgress: Outputs progress information

function outputProgress(data) {
    // Save progress to file for resumption
    saveProgress(data);
    // Format as JSON
    const event = {
        progress: data
    };
    // Send to stdout for frontend
    process.stdout.write(JSON.stringify(event) + '\n');
    
    // Log significant events to disk
    if (isSignificantEvent(data)) {
        logImport(`${data.operation || 'Operation'}${data.message ? ': ' + data.message : ''}${data.error ? ' Error: ' + data.error : ''}${data.status ? ' Status: ' + data.status : ''}`);
    }
}

Error Handling

The system includes comprehensive error handling:

1. Global Error Handlers

// Add error handler for uncaught exceptions
process.on('uncaughtException', (error) => {
    console.error('Uncaught Exception:', error);
    process.exit(1);
});

// Add error handler for unhandled promise rejections
process.on('unhandledRejection', (reason, promise) => {
    console.error('Unhandled Rejection at:', promise, 'reason:', reason);
    process.exit(1);
});

2. Module-Level Error Handling

Each module uses try/catch blocks with detailed error reporting:

try {
    // Module operations
} catch (error) {
    success = false;
    logError(error, 'Error calculating product metrics');
    
    // Update history with error
    await connection.query(`
        UPDATE calculate_history 
        SET 
            end_time = NOW(),
            duration_seconds = $1,
            processed_products = $2,
            processed_orders = $3,
            processed_purchase_orders = $4,
            status = $5,
            error_message = $6
        WHERE id = $7
    `, [
        totalElapsedSeconds,
        processedProducts || 0,
        processedOrders || 0,
        processedPurchaseOrders || 0,
        isCancelled ? 'cancelled' : 'failed',
        error.message,
        calculateHistoryId
    ]);
    
    throw error;
} finally {
    // Cleanup operations
    if (connection) {
        try {
            await cleanupTemporaryTables(connection);
            connection.release();
        } catch (err) {
            console.error('Error in final cleanup:', err);
        }
    }
}

3. Error Logging

Errors are logged to file and console:

function logError(error, context = '') {
    const timestamp = new Date().toISOString();
    const errorMessage = `[${timestamp}] ${context}\nError: ${error.message}\nStack: ${error.stack}\n\n`;
    
    // Log to error file
    fs.appendFileSync(ERROR_LOG, errorMessage);
    
    // Also log to console
    console.error(`\n${context}\nError: ${error.message}`);
}

4. Timeout Management

A global timeout prevents runaway processes:

// Set a maximum execution time (30 minutes)
const MAX_EXECUTION_TIME = 30 * 60 * 1000;
const timeout = setTimeout(() => {
    console.error(`Calculation timed out after ${MAX_EXECUTION_TIME/1000} seconds, forcing termination`);
    // Call cancel and force exit
    cancelCalculation();
    process.exit(1);
}, MAX_EXECUTION_TIME);

// Clear timeout when process completes
clearTimeout(timeout);

5. Cancellation Support

Users can cancel long-running calculations:

function cancelCalculation() {
    isCancelled = true;
    console.log('Calculation has been cancelled by user');
    
    // Force-terminate any query that's been running for more than 5 seconds
    try {
        const connection = getConnection();
        connection.then(async (conn) => {
            try {
                // Identify and terminate long-running queries from our application
                await conn.query(`
                    SELECT pg_cancel_backend(pid) 
                    FROM pg_stat_activity 
                    WHERE query_start < now() - interval '5 seconds'
                    AND application_name LIKE '%node%'
                    AND query NOT LIKE '%pg_cancel_backend%'
                `);
                
                // Clean up any temporary tables
                await cleanupTemporaryTables(conn);
                
                // Release connection
                conn.release();
            } catch (err) {
                console.error('Error during force cancellation:', err);
                conn.release();
            }
        }).catch(err => {
            console.error('Could not get connection for cancellation:', err);
        });
    } catch (err) {
        console.error('Failed to terminate running queries:', err);
    }
    
    return {
        success: true,
        message: 'Calculation has been cancelled'
    };
}

Utility Components

Database Utilities (db.js)

Purpose: Manage database connections and provide a connection pool for efficient resource usage.

Key Functions:

async function getConnection() {
    return await pool.connect();
}

async function closePool() {
    await pool.end();
}

Configuration:

const dbConfig = {
    host: process.env.DB_HOST,
    user: process.env.DB_USER,
    password: process.env.DB_PASSWORD,
    database: process.env.DB_NAME,
    port: process.env.DB_PORT || 5432,
    ssl: process.env.DB_SSL === 'true',
    // Performance optimizations
    max: 10, // connection pool max size
    idleTimeoutMillis: 30000,
    connectionTimeoutMillis: 60000
};

Progress Utilities (progress.js)

Purpose: Provide functions for tracking and reporting calculation progress.

Key Functions:

formatElapsedTime: Convert milliseconds to human-readable format
estimateRemaining: Estimate remaining time based on progress
calculateRate: Calculate processing rate in items per second
logError: Log errors to file and console
outputProgress: Output progress information to multiple channels
saveProgress: Persist progress information to disk
clearProgress: Clear saved progress information
getProgress: Retrieve previously saved progress

Progress Persistence:

function saveProgress(progress) {
    try {
        fs.writeFileSync(STATUS_FILE, JSON.stringify({
            ...progress,
            timestamp: Date.now()
        }));
    } catch (err) {
        console.error('Failed to save progress:', err);
    }
}

function getProgress() {
    try {
        if (fs.existsSync(STATUS_FILE)) {
            const progress = JSON.parse(fs.readFileSync(STATUS_FILE, 'utf8'));
            // Check if the progress is still valid (less than 1 hour old)
            if (progress.timestamp && Date.now() - progress.timestamp < 3600000) {
                return progress;
            } else {
                // Clear old progress
                clearProgress();
            }
        }
    } catch (err) {
        console.error('Failed to read progress:', err);
        clearProgress();
    }
    return null;
}

Implementation Notes

Module Dependencies

The main calculate-metrics.js script requires the following modules:

const progress = require('./metrics/utils/progress');
const { getConnection, closePool } = require('./metrics/utils/db');
const calculateProductMetrics = require('./metrics/product-metrics');
const calculateTimeAggregates = require('./metrics/time-aggregates');
const calculateFinancialMetrics = require('./metrics/financial-metrics');
const calculateVendorMetrics = require('./metrics/vendor-metrics');
const calculateCategoryMetrics = require('./metrics/category-metrics');
const calculateBrandMetrics = require('./metrics/brand-metrics');
const calculateSalesForecasts = require('./metrics/sales-forecasts');

Temporary Tables

The system uses these temporary tables during calculation:

const TEMP_TABLES = [
    'temp_revenue_ranks',
    'temp_sales_metrics',
    'temp_purchase_metrics',
    'temp_product_metrics',
    'temp_vendor_metrics',
    'temp_category_metrics',
    'temp_brand_metrics',
    'temp_forecast_dates',
    'temp_daily_sales',
    'temp_product_stats',
    'temp_category_sales',
    'temp_category_stats'
];

Ensuring Valid Progress Data

The system includes checks to ensure progress data is valid:

// Helper function to ensure valid progress numbers
const ensureValidProgress = (current, total) => ({
    current: Number(current) || 0,
    total: Number(total) || 1,  // Default to 1 to avoid division by zero
    percentage: (((Number(current) || 0) / (Number(total) || 1)) * 100).toFixed(1)
});

Module Execution Logic

The main calculation function executes modules conditionally:

if (!SKIP_PRODUCT_METRICS) {
    const result = await calculateProductMetrics(startTime, totalProducts);
    await updateProgress(result.processedProducts, result.processedOrders, result.processedPurchaseOrders);
    if (!result.success) {
        throw new Error('Product metrics calculation failed');
    }
} else {
    console.log('Skipping product metrics calculation...');
    // Update progress proportionally
}

Command Line Usage

When run from the command line:

// Run directly if called from command line
if (require.main === module) {
    calculateMetrics().catch(error => {
        if (!error.message.includes('Operation cancelled')) {
            console.error('Error:', error);
        }
        process.exit(1);
    });
}

Module Export

The script exports functions for use by other components:

module.exports = {
    calculateMetrics,
    cancelCalculation,
    getProgress: global.getProgress
};

39 KiB Raw Blame History

Metrics Calculation System Documentation

Table of Contents

Overview

System Architecture

Calculation Process

Metrics Modules

Product Metrics

Time Aggregates

Financial Metrics

Vendor Metrics

Category Metrics

Brand Metrics

Sales Forecasts

Database Schema

Main Tables

calculate_history

calculate_status

product_metrics

product_time_aggregates

sales_forecasts

category_forecasts

vendor_metrics

Performance Optimizations

1. Batch Processing

2. Temporary Tables

3. Query Timeouts

4. Connection Pooling

5. Transaction Management

Progress Tracking and Reporting

Progress Output Format

Key Functions

Error Handling

1. Global Error Handlers

2. Module-Level Error Handling

3. Error Logging

4. Timeout Management

5. Cancellation Support

Utility Components

Database Utilities (db.js)

Progress Utilities (progress.js)

Implementation Notes

Module Dependencies

Temporary Tables

Ensuring Valid Progress Data

Module Execution Logic

Command Line Usage

Module Export

39 KiB

Raw Blame History