39 KiB
Metrics Calculation System Documentation
This document provides a comprehensive description of the metrics calculation system in the inventory management application, detailing the calculation modules, data transformations, formulas, and overall architecture.
Table of Contents
- Overview
- System Architecture
- Calculation Process
- Metrics Modules
- Database Schema
- Performance Optimizations
- Progress Tracking and Reporting
- Error Handling
- Utility Components
- Implementation Notes
Overview
The metrics calculation system processes inventory, sales, and purchase data to generate insights and forecasts for business decision-making. It powers dashboards, reports, and automated inventory recommendations that are core to the business operations.
The system can operate in two modes:
- Complete Calculation: Calculates all metrics from scratch
- Selective Calculation: Calculates only specified metric types
Key features of the system include:
- Progress tracking with estimated completion time
- Cancellation capability for long-running operations
- Fault tolerance with automatic recovery
- Temporary table usage for optimal performance
- Batched processing for memory efficiency
- Multiple calculation modules that can run independently
System Architecture
The calculation system follows a modular architecture with these components:
├── calculate-metrics.js # Main orchestration script
├── metrics/ # Calculation modules
│ ├── product-metrics.js # Product-level metrics calculation
│ ├── time-aggregates.js # Time-based aggregation calculations
│ ├── financial-metrics.js # Financial performance metrics
│ ├── vendor-metrics.js # Supplier/vendor analytics
│ ├── category-metrics.js # Category-level performance metrics
│ ├── brand-metrics.js # Brand performance analytics
│ ├── sales-forecasts.js # Future sales prediction
│ └── utils/ # Shared utilities
│ ├── db.js # Database connection management
│ └── progress.js # Progress tracking utilities
The system uses PostgreSQL for data storage and processing, leveraging:
- Temporary tables for intermediate calculations
- Database transactions for data consistency
- SQL window functions for ranking and trends
- Batch processing for memory efficiency
Calculation Process
The calculation process follows a sequential flow with the following steps:
-
Initialization
- Establish database connection
- Create calculation history record
- Set up progress tracking
- Prepare environment
-
Module Execution
- Product Metrics: Calculate inventory metrics and ABC classification
- Time Aggregates: Calculate time-series data aggregations
- Financial Metrics: Calculate profit, margin, and ROI metrics
- Vendor Metrics: Calculate supplier performance metrics
- Category Metrics: Calculate category performance metrics
- Brand Metrics: Calculate brand performance metrics
- Sales Forecasts: Generate future sales predictions
-
Finalization
- Clean up temporary tables
- Update calculation history status
- Record completion time
- Release database connections
Each module creates its own temporary tables, performs calculations, updates the appropriate destination tables, and then cleans up. Modules can be enabled or disabled through configuration flags:
// Configuration flags for controlling which metrics to calculate
// Set to 1 to skip the corresponding calculation, 0 to run it
const SKIP_PRODUCT_METRICS = 0;
const SKIP_TIME_AGGREGATES = 0;
const SKIP_FINANCIAL_METRICS = 0;
const SKIP_VENDOR_METRICS = 0;
const SKIP_CATEGORY_METRICS = 0;
const SKIP_BRAND_METRICS = 0;
const SKIP_SALES_FORECASTS = 0;
Metrics Modules
Product Metrics
Purpose: Calculate core metrics for individual products including stock status, sales rates, inventory classifications, and reorder recommendations.
Source Tables:
products: Core product information and current stockorders: Sales order datapurchase_orders: Supplier orders
Destination Tables:
product_metrics: Main repository for calculated metricsabc_classification: Product importance classification
Key Calculations:
| Metric | Formula/Logic | Description |
|---|---|---|
| Inventory Value | stock_quantity * cost_price |
Current value of inventory at cost |
| Daily Sales Avg | SUM(quantity) / COUNT(DISTINCT DATE(date)) |
Average daily unit sales in past 90 days |
| Weekly Sales Avg | SUM(quantity) / CEIL(COUNT(DISTINCT DATE(date)) / 7) |
Average weekly unit sales in past 90 days |
| Monthly Sales Avg | SUM(quantity) / CEIL(COUNT(DISTINCT DATE(date)) / 30) |
Average monthly unit sales in past 90 days |
| Total Revenue | SUM(quantity * price) |
Total revenue in past 90 days |
| Avg Margin Percent | ((SUM(quantity * price) - SUM(quantity * cost_price)) / SUM(quantity * price)) * 100 |
Average margin percentage |
| Avg Lead Time | AVG(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0) |
Average lead time in days from order to receipt |
| Days of Inventory | floor(stock_quantity / daily_sales_avg) |
How many days current stock will last |
| Weeks of Inventory | floor(stock_quantity / weekly_sales_avg) |
How many weeks current stock will last |
| Stock Status | Complex conditional logic (see below) | Current inventory status category |
| Safety Stock | CEIL(daily_sales_avg * SQRT(avg_lead_time_days) * 1.96) |
Buffer stock to maintain service level (95%) |
| Reorder Point | CEIL(daily_sales_avg * avg_lead_time_days) + safety_stock |
When to reorder based on lead time and safety stock |
| Reorder Qty | SQRT((2 * annual_demand * order_cost) / holding_cost) |
Economic Order Quantity (EOQ) formula |
| Overstocked Amt | MAX(0, stock_quantity - (daily_sales_avg * overstock_days)) |
Units considered excess inventory |
| ABC Class | Percentile ranking by revenue | 'A': top 20%, 'B': 21-50%, 'C': remainder |
| Turnover Rate | (total_sold / avg_nonzero_stock) * (365.0 / active_days) |
Inventory turnover rate adjusted for zero-stock periods |
| Forecast Accuracy | 100 - MIN(AVG(ABS(forecast_quantity - actual_quantity) / actual_quantity * 100), 100) |
Historic forecast accuracy percentage |
| Forecast Bias | AVG((forecast_quantity - actual_quantity) / actual_quantity * 100) |
Directional bias of forecasts (positive = over-forecasting) |
Stock Status Logic:
CASE
WHEN stock_quantity <= 0 THEN 'Out of Stock'
WHEN daily_sales_avg = 0 AND stock_quantity <= low_stock_threshold THEN 'Low Stock'
WHEN daily_sales_avg = 0 THEN 'In Stock'
WHEN stock_quantity / daily_sales_avg <= critical_days THEN 'Critical'
WHEN stock_quantity / daily_sales_avg <= reorder_days THEN 'Reorder'
WHEN stock_quantity / daily_sales_avg > overstock_days THEN 'Overstocked'
ELSE 'Healthy'
END
Process Flow:
- Create temporary tables for sales and purchase metrics
- Calculate sales averages by product
- Calculate lead times and purchase metrics
- Update product_metrics with calculated values
- Calculate ABC classification using revenue ranking
- Calculate turnover rates with zero-stock period adjustment
- Calculate forecast accuracy and bias using historical data
Optimizations:
- Batch processing of products (5,000 per batch)
- Timeout protection for long-running queries
- Indexed temporary tables for performance
- Strategic transaction management
Time Aggregates
Purpose: Aggregate sales and purchase data by time periods (year/month) for trend analysis and historical reporting.
Source Tables:
products: Core product informationorders: Sales order datapurchase_orders: Supplier orders
Destination Tables:
product_time_aggregates: Time-based metrics by product, year, and month
Key Calculations:
| Metric | Formula/Logic | Description |
|---|---|---|
| Total Quantity Sold | SUM(quantity) |
Units sold in the time period |
| Total Revenue | SUM((price - discount) * quantity) |
Net revenue in the time period |
| Total Cost | SUM(cost_price * quantity) |
Cost of goods sold in time period |
| Order Count | COUNT(DISTINCT order_number) |
Number of orders in the time period |
| Stock Received | SUM(received) |
Units received from suppliers in time period |
| Stock Ordered | SUM(ordered) |
Units ordered from suppliers in time period |
| Avg Price | AVG(price - discount) |
Average net selling price in time period |
| Profit Margin | ((total_revenue - total_cost) / total_revenue) * 100 |
Percentage profit margin in time period |
| GMROI | ((total_revenue - total_cost) * (365.0 / active_days)) / inventory_value |
Gross Margin Return on Inventory Investment |
Process Flow:
- Calculate monthly aggregates for sales data
- Calculate monthly aggregates for purchase data
- Combine sales and purchase data for complete monthly picture
- Update financial metrics for each time period
- Calculate efficiency metrics like GMROI
SQL Example (Monthly Aggregation):
WITH monthly_sales AS (
SELECT
o.pid,
EXTRACT(YEAR FROM o.date) as year,
EXTRACT(MONTH FROM o.date) as month,
SUM(o.quantity) as total_quantity_sold,
SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) as total_revenue,
SUM(COALESCE(p.cost_price, 0) * o.quantity) as total_cost,
COUNT(DISTINCT o.order_number) as order_count,
AVG(o.price - COALESCE(o.discount, 0)) as avg_price,
-- Profit margin calculation
CASE
WHEN SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) > 0
THEN ((SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) - SUM(COALESCE(p.cost_price, 0) * o.quantity))
/ SUM((o.price - COALESCE(o.discount, 0)) * o.quantity)) * 100
ELSE 0
END as profit_margin,
p.cost_price * p.stock_quantity as inventory_value,
COUNT(DISTINCT DATE(o.date)) as active_days
FROM orders o
JOIN products p ON o.pid = p.pid
WHERE o.canceled = false
GROUP BY o.pid, EXTRACT(YEAR FROM o.date), EXTRACT(MONTH FROM o.date), p.cost_price, p.stock_quantity
)
-- Insert results into product_time_aggregates table
Financial Metrics
Purpose: Calculate profit, revenue, margin, and return on investment metrics for financial analysis.
Source Tables:
products: Core product informationorders: Sales order dataproduct_metrics: Previously calculated product metrics
Destination Tables:
financial_metrics: Financial performance metricsproduct_metrics: Updated with financial calculations
Key Calculations:
| Metric | Formula/Logic | Description |
|---|---|---|
| Gross Profit | SUM(quantity * (price - cost_price)) |
Total profit across all sales |
| Gross Margin | (gross_profit / SUM(quantity * price)) * 100 |
Profit margin percentage |
| GMROI | gross_profit / AVG(inventory_value) |
Gross Margin Return on Inventory Investment |
| GMROII | GMROI / inventory_turns |
GMROI Index (normalized by inventory turns) |
| Revenue per Unit | SUM(price * quantity) / SUM(quantity) |
Average revenue per unit sold |
| Cost per Unit | SUM(cost_price * quantity) / SUM(quantity) |
Average cost per unit sold |
| COGS | SUM(cost_price * quantity) |
Cost of Goods Sold total |
| Total Discounts | SUM(discount * quantity) |
Total discount amount |
| Discount Rate | total_discounts / SUM(price * quantity) |
Percentage of revenue discounted |
| Revenue Rank | RANK() OVER (ORDER BY total_revenue DESC) |
Revenue ranking compared to other products |
| Profit Rank | RANK() OVER (ORDER BY gross_profit DESC) |
Profit ranking compared to other products |
| ROI | (gross_profit / SUM(cost_price * quantity)) * 100 |
Return on Investment percentage |
Process Flow:
- Calculate base financial metrics for each product
- Calculate comparative rankings across products
- Calculate inventory efficiency metrics
- Update appropriate destination tables
Vendor Metrics
Purpose: Analyze supplier performance including lead times, reliability, and cost metrics.
Source Tables:
products: Core product informationpurchase_orders: Supplier ordersvendors: Supplier information
Destination Tables:
vendor_metrics: Supplier performance metricsvendor_time_aggregates: Time-based supplier metrics
Key Calculations:
| Metric | Formula/Logic | Description |
|---|---|---|
| Avg Lead Time | AVG(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0) |
Average lead time in days |
| Lead Time Variance | STDDEV(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0) |
Variance in lead times |
| Fill Rate | SUM(received) / NULLIF(SUM(ordered), 0) * 100 |
Percentage of ordered units received |
| On-Time Delivery | COUNT(CASE WHEN received_date <= expected_date THEN 1 END) / COUNT(*) * 100 |
Percentage of on-time deliveries |
| Avg Cost | AVG(cost_price) |
Average cost per unit |
| Cost Variance | STDDEV(cost_price) / AVG(cost_price) * 100 |
Variation in costs (%) |
| Total Spend | SUM(ordered * cost_price) |
Total amount spent with vendor |
| Order Frequency | COUNT(DISTINCT po_id) / COUNT(DISTINCT DATE_TRUNC('month', date)) |
Average orders per month |
| Quality Rating | Complex formula based on returns | Rating of product quality from vendor |
| Vendor Score | Weighted score of all metrics | Overall performance score |
Process Flow:
- Calculate lead time metrics by vendor
- Calculate fill rate and delivery metrics
- Calculate cost and spend metrics
- Calculate quality and reliability metrics
- Generate overall vendor score
- Update time-based aggregates by month
Category Metrics
Purpose: Analyze performance metrics at the category level for merchandising insights.
Source Tables:
products: Core product informationproduct_categories: Category relationshipsorders: Sales order datacategories: Category information
Destination Tables:
category_metrics: Category performance metricscategory_time_aggregates: Time-based category metrics
Key Calculations:
| Metric | Formula/Logic | Description |
|---|---|---|
| Total Products | COUNT(DISTINCT p.pid) |
Number of products in category |
| Active Products | COUNT(DISTINCT CASE WHEN o.pid IS NOT NULL THEN p.pid END) |
Products with sales in period |
| Total Revenue | SUM(o.quantity * o.price) |
Total revenue from category |
| Total Units | SUM(o.quantity) |
Total units sold from category |
| Avg Price | SUM(o.quantity * o.price) / SUM(o.quantity) |
Average selling price in category |
| Gross Profit | SUM(o.quantity * (o.price - p.cost_price)) |
Total profit from category |
| Margin Percent | (gross_profit / total_revenue) * 100 |
Profit margin percentage |
| Inventory Value | SUM(p.stock_quantity * p.cost_price) |
Current inventory value at cost |
| Inventory Turns | COGS / AVG(inventory_value) |
How many times inventory turns over |
| Revenue Growth | (current_period_revenue / previous_period_revenue - 1) * 100 |
Revenue growth percentage |
| Category Share | category_revenue / total_revenue * 100 |
Percentage of total revenue |
| Avg Weekly Sales | SUM(o.quantity) / COUNT(DISTINCT DATE_TRUNC('week', o.date)) |
Average weekly units sold |
Process Flow:
- Calculate category product counts
- Calculate category sales metrics
- Calculate category profitability
- Calculate category growth rates
- Calculate category share metrics
- Update time-based aggregates by month
Brand Metrics
Purpose: Analyze performance metrics at the brand level for brand management insights.
Source Tables:
products: Core product informationproduct_categories: Brand relationships (brands are stored as categories)orders: Sales order data
Destination Tables:
brand_metrics: Brand performance metricsbrand_time_aggregates: Time-based brand metrics
Key Calculations:
| Metric | Formula/Logic | Description |
|---|---|---|
| Total Products | COUNT(DISTINCT p.pid) |
Number of products in brand |
| Active Products | COUNT(DISTINCT CASE WHEN o.pid IS NOT NULL THEN p.pid END) |
Products with sales in period |
| Total Revenue | SUM(o.quantity * o.price) |
Total revenue from brand |
| Total Units | SUM(o.quantity) |
Total units sold from brand |
| Avg Price | SUM(o.quantity * o.price) / SUM(o.quantity) |
Average selling price in brand |
| Gross Profit | SUM(o.quantity * (o.price - p.cost_price)) |
Total profit from brand |
| Margin Percent | (gross_profit / total_revenue) * 100 |
Profit margin percentage |
| Inventory Value | SUM(p.stock_quantity * p.cost_price) |
Current inventory value at cost |
| Inventory Turns | COGS / AVG(inventory_value) |
How many times inventory turns over |
| Revenue Growth | (current_period_revenue / previous_period_revenue - 1) * 100 |
Revenue growth percentage |
| Brand Share | brand_revenue / total_revenue * 100 |
Percentage of total revenue |
| Brand Velocity | SUM(o.quantity) / COUNT(DISTINCT DATE(o.date)) |
Daily sales velocity |
Process Flow:
- Calculate brand product counts
- Calculate brand sales metrics
- Calculate brand profitability
- Calculate brand growth rates
- Calculate brand share metrics
- Update time-based aggregates by month
Sales Forecasts
Purpose: Generate sales forecasts for future periods using historical data and seasonality factors.
Source Tables:
products: Core product informationorders: Sales order dataproduct_categories: Category relationshipssales_seasonality: Seasonal adjustment factors
Destination Tables:
sales_forecasts: Product-level sales forecastscategory_forecasts: Category-level sales forecasts
Key Calculations:
| Metric | Formula/Logic | Description |
|---|---|---|
| Daily Sales Avg | AVG(daily_quantity) for each day of week |
Average sales by day of week |
| Sales Std Dev | STDDEV(daily_quantity) |
Standard deviation of daily sales |
| Sales Variance Ratio | STDDEV(daily_quantity) / AVG(daily_quantity) |
Measure of sales variability |
| Forecast Quantity | avg_daily_qty * (1 + seasonality_factor) * confidence_adjustment |
Predicted daily sales |
| Confidence Level | Complex scale based on data quality | Confidence in forecast (60-90%) |
| Seasonality Factor | Monthly adjustment factor | Sales variation by month (-30% to +30%) |
| Forecast Revenue | forecast_quantity * avg_price |
Predicted daily revenue |
| Category Forecast | Sum of product forecasts with adjustments | Forecast at category level |
Confidence Level Calculation:
CASE
WHEN total_days >= 60 AND daily_variance_ratio < 0.5 THEN 90
WHEN total_days >= 60 THEN 85
WHEN total_days >= 30 AND daily_variance_ratio < 0.5 THEN 80
WHEN total_days >= 30 THEN 75
WHEN total_days >= 14 AND daily_variance_ratio < 0.5 THEN 70
WHEN total_days >= 14 THEN 65
ELSE 60
END
Process Flow:
- Create temporary tables for forecast dates (next 30 days)
- Calculate daily sales statistics by product and day of week
- Calculate product-level statistics (variability, seasonality)
- Generate product-level forecasts with confidence levels
- Aggregate to category level forecasts
- Apply seasonality adjustments
- Store forecasts in respective tables
Method Details:
- Uses a combination of time-series analysis and statistical methods
- Applies day-of-week patterns from historical data
- Adjusts for seasonal variations by month
- Accounts for data quality in confidence calculation
- Handles products with intermittent demand differently
- Provides both unit and revenue forecasts
Database Schema
Main Tables
calculate_history
| Column | Type | Description |
|---|---|---|
| id | SERIAL | Primary key |
| start_time | TIMESTAMP | Calculation start time |
| end_time | TIMESTAMP | Calculation end time |
| duration_seconds | INTEGER | Elapsed time in seconds |
| status | TEXT | 'running', 'completed', 'cancelled', 'failed' |
| total_products | INTEGER | Total products processed |
| total_orders | INTEGER | Total orders processed |
| total_purchase_orders | INTEGER | Total purchase orders processed |
| processed_products | INTEGER | Number of processed products |
| processed_orders | INTEGER | Number of processed orders |
| processed_purchase_orders | INTEGER | Number of processed purchase orders |
| error_message | TEXT | Error message if failed |
| additional_info | JSONB | Configuration and results |
calculate_status
| Column | Type | Description |
|---|---|---|
| module_name | TEXT | Name of calculation module |
| last_calculation_timestamp | TIMESTAMP | When module last completed |
product_metrics
| Column | Type | Description |
|---|---|---|
| pid | BIGINT | Product ID (PK) |
| inventory_value | DECIMAL(10,2) | Current inventory value |
| daily_sales_avg | DECIMAL(10,3) | Average daily sales (units) |
| weekly_sales_avg | DECIMAL(10,3) | Average weekly sales (units) |
| monthly_sales_avg | DECIMAL(10,3) | Average monthly sales (units) |
| total_revenue | DECIMAL(10,2) | Total revenue from product |
| avg_margin_percent | DECIMAL(10,2) | Average margin percentage |
| first_sale_date | DATE | Date of first sale |
| last_sale_date | DATE | Date of last sale |
| avg_lead_time_days | DOUBLE PRECISION | Average lead time in days |
| days_of_inventory | INTEGER | Days until stockout at current rate |
| weeks_of_inventory | INTEGER | Weeks until stockout at current rate |
| stock_status | TEXT | Current stock status |
| safety_stock | INTEGER | Recommended safety stock |
| reorder_point | INTEGER | Recommended reorder point |
| reorder_qty | INTEGER | Recommended order quantity |
| overstocked_amt | INTEGER | Units considered excess inventory |
| abc_class | CHAR(1) | ABC classification |
| turnover_rate | DECIMAL(10,2) | Inventory turnover rate |
| forecast_accuracy | DECIMAL(10,2) | Accuracy of past forecasts (%) |
| forecast_bias | DECIMAL(10,2) | Directional bias of forecasts (%) |
| last_forecast_date | DATE | Date of most recent forecast |
| last_calculated_at | TIMESTAMP | Last calculation time |
product_time_aggregates
| Column | Type | Description |
|---|---|---|
| pid | BIGINT | Product ID (PK) |
| year | INTEGER | Year (PK) |
| month | INTEGER | Month (PK) |
| total_quantity_sold | INTEGER | Units sold in period |
| total_revenue | DECIMAL(10,2) | Revenue in period |
| total_cost | DECIMAL(10,2) | Cost in period |
| order_count | INTEGER | Number of orders |
| stock_received | INTEGER | Units received |
| stock_ordered | INTEGER | Units ordered |
| avg_price | DECIMAL(10,2) | Average price |
| profit_margin | DECIMAL(10,2) | Profit margin (%) |
| inventory_value | DECIMAL(10,2) | Inventory value |
| gmroi | DECIMAL(10,3) | Gross Margin Return on Investment |
sales_forecasts
| Column | Type | Description |
|---|---|---|
| pid | BIGINT | Product ID (PK) |
| forecast_date | DATE | Date of forecast (PK) |
| forecast_quantity | DECIMAL(10,2) | Predicted quantity |
| confidence_level | INTEGER | Confidence (60-90%) |
| created_at | TIMESTAMP | Creation time |
category_forecasts
| Column | Type | Description |
|---|---|---|
| category_id | BIGINT | Category ID (PK) |
| forecast_date | DATE | Date of forecast (PK) |
| forecast_units | DECIMAL(10,2) | Predicted units |
| forecast_revenue | DECIMAL(10,2) | Predicted revenue |
| confidence_level | INTEGER | Confidence (60-90%) |
| created_at | TIMESTAMP | Creation time |
vendor_metrics
| Column | Type | Description |
|---|---|---|
| vendor_id | BIGINT | Vendor ID (PK) |
| avg_lead_time | DECIMAL(10,2) | Average lead time in days |
| lead_time_variance | DECIMAL(10,2) | Variance in lead times |
| fill_rate | DECIMAL(10,2) | Percentage of ordered units received |
| on_time_delivery | DECIMAL(10,2) | Percentage of on-time deliveries |
| avg_cost | DECIMAL(10,2) | Average cost per unit |
| cost_variance | DECIMAL(10,2) | Variation in costs (%) |
| total_spend | DECIMAL(10,2) | Total amount spent |
| order_frequency | DECIMAL(10,2) | Average orders per month |
| quality_rating | DECIMAL(10,2) | Rating of product quality |
| vendor_score | DECIMAL(10,2) | Overall performance score |
| last_calculated_at | TIMESTAMP | Last calculation time |
Performance Optimizations
The metrics calculation system employs several optimizations for performance:
1. Batch Processing
Products are processed in configurable batch sizes (default 5,000) to:
- Reduce memory usage
- Improve transaction management
- Enable partial progress in case of failure
Implementation:
// Process updates in batches
let lastPid = 0;
let batchCount = 0;
const MAX_BATCHES = 1000; // Safety limit
const BATCH_SIZE = 5000;
while (batchCount < MAX_BATCHES) {
batchCount++;
const batch = await connection.query(
'SELECT pid FROM products WHERE pid > $1 ORDER BY pid LIMIT $2',
[lastPid, BATCH_SIZE]
);
if (batch.rows.length === 0) break;
// Process the entire batch in a single efficient query
await connection.query(`
UPDATE product_metrics pm
SET /* fields to update */
FROM /* joined tables */
WHERE p.pid = ANY($1::bigint[])
AND pm.pid = p.pid
`, [batch.rows.map(row => row.pid)]);
lastPid = batch.rows[batch.rows.length - 1].pid;
}
2. Temporary Tables
The system creates temporary tables for intermediate calculations to:
- Reduce query complexity
- Improve join performance
- Enable indexing of intermediate results
Example:
// Create temp_sales_metrics
await connection.query(`
CREATE TEMPORARY TABLE temp_sales_metrics (
pid BIGINT NOT NULL,
daily_sales_avg DECIMAL(10,3),
weekly_sales_avg DECIMAL(10,3),
monthly_sales_avg DECIMAL(10,3),
total_revenue DECIMAL(10,3),
avg_margin_percent DECIMAL(10,3),
first_sale_date DATE,
last_sale_date DATE,
PRIMARY KEY (pid)
)
`);
// Create indexes on temporary tables
await connection.query('CREATE INDEX ON temp_revenue_ranks (rank_num)');
3. Query Timeouts
Long-running queries have timeout protection:
await Promise.race([
connection.query(`
INSERT INTO temp_purchase_metrics
/* Complex query here */
`),
new Promise((_, reject) =>
setTimeout(() => reject(new Error('Timeout: query took too long')), 60000)
)
]).catch(async (err) => {
logError(err, 'Error populating table, continuing with fallback');
// Fallback mechanism
});
4. Connection Pooling
Database connections are managed through a connection pool:
const dbConfig = {
// Database connection details
max: 10, // connection pool max size
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 60000
};
const pool = new Pool(dbConfig);
5. Transaction Management
Each module uses appropriate transaction boundaries:
// Start transaction
await connection.query('BEGIN');
try {
// Perform operations...
// Commit transaction
await connection.query('COMMIT');
} catch (error) {
// Rollback on error
await connection.query('ROLLBACK');
throw error;
}
Progress Tracking and Reporting
The system includes a comprehensive progress tracking system that:
- Estimates completion time based on processed items
- Reports progress in real-time
- Persists progress information for UI display
- Handles interruptions gracefully
Progress Output Format
{
status: 'running', // running, complete, error, cancelled
operation: 'Operation description',
current: 1000, // Current number of processed items
total: 10000, // Total items to process
elapsed: '2m 30s', // Formatted elapsed time
remaining: '7m 15s', // Estimated remaining time
rate: 120, // Processing rate (items/second)
percentage: '10.0', // Progress percentage
timing: {
start_time: '2023-03-26T12:00:00Z',
end_time: '2023-03-26T12:02:30Z',
elapsed_seconds: 150
}
}
Key Functions
formatElapsedTime: Converts milliseconds to human-readable format
function formatElapsedTime(elapsed) {
// If elapsed is a timestamp, convert to elapsed milliseconds
if (elapsed instanceof Date || elapsed > 1000000000000) {
elapsed = Date.now() - elapsed;
} else {
// If elapsed is in seconds, convert to milliseconds
elapsed = elapsed * 1000;
}
const seconds = Math.floor(elapsed / 1000);
const minutes = Math.floor(seconds / 60);
const hours = Math.floor(minutes / 60);
if (hours > 0) {
return `${hours}h ${minutes % 60}m`;
} else if (minutes > 0) {
return `${minutes}m ${seconds % 60}s`;
} else {
return `${seconds}s`;
}
}
estimateRemaining: Calculates estimated time to completion
function estimateRemaining(startTime, current, total) {
if (current === 0) return null;
const elapsed = Date.now() - startTime;
const rate = current / elapsed;
const remaining = (total - current) / rate;
const minutes = Math.floor(remaining / 60000);
const seconds = Math.floor((remaining % 60000) / 1000);
if (minutes > 0) {
return `${minutes}m ${seconds}s`;
} else {
return `${seconds}s`;
}
}
outputProgress: Outputs progress information
function outputProgress(data) {
// Save progress to file for resumption
saveProgress(data);
// Format as JSON
const event = {
progress: data
};
// Send to stdout for frontend
process.stdout.write(JSON.stringify(event) + '\n');
// Log significant events to disk
if (isSignificantEvent(data)) {
logImport(`${data.operation || 'Operation'}${data.message ? ': ' + data.message : ''}${data.error ? ' Error: ' + data.error : ''}${data.status ? ' Status: ' + data.status : ''}`);
}
}
Error Handling
The system includes comprehensive error handling:
1. Global Error Handlers
// Add error handler for uncaught exceptions
process.on('uncaughtException', (error) => {
console.error('Uncaught Exception:', error);
process.exit(1);
});
// Add error handler for unhandled promise rejections
process.on('unhandledRejection', (reason, promise) => {
console.error('Unhandled Rejection at:', promise, 'reason:', reason);
process.exit(1);
});
2. Module-Level Error Handling
Each module uses try/catch blocks with detailed error reporting:
try {
// Module operations
} catch (error) {
success = false;
logError(error, 'Error calculating product metrics');
// Update history with error
await connection.query(`
UPDATE calculate_history
SET
end_time = NOW(),
duration_seconds = $1,
processed_products = $2,
processed_orders = $3,
processed_purchase_orders = $4,
status = $5,
error_message = $6
WHERE id = $7
`, [
totalElapsedSeconds,
processedProducts || 0,
processedOrders || 0,
processedPurchaseOrders || 0,
isCancelled ? 'cancelled' : 'failed',
error.message,
calculateHistoryId
]);
throw error;
} finally {
// Cleanup operations
if (connection) {
try {
await cleanupTemporaryTables(connection);
connection.release();
} catch (err) {
console.error('Error in final cleanup:', err);
}
}
}
3. Error Logging
Errors are logged to file and console:
function logError(error, context = '') {
const timestamp = new Date().toISOString();
const errorMessage = `[${timestamp}] ${context}\nError: ${error.message}\nStack: ${error.stack}\n\n`;
// Log to error file
fs.appendFileSync(ERROR_LOG, errorMessage);
// Also log to console
console.error(`\n${context}\nError: ${error.message}`);
}
4. Timeout Management
A global timeout prevents runaway processes:
// Set a maximum execution time (30 minutes)
const MAX_EXECUTION_TIME = 30 * 60 * 1000;
const timeout = setTimeout(() => {
console.error(`Calculation timed out after ${MAX_EXECUTION_TIME/1000} seconds, forcing termination`);
// Call cancel and force exit
cancelCalculation();
process.exit(1);
}, MAX_EXECUTION_TIME);
// Clear timeout when process completes
clearTimeout(timeout);
5. Cancellation Support
Users can cancel long-running calculations:
function cancelCalculation() {
isCancelled = true;
console.log('Calculation has been cancelled by user');
// Force-terminate any query that's been running for more than 5 seconds
try {
const connection = getConnection();
connection.then(async (conn) => {
try {
// Identify and terminate long-running queries from our application
await conn.query(`
SELECT pg_cancel_backend(pid)
FROM pg_stat_activity
WHERE query_start < now() - interval '5 seconds'
AND application_name LIKE '%node%'
AND query NOT LIKE '%pg_cancel_backend%'
`);
// Clean up any temporary tables
await cleanupTemporaryTables(conn);
// Release connection
conn.release();
} catch (err) {
console.error('Error during force cancellation:', err);
conn.release();
}
}).catch(err => {
console.error('Could not get connection for cancellation:', err);
});
} catch (err) {
console.error('Failed to terminate running queries:', err);
}
return {
success: true,
message: 'Calculation has been cancelled'
};
}
Utility Components
Database Utilities (db.js)
Purpose: Manage database connections and provide a connection pool for efficient resource usage.
Key Functions:
async function getConnection() {
return await pool.connect();
}
async function closePool() {
await pool.end();
}
Configuration:
const dbConfig = {
host: process.env.DB_HOST,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
database: process.env.DB_NAME,
port: process.env.DB_PORT || 5432,
ssl: process.env.DB_SSL === 'true',
// Performance optimizations
max: 10, // connection pool max size
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 60000
};
Progress Utilities (progress.js)
Purpose: Provide functions for tracking and reporting calculation progress.
Key Functions:
formatElapsedTime: Convert milliseconds to human-readable formatestimateRemaining: Estimate remaining time based on progresscalculateRate: Calculate processing rate in items per secondlogError: Log errors to file and consoleoutputProgress: Output progress information to multiple channelssaveProgress: Persist progress information to diskclearProgress: Clear saved progress informationgetProgress: Retrieve previously saved progress
Progress Persistence:
function saveProgress(progress) {
try {
fs.writeFileSync(STATUS_FILE, JSON.stringify({
...progress,
timestamp: Date.now()
}));
} catch (err) {
console.error('Failed to save progress:', err);
}
}
function getProgress() {
try {
if (fs.existsSync(STATUS_FILE)) {
const progress = JSON.parse(fs.readFileSync(STATUS_FILE, 'utf8'));
// Check if the progress is still valid (less than 1 hour old)
if (progress.timestamp && Date.now() - progress.timestamp < 3600000) {
return progress;
} else {
// Clear old progress
clearProgress();
}
}
} catch (err) {
console.error('Failed to read progress:', err);
clearProgress();
}
return null;
}
Implementation Notes
Module Dependencies
The main calculate-metrics.js script requires the following modules:
const progress = require('./metrics/utils/progress');
const { getConnection, closePool } = require('./metrics/utils/db');
const calculateProductMetrics = require('./metrics/product-metrics');
const calculateTimeAggregates = require('./metrics/time-aggregates');
const calculateFinancialMetrics = require('./metrics/financial-metrics');
const calculateVendorMetrics = require('./metrics/vendor-metrics');
const calculateCategoryMetrics = require('./metrics/category-metrics');
const calculateBrandMetrics = require('./metrics/brand-metrics');
const calculateSalesForecasts = require('./metrics/sales-forecasts');
Temporary Tables
The system uses these temporary tables during calculation:
const TEMP_TABLES = [
'temp_revenue_ranks',
'temp_sales_metrics',
'temp_purchase_metrics',
'temp_product_metrics',
'temp_vendor_metrics',
'temp_category_metrics',
'temp_brand_metrics',
'temp_forecast_dates',
'temp_daily_sales',
'temp_product_stats',
'temp_category_sales',
'temp_category_stats'
];
Ensuring Valid Progress Data
The system includes checks to ensure progress data is valid:
// Helper function to ensure valid progress numbers
const ensureValidProgress = (current, total) => ({
current: Number(current) || 0,
total: Number(total) || 1, // Default to 1 to avoid division by zero
percentage: (((Number(current) || 0) / (Number(total) || 1)) * 100).toFixed(1)
});
Module Execution Logic
The main calculation function executes modules conditionally:
if (!SKIP_PRODUCT_METRICS) {
const result = await calculateProductMetrics(startTime, totalProducts);
await updateProgress(result.processedProducts, result.processedOrders, result.processedPurchaseOrders);
if (!result.success) {
throw new Error('Product metrics calculation failed');
}
} else {
console.log('Skipping product metrics calculation...');
// Update progress proportionally
}
Command Line Usage
When run from the command line:
// Run directly if called from command line
if (require.main === module) {
calculateMetrics().catch(error => {
if (!error.message.includes('Operation cancelled')) {
console.error('Error:', error);
}
process.exit(1);
});
}
Module Export
The script exports functions for use by other components:
module.exports = {
calculateMetrics,
cancelCalculation,
getProgress: global.getProgress
};