Files
inventory/docs/metrics-calculation-system.md

1065 lines
39 KiB
Markdown

# Metrics Calculation System Documentation
This document provides a comprehensive description of the metrics calculation system in the inventory management application, detailing the calculation modules, data transformations, formulas, and overall architecture.
## Table of Contents
1. [Overview](#overview)
2. [System Architecture](#system-architecture)
3. [Calculation Process](#calculation-process)
4. [Metrics Modules](#metrics-modules)
- [Product Metrics](#product-metrics)
- [Time Aggregates](#time-aggregates)
- [Financial Metrics](#financial-metrics)
- [Vendor Metrics](#vendor-metrics)
- [Category Metrics](#category-metrics)
- [Brand Metrics](#brand-metrics)
- [Sales Forecasts](#sales-forecasts)
5. [Database Schema](#database-schema)
6. [Performance Optimizations](#performance-optimizations)
7. [Progress Tracking and Reporting](#progress-tracking-and-reporting)
8. [Error Handling](#error-handling)
9. [Utility Components](#utility-components)
10. [Implementation Notes](#implementation-notes)
## Overview
The metrics calculation system processes inventory, sales, and purchase data to generate insights and forecasts for business decision-making. It powers dashboards, reports, and automated inventory recommendations that are core to the business operations.
The system can operate in two modes:
- **Complete Calculation**: Calculates all metrics from scratch
- **Selective Calculation**: Calculates only specified metric types
Key features of the system include:
- Progress tracking with estimated completion time
- Cancellation capability for long-running operations
- Fault tolerance with automatic recovery
- Temporary table usage for optimal performance
- Batched processing for memory efficiency
- Multiple calculation modules that can run independently
## System Architecture
The calculation system follows a modular architecture with these components:
```
├── calculate-metrics.js # Main orchestration script
├── metrics/ # Calculation modules
│ ├── product-metrics.js # Product-level metrics calculation
│ ├── time-aggregates.js # Time-based aggregation calculations
│ ├── financial-metrics.js # Financial performance metrics
│ ├── vendor-metrics.js # Supplier/vendor analytics
│ ├── category-metrics.js # Category-level performance metrics
│ ├── brand-metrics.js # Brand performance analytics
│ ├── sales-forecasts.js # Future sales prediction
│ └── utils/ # Shared utilities
│ ├── db.js # Database connection management
│ └── progress.js # Progress tracking utilities
```
The system uses PostgreSQL for data storage and processing, leveraging:
- Temporary tables for intermediate calculations
- Database transactions for data consistency
- SQL window functions for ranking and trends
- Batch processing for memory efficiency
## Calculation Process
The calculation process follows a sequential flow with the following steps:
1. **Initialization**
- Establish database connection
- Create calculation history record
- Set up progress tracking
- Prepare environment
2. **Module Execution**
- Product Metrics: Calculate inventory metrics and ABC classification
- Time Aggregates: Calculate time-series data aggregations
- Financial Metrics: Calculate profit, margin, and ROI metrics
- Vendor Metrics: Calculate supplier performance metrics
- Category Metrics: Calculate category performance metrics
- Brand Metrics: Calculate brand performance metrics
- Sales Forecasts: Generate future sales predictions
3. **Finalization**
- Clean up temporary tables
- Update calculation history status
- Record completion time
- Release database connections
Each module creates its own temporary tables, performs calculations, updates the appropriate destination tables, and then cleans up. Modules can be enabled or disabled through configuration flags:
```javascript
// Configuration flags for controlling which metrics to calculate
// Set to 1 to skip the corresponding calculation, 0 to run it
const SKIP_PRODUCT_METRICS = 0;
const SKIP_TIME_AGGREGATES = 0;
const SKIP_FINANCIAL_METRICS = 0;
const SKIP_VENDOR_METRICS = 0;
const SKIP_CATEGORY_METRICS = 0;
const SKIP_BRAND_METRICS = 0;
const SKIP_SALES_FORECASTS = 0;
```
## Metrics Modules
### Product Metrics
**Purpose**: Calculate core metrics for individual products including stock status, sales rates, inventory classifications, and reorder recommendations.
**Source Tables**:
- `products`: Core product information and current stock
- `orders`: Sales order data
- `purchase_orders`: Supplier orders
**Destination Tables**:
- `product_metrics`: Main repository for calculated metrics
- `abc_classification`: Product importance classification
**Key Calculations**:
| Metric | Formula/Logic | Description |
|--------|--------------|-------------|
| Inventory Value | `stock_quantity * cost_price` | Current value of inventory at cost |
| Daily Sales Avg | `SUM(quantity) / COUNT(DISTINCT DATE(date))` | Average daily unit sales in past 90 days |
| Weekly Sales Avg | `SUM(quantity) / CEIL(COUNT(DISTINCT DATE(date)) / 7)` | Average weekly unit sales in past 90 days |
| Monthly Sales Avg | `SUM(quantity) / CEIL(COUNT(DISTINCT DATE(date)) / 30)` | Average monthly unit sales in past 90 days |
| Total Revenue | `SUM(quantity * price)` | Total revenue in past 90 days |
| Avg Margin Percent | `((SUM(quantity * price) - SUM(quantity * cost_price)) / SUM(quantity * price)) * 100` | Average margin percentage |
| Avg Lead Time | `AVG(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0)` | Average lead time in days from order to receipt |
| Days of Inventory | `floor(stock_quantity / daily_sales_avg)` | How many days current stock will last |
| Weeks of Inventory | `floor(stock_quantity / weekly_sales_avg)` | How many weeks current stock will last |
| Stock Status | Complex conditional logic (see below) | Current inventory status category |
| Safety Stock | `CEIL(daily_sales_avg * SQRT(avg_lead_time_days) * 1.96)` | Buffer stock to maintain service level (95%) |
| Reorder Point | `CEIL(daily_sales_avg * avg_lead_time_days) + safety_stock` | When to reorder based on lead time and safety stock |
| Reorder Qty | `SQRT((2 * annual_demand * order_cost) / holding_cost)` | Economic Order Quantity (EOQ) formula |
| Overstocked Amt | `MAX(0, stock_quantity - (daily_sales_avg * overstock_days))` | Units considered excess inventory |
| ABC Class | Percentile ranking by revenue | 'A': top 20%, 'B': 21-50%, 'C': remainder |
| Turnover Rate | `(total_sold / avg_nonzero_stock) * (365.0 / active_days)` | Inventory turnover rate adjusted for zero-stock periods |
| Forecast Accuracy | `100 - MIN(AVG(ABS(forecast_quantity - actual_quantity) / actual_quantity * 100), 100)` | Historic forecast accuracy percentage |
| Forecast Bias | `AVG((forecast_quantity - actual_quantity) / actual_quantity * 100)` | Directional bias of forecasts (positive = over-forecasting) |
**Stock Status Logic**:
```sql
CASE
WHEN stock_quantity <= 0 THEN 'Out of Stock'
WHEN daily_sales_avg = 0 AND stock_quantity <= low_stock_threshold THEN 'Low Stock'
WHEN daily_sales_avg = 0 THEN 'In Stock'
WHEN stock_quantity / daily_sales_avg <= critical_days THEN 'Critical'
WHEN stock_quantity / daily_sales_avg <= reorder_days THEN 'Reorder'
WHEN stock_quantity / daily_sales_avg > overstock_days THEN 'Overstocked'
ELSE 'Healthy'
END
```
**Process Flow**:
1. Create temporary tables for sales and purchase metrics
2. Calculate sales averages by product
3. Calculate lead times and purchase metrics
4. Update product_metrics with calculated values
5. Calculate ABC classification using revenue ranking
6. Calculate turnover rates with zero-stock period adjustment
7. Calculate forecast accuracy and bias using historical data
**Optimizations**:
- Batch processing of products (5,000 per batch)
- Timeout protection for long-running queries
- Indexed temporary tables for performance
- Strategic transaction management
### Time Aggregates
**Purpose**: Aggregate sales and purchase data by time periods (year/month) for trend analysis and historical reporting.
**Source Tables**:
- `products`: Core product information
- `orders`: Sales order data
- `purchase_orders`: Supplier orders
**Destination Tables**:
- `product_time_aggregates`: Time-based metrics by product, year, and month
**Key Calculations**:
| Metric | Formula/Logic | Description |
|--------|--------------|-------------|
| Total Quantity Sold | `SUM(quantity)` | Units sold in the time period |
| Total Revenue | `SUM((price - discount) * quantity)` | Net revenue in the time period |
| Total Cost | `SUM(cost_price * quantity)` | Cost of goods sold in time period |
| Order Count | `COUNT(DISTINCT order_number)` | Number of orders in the time period |
| Stock Received | `SUM(received)` | Units received from suppliers in time period |
| Stock Ordered | `SUM(ordered)` | Units ordered from suppliers in time period |
| Avg Price | `AVG(price - discount)` | Average net selling price in time period |
| Profit Margin | `((total_revenue - total_cost) / total_revenue) * 100` | Percentage profit margin in time period |
| GMROI | `((total_revenue - total_cost) * (365.0 / active_days)) / inventory_value` | Gross Margin Return on Inventory Investment |
**Process Flow**:
1. Calculate monthly aggregates for sales data
2. Calculate monthly aggregates for purchase data
3. Combine sales and purchase data for complete monthly picture
4. Update financial metrics for each time period
5. Calculate efficiency metrics like GMROI
**SQL Example (Monthly Aggregation)**:
```sql
WITH monthly_sales AS (
SELECT
o.pid,
EXTRACT(YEAR FROM o.date) as year,
EXTRACT(MONTH FROM o.date) as month,
SUM(o.quantity) as total_quantity_sold,
SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) as total_revenue,
SUM(COALESCE(p.cost_price, 0) * o.quantity) as total_cost,
COUNT(DISTINCT o.order_number) as order_count,
AVG(o.price - COALESCE(o.discount, 0)) as avg_price,
-- Profit margin calculation
CASE
WHEN SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) > 0
THEN ((SUM((o.price - COALESCE(o.discount, 0)) * o.quantity) - SUM(COALESCE(p.cost_price, 0) * o.quantity))
/ SUM((o.price - COALESCE(o.discount, 0)) * o.quantity)) * 100
ELSE 0
END as profit_margin,
p.cost_price * p.stock_quantity as inventory_value,
COUNT(DISTINCT DATE(o.date)) as active_days
FROM orders o
JOIN products p ON o.pid = p.pid
WHERE o.canceled = false
GROUP BY o.pid, EXTRACT(YEAR FROM o.date), EXTRACT(MONTH FROM o.date), p.cost_price, p.stock_quantity
)
-- Insert results into product_time_aggregates table
```
### Financial Metrics
**Purpose**: Calculate profit, revenue, margin, and return on investment metrics for financial analysis.
**Source Tables**:
- `products`: Core product information
- `orders`: Sales order data
- `product_metrics`: Previously calculated product metrics
**Destination Tables**:
- `financial_metrics`: Financial performance metrics
- `product_metrics`: Updated with financial calculations
**Key Calculations**:
| Metric | Formula/Logic | Description |
|--------|--------------|-------------|
| Gross Profit | `SUM(quantity * (price - cost_price))` | Total profit across all sales |
| Gross Margin | `(gross_profit / SUM(quantity * price)) * 100` | Profit margin percentage |
| GMROI | `gross_profit / AVG(inventory_value)` | Gross Margin Return on Inventory Investment |
| GMROII | `GMROI / inventory_turns` | GMROI Index (normalized by inventory turns) |
| Revenue per Unit | `SUM(price * quantity) / SUM(quantity)` | Average revenue per unit sold |
| Cost per Unit | `SUM(cost_price * quantity) / SUM(quantity)` | Average cost per unit sold |
| COGS | `SUM(cost_price * quantity)` | Cost of Goods Sold total |
| Total Discounts | `SUM(discount * quantity)` | Total discount amount |
| Discount Rate | `total_discounts / SUM(price * quantity)` | Percentage of revenue discounted |
| Revenue Rank | `RANK() OVER (ORDER BY total_revenue DESC)` | Revenue ranking compared to other products |
| Profit Rank | `RANK() OVER (ORDER BY gross_profit DESC)` | Profit ranking compared to other products |
| ROI | `(gross_profit / SUM(cost_price * quantity)) * 100` | Return on Investment percentage |
**Process Flow**:
1. Calculate base financial metrics for each product
2. Calculate comparative rankings across products
3. Calculate inventory efficiency metrics
4. Update appropriate destination tables
### Vendor Metrics
**Purpose**: Analyze supplier performance including lead times, reliability, and cost metrics.
**Source Tables**:
- `products`: Core product information
- `purchase_orders`: Supplier orders
- `vendors`: Supplier information
**Destination Tables**:
- `vendor_metrics`: Supplier performance metrics
- `vendor_time_aggregates`: Time-based supplier metrics
**Key Calculations**:
| Metric | Formula/Logic | Description |
|--------|--------------|-------------|
| Avg Lead Time | `AVG(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0)` | Average lead time in days |
| Lead Time Variance | `STDDEV(EXTRACT(EPOCH FROM (received_date - date)) / 86400.0)` | Variance in lead times |
| Fill Rate | `SUM(received) / NULLIF(SUM(ordered), 0) * 100` | Percentage of ordered units received |
| On-Time Delivery | `COUNT(CASE WHEN received_date <= expected_date THEN 1 END) / COUNT(*) * 100` | Percentage of on-time deliveries |
| Avg Cost | `AVG(cost_price)` | Average cost per unit |
| Cost Variance | `STDDEV(cost_price) / AVG(cost_price) * 100` | Variation in costs (%) |
| Total Spend | `SUM(ordered * cost_price)` | Total amount spent with vendor |
| Order Frequency | `COUNT(DISTINCT po_id) / COUNT(DISTINCT DATE_TRUNC('month', date))` | Average orders per month |
| Quality Rating | Complex formula based on returns | Rating of product quality from vendor |
| Vendor Score | Weighted score of all metrics | Overall performance score |
**Process Flow**:
1. Calculate lead time metrics by vendor
2. Calculate fill rate and delivery metrics
3. Calculate cost and spend metrics
4. Calculate quality and reliability metrics
5. Generate overall vendor score
6. Update time-based aggregates by month
### Category Metrics
**Purpose**: Analyze performance metrics at the category level for merchandising insights.
**Source Tables**:
- `products`: Core product information
- `product_categories`: Category relationships
- `orders`: Sales order data
- `categories`: Category information
**Destination Tables**:
- `category_metrics`: Category performance metrics
- `category_time_aggregates`: Time-based category metrics
**Key Calculations**:
| Metric | Formula/Logic | Description |
|--------|--------------|-------------|
| Total Products | `COUNT(DISTINCT p.pid)` | Number of products in category |
| Active Products | `COUNT(DISTINCT CASE WHEN o.pid IS NOT NULL THEN p.pid END)` | Products with sales in period |
| Total Revenue | `SUM(o.quantity * o.price)` | Total revenue from category |
| Total Units | `SUM(o.quantity)` | Total units sold from category |
| Avg Price | `SUM(o.quantity * o.price) / SUM(o.quantity)` | Average selling price in category |
| Gross Profit | `SUM(o.quantity * (o.price - p.cost_price))` | Total profit from category |
| Margin Percent | `(gross_profit / total_revenue) * 100` | Profit margin percentage |
| Inventory Value | `SUM(p.stock_quantity * p.cost_price)` | Current inventory value at cost |
| Inventory Turns | `COGS / AVG(inventory_value)` | How many times inventory turns over |
| Revenue Growth | `(current_period_revenue / previous_period_revenue - 1) * 100` | Revenue growth percentage |
| Category Share | `category_revenue / total_revenue * 100` | Percentage of total revenue |
| Avg Weekly Sales | `SUM(o.quantity) / COUNT(DISTINCT DATE_TRUNC('week', o.date))` | Average weekly units sold |
**Process Flow**:
1. Calculate category product counts
2. Calculate category sales metrics
3. Calculate category profitability
4. Calculate category growth rates
5. Calculate category share metrics
6. Update time-based aggregates by month
### Brand Metrics
**Purpose**: Analyze performance metrics at the brand level for brand management insights.
**Source Tables**:
- `products`: Core product information
- `product_categories`: Brand relationships (brands are stored as categories)
- `orders`: Sales order data
**Destination Tables**:
- `brand_metrics`: Brand performance metrics
- `brand_time_aggregates`: Time-based brand metrics
**Key Calculations**:
| Metric | Formula/Logic | Description |
|--------|--------------|-------------|
| Total Products | `COUNT(DISTINCT p.pid)` | Number of products in brand |
| Active Products | `COUNT(DISTINCT CASE WHEN o.pid IS NOT NULL THEN p.pid END)` | Products with sales in period |
| Total Revenue | `SUM(o.quantity * o.price)` | Total revenue from brand |
| Total Units | `SUM(o.quantity)` | Total units sold from brand |
| Avg Price | `SUM(o.quantity * o.price) / SUM(o.quantity)` | Average selling price in brand |
| Gross Profit | `SUM(o.quantity * (o.price - p.cost_price))` | Total profit from brand |
| Margin Percent | `(gross_profit / total_revenue) * 100` | Profit margin percentage |
| Inventory Value | `SUM(p.stock_quantity * p.cost_price)` | Current inventory value at cost |
| Inventory Turns | `COGS / AVG(inventory_value)` | How many times inventory turns over |
| Revenue Growth | `(current_period_revenue / previous_period_revenue - 1) * 100` | Revenue growth percentage |
| Brand Share | `brand_revenue / total_revenue * 100` | Percentage of total revenue |
| Brand Velocity | `SUM(o.quantity) / COUNT(DISTINCT DATE(o.date))` | Daily sales velocity |
**Process Flow**:
1. Calculate brand product counts
2. Calculate brand sales metrics
3. Calculate brand profitability
4. Calculate brand growth rates
5. Calculate brand share metrics
6. Update time-based aggregates by month
### Sales Forecasts
**Purpose**: Generate sales forecasts for future periods using historical data and seasonality factors.
**Source Tables**:
- `products`: Core product information
- `orders`: Sales order data
- `product_categories`: Category relationships
- `sales_seasonality`: Seasonal adjustment factors
**Destination Tables**:
- `sales_forecasts`: Product-level sales forecasts
- `category_forecasts`: Category-level sales forecasts
**Key Calculations**:
| Metric | Formula/Logic | Description |
|--------|--------------|-------------|
| Daily Sales Avg | `AVG(daily_quantity)` for each day of week | Average sales by day of week |
| Sales Std Dev | `STDDEV(daily_quantity)` | Standard deviation of daily sales |
| Sales Variance Ratio | `STDDEV(daily_quantity) / AVG(daily_quantity)` | Measure of sales variability |
| Forecast Quantity | `avg_daily_qty * (1 + seasonality_factor) * confidence_adjustment` | Predicted daily sales |
| Confidence Level | Complex scale based on data quality | Confidence in forecast (60-90%) |
| Seasonality Factor | Monthly adjustment factor | Sales variation by month (-30% to +30%) |
| Forecast Revenue | `forecast_quantity * avg_price` | Predicted daily revenue |
| Category Forecast | Sum of product forecasts with adjustments | Forecast at category level |
**Confidence Level Calculation**:
```sql
CASE
WHEN total_days >= 60 AND daily_variance_ratio < 0.5 THEN 90
WHEN total_days >= 60 THEN 85
WHEN total_days >= 30 AND daily_variance_ratio < 0.5 THEN 80
WHEN total_days >= 30 THEN 75
WHEN total_days >= 14 AND daily_variance_ratio < 0.5 THEN 70
WHEN total_days >= 14 THEN 65
ELSE 60
END
```
**Process Flow**:
1. Create temporary tables for forecast dates (next 30 days)
2. Calculate daily sales statistics by product and day of week
3. Calculate product-level statistics (variability, seasonality)
4. Generate product-level forecasts with confidence levels
5. Aggregate to category level forecasts
6. Apply seasonality adjustments
7. Store forecasts in respective tables
**Method Details**:
- Uses a combination of time-series analysis and statistical methods
- Applies day-of-week patterns from historical data
- Adjusts for seasonal variations by month
- Accounts for data quality in confidence calculation
- Handles products with intermittent demand differently
- Provides both unit and revenue forecasts
## Database Schema
### Main Tables
#### calculate_history
| Column | Type | Description |
|--------|------|-------------|
| id | SERIAL | Primary key |
| start_time | TIMESTAMP | Calculation start time |
| end_time | TIMESTAMP | Calculation end time |
| duration_seconds | INTEGER | Elapsed time in seconds |
| status | TEXT | 'running', 'completed', 'cancelled', 'failed' |
| total_products | INTEGER | Total products processed |
| total_orders | INTEGER | Total orders processed |
| total_purchase_orders | INTEGER | Total purchase orders processed |
| processed_products | INTEGER | Number of processed products |
| processed_orders | INTEGER | Number of processed orders |
| processed_purchase_orders | INTEGER | Number of processed purchase orders |
| error_message | TEXT | Error message if failed |
| additional_info | JSONB | Configuration and results |
#### calculate_status
| Column | Type | Description |
|--------|------|-------------|
| module_name | TEXT | Name of calculation module |
| last_calculation_timestamp | TIMESTAMP | When module last completed |
#### product_metrics
| Column | Type | Description |
|--------|------|-------------|
| pid | BIGINT | Product ID (PK) |
| inventory_value | DECIMAL(10,2) | Current inventory value |
| daily_sales_avg | DECIMAL(10,3) | Average daily sales (units) |
| weekly_sales_avg | DECIMAL(10,3) | Average weekly sales (units) |
| monthly_sales_avg | DECIMAL(10,3) | Average monthly sales (units) |
| total_revenue | DECIMAL(10,2) | Total revenue from product |
| avg_margin_percent | DECIMAL(10,2) | Average margin percentage |
| first_sale_date | DATE | Date of first sale |
| last_sale_date | DATE | Date of last sale |
| avg_lead_time_days | DOUBLE PRECISION | Average lead time in days |
| days_of_inventory | INTEGER | Days until stockout at current rate |
| weeks_of_inventory | INTEGER | Weeks until stockout at current rate |
| stock_status | TEXT | Current stock status |
| safety_stock | INTEGER | Recommended safety stock |
| reorder_point | INTEGER | Recommended reorder point |
| reorder_qty | INTEGER | Recommended order quantity |
| overstocked_amt | INTEGER | Units considered excess inventory |
| abc_class | CHAR(1) | ABC classification |
| turnover_rate | DECIMAL(10,2) | Inventory turnover rate |
| forecast_accuracy | DECIMAL(10,2) | Accuracy of past forecasts (%) |
| forecast_bias | DECIMAL(10,2) | Directional bias of forecasts (%) |
| last_forecast_date | DATE | Date of most recent forecast |
| last_calculated_at | TIMESTAMP | Last calculation time |
#### product_time_aggregates
| Column | Type | Description |
|--------|------|-------------|
| pid | BIGINT | Product ID (PK) |
| year | INTEGER | Year (PK) |
| month | INTEGER | Month (PK) |
| total_quantity_sold | INTEGER | Units sold in period |
| total_revenue | DECIMAL(10,2) | Revenue in period |
| total_cost | DECIMAL(10,2) | Cost in period |
| order_count | INTEGER | Number of orders |
| stock_received | INTEGER | Units received |
| stock_ordered | INTEGER | Units ordered |
| avg_price | DECIMAL(10,2) | Average price |
| profit_margin | DECIMAL(10,2) | Profit margin (%) |
| inventory_value | DECIMAL(10,2) | Inventory value |
| gmroi | DECIMAL(10,3) | Gross Margin Return on Investment |
#### sales_forecasts
| Column | Type | Description |
|--------|------|-------------|
| pid | BIGINT | Product ID (PK) |
| forecast_date | DATE | Date of forecast (PK) |
| forecast_quantity | DECIMAL(10,2) | Predicted quantity |
| confidence_level | INTEGER | Confidence (60-90%) |
| created_at | TIMESTAMP | Creation time |
#### category_forecasts
| Column | Type | Description |
|--------|------|-------------|
| category_id | BIGINT | Category ID (PK) |
| forecast_date | DATE | Date of forecast (PK) |
| forecast_units | DECIMAL(10,2) | Predicted units |
| forecast_revenue | DECIMAL(10,2) | Predicted revenue |
| confidence_level | INTEGER | Confidence (60-90%) |
| created_at | TIMESTAMP | Creation time |
#### vendor_metrics
| Column | Type | Description |
|--------|------|-------------|
| vendor_id | BIGINT | Vendor ID (PK) |
| avg_lead_time | DECIMAL(10,2) | Average lead time in days |
| lead_time_variance | DECIMAL(10,2) | Variance in lead times |
| fill_rate | DECIMAL(10,2) | Percentage of ordered units received |
| on_time_delivery | DECIMAL(10,2) | Percentage of on-time deliveries |
| avg_cost | DECIMAL(10,2) | Average cost per unit |
| cost_variance | DECIMAL(10,2) | Variation in costs (%) |
| total_spend | DECIMAL(10,2) | Total amount spent |
| order_frequency | DECIMAL(10,2) | Average orders per month |
| quality_rating | DECIMAL(10,2) | Rating of product quality |
| vendor_score | DECIMAL(10,2) | Overall performance score |
| last_calculated_at | TIMESTAMP | Last calculation time |
## Performance Optimizations
The metrics calculation system employs several optimizations for performance:
### 1. Batch Processing
Products are processed in configurable batch sizes (default 5,000) to:
- Reduce memory usage
- Improve transaction management
- Enable partial progress in case of failure
**Implementation**:
```javascript
// Process updates in batches
let lastPid = 0;
let batchCount = 0;
const MAX_BATCHES = 1000; // Safety limit
const BATCH_SIZE = 5000;
while (batchCount < MAX_BATCHES) {
batchCount++;
const batch = await connection.query(
'SELECT pid FROM products WHERE pid > $1 ORDER BY pid LIMIT $2',
[lastPid, BATCH_SIZE]
);
if (batch.rows.length === 0) break;
// Process the entire batch in a single efficient query
await connection.query(`
UPDATE product_metrics pm
SET /* fields to update */
FROM /* joined tables */
WHERE p.pid = ANY($1::bigint[])
AND pm.pid = p.pid
`, [batch.rows.map(row => row.pid)]);
lastPid = batch.rows[batch.rows.length - 1].pid;
}
```
### 2. Temporary Tables
The system creates temporary tables for intermediate calculations to:
- Reduce query complexity
- Improve join performance
- Enable indexing of intermediate results
**Example**:
```javascript
// Create temp_sales_metrics
await connection.query(`
CREATE TEMPORARY TABLE temp_sales_metrics (
pid BIGINT NOT NULL,
daily_sales_avg DECIMAL(10,3),
weekly_sales_avg DECIMAL(10,3),
monthly_sales_avg DECIMAL(10,3),
total_revenue DECIMAL(10,3),
avg_margin_percent DECIMAL(10,3),
first_sale_date DATE,
last_sale_date DATE,
PRIMARY KEY (pid)
)
`);
// Create indexes on temporary tables
await connection.query('CREATE INDEX ON temp_revenue_ranks (rank_num)');
```
### 3. Query Timeouts
Long-running queries have timeout protection:
```javascript
await Promise.race([
connection.query(`
INSERT INTO temp_purchase_metrics
/* Complex query here */
`),
new Promise((_, reject) =>
setTimeout(() => reject(new Error('Timeout: query took too long')), 60000)
)
]).catch(async (err) => {
logError(err, 'Error populating table, continuing with fallback');
// Fallback mechanism
});
```
### 4. Connection Pooling
Database connections are managed through a connection pool:
```javascript
const dbConfig = {
// Database connection details
max: 10, // connection pool max size
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 60000
};
const pool = new Pool(dbConfig);
```
### 5. Transaction Management
Each module uses appropriate transaction boundaries:
```javascript
// Start transaction
await connection.query('BEGIN');
try {
// Perform operations...
// Commit transaction
await connection.query('COMMIT');
} catch (error) {
// Rollback on error
await connection.query('ROLLBACK');
throw error;
}
```
## Progress Tracking and Reporting
The system includes a comprehensive progress tracking system that:
1. Estimates completion time based on processed items
2. Reports progress in real-time
3. Persists progress information for UI display
4. Handles interruptions gracefully
### Progress Output Format
```javascript
{
status: 'running', // running, complete, error, cancelled
operation: 'Operation description',
current: 1000, // Current number of processed items
total: 10000, // Total items to process
elapsed: '2m 30s', // Formatted elapsed time
remaining: '7m 15s', // Estimated remaining time
rate: 120, // Processing rate (items/second)
percentage: '10.0', // Progress percentage
timing: {
start_time: '2023-03-26T12:00:00Z',
end_time: '2023-03-26T12:02:30Z',
elapsed_seconds: 150
}
}
```
### Key Functions
**formatElapsedTime**: Converts milliseconds to human-readable format
```javascript
function formatElapsedTime(elapsed) {
// If elapsed is a timestamp, convert to elapsed milliseconds
if (elapsed instanceof Date || elapsed > 1000000000000) {
elapsed = Date.now() - elapsed;
} else {
// If elapsed is in seconds, convert to milliseconds
elapsed = elapsed * 1000;
}
const seconds = Math.floor(elapsed / 1000);
const minutes = Math.floor(seconds / 60);
const hours = Math.floor(minutes / 60);
if (hours > 0) {
return `${hours}h ${minutes % 60}m`;
} else if (minutes > 0) {
return `${minutes}m ${seconds % 60}s`;
} else {
return `${seconds}s`;
}
}
```
**estimateRemaining**: Calculates estimated time to completion
```javascript
function estimateRemaining(startTime, current, total) {
if (current === 0) return null;
const elapsed = Date.now() - startTime;
const rate = current / elapsed;
const remaining = (total - current) / rate;
const minutes = Math.floor(remaining / 60000);
const seconds = Math.floor((remaining % 60000) / 1000);
if (minutes > 0) {
return `${minutes}m ${seconds}s`;
} else {
return `${seconds}s`;
}
}
```
**outputProgress**: Outputs progress information
```javascript
function outputProgress(data) {
// Save progress to file for resumption
saveProgress(data);
// Format as JSON
const event = {
progress: data
};
// Send to stdout for frontend
process.stdout.write(JSON.stringify(event) + '\n');
// Log significant events to disk
if (isSignificantEvent(data)) {
logImport(`${data.operation || 'Operation'}${data.message ? ': ' + data.message : ''}${data.error ? ' Error: ' + data.error : ''}${data.status ? ' Status: ' + data.status : ''}`);
}
}
```
## Error Handling
The system includes comprehensive error handling:
### 1. Global Error Handlers
```javascript
// Add error handler for uncaught exceptions
process.on('uncaughtException', (error) => {
console.error('Uncaught Exception:', error);
process.exit(1);
});
// Add error handler for unhandled promise rejections
process.on('unhandledRejection', (reason, promise) => {
console.error('Unhandled Rejection at:', promise, 'reason:', reason);
process.exit(1);
});
```
### 2. Module-Level Error Handling
Each module uses try/catch blocks with detailed error reporting:
```javascript
try {
// Module operations
} catch (error) {
success = false;
logError(error, 'Error calculating product metrics');
// Update history with error
await connection.query(`
UPDATE calculate_history
SET
end_time = NOW(),
duration_seconds = $1,
processed_products = $2,
processed_orders = $3,
processed_purchase_orders = $4,
status = $5,
error_message = $6
WHERE id = $7
`, [
totalElapsedSeconds,
processedProducts || 0,
processedOrders || 0,
processedPurchaseOrders || 0,
isCancelled ? 'cancelled' : 'failed',
error.message,
calculateHistoryId
]);
throw error;
} finally {
// Cleanup operations
if (connection) {
try {
await cleanupTemporaryTables(connection);
connection.release();
} catch (err) {
console.error('Error in final cleanup:', err);
}
}
}
```
### 3. Error Logging
Errors are logged to file and console:
```javascript
function logError(error, context = '') {
const timestamp = new Date().toISOString();
const errorMessage = `[${timestamp}] ${context}\nError: ${error.message}\nStack: ${error.stack}\n\n`;
// Log to error file
fs.appendFileSync(ERROR_LOG, errorMessage);
// Also log to console
console.error(`\n${context}\nError: ${error.message}`);
}
```
### 4. Timeout Management
A global timeout prevents runaway processes:
```javascript
// Set a maximum execution time (30 minutes)
const MAX_EXECUTION_TIME = 30 * 60 * 1000;
const timeout = setTimeout(() => {
console.error(`Calculation timed out after ${MAX_EXECUTION_TIME/1000} seconds, forcing termination`);
// Call cancel and force exit
cancelCalculation();
process.exit(1);
}, MAX_EXECUTION_TIME);
// Clear timeout when process completes
clearTimeout(timeout);
```
### 5. Cancellation Support
Users can cancel long-running calculations:
```javascript
function cancelCalculation() {
isCancelled = true;
console.log('Calculation has been cancelled by user');
// Force-terminate any query that's been running for more than 5 seconds
try {
const connection = getConnection();
connection.then(async (conn) => {
try {
// Identify and terminate long-running queries from our application
await conn.query(`
SELECT pg_cancel_backend(pid)
FROM pg_stat_activity
WHERE query_start < now() - interval '5 seconds'
AND application_name LIKE '%node%'
AND query NOT LIKE '%pg_cancel_backend%'
`);
// Clean up any temporary tables
await cleanupTemporaryTables(conn);
// Release connection
conn.release();
} catch (err) {
console.error('Error during force cancellation:', err);
conn.release();
}
}).catch(err => {
console.error('Could not get connection for cancellation:', err);
});
} catch (err) {
console.error('Failed to terminate running queries:', err);
}
return {
success: true,
message: 'Calculation has been cancelled'
};
}
```
## Utility Components
### Database Utilities (db.js)
**Purpose**: Manage database connections and provide a connection pool for efficient resource usage.
**Key Functions**:
```javascript
async function getConnection() {
return await pool.connect();
}
async function closePool() {
await pool.end();
}
```
**Configuration**:
```javascript
const dbConfig = {
host: process.env.DB_HOST,
user: process.env.DB_USER,
password: process.env.DB_PASSWORD,
database: process.env.DB_NAME,
port: process.env.DB_PORT || 5432,
ssl: process.env.DB_SSL === 'true',
// Performance optimizations
max: 10, // connection pool max size
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 60000
};
```
### Progress Utilities (progress.js)
**Purpose**: Provide functions for tracking and reporting calculation progress.
**Key Functions**:
- `formatElapsedTime`: Convert milliseconds to human-readable format
- `estimateRemaining`: Estimate remaining time based on progress
- `calculateRate`: Calculate processing rate in items per second
- `logError`: Log errors to file and console
- `outputProgress`: Output progress information to multiple channels
- `saveProgress`: Persist progress information to disk
- `clearProgress`: Clear saved progress information
- `getProgress`: Retrieve previously saved progress
**Progress Persistence**:
```javascript
function saveProgress(progress) {
try {
fs.writeFileSync(STATUS_FILE, JSON.stringify({
...progress,
timestamp: Date.now()
}));
} catch (err) {
console.error('Failed to save progress:', err);
}
}
function getProgress() {
try {
if (fs.existsSync(STATUS_FILE)) {
const progress = JSON.parse(fs.readFileSync(STATUS_FILE, 'utf8'));
// Check if the progress is still valid (less than 1 hour old)
if (progress.timestamp && Date.now() - progress.timestamp < 3600000) {
return progress;
} else {
// Clear old progress
clearProgress();
}
}
} catch (err) {
console.error('Failed to read progress:', err);
clearProgress();
}
return null;
}
```
## Implementation Notes
### Module Dependencies
The main `calculate-metrics.js` script requires the following modules:
```javascript
const progress = require('./metrics/utils/progress');
const { getConnection, closePool } = require('./metrics/utils/db');
const calculateProductMetrics = require('./metrics/product-metrics');
const calculateTimeAggregates = require('./metrics/time-aggregates');
const calculateFinancialMetrics = require('./metrics/financial-metrics');
const calculateVendorMetrics = require('./metrics/vendor-metrics');
const calculateCategoryMetrics = require('./metrics/category-metrics');
const calculateBrandMetrics = require('./metrics/brand-metrics');
const calculateSalesForecasts = require('./metrics/sales-forecasts');
```
### Temporary Tables
The system uses these temporary tables during calculation:
```javascript
const TEMP_TABLES = [
'temp_revenue_ranks',
'temp_sales_metrics',
'temp_purchase_metrics',
'temp_product_metrics',
'temp_vendor_metrics',
'temp_category_metrics',
'temp_brand_metrics',
'temp_forecast_dates',
'temp_daily_sales',
'temp_product_stats',
'temp_category_sales',
'temp_category_stats'
];
```
### Ensuring Valid Progress Data
The system includes checks to ensure progress data is valid:
```javascript
// Helper function to ensure valid progress numbers
const ensureValidProgress = (current, total) => ({
current: Number(current) || 0,
total: Number(total) || 1, // Default to 1 to avoid division by zero
percentage: (((Number(current) || 0) / (Number(total) || 1)) * 100).toFixed(1)
});
```
### Module Execution Logic
The main calculation function executes modules conditionally:
```javascript
if (!SKIP_PRODUCT_METRICS) {
const result = await calculateProductMetrics(startTime, totalProducts);
await updateProgress(result.processedProducts, result.processedOrders, result.processedPurchaseOrders);
if (!result.success) {
throw new Error('Product metrics calculation failed');
}
} else {
console.log('Skipping product metrics calculation...');
// Update progress proportionally
}
```
### Command Line Usage
When run from the command line:
```javascript
// Run directly if called from command line
if (require.main === module) {
calculateMetrics().catch(error => {
if (!error.message.includes('Operation cancelled')) {
console.error('Error:', error);
}
process.exit(1);
});
}
```
### Module Export
The script exports functions for use by other components:
```javascript
module.exports = {
calculateMetrics,
cancelCalculation,
getProgress: global.getProgress
};
```