Files
inventory/docs/validation-process-issues.md

12 KiB

1. Error Filtering Logic Inconsistency (RESOLVED)

Note: This issue has been resolved by implementing a type-based error system.

The filtering logic in ValidationCell.tsx previously relied on string matching, which was fragile:

// Old implementation (string-based matching)
const filteredErrors = React.useMemo(() => {
  return !isEmpty(value) 
    ? errors.filter(error => !error.message?.toLowerCase().includes('required'))
    : errors;
}, [value, errors]);

// New implementation (type-based filtering)
const filteredErrors = React.useMemo(() => {
  return !isEmpty(value) 
    ? errors.filter(error => error.type !== ErrorType.Required)
    : errors;
}, [value, errors]);

The solution implemented:

  • Added an ErrorType enum in types.ts to standardize error categorization
  • Updated all error creation to include the appropriate error type
  • Modified error filtering to use the type property instead of string matching
  • Ensured consistent error handling across the application

Guidelines for future development:

  • Always use the ErrorType enum when creating errors
  • Never rely on string matching for error filtering
  • Ensure all error objects include the type property
  • Use the appropriate error type for each validation rule:
    • ErrorType.Required for required field validations
    • ErrorType.Regex for regex validations
    • ErrorType.Unique for uniqueness validations
    • ErrorType.Custom for custom validations
    • ErrorType.Api for API-based validations

2. Redundant Error Processing

The system processes errors in multiple places:

  • In ValidationCell.tsx, errors are filtered and processed again
  • In useValidation.tsx, errors are already filtered once
  • In ValidationContainer.tsx, errors are manipulated directly

This redundancy could lead to inconsistent behavior and makes the code harder to maintain.

3. Race Conditions in Async Validation

The UPC validation and other async validations could create race conditions:

  • If a user types quickly, multiple validation requests might be in flight
  • Later responses could overwrite more recent ones if they complete out of order
  • The debouncing helps but doesn't fully solve this issue

4. Memory Leaks in Timeout Management

The validation timeouts are stored in refs:

const validationTimeoutsRef = useRef<Record<number, NodeJS.Timeout>>({});

While there is cleanup on unmount, if rows are added/removed dynamically, timeouts for deleted rows might not be properly cleared.

5. Inefficient Error Storage (RESOLVED)

Status: RESOLVED

Problem

Previously, validation errors were stored in multiple locations:

  • In the validationErrors Map in useValidationState
  • In the row data itself as __errors

This redundancy caused several issues:

  • Inconsistent error states between the two storage locations
  • Increased memory usage by storing the same information twice
  • Complex state management to keep both sources in sync
  • Difficulty reasoning about where errors should be accessed from

Solution

We've implemented a unified error storage approach by:

  • Making the validationErrors Map in useValidationState the single source of truth for all validation errors
  • Removed the __errors property from row data
  • Updated all validation functions to interact with the central error store instead of modifying row data
  • Modified UPC validation to use the central error store
  • Updated all components to read errors from the validationErrors Map instead of row data

Key Changes

  1. Modified dataMutations.ts to stop storing errors in row data
  2. Updated the Meta type to remove the __errors property
  3. Modified the RowData type to remove the __errors property
  4. Updated the useValidation hook to return errors separately from row data
  5. Modified the useAiValidation hook to work with the central error store
  6. Updated the useFilters hook to check for errors in the validationErrors Map
  7. Modified the ValidationTable and ValidationCell components to read errors from the validationErrors Map

Benefits

  • Single Source of Truth: All validation errors are now stored in one place
  • Reduced Memory Usage: No duplicate storage of error information
  • Simplified State Management: Only one state to update when errors change
  • Cleaner Data Structure: Row data no longer contains validation metadata
  • More Maintainable Code: Clearer separation of concerns between data and validation

Future Improvements

While this refactoring addresses the core issue of inefficient error storage, there are still opportunities for further optimization:

  1. Redundant Error Processing: The validation process still performs some redundant calculations that could be optimized.
  2. Race Conditions: Async validation can lead to race conditions when multiple validations are triggered in quick succession.
  3. Memory Leaks: The timeout management for validation could be improved to prevent potential memory leaks.
  4. Tight Coupling: Components are still tightly coupled to the validation state structure.
  5. Error Prioritization: The system doesn't prioritize errors well, showing all errors at once rather than focusing on the most critical ones first.

Validation Flow

The validation process now works as follows:

  1. Error Generation:

    • Field-level validations generate errors based on validation rules
    • Row-level hooks add custom validation errors
    • Table-level validations (like uniqueness checks) add errors across rows
  2. Error Storage:

    • All errors are stored in the validationErrors Map in useValidationState
    • The Map uses row indices as keys and objects of field errors as values
  3. Error Display:

    • The ValidationTable component checks the validationErrors Map to highlight rows with errors
    • The ValidationCell component receives errors for specific fields from the validationErrors Map
    • Errors are filtered in the UI to avoid showing "required" errors for fields with values

This focused refactoring approach has successfully addressed a critical issue while keeping changes manageable and targeted.

6. Excessive Re-rendering

Despite optimization attempts, the system might still cause excessive re-renders:

  • Each cell validation can trigger updates to the entire data structure
  • The batch update system helps but still has limitations
  • The memoization in ValidationCell might not catch all cases where re-rendering is unnecessary

7. Complex Error Merging Logic

When merging errors from different sources, the logic is complex and potentially error-prone:

// Merge field errors and row hook errors
const mergedErrors: Record<string, InfoWithSource> = {}

// Convert field errors to InfoWithSource
Object.entries(fieldErrors).forEach(([key, errors]) => {
  if (errors.length > 0) {
    mergedErrors[key] = {
      message: errors[0].message,
      level: errors[0].level,
      source: ErrorSources.Row,
      type: errors[0].type || ErrorType.Custom
    }
  }
})

This only takes the first error for each field, potentially hiding important validation issues.

8. Inconsistent Error Handling for Empty Values (PARTIALLY RESOLVED)

Note: This issue has been partially resolved by standardizing the isEmpty function and error type system.

The system previously had different approaches to handling empty values:

  • Some validations skipped empty values unless they're required
  • Others processed empty values differently
  • The isEmpty function was defined multiple times with slight variations

The solution implemented:

  • Standardized the isEmpty function implementation
  • Ensured consistent error type usage for required field validations
  • Made error filtering consistent across the application

Guidelines for future development:

  • Always use the shared isEmpty function for checking empty values
  • Ensure consistent handling of empty values across all validation rules
  • Use the ErrorType.Required type for all required field validations

9. Tight Coupling Between Components

The validation system is tightly coupled across components:

  • ValidationCell needs to understand the structure of errors
  • ValidationTable needs to extract and pass the right errors
  • ValidationContainer directly manipulates the error structure

This makes it harder to refactor or reuse components independently.

10. Limited Error Prioritization

There's no clear prioritization of errors:

  • When multiple errors exist for a field, which one should be shown first?
  • Are some errors more important than others?
  • The current system mostly shows the first error it finds

A more robust approach would be to have a consistent error source identification system and a clear prioritization strategy for displaying errors.


Let me explain how these hooks fit together to create the validation errors that eventually get filtered in the ValidationCell component:

The Validation Flow

  1. useValidationState Hook: This is the main state management hook used by the ValidationContainer component. It:

    • Manages the core data state (data)
    • Tracks validation errors in a Map (validationErrors)
    • Provides functions to update and validate rows
  2. useValidation Hook: This is a utility hook that provides the core validation logic:

    • validateField: Validates a single field against its validation rules
    • validateRow: Validates an entire row, field by field
    • validateTable: Runs table-level validations
    • validateUnique: Checks for uniqueness constraints
    • validateData: Orchestrates the complete validation process

How Errors Are Generated

Validation errors come from multiple sources:

  1. Field-Level Validations: In useValidation.tsx, the validateField function checks individual fields against rules like:

    • required: Field must have a value
    • regex: Value must match a pattern
    • min/max: Numeric constraints
  2. Row-Level Validations: The validateRow function in useValidation.tsx runs:

    • Field validations for each field in the row
    • Special validations for required fields like supplier and company
    • Custom row hooks provided by the application
  3. Table-Level Validations:

    • validateUnique checks for duplicate values in fields marked as unique
    • validateTable runs custom table hooks for cross-row validations
  4. API-Based Validations: In useValidationState.tsx and ValidationContainer.tsx:

    • UPC validation via API calls
    • Item number uniqueness checks

The Error Flow

  1. Errors are collected in the validationErrors Map in useValidationState
  2. This Map is passed to ValidationTable as a prop
  3. ValidationTable extracts the relevant errors for each cell and passes them to ValidationCell
  4. In ValidationCell, the errors are filtered based on whether the cell has a value:
    // Updated implementation using type-based filtering
    const filteredErrors = React.useMemo(() => {
      return !isEmpty(value) 
        ? errors.filter(error => error.type !== ErrorType.Required)
        : errors;
    }, [value, errors]);
    

Key Insights

  1. Error Structure: Errors now have a consistent structure with type information:

    type ErrorObject = {
      message: string;
      level: string;  // 'error', 'warning', etc.
      source?: ErrorSources; // Where the error came from
      type: ErrorType; // The type of error (Required, Regex, Unique, etc.)
    }
    
  2. Error Sources: Errors can come from:

    • Field validations (required, regex, etc.)
    • Row validations (custom business logic)
    • Table validations (uniqueness checks)
    • API validations (UPC checks)
  3. Error Types: Errors are now categorized by type:

    • ErrorType.Required: Field is required but empty
    • ErrorType.Regex: Value doesn't match the regex pattern
    • ErrorType.Unique: Value must be unique across rows
    • ErrorType.Custom: Custom validation errors
    • ErrorType.Api: Errors from API calls
  4. Error Filtering: The filtering in ValidationCell is now more robust:

    • When a field has a value, errors of type ErrorType.Required are filtered out
    • When a field is empty, all errors are shown
  5. Performance Optimizations:

    • Batch processing of validations
    • Debounced updates to avoid excessive re-renders
    • Memoization of computed values