Enhance AI validation with progress tracking and prompt debugging

This commit is contained in:
2025-02-22 20:53:13 -05:00
parent 694014934c
commit 959a64aebc
4 changed files with 660 additions and 67 deletions

View File

@@ -1,6 +1,6 @@
I will provide a JSON array with product data. Process the array by combining all products from validData and invalidData arrays into a single array, excluding any fields starting with “__”, such as “__index” or “__errors”. Process each product according to the reference guidelines below. If a field is not included in the data, do not include it in your response unless the specific field guidelines below say otherwise. Please respond with: I will provide a JSON array with product data. Process the array by combining all products from validData and invalidData arrays into a single array, excluding any fields starting with “__”, such as “__index” or “__errors”. Process each product according to the reference guidelines below. If a field is not included in the data, do not include it in your response unless the specific field guidelines below say otherwise.
Respond in the following JSON format: Respond in the following JSON structure in minified format (single line, no whitespace):
{ {
"correctedData": [], // Array of corrected products "correctedData": [], // Array of corrected products
"changes": [], // Array of strings describing each change made "changes": [], // Array of strings describing each change made
@@ -11,7 +11,203 @@ Using the provided guidelines, focus on:
1. Correcting typos and any incorrect spelling or grammar 1. Correcting typos and any incorrect spelling or grammar
2. Standardizing product names 2. Standardizing product names
3. Correcting and enhancing descriptions by adding details, keywords, and SEO-friendly language 3. Correcting and enhancing descriptions by adding details, keywords, and SEO-friendly language
4. Fixing any obvious errors in measurements, prices, or quantities 4. Fixing any obvious errors or inconsistencies between similar products in measurements, prices, or quantities
5. Adding correct categories, themes, and colors 5. Adding correct categories, themes, and colors
Use only the provided data and your own knowledge to make changes. Do not make assumptions or make up information that you're not sure about. If you're unable to make a change you're confident about, leave the field as is. Use only the provided data and your own knowledge to make changes. Do not make assumptions or make up information that you're not sure about. If you're unable to make a change you're confident about, leave the field as is. All data passed in should be validated, corrected, and returned. All values should be strings. Do not leave out any fields that were present in the original data.
----------PRODUCT FIELD GUIDELINES----------
Fields: supplier, private_notes, company, line, subline, artist
Changes: Not allowed
Required: Return if present in the original data
Instructions: If present, return these fields exactly as provided with no changes
Fields: upc, supplier_no, notions_no, item_number
Changes: Formatting only
Required: Return if present in the original data
Instructions: If present, trim outside white space and return these fields exactly as provided with no other changes
Fields: hts_code
Changes: Minimal, you can correct formatting, obvious errors or inconsistencies
Required: Return if present in the original data
Instructions: If present, trim white space and any characters that are not a number or decimal point and return as a string
Fields: image_url
Changes: Formatting only
Required: Return if present in the original data
Instructions: If present, convert all comma-separated values to valid https:// URLs and return
Fields: msrp, cost_each
Changes: Minimal, you can correct formatting, obvious errors or inconsistencies
Required: Return if present in the original data
Instructions: If present, strip any currency symbols and return as a string with exactly two decimal places, even if the last place is a 0.
Fields: qty_per_unit, case_qty
Changes: Minimal, you can correct formatting, obvious errors or inconsistencies
Required: Return if present in the original data
Instructions: If present, strip non-numeric characters and return
Fields: ship_restrictions
Changes: Only add a value if it's not already present
Required: You must always return a value for this field, even if it's not provided in the original data
Instructions: Always return a value exactly as provided, or return 0 if no value is provided. Do not leave this field out even if it's not provided.
Fields: eta
Changes: Minimal, you can correct formatting, obvious errors or inconsistencies
Required: Return if present in the original data
Instructions: If present, return a full month name, day is optional, no year ever (e.g. “January” or “March 3”). This value is not required if not provided.
Fields: name
Changes: Allowed to conform to guidelines, to fix typos or formatting
Required: You must always return a value for this field, even if it's not provided in the original data
Instructions: Always return a value that is corrected and enhanced per additional guidelines below
Fields: description
Changes: Full creative control allowed within guidelines
Required: You must always return a value for this field, even if it's not provided in the original data
Instructions: Always return a value that is corrected and enhanced per additional guidelines below
Fields: weight, length, width, height
Changes: Allowed to correct obvious errors or inconsistencies or to add missing values
Required: You must always return a value for this field, even if it's not provided in the original data
Instructions: Always return a reasonable value (weights in ounces and dimensions in inches) that is validated against similar provided products and your knowledge of general object measurements (e.g. a sheet of paper is not going to be 3 inches thick, a pack of stickers is not going to be 250 ounces, this sheet of paper is very likely going to be the same size as that other sheet of paper from the same line). If a value is unusual or unreasonable, change it to match similar products or to be more reasonable. Do not return 0 or null for any of these fields.
Fields: coo
Changes: Formatting only
Required: Return if present in the original data
Instructions: If present, return a valid two character country code, using capital letters
Fields: tax_cat
Changes: Allowed to correct obvious errors or inconsistencies or to add missing values
Required: You must always return a value for this field, even if it's not provided in the original data
Instructions: Always return a valid numerical tax code ID from the Available Tax Codes array below. Give preference to the value provided, but correct it if another value is more accurate. You must return a value for this field. 0 should be the default value in most cases.
Fields: size_cat
Changes: Allowed to correct obvious errors or inconsistencies or to add missing values
Required: Return if present in the original data or if not present and applicable
Instructions: If present or if applicable, return a valid numerical size category ID from the Available Size Categories array below. Give preference to the value provided, but correct it if another value is more accurate. A value is not required if none of the size categories apply.
Fields: themes
Changes: Allowed to correct obvious errors or inconsistencies or to add missing values
Required: Return if present in the original data or if not present and applicable
Instructions: If present, confirm that each provided theme matches what you understand to be a theme of the product. Remove any themes that do not match and add any themes that are missing. Most products will have zero or one theme. Return a comma-separated list of numerical theme IDs from the Available Themes array below. If you choose a sub-theme, you do not need to include its parent theme in the list.
Fields: colors
Changes: Allowed to correct obvious errors or inconsistencies or to add missing values
Required: Return if present in the original data or if not present and applicable
Instructions: If present or if applicable, return a comma-separated list of numerical color IDs from the Available Colors array below, using the product name as the primary guide. A value is not required if none of the colors apply. Most products will have zero colors.
Fields: categories
Changes: Allowed to correct obvious errors or inconsistencies or to add missing values
Required: You must always return at least one value for this field, even if it's not provided in the original data
Instructions: Always return a comma-separated list of one or more valid numerical category IDs from the Available Categories array below. Give preference to the values provided, particularly if the other information isn't enough to determine a category, but correct them or add new categories if another value is more accurate. Do not return categories in the Deals or Black Friday categories, and strip these from the list if present. If you choose a subcategory at any level, you do not need to include its parent categories in the list. You must return at least one category.
----------PRODUCT NAMING GUIDELINES----------
If there's only one of this type of product in a line: [Line Name] [Product Name] - [Company]
Example: "Cosmos Infinity Chipboard - Stamperia"
Example: "Serene Petals 6x6 Paper Pad - Prima"
Multiple similar products in a line: [Differentiator] [Product Type] - [Line Name] - [Company]
Example: "Ice & Shells Stencil - Arctic Antarctic - Stamperia"
Example: "Astronomy Paper - Cosmos Infinity - Stamperia"
Standalone products: [Product Name] - [Company]
Example: "Hedwig Puffy Stickers - Paper House Productions"
Example: "Heart Tree Dies - Lawn Fawn"
Color-based products: [Color] [Product Name] - [Company]
Example: "Green Valley Enamel Dots - Altenew"
Example: "Magenta Aqua Pigment - Brutus Monroe"
Complex products: [Differentiator] [Line] [Product Type] - [Company]
Example: "Size 6 Round Black Velvet Watercolor Brush - Silver Brush Limited" (Size 6 Round is the differentiator, Black Velvet is the line, Watercolor Brush is the product type)
These should not be included in the name, unless there are multiple products that are otherwise identical:
- Product size
- Product weight
- Number of pages
- How many are in the package
Naming Conventions:
- Paper sizes: Use "12x12", "8x8", "6x6" (no spaces or units of measure)
- Company names must match backend exactly
- Always capitalize every word in the name, including short articles like "The" and "An"
- Use "Idea-ology" (not "idea-ology" or "Ideaology")
- All stamps are "Stamp Set" (not "Clear Stamps" or "Rubber Stamps")
- All dies are "Dies" or "Die" (not "Die Set")
- Brands with their own naming conventions should be respected, such as "Doodle Cuts" for dies from Doodlebug
Special Brand Rules - Ranger:
Format: [Product Name] - [Designer Line] - Ranger
Possible Designers: Dylusions, Dina Wakley MEdia, Simon Hurley create., Wendy Vecchi
Example: "Stacked Stencil - Dina Wakley MEdia - Ranger"
Special Brand Rules - Tim Holtz products from Ranger:
Format: [Color] [Product Name/Type] - Tim Holtz Distress - Ranger
Example: "Mermaid Lagoon Tim Holtz Distress Oxide Ink Pad - Ranger"
Special Brand Rules - Tim Holtz products from Sizzix or Stampers Anonymous:
Format: [Product Name] [Product Type] by Tim Holtz - [Company]
Example: "Leaf Fragments Thinlits Dies by Tim Holtz - Sizzix"
Special Brand Rules - Tim Holtz products from Advantus/Idea-ology:
Format: [Product Name] - Tim Holtz Idea-ology
Example: "Tiny Vials - Tim Holtz Idea-ology"
Special Brand Rules - Dies from Sizzix:
Include die type plus "Dies" or "Die"
Examples:
"Art Nouveau 3-D Textured Impressions Embossing Folder - Sizzix"
"Pocket Pals Thinlits Dies - Sizzix"
"Butterfly Wishes Framelits Dies & Stamps - Sizzix"
Important Notes
- Ensure that product names are consistent across all products of the same type
- Use the minimum amount of information needed to uniquely identify the product
- Put detailed specifications in the product description, not its name
Incorrect example: MVP Rugby - Collection Pack - Photoplay
Notes: there should be no dash between the line and the product
Incorrect Example: A2 Easel Cards - Black - Photoplay
Notes: the differentiating factor should come first: “Black A2 Easel Cards - Photoplay”. Size is ok to include here because this is the name printed on the package.
Incorrect Example: 6” - Scriber Needle Modeling Tool
Notes: this product only comes in one size, so 6” isnt needed. The company name should also be included.
Incorrect Example: Slick - White - Tulip Dimensional Fabric Paint 4oz
Notes: color should be first, then type, then product, then company, so “White Slick Dimensional Fabric Paint - Tulip”. It appears theres only one size available so no need to differentiate in the name.
Incorrect Example: Silhouette Adhesive Cork Sheets 5”X7” 8/Pkg
Notes: should be “Adhesive Cork Sheets - Silhouette”
Incorrect Example: Galaxy - Opaque - American Crafts Color Pour Resin Dyes
Notes: “Galaxy Opaque Dye Set - Color Pour Resin - American Crafts”
Incorrect Example: Slate - Lion Brand Truboo Yarn
Notes: [Differentiator] [Line] [Product Type] - [Company] : “Slate Truboo Yarn - Lion Brand”
Incorrect Example: Rose Quartz Dylusions Shimmer Paint
Notes: “Rose Quartz Shimmer Paint - Dylusions - Ranger”
----------PRODUCT DESCRIPTION GUIDELINES----------
Product descriptions are an extremely important part of the listing and are the most important part of your response. Care should be taken to ensure they are correct, helpful, and SEO-friendly.
If a description is provided in the data, use it as a starting point. Correct any spelling errors, typos, poor grammar, or awkward phrasing. If necessary and you have the information, add more details, describe how the customer could use it, etc. Use complete sentences and keep SEO in mind.
If no description is provided, make one up using the product name, the information you have, and the other provided guidelines. At minimum, a description should be one complete sentence that starts with a capital letter and ends with a period. Unless the product is extremely complex, 2-4 sentences is usually sufficient if you have enough information.
Important Notes:
- Every description should state exactly what's included in the product (e.g. "Includes one 12x12 sheet of patterned cardstock." or "Includes one 6x12 sheet with 27 unique stickers." or "Includes 55 pieces." or "Package includes machine, power cord, 12 sheets of cardstock, 3 dies, and project instructions.")
- Do not use the word "our" in the description (this usually shows up when we copy a description from the manufacturer). Instead use "these" or "[Company name] [product]" or similar. (e.g. don't use "Our journals are hand-made in the USA", instead use "These journals are hand made..." or "Archer & Olive journals are handmade...")
- Don't include fluff like “this is perfect for all your paper crafts” most of the time. If the product helps to solve a unique problem or has a unique feature, by all means describe it, but if its just a normal sheet of paper or pack of stickers, you dont have to pretend like its the best thing ever.
- State as many facts as you can about the product, considering the viewpoint of the customer and what they would want to know when looking at it. They probably want to know dimensions, what products its compatible with, how thick the paper is, how many sheets are included, whether the sheets are double-sided or not, which items are in the kit, etc. Say as much as you possibly can with the information that you have.
- !!DO NOT make up information if you aren't sure about it. A minimal correct description is better than a long incorrect one!!
Avoid/remove:
- The word "Imported"
- Any warnings about Prop 65, choking hazards, etc
- The manufacturer's name if it's included as the very first thing in the description
- Any statement similar to "comes in a variety of colors, each sold separately"

View File

@@ -38,51 +38,119 @@ router.get('/debug', async (req, res) => {
console.log('Debug endpoint called'); console.log('Debug endpoint called');
const pool = req.app.locals.pool; const pool = req.app.locals.pool;
// Load taxonomy data first // Get a real supplier, company, and artist ID from the database
console.log('Loading taxonomy data...'); const [suppliers] = await pool.query('SELECT supplierid FROM suppliers LIMIT 1');
const taxonomy = await getTaxonomyData(pool); const [companies] = await pool.query('SELECT cat_id FROM product_categories WHERE type = 1 LIMIT 1');
console.log('Taxonomy data loaded:', { const [artists] = await pool.query('SELECT cat_id FROM product_categories WHERE type = 40 LIMIT 1');
categoriesCount: taxonomy.categories.length,
themesCount: taxonomy.themes.length, // Create a sample product with real IDs
colorsCount: taxonomy.colors.length, const productsToUse = [{
taxCodesCount: taxonomy.taxCodes.length, supplierid: suppliers[0]?.supplierid || 1234,
sizeCategoriesCount: taxonomy.sizeCategories.length company: companies[0]?.cat_id || 567,
}); artist: artists[0]?.cat_id || 890
}];
// Then load the prompt
console.log('Loading prompt...'); return await generateDebugResponse(pool, productsToUse, res);
const currentPrompt = await loadPrompt(pool);
const sampleData = [{ name: "Sample Product" }];
const fullPrompt = currentPrompt + '\n' + JSON.stringify(sampleData);
const response = {
cacheStatus: {
isCacheValid: isCacheValid(),
lastUpdated: cache.lastUpdated ? new Date(cache.lastUpdated).toISOString() : null,
timeUntilExpiry: cache.lastUpdated ?
Math.max(0, CACHE_TTL - (Date.now() - cache.lastUpdated)) / 1000 + ' seconds' :
'expired',
},
taxonomyStats: taxonomy ? {
categories: countItems(taxonomy.categories),
themes: taxonomy.themes.length,
colors: taxonomy.colors.length,
taxCodes: taxonomy.taxCodes.length,
sizeCategories: taxonomy.sizeCategories.length
} : null,
basePrompt: currentPrompt,
sampleFullPrompt: fullPrompt,
promptLength: fullPrompt.length,
};
console.log('Sending response with stats:', response.taxonomyStats);
res.json(response);
} catch (error) { } catch (error) {
console.error('Debug endpoint error:', error); console.error('Debug endpoint error:', error);
res.status(500).json({ error: error.message }); res.status(500).json({ error: error.message });
} }
}); });
// New POST endpoint for debug with products
router.post('/debug', async (req, res) => {
try {
console.log('Debug POST endpoint called');
const pool = req.app.locals.pool;
const { products } = req.body;
console.log('Received products:', {
isArray: Array.isArray(products),
length: products?.length,
firstProduct: products?.[0],
lastProduct: products?.[products?.length - 1]
});
if (!Array.isArray(products)) {
console.error('Invalid input: products is not an array');
return res.status(400).json({ error: 'Products must be an array' });
}
if (products.length === 0) {
console.error('Invalid input: products array is empty');
return res.status(400).json({ error: 'Products array cannot be empty' });
}
// Clean the products array to remove any internal fields
const cleanedProducts = products.map(product => {
const { __errors, __index, ...cleanProduct } = product;
return cleanProduct;
});
return await generateDebugResponse(pool, cleanedProducts, res);
} catch (error) {
console.error('Debug POST endpoint error:', error);
res.status(500).json({ error: error.message });
}
});
// Helper function to generate debug response
async function generateDebugResponse(pool, productsToUse, res) {
// Load taxonomy data first
console.log('Loading taxonomy data...');
const taxonomy = await getTaxonomyData(pool);
console.log('Taxonomy data loaded:', {
categoriesCount: taxonomy.categories.length,
themesCount: taxonomy.themes.length,
colorsCount: taxonomy.colors.length,
taxCodesCount: taxonomy.taxCodes.length,
sizeCategoriesCount: taxonomy.sizeCategories.length,
suppliersCount: taxonomy.suppliers.length,
companiesCount: taxonomy.companies.length,
artistsCount: taxonomy.artists.length
});
// Load the prompt using the same function used by validation
console.log('Loading prompt...');
const prompt = await loadPrompt(pool, productsToUse);
const fullPrompt = prompt + '\n' + JSON.stringify(productsToUse);
const response = {
cacheStatus: {
isCacheValid: isCacheValid(),
lastUpdated: cache.lastUpdated ? new Date(cache.lastUpdated).toISOString() : null,
timeUntilExpiry: cache.lastUpdated ?
Math.max(0, CACHE_TTL - (Date.now() - cache.lastUpdated)) / 1000 + ' seconds' :
'expired',
},
taxonomyStats: taxonomy ? {
categories: countItems(taxonomy.categories),
themes: taxonomy.themes.length,
colors: taxonomy.colors.length,
taxCodes: taxonomy.taxCodes.length,
sizeCategories: taxonomy.sizeCategories.length,
suppliers: taxonomy.suppliers.length,
companies: taxonomy.companies.length,
artists: taxonomy.artists.length,
// Add filtered counts when products are provided
filtered: productsToUse ? {
suppliers: taxonomy.suppliers.filter(([id]) =>
productsToUse.some(p => Number(p.supplierid) === Number(id))).length,
companies: taxonomy.companies.filter(([id]) =>
productsToUse.some(p => Number(p.company) === Number(id))).length,
artists: taxonomy.artists.filter(([id]) =>
productsToUse.some(p => Number(p.artist) === Number(id))).length
} : null
} : null,
basePrompt: prompt,
sampleFullPrompt: fullPrompt,
promptLength: fullPrompt.length,
};
console.log('Sending response with stats:', response.taxonomyStats);
return res.json(response);
}
// Helper function to count total items in hierarchical structure // Helper function to count total items in hierarchical structure
function countItems(items) { function countItems(items) {
return items.reduce((count, item) => { return items.reduce((count, item) => {
@@ -167,6 +235,46 @@ async function getTaxonomyData(pool) {
// Fetch size categories // Fetch size categories
const [sizeCategories] = await pool.query('SELECT cat_id, name FROM product_categories WHERE type=50 ORDER BY name'); const [sizeCategories] = await pool.query('SELECT cat_id, name FROM product_categories WHERE type=50 ORDER BY name');
// Fetch suppliers
const [suppliers] = await pool.query(`
SELECT supplierid, companyname as name
FROM suppliers
WHERE companyname <> ''
ORDER BY companyname
`);
// Fetch companies (type 1)
const [companies] = await pool.query(`
SELECT cat_id, name
FROM product_categories
WHERE type = 1
ORDER BY name
`);
// Fetch artists (type 40)
const [artists] = await pool.query(`
SELECT cat_id, name
FROM product_categories
WHERE type = 40
ORDER BY name
`);
// Fetch lines (type 2)
const [lines] = await pool.query(`
SELECT cat_id, name
FROM product_categories
WHERE type = 2
ORDER BY name
`);
// Fetch sub-lines (type 3)
const [subLines] = await pool.query(`
SELECT cat_id, name
FROM product_categories
WHERE type = 3
ORDER BY name
`);
// Format categories into a hierarchical structure // Format categories into a hierarchical structure
const formatHierarchy = (items, level = 1, parentId = null) => { const formatHierarchy = (items, level = 1, parentId = null) => {
return items return items
@@ -198,7 +306,12 @@ async function getTaxonomyData(pool) {
themes: formatThemes(themes), themes: formatThemes(themes),
colors: colors.map(c => [c.color, c.name]), colors: colors.map(c => [c.color, c.name]),
taxCodes: (taxCodes || []).map(tc => [tc.tax_code_id, tc.name]), taxCodes: (taxCodes || []).map(tc => [tc.tax_code_id, tc.name]),
sizeCategories: (sizeCategories || []).map(sc => [sc.cat_id, sc.name]) sizeCategories: (sizeCategories || []).map(sc => [sc.cat_id, sc.name]),
suppliers: suppliers.map(s => [s.supplierid, s.name]),
companies: companies.map(c => [c.cat_id, c.name]),
artists: artists.map(a => [a.cat_id, a.name]),
lines: lines.map(l => [l.cat_id, l.name]),
subLines: subLines.map(sl => [sl.cat_id, sl.name])
}; };
cache.lastUpdated = Date.now(); cache.lastUpdated = Date.now();
@@ -206,18 +319,113 @@ async function getTaxonomyData(pool) {
} }
// Load the prompt from file and inject taxonomy data // Load the prompt from file and inject taxonomy data
async function loadPrompt(pool) { async function loadPrompt(pool, productsToValidate = null) {
if (cache.validationPrompt && isCacheValid()) {
return cache.validationPrompt;
}
const promptPath = path.join(__dirname, '..', 'prompts', 'product-validation.txt'); const promptPath = path.join(__dirname, '..', 'prompts', 'product-validation.txt');
const basePrompt = await fs.readFile(promptPath, 'utf8'); const basePrompt = await fs.readFile(promptPath, 'utf8');
// Get taxonomy data // Get taxonomy data
const taxonomy = await getTaxonomyData(pool); const taxonomy = await getTaxonomyData(pool);
// Format taxonomy data for the prompt // Add system instructions to the prompt
const systemInstructions = `You are a specialized e-commerce product data processor for a crafting supplies website tasked with providing complete, correct, appealing, and SEO-friendly product listings. You should write professionally, but in a friendly and engaging tone.
`;
// If we have products to validate, create a filtered prompt
if (productsToValidate) {
console.log('Creating filtered prompt for products:', productsToValidate);
// Extract unique values from products for non-core attributes
const uniqueValues = {
supplierIds: new Set(),
companyIds: new Set(),
artistIds: new Set(),
lineIds: new Set(),
subLineIds: new Set()
};
// Collect any values that exist in the products
productsToValidate.forEach(product => {
Object.entries(product).forEach(([key, value]) => {
if (value === undefined || value === null) return;
// Map field names to their respective sets
const fieldMap = {
supplierid: 'supplierIds',
supplier: 'supplierIds',
company: 'companyIds',
artist: 'artistIds',
line: 'lineIds',
subline: 'subLineIds'
};
if (fieldMap[key]) {
uniqueValues[fieldMap[key]].add(Number(value));
}
});
});
console.log('Unique values collected:', {
suppliers: Array.from(uniqueValues.supplierIds),
companies: Array.from(uniqueValues.companyIds),
artists: Array.from(uniqueValues.artistIds),
lines: Array.from(uniqueValues.lineIds),
subLines: Array.from(uniqueValues.subLineIds)
});
// Create mixed taxonomy with filtered non-core data and full core data
const mixedTaxonomy = {
// Keep full data for core attributes
categories: taxonomy.categories,
themes: taxonomy.themes,
colors: taxonomy.colors,
taxCodes: taxonomy.taxCodes,
sizeCategories: taxonomy.sizeCategories,
// For non-core data, only include items that are actually used
suppliers: taxonomy.suppliers.filter(([id]) => uniqueValues.supplierIds.has(Number(id))),
companies: taxonomy.companies.filter(([id]) => uniqueValues.companyIds.has(Number(id))),
artists: taxonomy.artists.filter(([id]) => uniqueValues.artistIds.has(Number(id))),
lines: taxonomy.lines.filter(([id]) => uniqueValues.lineIds.has(Number(id))),
subLines: taxonomy.subLines.filter(([id]) => uniqueValues.subLineIds.has(Number(id)))
};
console.log('Filtered taxonomy counts:', {
suppliers: mixedTaxonomy.suppliers.length,
companies: mixedTaxonomy.companies.length,
artists: mixedTaxonomy.artists.length,
lines: mixedTaxonomy.lines.length,
subLines: mixedTaxonomy.subLines.length
});
// Format taxonomy data for the prompt, only including sections with values
const taxonomySection = `
All Available Categories:
${JSON.stringify(mixedTaxonomy.categories)}
All Available Themes:
${JSON.stringify(mixedTaxonomy.themes)}
All Available Colors:
${JSON.stringify(mixedTaxonomy.colors)}
All Available Tax Codes:
${JSON.stringify(mixedTaxonomy.taxCodes)}
All Available Size Categories:
${JSON.stringify(mixedTaxonomy.sizeCategories)}${mixedTaxonomy.suppliers.length ? `\n\nSuppliers Used In This Data:\n${JSON.stringify(mixedTaxonomy.suppliers)}` : ''}${mixedTaxonomy.companies.length ? `\n\nCompanies Used In This Data:\n${JSON.stringify(mixedTaxonomy.companies)}` : ''}${mixedTaxonomy.artists.length ? `\n\nArtists Used In This Data:\n${JSON.stringify(mixedTaxonomy.artists)}` : ''}${mixedTaxonomy.lines.length ? `\n\nLines Used In This Data:\n${JSON.stringify(mixedTaxonomy.lines)}` : ''}${mixedTaxonomy.subLines.length ? `\n\nSub-Lines Used In This Data:\n${JSON.stringify(mixedTaxonomy.subLines)}` : ''}
----------Here is the product data to validate----------`;
// Return the filtered prompt without caching
return systemInstructions + basePrompt + '\n' + taxonomySection;
}
// For debug/display purposes, if no products provided and cache is valid, return cached prompt
if (!productsToValidate && cache.validationPrompt && isCacheValid()) {
return cache.validationPrompt;
}
// Generate and cache the full unfiltered prompt
const taxonomySection = ` const taxonomySection = `
Available Categories: Available Categories:
${JSON.stringify(taxonomy.categories)} ${JSON.stringify(taxonomy.categories)}
@@ -234,10 +442,22 @@ ${JSON.stringify(taxonomy.taxCodes)}
Available Size Categories: Available Size Categories:
${JSON.stringify(taxonomy.sizeCategories)} ${JSON.stringify(taxonomy.sizeCategories)}
Available Suppliers:
${JSON.stringify(taxonomy.suppliers)}
Available Companies:
${JSON.stringify(taxonomy.companies)}
Available Artists:
${JSON.stringify(taxonomy.artists)}
Available Shipping Restrictions:
${JSON.stringify(taxonomy.shippingRestrictions)}
Here is the product data to validate:`; Here is the product data to validate:`;
// Combine the prompt sections // Cache the full prompt only when no specific products are provided
cache.validationPrompt = basePrompt + '\n' + taxonomySection; cache.validationPrompt = systemInstructions + basePrompt + '\n' + taxonomySection;
cache.lastUpdated = Date.now(); cache.lastUpdated = Date.now();
return cache.validationPrompt; return cache.validationPrompt;
@@ -256,19 +476,15 @@ router.post('/validate', async (req, res) => {
return res.status(400).json({ error: 'Products must be an array' }); return res.status(400).json({ error: 'Products must be an array' });
} }
// Load the prompt and append the products data // Load the prompt with the products data to filter taxonomy
const basePrompt = await loadPrompt(req.app.locals.pool); const prompt = await loadPrompt(req.app.locals.pool, products);
const fullPrompt = basePrompt + '\n' + JSON.stringify(products); const fullPrompt = prompt + '\n' + JSON.stringify(products);
console.log('📝 Generated prompt:', fullPrompt); console.log('📝 Generated prompt:', fullPrompt);
console.log('🤖 Sending request to OpenAI...'); console.log('🤖 Sending request to OpenAI...');
const completion = await openai.chat.completions.create({ const completion = await openai.chat.completions.create({
model: "gpt-4-turbo-preview", model: "gpt-4o",
messages: [ messages: [
{
role: "system",
content: "You are a specialized e-commerce product data processor for a crafting supplies website tasked with providing complete, correct, appealing, and SEO-friendly product listings. You should write professionally, but in a friendly and engaging tone."
},
{ {
role: "user", role: "user",
content: fullPrompt content: fullPrompt

View File

@@ -65,6 +65,7 @@ import {
DialogTitle, DialogTitle,
} from "@/components/ui/dialog" } from "@/components/ui/dialog"
import { ScrollArea } from "@/components/ui/scroll-area" import { ScrollArea } from "@/components/ui/scroll-area"
import { Code } from "@/components/ui/code"
type Props<T extends string> = { type Props<T extends string> = {
initialData: (Data<T> & Meta)[] initialData: (Data<T> & Meta)[]
@@ -724,6 +725,26 @@ export const ValidationStep = <T extends string>({ initialData, file, onBack }:
isOpen: false, isOpen: false,
}); });
const [aiValidationProgress, setAiValidationProgress] = useState<{
isOpen: boolean;
status: string;
step: number;
}>({
isOpen: false,
status: "",
step: 0,
});
const [currentPrompt, setCurrentPrompt] = useState<{
isOpen: boolean;
prompt: string | null;
isLoading: boolean;
}>({
isOpen: false,
prompt: null,
isLoading: false,
});
// Memoize filtered data to prevent recalculation on every render // Memoize filtered data to prevent recalculation on every render
const filteredData = useMemo(() => { const filteredData = useMemo(() => {
if (!filterByErrors) return data if (!filterByErrors) return data
@@ -967,7 +988,18 @@ export const ValidationStep = <T extends string>({ initialData, file, onBack }:
const handleAiValidation = async () => { const handleAiValidation = async () => {
try { try {
setIsAiValidating(true); setIsAiValidating(true);
console.log('Sending data for AI validation:', data); setAiValidationProgress({
isOpen: true,
status: "Preparing data for validation...",
step: 1
});
console.log('Sending data for validation:', data);
setAiValidationProgress(prev => ({
...prev,
status: "Sending data to AI service and awaiting response...",
step: 2
}));
const response = await fetch(`${config.apiUrl}/ai-validation/validate`, { const response = await fetch(`${config.apiUrl}/ai-validation/validate`, {
method: 'POST', method: 'POST',
@@ -981,6 +1013,12 @@ export const ValidationStep = <T extends string>({ initialData, file, onBack }:
throw new Error('AI validation failed'); throw new Error('AI validation failed');
} }
setAiValidationProgress(prev => ({
...prev,
status: "Processing AI response...",
step: 3
}));
const result = await response.json(); const result = await response.json();
console.log('AI validation response:', result); console.log('AI validation response:', result);
@@ -988,6 +1026,12 @@ export const ValidationStep = <T extends string>({ initialData, file, onBack }:
throw new Error(result.error || 'AI validation failed'); throw new Error(result.error || 'AI validation failed');
} }
setAiValidationProgress(prev => ({
...prev,
status: "Applying corrections...",
step: 4
}));
// Update the data with AI suggestions // Update the data with AI suggestions
if (result.correctedData && Array.isArray(result.correctedData)) { if (result.correctedData && Array.isArray(result.correctedData)) {
// Log the differences // Log the differences
@@ -1027,6 +1071,16 @@ export const ValidationStep = <T extends string>({ initialData, file, onBack }:
isOpen: true, isOpen: true,
}); });
setAiValidationProgress(prev => ({
...prev,
status: "Validation complete!",
step: 5
}));
setTimeout(() => {
setAiValidationProgress(prev => ({ ...prev, isOpen: false }));
}, 1000);
} catch (error) { } catch (error) {
console.error('AI Validation Error:', error); console.error('AI Validation Error:', error);
toast({ toast({
@@ -1034,11 +1088,82 @@ export const ValidationStep = <T extends string>({ initialData, file, onBack }:
description: error instanceof Error ? error.message : "An error occurred during AI validation", description: error instanceof Error ? error.message : "An error occurred during AI validation",
variant: "destructive", variant: "destructive",
}); });
setAiValidationProgress(prev => ({
...prev,
status: "Validation failed",
step: -1
}));
} finally { } finally {
setIsAiValidating(false); setIsAiValidating(false);
} }
}; };
// Add function to fetch current prompt
const showCurrentPrompt = async () => {
try {
setCurrentPrompt(prev => ({ ...prev, isLoading: true, isOpen: true }));
// Debug log the data being sent
console.log('Sending products data:', {
dataLength: data.length,
firstProduct: data[0],
lastProduct: data[data.length - 1]
});
// Clean the data to ensure we only send what's needed
const cleanedData = data.map(item => {
const { __errors, __index, ...rest } = item;
return rest;
});
console.log('Cleaned data sample:', {
length: cleanedData.length,
firstProduct: cleanedData[0],
lastProduct: cleanedData[cleanedData.length - 1]
});
// Use POST to send products in request body
const response = await fetch(`${config.apiUrl}/ai-validation/debug`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ products: cleanedData })
});
if (!response.ok) {
const errorText = await response.text();
console.error('Debug endpoint error:', {
status: response.status,
statusText: response.statusText,
body: errorText
});
throw new Error(`Failed to fetch prompt: ${response.status} ${response.statusText}`);
}
const debugData = await response.json();
// Log the response stats
console.log('Debug response stats:', {
promptLength: debugData.promptLength,
taxonomyStats: debugData.taxonomyStats
});
setCurrentPrompt(prev => ({
...prev,
prompt: debugData.sampleFullPrompt,
isLoading: false
}));
} catch (error) {
console.error('Error fetching prompt:', error);
toast({
title: "Error",
description: error instanceof Error ? error.message : "Failed to fetch current prompt",
variant: "destructive",
});
setCurrentPrompt(prev => ({ ...prev, isLoading: false }));
}
};
return ( return (
<div className="flex h-[calc(100vh-9.5rem)] flex-col"> <div className="flex h-[calc(100vh-9.5rem)] flex-col">
<CopyDownDialog <CopyDownDialog
@@ -1047,6 +1172,25 @@ export const ValidationStep = <T extends string>({ initialData, file, onBack }:
onConfirm={executeCopyDown} onConfirm={executeCopyDown}
fieldLabel={copyDownField?.label || ""} fieldLabel={copyDownField?.label || ""}
/> />
<Dialog open={currentPrompt.isOpen} onOpenChange={(open) => setCurrentPrompt(prev => ({ ...prev, isOpen: open }))}>
<DialogContent className="max-w-4xl h-[80vh]">
<DialogHeader>
<DialogTitle>Current AI Prompt</DialogTitle>
<DialogDescription>
This is the exact prompt that would be sent to the AI for validation
</DialogDescription>
</DialogHeader>
<ScrollArea className="flex-1">
{currentPrompt.isLoading ? (
<div className="flex items-center justify-center h-full">
<Loader2 className="h-8 w-8 animate-spin" />
</div>
) : (
<Code className="whitespace-pre-wrap p-4">{currentPrompt.prompt}</Code>
)}
</ScrollArea>
</DialogContent>
</Dialog>
<AlertDialog open={showSubmitAlert} onOpenChange={setShowSubmitAlert}> <AlertDialog open={showSubmitAlert} onOpenChange={setShowSubmitAlert}>
<AlertDialogPortal> <AlertDialogPortal>
<AlertDialogOverlay className="z-[1400]" /> <AlertDialogOverlay className="z-[1400]" />
@@ -1074,6 +1218,34 @@ export const ValidationStep = <T extends string>({ initialData, file, onBack }:
</AlertDialogContent> </AlertDialogContent>
</AlertDialogPortal> </AlertDialogPortal>
</AlertDialog> </AlertDialog>
<Dialog open={aiValidationProgress.isOpen} onOpenChange={() => {}}>
<DialogContent className="sm:max-w-md">
<DialogHeader>
<DialogTitle>AI Validation Progress</DialogTitle>
</DialogHeader>
<div className="space-y-4 py-4">
<div className="flex items-center gap-4">
<div className="flex-1">
<div className="h-2 w-full bg-secondary rounded-full overflow-hidden">
<div
className="h-full bg-primary transition-all duration-500"
style={{
width: `${(aiValidationProgress.step / 5) * 100}%`,
backgroundColor: aiValidationProgress.step === -1 ? 'var(--destructive)' : undefined
}}
/>
</div>
</div>
<div className="text-sm text-muted-foreground w-12 text-right">
{aiValidationProgress.step === -1 ? '❌' : `${Math.round((aiValidationProgress.step / 5) * 100)}%`}
</div>
</div>
<p className="text-center text-sm text-muted-foreground">
{aiValidationProgress.status}
</p>
</div>
</DialogContent>
</Dialog>
<Dialog <Dialog
open={aiValidationDetails.isOpen} open={aiValidationDetails.isOpen}
onOpenChange={(open) => setAiValidationDetails(prev => ({ ...prev, isOpen: open }))} onOpenChange={(open) => setAiValidationDetails(prev => ({ ...prev, isOpen: open }))}
@@ -1141,6 +1313,14 @@ export const ValidationStep = <T extends string>({ initialData, file, onBack }:
)} )}
AI Validate AI Validate
</Button> </Button>
<Button
variant="outline"
size="sm"
onClick={showCurrentPrompt}
disabled={data.length === 0}
>
Show Prompt
</Button>
<div className="flex items-center gap-2"> <div className="flex items-center gap-2">
<Switch <Switch
checked={filterByErrors} checked={filterByErrors}

View File

@@ -40,7 +40,7 @@ const BASE_IMPORT_FIELDS = [
label: "Supplier #", label: "Supplier #",
key: "supplier_no", key: "supplier_no",
description: "Supplier's product identifier", description: "Supplier's product identifier",
alternateMatches: ["sku", "item#", "mfg item #", "item"], alternateMatches: ["sku", "item#", "mfg item #", "item", "supplier #"],
fieldType: { type: "input" }, fieldType: { type: "input" },
width: 180, width: 180,
validations: [ validations: [
@@ -52,6 +52,7 @@ const BASE_IMPORT_FIELDS = [
label: "Notions #", label: "Notions #",
key: "notions_no", key: "notions_no",
description: "Internal notions number", description: "Internal notions number",
alternateMatches: ["notions #"],
fieldType: { type: "input" }, fieldType: { type: "input" },
width: 110, width: 110,
validations: [ validations: [
@@ -109,7 +110,7 @@ const BASE_IMPORT_FIELDS = [
label: "Qty Per Unit", label: "Qty Per Unit",
key: "qty_per_unit", key: "qty_per_unit",
description: "Quantity of items per individual unit", description: "Quantity of items per individual unit",
alternateMatches: ["inner pack", "inner", "min qty", "unit qty", "min. order qty"], alternateMatches: ["inner pack", "inner", "min qty", "unit qty", "min. order qty", "supplier qty/unit"],
fieldType: { type: "input" }, fieldType: { type: "input" },
width: 90, width: 90,
validations: [ validations: [
@@ -121,7 +122,7 @@ const BASE_IMPORT_FIELDS = [
label: "Cost Each", label: "Cost Each",
key: "cost_each", key: "cost_each",
description: "Wholesale cost per unit", description: "Wholesale cost per unit",
alternateMatches: ["wholesale", "wholesale price"], alternateMatches: ["wholesale", "wholesale price", "supplier cost each"],
fieldType: { fieldType: {
type: "input", type: "input",
price: true price: true
@@ -264,7 +265,7 @@ const BASE_IMPORT_FIELDS = [
label: "Country Of Origin", label: "Country Of Origin",
key: "coo", key: "coo",
description: "2-letter country code (ISO)", description: "2-letter country code (ISO)",
alternateMatches: ["coo"], alternateMatches: ["coo", "country of origin"],
fieldType: { type: "input" }, fieldType: { type: "input" },
width: 100, width: 100,
validations: [ validations: [
@@ -275,7 +276,7 @@ const BASE_IMPORT_FIELDS = [
label: "HTS Code", label: "HTS Code",
key: "hts_code", key: "hts_code",
description: "Harmonized Tariff Schedule code", description: "Harmonized Tariff Schedule code",
alternateMatches: ["taric"], alternateMatches: ["taric","hts"],
fieldType: { type: "input" }, fieldType: { type: "input" },
width: 130, width: 130,
validations: [ validations: [