Product Research & Addition Skill
Comprehensive skill for researching, validating, and adding new products to True Baby Cost.
Overview
This skill handles the full lifecycle of adding new baby products:
- Category Research — Understand the market before adding products
- Product Discovery — Find all relevant products in each market
- Deduplication — Merge duplicate entries across sources
- Data Collection — Gather prices, specs, photos, and links per country
- Quality Assurance — Flag items needing human review
Supported Product Categories
- Strollers (established)
- Car Seats (planned)
- High Chairs (planned)
- Cribs/Bassinets (planned)
- Baby Monitors (planned)
- Breast Pumps (established)
- Bottles (established)
- Diapers (established)
- Formula (established)
Phase 0: Category Research (New Categories Only)
When to run: Starting a new product category that doesn’t exist yet.
0.1 Market Overview
Research goals:
- Market size and growth trends
- Key purchase factors for parents
- Price range tiers (budget/mid/luxury)
- Major brands by region
- Safety standards (US: CPSC/JPMA, EU: EN, etc.)
- Seasonal buying patterns
0.2 Competitive Landscape
Identify:
- Top 5 brands globally
- Regional leaders (US, EU, Latin America)
- Direct-to-consumer disruptors
- Common features that differentiate products
- Price anchors ($X for budget, $Y for mid, $Z for luxury)
0.3 Feature Taxonomy
Define standard features for the category:
- Must-have features (safety, basic function)
- Nice-to-have features (convenience)
- Luxury features (premium only)
- Create feature detection keywords for each
0.4 Create Category Schema
Output: data/{category}-schema.json
{
"category": "car-seats",
"types": ["infant", "convertible", "booster", "all-in-one"],
"features": [...],
"priceRanges": { "budget": [0, 150], "mid-range": [150, 350], "luxury": [350, 1000] },
"safetyStandards": ["FMVSS 213", "ECE R44/04", "i-Size"],
"keySpecs": ["weightRange", "heightRange", "installMethod", "harness"]
}
Phase 1: Product Discovery
1.1 Amazon Product Lists
For each market (US, MX, ES, CA), scrape Amazon bestsellers and search results:
const AMAZON_SOURCES = {
US: [
'https://www.amazon.com/Best-Sellers-Baby-Strollers/zgbs/baby-products/166842011',
'https://www.amazon.com/s?k=stroller&rh=n%3A166842011',
],
MX: [
'https://www.amazon.com.mx/gp/bestsellers/baby/9725084011',
'https://www.amazon.com.mx/s?k=carriola',
],
ES: [
'https://www.amazon.es/gp/bestsellers/baby/1703531031',
'https://www.amazon.es/s?k=silla+paseo+bebe',
],
CA: [
'https://www.amazon.ca/Best-Sellers-Baby-Standard-Strollers/zgbs/baby/2237497011',
]
};Output per product:
{
"source": "amazon-us-bestseller",
"brand": "UPPAbaby",
"model": "Vista V3",
"asin": "B0BX5JKLMN",
"url": "https://www.amazon.com/dp/B0BX5JKLMN",
"price": 1099.99,
"rating": 4.8,
"reviewCount": 2341,
"imageUrls": ["https://..."],
"bulletPoints": ["..."],
"scrapedAt": "2026-03-12T17:00:00Z"
}1.2 Editorial Lists & Guides
Search for authoritative recommendation lists:
Search queries per market:
- US: "best strollers 2026", "wirecutter stroller", "babylist stroller guide"
- MX: "mejores carriolas 2026", "guia carriolas mexico"
- ES: "mejores sillas paseo 2026", "comparativa cochecitos"
- CA: "best strollers canada 2026"
Sources to prioritize:
- Wirecutter, BabyGearLab, What to Expect
- BabyList, Babyletto, Buy Buy Baby guides
- Mumsnet, Netmums (UK/EU)
- Local parenting blogs with >10k monthly visitors
Extract:
- Product names mentioned
- Pros/cons noted
- Price points
- “Best for X” categorizations
1.3 Manufacturer Catalogs
For established brands, check official product lines:
Major brands to always check:
- Strollers: UPPAbaby, Bugaboo, Nuna, Cybex, Baby Jogger, Chicco, Graco, Britax
- Car Seats: Chicco, Graco, Britax, Nuna, Clek, Cybex
- High Chairs: Stokke, Peg Perego, Inglesina, OXO Tot
Get from each brand site:
- Full product catalog
- Regional availability
- MSRP prices
- Product specifications
- Official images
Phase 2: Deduplication
2.1 Normalize Product Names
function normalizeProductName(brand, model) {
return `${brand.toLowerCase().replace(/[^a-z0-9]/g, '')}-${model.toLowerCase().replace(/[^a-z0-9]/g, '')}`;
}
// Handle variations:
// "UPPAbaby Vista V3" = "UPPAbaby VISTA V3" = "Uppababy Vista v3"
// "Chicco Bravo 3-in-1" = "Chicco Bravo 3in1"2.2 Merge Records
When same product found in multiple sources:
1. Keep all URLs (Amazon US, Amazon MX, manufacturer, etc.)
2. Use highest-resolution image
3. Merge feature lists
4. Note price from each source with timestamp
5. Flag discrepancies for review
2.3 Generate Canonical ID
Format: {brand-slug}-{model-slug}
Examples:
- uppababy-vista-v3
- bugaboo-fox-5
- chicco-bravo-3-in-1
Phase 3: Per-Country Data Collection
For each unique product, collect market-specific data:
3.1 Availability Check
For each market (US, MX, ES, CA):
1. Search Amazon for exact product name
2. Check manufacturer's regional site
3. Search major local retailers:
- US: Target, Buy Buy Baby, Nordstrom
- MX: Liverpool, Palacio de Hierro, Walmart MX
- ES: El Corte Inglés, Hipercor, Prenatal
- CA: Indigo, Well.ca, Snuggle Bugz
Output:
{
"market": "US",
"available": true,
"retailers": [
{ "name": "Amazon", "url": "https://...", "price": 899.99 },
{ "name": "Target", "url": "https://...", "price": 899.99 }
]
}
3.2 Price Collection
Priority order:
1. Amazon price (most consistent)
2. Manufacturer MSRP
3. Major retailer price
Record:
- Current price
- List price (if on sale)
- Currency
- Timestamp
- Source URL
3.3 Product Specifications
Collect (if available):
- Weight (lbs/kg)
- Dimensions (folded/unfolded)
- Age/weight range
- Type classification
- Color options
- Included accessories
- Compatible accessories
- Safety certifications
3.4 Photo Collection
Priority order:
1. Manufacturer official images (highest quality)
2. Amazon main product image
3. Retailer images
Requirements:
- Minimum 800px width
- White or neutral background preferred
- Show product from front/side angle
- No lifestyle shots (people, rooms)
If multiple candidates:
- Download all to /images/review/{product-id}/
- Tag with source and confidence score
- Flag for human review
Phase 4: Quality Assurance
4.1 Automated Validation
const VALIDATION_RULES = {
required: ['brand', 'model', 'markets.US.price'],
priceRange: { min: 20, max: 3000 },
weightRange: { min: 3, max: 50 }, // lbs
imageMinSize: 10000, // bytes
};
function validate(product) {
const issues = [];
// Required fields
for (const field of VALIDATION_RULES.required) {
if (!getNestedValue(product, field)) {
issues.push({ type: 'missing', field });
}
}
// Price sanity check
const price = product.markets?.US?.price;
if (price < 20 || price > 3000) {
issues.push({ type: 'suspicious_price', value: price });
}
// Image check
if (!product.imageUrl || !fs.existsSync(imagePath)) {
issues.push({ type: 'missing_image' });
}
return issues;
}4.2 Review Queue
Generate review file for human verification:
// data/review-queue.json
{
"pendingReview": [
{
"productId": "xyz-stroller-x1",
"issues": [
{ "type": "suspicious_price", "value": 15.99, "note": "Unusually low for stroller" },
{ "type": "multiple_images", "candidates": ["img1.jpg", "img2.jpg"] }
],
"addedAt": "2026-03-12T17:00:00Z"
}
]
}4.3 Duplicate Detection
Flag potential duplicates for review:
Criteria:
- Same brand + similar model name (Levenshtein < 3)
- Same price within 5%
- Same specifications
Output Files
Main Product Data
data/strollers.json — Production data
Staging Data
data/staging/{category}-new.json — New products pending review
Review Queue
data/review-queue.json — Items needing human attention
Price History
data/price-history.json — Historical price tracking
Image Review
images/review/ — Multiple image candidates per product
Scripts
scripts/discover-products.js
Full discovery pipeline for a market
scripts/update-prices.js
Refresh prices from Amazon
scripts/download-images.js
Download missing product images
scripts/validate-data.js
Run validation rules on all products
scripts/import-staging.js
Move reviewed products from staging to production
Usage Examples
Add products from a new market
node scripts/discover-products.js --market MX --category strollersUpdate all prices
node scripts/update-prices.js --market US
node scripts/update-prices.js --market MXValidate before deploy
node scripts/validate-data.jsProcess review queue
# Human reviews data/review-queue.json
# Then run:
node scripts/import-staging.js --approvedMaintenance Schedule
| Task | Frequency | Script |
|---|---|---|
| Price updates (US) | Weekly | update-prices.js --market US |
| Price updates (other) | Bi-weekly | update-prices.js --market X |
| New product discovery | Monthly | discover-products.js |
| Full validation | Before deploy | validate-data.js |
| Image audit | Quarterly | Manual review |
Notes
- Always respect robots.txt and rate limits
- Use 2-3 second delays between requests
- Rotate user agents if blocked
- Cache responses to avoid redundant scraping
- Log all scraping activity for debugging