Product Research & Addition Skill

Comprehensive skill for researching, validating, and adding new products to True Baby Cost.

Overview

This skill handles the full lifecycle of adding new baby products:

  1. Category Research — Understand the market before adding products
  2. Product Discovery — Find all relevant products in each market
  3. Deduplication — Merge duplicate entries across sources
  4. Data Collection — Gather prices, specs, photos, and links per country
  5. Quality Assurance — Flag items needing human review

Supported Product Categories

  • Strollers (established)
  • Car Seats (planned)
  • High Chairs (planned)
  • Cribs/Bassinets (planned)
  • Baby Monitors (planned)
  • Breast Pumps (established)
  • Bottles (established)
  • Diapers (established)
  • Formula (established)

Phase 0: Category Research (New Categories Only)

When to run: Starting a new product category that doesn’t exist yet.

0.1 Market Overview

Research goals:
- Market size and growth trends
- Key purchase factors for parents
- Price range tiers (budget/mid/luxury)
- Major brands by region
- Safety standards (US: CPSC/JPMA, EU: EN, etc.)
- Seasonal buying patterns

0.2 Competitive Landscape

Identify:
- Top 5 brands globally
- Regional leaders (US, EU, Latin America)
- Direct-to-consumer disruptors
- Common features that differentiate products
- Price anchors ($X for budget, $Y for mid, $Z for luxury)

0.3 Feature Taxonomy

Define standard features for the category:
- Must-have features (safety, basic function)
- Nice-to-have features (convenience)
- Luxury features (premium only)
- Create feature detection keywords for each

0.4 Create Category Schema

Output: data/{category}-schema.json
{
  "category": "car-seats",
  "types": ["infant", "convertible", "booster", "all-in-one"],
  "features": [...],
  "priceRanges": { "budget": [0, 150], "mid-range": [150, 350], "luxury": [350, 1000] },
  "safetyStandards": ["FMVSS 213", "ECE R44/04", "i-Size"],
  "keySpecs": ["weightRange", "heightRange", "installMethod", "harness"]
}

Phase 1: Product Discovery

1.1 Amazon Product Lists

For each market (US, MX, ES, CA), scrape Amazon bestsellers and search results:

const AMAZON_SOURCES = {
  US: [
    'https://www.amazon.com/Best-Sellers-Baby-Strollers/zgbs/baby-products/166842011',
    'https://www.amazon.com/s?k=stroller&rh=n%3A166842011',
  ],
  MX: [
    'https://www.amazon.com.mx/gp/bestsellers/baby/9725084011',
    'https://www.amazon.com.mx/s?k=carriola',
  ],
  ES: [
    'https://www.amazon.es/gp/bestsellers/baby/1703531031',
    'https://www.amazon.es/s?k=silla+paseo+bebe',
  ],
  CA: [
    'https://www.amazon.ca/Best-Sellers-Baby-Standard-Strollers/zgbs/baby/2237497011',
  ]
};

Output per product:

{
  "source": "amazon-us-bestseller",
  "brand": "UPPAbaby",
  "model": "Vista V3",
  "asin": "B0BX5JKLMN",
  "url": "https://www.amazon.com/dp/B0BX5JKLMN",
  "price": 1099.99,
  "rating": 4.8,
  "reviewCount": 2341,
  "imageUrls": ["https://..."],
  "bulletPoints": ["..."],
  "scrapedAt": "2026-03-12T17:00:00Z"
}

1.2 Editorial Lists & Guides

Search for authoritative recommendation lists:

Search queries per market:
- US: "best strollers 2026", "wirecutter stroller", "babylist stroller guide"
- MX: "mejores carriolas 2026", "guia carriolas mexico"
- ES: "mejores sillas paseo 2026", "comparativa cochecitos"
- CA: "best strollers canada 2026"

Sources to prioritize:
- Wirecutter, BabyGearLab, What to Expect
- BabyList, Babyletto, Buy Buy Baby guides
- Mumsnet, Netmums (UK/EU)
- Local parenting blogs with >10k monthly visitors

Extract:

  • Product names mentioned
  • Pros/cons noted
  • Price points
  • “Best for X” categorizations

1.3 Manufacturer Catalogs

For established brands, check official product lines:

Major brands to always check:
- Strollers: UPPAbaby, Bugaboo, Nuna, Cybex, Baby Jogger, Chicco, Graco, Britax
- Car Seats: Chicco, Graco, Britax, Nuna, Clek, Cybex
- High Chairs: Stokke, Peg Perego, Inglesina, OXO Tot

Get from each brand site:
- Full product catalog
- Regional availability
- MSRP prices
- Product specifications
- Official images

Phase 2: Deduplication

2.1 Normalize Product Names

function normalizeProductName(brand, model) {
  return `${brand.toLowerCase().replace(/[^a-z0-9]/g, '')}-${model.toLowerCase().replace(/[^a-z0-9]/g, '')}`;
}
 
// Handle variations:
// "UPPAbaby Vista V3" = "UPPAbaby VISTA V3" = "Uppababy Vista v3"
// "Chicco Bravo 3-in-1" = "Chicco Bravo 3in1"

2.2 Merge Records

When same product found in multiple sources:

1. Keep all URLs (Amazon US, Amazon MX, manufacturer, etc.)
2. Use highest-resolution image
3. Merge feature lists
4. Note price from each source with timestamp
5. Flag discrepancies for review

2.3 Generate Canonical ID

Format: {brand-slug}-{model-slug}
Examples:
- uppababy-vista-v3
- bugaboo-fox-5
- chicco-bravo-3-in-1

Phase 3: Per-Country Data Collection

For each unique product, collect market-specific data:

3.1 Availability Check

For each market (US, MX, ES, CA):
1. Search Amazon for exact product name
2. Check manufacturer's regional site
3. Search major local retailers:
   - US: Target, Buy Buy Baby, Nordstrom
   - MX: Liverpool, Palacio de Hierro, Walmart MX
   - ES: El Corte Inglés, Hipercor, Prenatal
   - CA: Indigo, Well.ca, Snuggle Bugz

Output:
{
  "market": "US",
  "available": true,
  "retailers": [
    { "name": "Amazon", "url": "https://...", "price": 899.99 },
    { "name": "Target", "url": "https://...", "price": 899.99 }
  ]
}

3.2 Price Collection

Priority order:
1. Amazon price (most consistent)
2. Manufacturer MSRP
3. Major retailer price

Record:
- Current price
- List price (if on sale)
- Currency
- Timestamp
- Source URL

3.3 Product Specifications

Collect (if available):
- Weight (lbs/kg)
- Dimensions (folded/unfolded)
- Age/weight range
- Type classification
- Color options
- Included accessories
- Compatible accessories
- Safety certifications

3.4 Photo Collection

Priority order:
1. Manufacturer official images (highest quality)
2. Amazon main product image
3. Retailer images

Requirements:
- Minimum 800px width
- White or neutral background preferred
- Show product from front/side angle
- No lifestyle shots (people, rooms)

If multiple candidates:
- Download all to /images/review/{product-id}/
- Tag with source and confidence score
- Flag for human review

Phase 4: Quality Assurance

4.1 Automated Validation

const VALIDATION_RULES = {
  required: ['brand', 'model', 'markets.US.price'],
  priceRange: { min: 20, max: 3000 },
  weightRange: { min: 3, max: 50 }, // lbs
  imageMinSize: 10000, // bytes
};
 
function validate(product) {
  const issues = [];
  
  // Required fields
  for (const field of VALIDATION_RULES.required) {
    if (!getNestedValue(product, field)) {
      issues.push({ type: 'missing', field });
    }
  }
  
  // Price sanity check
  const price = product.markets?.US?.price;
  if (price < 20 || price > 3000) {
    issues.push({ type: 'suspicious_price', value: price });
  }
  
  // Image check
  if (!product.imageUrl || !fs.existsSync(imagePath)) {
    issues.push({ type: 'missing_image' });
  }
  
  return issues;
}

4.2 Review Queue

Generate review file for human verification:

// data/review-queue.json
{
  "pendingReview": [
    {
      "productId": "xyz-stroller-x1",
      "issues": [
        { "type": "suspicious_price", "value": 15.99, "note": "Unusually low for stroller" },
        { "type": "multiple_images", "candidates": ["img1.jpg", "img2.jpg"] }
      ],
      "addedAt": "2026-03-12T17:00:00Z"
    }
  ]
}

4.3 Duplicate Detection

Flag potential duplicates for review:

Criteria:
- Same brand + similar model name (Levenshtein < 3)
- Same price within 5%
- Same specifications

Output Files

Main Product Data

data/strollers.json — Production data

Staging Data

data/staging/{category}-new.json — New products pending review

Review Queue

data/review-queue.json — Items needing human attention

Price History

data/price-history.json — Historical price tracking

Image Review

images/review/ — Multiple image candidates per product


Scripts

scripts/discover-products.js

Full discovery pipeline for a market

scripts/update-prices.js

Refresh prices from Amazon

scripts/download-images.js

Download missing product images

scripts/validate-data.js

Run validation rules on all products

scripts/import-staging.js

Move reviewed products from staging to production


Usage Examples

Add products from a new market

node scripts/discover-products.js --market MX --category strollers

Update all prices

node scripts/update-prices.js --market US
node scripts/update-prices.js --market MX

Validate before deploy

node scripts/validate-data.js

Process review queue

# Human reviews data/review-queue.json
# Then run:
node scripts/import-staging.js --approved

Maintenance Schedule

TaskFrequencyScript
Price updates (US)Weeklyupdate-prices.js --market US
Price updates (other)Bi-weeklyupdate-prices.js --market X
New product discoveryMonthlydiscover-products.js
Full validationBefore deployvalidate-data.js
Image auditQuarterlyManual review

Notes

  • Always respect robots.txt and rate limits
  • Use 2-3 second delays between requests
  • Rotate user agents if blocked
  • Cache responses to avoid redundant scraping
  • Log all scraping activity for debugging