Hacker News new | ask | show | jobs
by FrenchDevRemote 716 days ago
First list all the categories/subcategories URLs for the domain you want to target(you'll probably need to do it for each country), pretty straightforward.

Then find how the pagination works, usually it's a get number parameter or a cursor in the URL.

Some websites will have a limited number of results on the same search, so you'll need to tinker with the faceted search, it's a "simple" loop, for example:

for (i in categories)

      for (j in price_ranges)
    
          for (k in price_ranges.pages[j])
  
                  get_products_on_the_page()
1 comments

We do that with MercadoLibre (way smaller Amazon for latam) and the limit they pagination results to 40 pages/2000 products. Never thought of the faceted search approach. Thanks for the idea!