|
|
|
|
|
by lstamour
2078 days ago
|
|
Have a listen to https://softwareengineeringdaily.com/2019/05/24/camelcamelca... or browse the transcript at https://softwareengineeringdaily.com/wp-content/uploads/2019... (27 page PDF) Relevant bits - they continuously poll for new prices but they have a lot of products to request. They batch as best they can but there are request rate limits they have to respect also: > [00:11:04] JM: Let’s talk about the core the process that you have to do. So in order to build these price models for CamelCamelCamel, there is a repeated usage of this Amazon advertising API that gives you some data on the price. Tell me how that scraping infrastructure works. > [00:11:27] DG: Sure. So essentially what we do is build a queue of – Or multiple queues of products. We split up things in different ways by the Amazon country. Because, of course, we support all of Europe and North America. Well, Canada and the United States. So we split that up. > We also prioritize based on user interest. Since we have a finite number of API requests, we have to try to make the most of those. So whether a product is being actively tracked by a user or not, it gets higher priority. Then we use Amazon SQS to create these queues and then we just pop things off the queues and make API requests. |
|