|
|
|
|
|
by yorhel
5054 days ago
|
|
> I managed to crawl [..] more than 300k movies from IMDB in just a few hours I suppose IMDB already has a pretty good architecture to handle that load, but please, if you're crawling from a single site, be careful. I host a similar database myself, and the CPU/load graphs of my server can tell me exactly when someone has a crawler active again. That's not fun if your goal is to keep a site responsive while keeping the hosting at low cost. |
|
For me, it was more a proof of how efficient and fast a crawler can be. Also, a response from IMDB was very fast in less than 0.4 seconds, so not that much time was lost there.