|
|
|
|
|
by marginalia_nu
432 days ago
|
|
The problem with scrapers in general is the asymmetry of compute resources involved in generating versus requesting a website. You can likely make millions of HTTP requests with the compute required in generating the average response. If you make it more expensive to request a documents at scale, you make this type of crawling prohibitively expensive. On a small scale it really doesn't matter, but if you're casting an extremely wide net and re-fetching the same documents hundreds of times, yeah it really does matter. Even if you have a big VC budget. |
|