|
|
|
|
|
by renegat0x0
89 days ago
|
|
There are many nice http clients: - httpx - curl cffi - httpmorph - httpcloak - stealth crawler I wrote a framework, link below, which uses them all. You can compare each to verify crawling speed. Some sites can be cleanly crawled with a one particular framework. Having read the article I am in a pain. I do break things while development. I rewrite stuff. Maybe some day I will find a way to develop things "stable". One thing I try to keep in good shape is 'docker' image. I update it once everything seems to be quite stable. https://github.com/rumca-js/crawler-buddy |
|