Hacker News new | ask | show | jobs
by potatosareok 3930 days ago
One question I have about this - and I might have missed in article is - I'm all for using asyncio to make HTTP requests. But I see they apparently also use asyncio for "parse_links". Since parselinks should be CPU op, would it make sense to use fibers to download links and pass them into a thread pool to actually parse them//add to queue?

I'm messing around with some of the ParallelUniverse Java fiber implementation and what I do is spam fibers to download pages and send the String response over to another fiber over a channel that maintains a thread pool to parse response body as they come in//create new fibers to read these links.

I'm really just doing this to get more familiar with async programming and specifically the paralleluniverse Java libs but one thing I'm struggling a bit with is how to best make it well behaved (e.g right now there's no bound on number of outstanding HTPT requests).