Hacker News new | ask | show | jobs
by tveita 4385 days ago
I'm curious how this is executed.

Is it like a query engine, where you work with the entire query up-front, apply transforms and build a query plan?

Or is it more like an event loop, where you run as far as you can until the code blocks on IO, batch up and send all the pending IO requests, and run further when the tasks you're blocked on resolve?

1 comments

Part of the beauty is that the actual way IO (note: in this version, IO here means 'reads from the network', almost always) is scheduled is abstracted away such that we could go with either approach w/o impacting client code.

That said, the way it currently works is more like the first. You can think of the entire haxl run (program) as an AST that is given to the execution. It expands as much of the AST as possible (anything that's not IO), and anywhere it needs IO it enqueues those requests to be scheduled. Once it's explored as much as possible, it aggressively schedules the IO (deduping, batching, and overlapping the calls). Once it all comes back, it unblocks the AST where it can, and repeats the process.

This isn't necessarily the optimal scheduling (as you point out, unblocking each part of the tree as each result comes in might be better). It was specifically designed to make it easy to play with this kind of stuff later. Since the concurrency is entirely implicit the implementation is entirely abstracted away.

Have a look at the SQLTap service written by the guys from DaWanda.com (https://github.com/paulasmuth/sqltap). It does basically exactly that for SQL queries but is implemented as a standalone Java/Scala SQL proxy server.