Hacker News new | ask | show | jobs
by the_duke 1987 days ago
The juicy part of the paper:

> The main idea of an architecture-less database system is that it is composed of a single generic type of component where multiple instances of those components “act together” in an optimal manner on a per-query basis. To instrument generic components at run-time and coordinate the overall DBMS execution, each component consumes two streams: an event and a data stream. While the event stream encodes the operations to be executed, the data stream shuffles the state required by these events to the executing component.Through this instrumentation of generic components by event and data streams, a component can act as a query optimizer at one moment for one query but for the next as a worker executing a filter or join operator.

Doesn't sound too different from current distributed DBMS, which already specialize query execution and distribute workload between cores/nodes in a similar manner, but taken to the next level to more easily enable things like different data persistence models or FPGA integration.

Seems challenging to implement without losing significant performance to the abstraction layer.

2 comments

> Seems challenging to implement without losing significant performance to the abstraction layer.

Agreed, it seems the tradeoff would only make this worthwhile when you need to optimize for throughput, and a lot of the workload is ad hoc. A lot of distributed DBMS just serve point queries or range scans where I don't think something like this would be useful.

Troubleshooting production issues here seems challenging too (though that's always the case in distributed systems to this would be less of a problem).

Still interesting, but it would be more interesting to see the idea get some trial by fire.

It sounds like a great way to have simple queries create tons of cpu cache misses.