| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pjot 906 days ago
	Apache Arrow and Substrait have been working towards making this a reality. I see a future where executing a query can/will send plans to many different engines distributed across the cloud, but also locally on your on machine.

2 comments

falcor84 906 days ago

Real-time Bidding on query execution? The more I think about it, I believe you actually have a viable business model here.

link

FridgeSeal 906 days ago

That’s a wildly interesting idea.

It open up another market too: compatible, scalable storage. Sell shovels in a gold-rush, and what better shovel than the substrate infrastructure that those bidding query engines would probably depend on.

link

yodon 906 days ago

If the queries can be executed by any provider, you are talking about a commodity product.

The business model of selling a commodity is wildly unlike the business model tech is in today.

link

FridgeSeal 905 days ago

The query execution might be commodity, but the purchasers will still need to store their data somewhere, and this somewhere will need to be able to service the bandwidth and requirements of the query execution providers.

link

fifilura 905 days ago

It feels like you could just as well pack the runtime/engine into the job you are requesting? Am I wrong?

link

pjot 905 days ago

The point is more so in aim of creating interoperability between systems and making them in turn composable.

When there’s a common intermediate representation you can pass around those compute instructions and execute. And when there’s shared memory formats data can pass from storage to engine without serialization/deserialization.

So it wouldn’t matter if data is here or there, in this or that format, because the instructions are the same the specific interface (snowflake, MySQL, a local parquet file, etc) is irrelevant mitigating the need for glue code.

link