| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by MR4D 2999 days ago

What you describe would be a super nice utility. Great for prototyping and development, as the code could be copy & pasted from such a tool into application code.

The streaming would make it memory efficient, and possibly able to handle some big data - maybe not true "Big Data", but certainly 10s of gigabytes.

Anyone want to take this idea into a GoFundMe site?

2 comments

ams6110 2999 days ago

SQL is set-oriented. How would that work on a potentially indefinite stream, other than as a simple filter which you could just do with a tool such as awk.

link

barrkel 2999 days ago

Many relational operations don't require a whole stream to compute, and many of those that do don't need it all at once.

Projection (mapping), a join against a fully loaded other side as well as filtering work.

Aggregation can consume an indefinite stream with limited working set if the cardinality of the grouping key isn't large.

And of course you can combine these in nested and unioned operations, computing across multiple indefinite streams concurrently and with limited working set.

It would be tricky to make work effecively without hinting for things like joins, for sure; join order is one of the hardest bits a query engine optimizes.

link

luckydata 2999 days ago

I think this might be helpful context:

https://calcite.apache.org/docs/stream.html

link

orionblastar 2999 days ago

Sort of a good idea to play with a small database to see how things work if it was on a SQL database.

I usually have to develop a database in Windows and Access. One more tool to work in Linux is a good idea.

I used to use awk and sed before.

link