Hacker News new | ask | show | jobs
by timwis 1693 days ago
Interesting.. Would this be effective at loading a remote CSV file with a million rows, then performing basic GROUP BY COUNTs on it so I can render bar charts?

I’ve been thinking of using absurd-sql for it since I saw https://news.ycombinator.com/item?id=28156831 last week

2 comments

I contribute to https://perspective.finos.org/ , supports all of this and quite a lot more.

Here's 1,000,000 rows example I just threw together for you

https://bl.ocks.org/texodus/3802a8671fa77399c7842fd0deffe925

and a CSV example, you try yours right now

https://bl.ocks.org/texodus/02d8fd10aef21b19d6165cf92e43e668

It depends.

Querying CSV files is particularly painful over the network since we still have to read everything for a full scan.

With Parquet, you would at least only have to read the columns of group by keys and aggregate arguments.

Try it out and share your experiences with us!