Hacker News new | ask | show | jobs
by philbe77 237 days ago
Hi shinypenguin - the dataset and challenge are detailed here: https://github.com/coiled/1trc

The data is in a publicly accessible bucket, but the requester is responsible for any egress fees...

2 comments

I suggest linking to that from the article, it is a useful clarification.
Good point - I'll update it...
Hi, thank you for the link and quick response! :)

Do you know if anyone attempted to run this on the least amount of hardware possible with reasonable processing times?

Yes - I also had GizmoSQL (a single-node DuckDB database engine) take the challenge - with very good performance (2 minutes for $0.10 in cloud compute cost): https://gizmodata.com/blog/gizmosql-one-trillion-row-challen...