Hacker News new | ask | show | jobs
by rch 2044 days ago
This could be of interest to you:

Ville Tuulos - How to Build a SQL-based Data Warehouse for 100+ Billion Rows in Python

PyData SV 2014 - In this talk, we show how and why AdRoll built a custom, high-performance data warehouse in Python which can handle hundreds of billions of data points with sub-minute latency on a small cluster of servers. This feat is made possible by a non-trivial combination of compressed data structures, meta-programming, and just-in-time compilation using Numba, a compiler for numerical Python. To enable smooth interoperability with existing tools, the system provides a standard SQL-interface using Multicorn and Foreign Data Wrappers in PostgreSQL.

https://www.youtube.com/watch?v=xnfnv6WT1Ng