|
|
|
|
|
by rch
2044 days ago
|
|
This could be of interest to you: Ville Tuulos - How to Build a SQL-based Data Warehouse for 100+ Billion Rows in Python PyData SV 2014 -
In this talk, we show how and why AdRoll built a custom, high-performance data warehouse in Python which can handle hundreds of billions of data points with sub-minute latency on a small cluster of servers. This feat is made possible by a non-trivial combination of compressed data structures, meta-programming, and just-in-time compilation using Numba, a compiler for numerical Python. To enable smooth interoperability with existing tools, the system provides a standard SQL-interface using Multicorn and Foreign Data Wrappers in PostgreSQL. https://www.youtube.com/watch?v=xnfnv6WT1Ng |
|