Hacker News new | ask | show | jobs
by simonw 1872 days ago
If you want to try it in SQLite (pros: no need to run a server or install anything since it's in the Python standard library, cons: not nearly as many advanced statistical analysis features as PostgreSQL) my sqlite-utils CLI tool may help here: it can import from CSV/TSV/JSON into a SQLite database: https://sqlite-utils.datasette.io/en/stable/cli.html#inserti...
1 comments

The sqlite3 connection object in Python allows you to register callables which you can use as scalar or aggregate functions in your SQL queries. With this, you can fill some of the gaps compared to PostgreSQL by essentially importing Python libraries. I just found this nice tutorial while looking for relevant docs:

https://wellsr.com/python/create-scalar-and-aggregate-functi...

However, I think the limited type system in SQLite means you would still want to extract more data to process in Python, whether via pandas, numpy, or scipy stats functions. Rather introducing new composite types, I think you might be stuck with just JSON strings and frequent deserialization/reserialization if you wanted to build up structured results and process them via layers of user-defined functions.