Hacker News new | ask | show | jobs
by alanvillalobos 1034 days ago
Git based database? Sounds super interesting. Are you guys actively looking for something or is it a nice to have?
2 comments

Many moons and a few jobs ago, I rolled an in-house Wiki in python, that was basically a very thin web UI over mercurial. Each page was a checked-in plain text file (I think some sort of markdown flavor, but this was circa 2010, a totally different era!), each edit was a revision/commit. Things like article history just shelled out to mercurial. "Querying" was down via filesystem operations. Rolled the whole thing in a day and a half or so - less than I'd already spent trying to get a MediaWiki instance stood up!
More like a git-based log than database. We use it for requests and questions that come up on our slack. We figured out how to extract threads from slack, transform them into CSV files and then load them into Airtable.

Airtable has a great UI but the data becomes isolated from our ETL script and it means we have to re-download and export to CSV to use in other contexts.

It'd be great if we could load a CSV into a desktop editor, and then have it store certain views/linked records to other CSV files in a local "db.json" file.

If you want a proper relational database, sqlite would suit better. There are GUI tools to work with sqlite database files and sqlite has csv import and export. But that won't help much if you already have a large CSV you just need to edit a few rows on.

As for versioning sqlite data, just don't ever update or delete any rows but instead only ever insert the updates as a new row. Potentially you could add a 'deleted' boolean and 'inserted' date columns to the tables you want to version. That way you can use '... and deleted = false' to filter out old data and you also know when updates occurred.

The problem with SQLite is that you have binary blobs and you don't have useful text diffs. So you lose a lot easy history reviewing tools based on git and also easy merge/pull reviews.

The closest to a best of both worlds is to use something like JSON to store the data in git and then a tool like Datasette to build an SQLite-powered view on top of that repo.