Hacker News new | ask | show | jobs
by koeng 2137 days ago
Yep! The data is for the most part append-only, with massive writes once a month (that's how often the genetic repositories update their data dumps) with a few updates scattered in when annotations change.

What exactly do you mean by split it to several databases? It seems like to me that would make backup and replication and such more difficult, since now I'd have to manage multiple databases. But I don't have experience there, so I'd love to hear if there are easy ways to do that

1 comments

If you are doing this for analytical things, setup with one database is OK. But, imagine that you a running something on production with sqlite, and database is really big. It is hard to : VACUUMizing, creating indexes and so on. In that case it's great to shard this thing, even it will be several files on one machine (of course, if you have data that can be sharded, like different users data can be stored on different dbs.)