| > Yes, indexing is actually a significant challenge at our scale, as it can take a long time to add and currently, it is a blocking operation. What do you do during index creation: Do you drop write requests on the floor? Do you store them in a persistent queue somewhere? Or do you remove a node from the cluster, create the index, re-add the node to the cluster, and repeat this on every other nodes? I'm asking because at some point I thought of using SQLite, embedded in a process containing the business logic and the network layer, but I abandoned this idea specifically because of blocking operations like index creation, and decided to use PostgreSQL instead. > the sqlite team is amazing and has all sorts of write-concurrency tricks up their sleeve to allow for creating indexes in a parallel thread I'd be very interested in any link related to this new feature of SQLite. > However, we have 6 of them (and honestly, any one of which has enough read capacity to satisfy about all our traffic) so a minute of downtime to upgrade for each independently isn't a problem. So, basically, you're doing a rolling upgrade? Have you evaluated compiling your stored procedures to dynamically loaded libraries, or using an interpreted language like Lua/Python/JavaScript? |
The last of those -- for small indexes, we just replicate them out like normal queries. For large indexes, we take that node down and add offline.
> So, basically, you're doing a rolling upgrade? Have you evaluated compiling your stored procedures to dynamically loaded libraries, or using an interpreted language like Lua/Python/JavaScript?
Correct, rolling upgrades. It's worked well to date, but the idea of putting the plugin into a dynamically loaded library is really interesting. Our plugin system is relatively new (only the past few months) so it hasn't really been considered. Great idea!