Hacker News new | ask | show | jobs
by ngrilly 3534 days ago
> Yes, indexing is actually a significant challenge at our scale, as it can take a long time to add and currently, it is a blocking operation.

What do you do during index creation: Do you drop write requests on the floor? Do you store them in a persistent queue somewhere? Or do you remove a node from the cluster, create the index, re-add the node to the cluster, and repeat this on every other nodes?

I'm asking because at some point I thought of using SQLite, embedded in a process containing the business logic and the network layer, but I abandoned this idea specifically because of blocking operations like index creation, and decided to use PostgreSQL instead.

> the sqlite team is amazing and has all sorts of write-concurrency tricks up their sleeve to allow for creating indexes in a parallel thread

I'd be very interested in any link related to this new feature of SQLite.

> However, we have 6 of them (and honestly, any one of which has enough read capacity to satisfy about all our traffic) so a minute of downtime to upgrade for each independently isn't a problem.

So, basically, you're doing a rolling upgrade? Have you evaluated compiling your stored procedures to dynamically loaded libraries, or using an interpreted language like Lua/Python/JavaScript?

1 comments

> What do you do during index creation: Do you drop write requests on the floor? Do you store them in a persistent queue somewhere? Or do you remove a node from the cluster, create the index, re-add the node to the cluster, and repeat this on every other nodes?

The last of those -- for small indexes, we just replicate them out like normal queries. For large indexes, we take that node down and add offline.

> So, basically, you're doing a rolling upgrade? Have you evaluated compiling your stored procedures to dynamically loaded libraries, or using an interpreted language like Lua/Python/JavaScript?

Correct, rolling upgrades. It's worked well to date, but the idea of putting the plugin into a dynamically loaded library is really interesting. Our plugin system is relatively new (only the past few months) so it hasn't really been considered. Great idea!

Great! Thanks for following-up.

If you have any link about "SQLite write-concurrency tricks to allow for creating indexes in a parallel thread", please share ;-)