| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ngrilly 3534 days ago

> Yes, indexing is actually a significant challenge at our scale, as it can take a long time to add and currently, it is a blocking operation.

What do you do during index creation: Do you drop write requests on the floor? Do you store them in a persistent queue somewhere? Or do you remove a node from the cluster, create the index, re-add the node to the cluster, and repeat this on every other nodes?

I'm asking because at some point I thought of using SQLite, embedded in a process containing the business logic and the network layer, but I abandoned this idea specifically because of blocking operations like index creation, and decided to use PostgreSQL instead.

> the sqlite team is amazing and has all sorts of write-concurrency tricks up their sleeve to allow for creating indexes in a parallel thread

I'd be very interested in any link related to this new feature of SQLite.

> However, we have 6 of them (and honestly, any one of which has enough read capacity to satisfy about all our traffic) so a minute of downtime to upgrade for each independently isn't a problem.

So, basically, you're doing a rolling upgrade? Have you evaluated compiling your stored procedures to dynamically loaded libraries, or using an interpreted language like Lua/Python/JavaScript?

1 comments

quinthar 3534 days ago

> What do you do during index creation: Do you drop write requests on the floor? Do you store them in a persistent queue somewhere? Or do you remove a node from the cluster, create the index, re-add the node to the cluster, and repeat this on every other nodes?

The last of those -- for small indexes, we just replicate them out like normal queries. For large indexes, we take that node down and add offline.

> So, basically, you're doing a rolling upgrade? Have you evaluated compiling your stored procedures to dynamically loaded libraries, or using an interpreted language like Lua/Python/JavaScript?

Correct, rolling upgrades. It's worked well to date, but the idea of putting the plugin into a dynamically loaded library is really interesting. Our plugin system is relatively new (only the past few months) so it hasn't really been considered. Great idea!

ngrilly 3534 days ago

Great! Thanks for following-up.

If you have any link about "SQLite write-concurrency tricks to allow for creating indexes in a parallel thread", please share ;-)