Hacker News new | ask | show | jobs
by quinthar 3534 days ago
Great questions! Yes, indexing is actually a significant challenge at our scale, as it can take a long time to add and currently, it is a blocking operation. However, the sqlite team is amazing and has all sorts of write-concurrency tricks up their sleeve to allow for creating indexes in a parallel thread. We haven't used it yet, but we plan to.

As for the stored procedures, yes they are written in C++, and thus deploying them requires compiling and upgrading the server itself. However, we have 6 of them (and honestly, any one of which has enough read capacity to satisfy about all our traffic) so a minute of downtime to upgrade for each independently isn't a problem.

The result is zero downtime as perceived by the user, even though each server has occasional downtime for maintenance and upgrades. In general I'd say we upgrade the database about weekly, or more frequent depending on how active we are in the stored procedures.

1 comments

> Yes, indexing is actually a significant challenge at our scale, as it can take a long time to add and currently, it is a blocking operation.

What do you do during index creation: Do you drop write requests on the floor? Do you store them in a persistent queue somewhere? Or do you remove a node from the cluster, create the index, re-add the node to the cluster, and repeat this on every other nodes?

I'm asking because at some point I thought of using SQLite, embedded in a process containing the business logic and the network layer, but I abandoned this idea specifically because of blocking operations like index creation, and decided to use PostgreSQL instead.

> the sqlite team is amazing and has all sorts of write-concurrency tricks up their sleeve to allow for creating indexes in a parallel thread

I'd be very interested in any link related to this new feature of SQLite.

> However, we have 6 of them (and honestly, any one of which has enough read capacity to satisfy about all our traffic) so a minute of downtime to upgrade for each independently isn't a problem.

So, basically, you're doing a rolling upgrade? Have you evaluated compiling your stored procedures to dynamically loaded libraries, or using an interpreted language like Lua/Python/JavaScript?

> What do you do during index creation: Do you drop write requests on the floor? Do you store them in a persistent queue somewhere? Or do you remove a node from the cluster, create the index, re-add the node to the cluster, and repeat this on every other nodes?

The last of those -- for small indexes, we just replicate them out like normal queries. For large indexes, we take that node down and add offline.

> So, basically, you're doing a rolling upgrade? Have you evaluated compiling your stored procedures to dynamically loaded libraries, or using an interpreted language like Lua/Python/JavaScript?

Correct, rolling upgrades. It's worked well to date, but the idea of putting the plugin into a dynamically loaded library is really interesting. Our plugin system is relatively new (only the past few months) so it hasn't really been considered. Great idea!

Great! Thanks for following-up.

If you have any link about "SQLite write-concurrency tricks to allow for creating indexes in a parallel thread", please share ;-)