Hacker News new | ask | show | jobs
by scottlamb 960 days ago
> 12 billion queries to process 5 million orders?

2,400 queries per order? That's not that crazy IMHO. There might be significant database fan-out on each click (depending on how they do geographic lookups, search ranking / synonyms / sponsored stuff, the repeat your last order features, whether the ranked search returns the full object or a reference that then has to be individually queried, etc.). There might be many clicks per order because people browse a lot (both to find a restaurant then to find dishes within the restaurant), leave reviews, poll for delivery status updates, etc.

2 comments

That's fair but that also suggests most actions hit the main database directly instead of caching layers. Possible, but somewhat unusual at this scale.
In quorum systems like CockroachDB, non-leaders provide tons of extra capacity for eventually consistent reads. [edit: maybe a bit less so in a big database because at any instant one machine should be a leader for some shards and non-leader replica for others.] It's not always worth the complexity of having a high-hit-rate cache in front of that. Maybe no cache is needed, or just one to mitigate the worst of the hot spots.
> 2,400 queries per order? That's not that crazy IMHO.

Isn't that off by at least an order of magnitude though? It forces them to operate a much larger cluster than should be necessary.

> Isn't that off by at least an order of magnitude though?

No, for all the reasons I just said?

> It forces them to operate a much larger cluster than should be necessary.

How much machine cost and operational effort do you imagine they would save if they reduced the qps by a factor of 10 without changing the number of regions, number of tables, or size of the data? How much SWE time do you imagine that'd take to do and maintain?

I've run a global Paxos-based database that received two orders of magnitude more qps than this. It cost less than you're probably imagining. I sometimes hunted down silly queries, but mostly leader ops, and mostly to mitigate hot spots or as a quixotic latency reduction effort...overall, this was the cheapest layer of the system.

A query to a well-implemented OLTP database is not like a request to some Python/PHP/Ruby app.