Hacker News new | ask | show | jobs
by aurareturn 5 days ago
Is 100k order per minute a lot? Even a single Postgres instance should serve that fine?
4 comments

100k(s) orders per minute is several orders of magnitude more than realistic. Amazon does 20k orders per minute.

Instacart doesn't need "100,000s of grocery delivery orders per minute".

There must be some 0s added for the sake of the story.

According their 2026 Q1 filing they do about 90 million orders per quarter which is about 12 orders per second, 720 orders per minute.

It might make 100k row level changes per minute, but that’s a different metric.

https://www.sec.gov/Archives/edgar/data/1579091/000157909126...

Instacard have released a public dataset[1] on their orders, so it should be even easier to verify this claim. From what I could find in some analysis[2] of this dataset around 100k orders per day and not per minute seems accurate.

I assume they are referring to how many database requests they have due to customers orders or a similar metric and just worded it poorly.

[1] https://www.kaggle.com/datasets/psparks/instacart-market-bas... [2] https://rstudio-pubs-static.s3.amazonaws.com/284199_5c498037...

This data set was released years before the Covid hypergrowth phase which they are referring to.
That's fair as the Kaggle dataset[1] is from 2017. Even assuming orders scaled with revenue (which grew to $1.5B in 2020[2]), you'd only reach a few hundred orders/minute at the pandemic peak (which lines up with the calculation above via a different method).

So I still assume the original comment isn't referring to actual orders placed.

[1] https://www.kaggle.com/datasets/psparks/instacart-market-bas... [2] https://fortune.com/2022/05/18/what-to-know-instacart-ipo/

it could be peak orders per second
Going from 720 average to 100,000s peak still doesn't sound realistic. especially as they operate in many timezones.
I'd wager on this.
i think this assumes that those orders are distributed evenly over time
And just like that you’ve done more due diligence than the VCs who just threw money at this.
Nope. Completely flawed logic that assumes equal distribution. Dunning Kruger
I thought not assuming uniform distribution was table stakes for senior engineers.

I can't say what the curve looks like, but 100,000 orders per second would consume reach official quarterly count in 15 minutes.

Since that's unlikely, this at least gives us some degree of bounds to guess what the curve looks like.

Amazon does 20k peak, or 20k average? Website visitor peaks could easily be two orders of magnitude higher traffic than average for a few minutes.
I worked at a company that had billions of views per year on a single big Postgres instance. Extremely read heavy with many queries needed for a page load. You can cache a lot of things.
Yes, but that's not a shopping cart, or a checkout workflow, nor a web store with heavy analytics.
It was one of the top real estate portals in the world. A lot of geolocation searches. New search every time someone moves the map. A ton of data sent to the client. Analytics in every page view.

No clue how a shopping cart or checkout flow would drastically increase database load. It should just be basic CRUD. Building a shopping cart is something every student makes. Pages in a web store can be cached relatively easily since items won't change often.

A primary DB with a few replicas and caching can go a really long way.

The composition of the average transaction will be different in a shopping cart (lots of writes and updates) compared to your use case which sounds like it skewed read heavy. With Postgres it’s generally easier to scale reads because it doesn’t really matter which replica the query hits, as long as it contains the data it needs. Whereas write-heavy workloads route through a single-writer bottleneck.

There’s challenges scaling read-heavy workloads, for sure — but they’re generally more straight forward than scaling write-heavy workloads. You can get away with more dumb horizontal scaling than with writes.

You don't see how adding functionality that requires writing to the database rather than just reading from a cache could "drastically increase database load"?
Scaling (asynchronous) reads is much easier than scaling writes.
That doesn't necessarily mean _new_ orders per minute. Their app or website could poll for updates every 15 seconds

Could just be looking at the "orders" endpoint in their app which might also include incremental updates as shoppers get items from the store. It's a fairly ambiguous statement

One assumes they mean 100,000s (plural) concurrent users actively building carts
Is that still a lot? Feels like a single 64-core, 256GB RDS instance with some caching should handle that fine. RDS has instances up to 192-core and 768GB.
Keep in mind they’re doing real-time logistics and messaging, as well as type-ahead search and managing ads and promotions
I think the real-time logistics is likely the thing taxes a Postgres database.

Everything else seems normal DB CRUD that a single beefy instance with a few replicas should handle easily. Type ahead search is no doubt using a different service and not directly querying Postgres.

It's orders, not queries. Who knows how many requests that actually takes.
Average throughput is one thing, tail latency, quite another.