Hacker News new | ask | show | jobs
by lmilcin 1772 days ago
1M requests per minute on 1000 web instances is not an achievement, it is a disaster.

It is ridiculous people brag about it.

Guys, if you have budget maybe I can help you up this by couple orders of magnitude.

3 comments

Knowing nothing else, it's hard to know if this is good or not. It's 16 requests per second. Are those requests something like "Render a support article" or are they "Give the user a ranked feed of what they should see on their home screen"? Is most of the logic run by the web server or some combination of app servers / backend services behind it? What kind of hardware does the web server have?

All of those would affect the answer, and would preclude being able to guarantee "up this by couple orders of magnitude"

Well, to give you an idea, I am working on an service that implements rather complex business process involving fetching data from multiple sources, parsing binary blobs with market data in proprietary format, saving results to a database and so on. And it does around 10k requests per second on a single relatively normal node (8 cores, 64GB ram, etc.)

And no, it does not require any special tricks. It is regular Java / WebFlux / REST / MongoDB backend service.

CPUs can do really a lot and if your node processes 16 requests per second on a multi-core machine then you are using billions of clock cycles and gigabytes of possible transfer to memory for a single request. Something is not quite right...

As a somewhat imprecise example, if a single "request" requires sorting 2.4 billion integers, then a 2.4 GHz CPU with 16 cores will be able to process at most 16 RPS no matter how much you switch from JavaScript to Java or if you write assembly.

At the end of the day efficiency is ultimately a business problem and not a technical problem and is rarely the thing that tips a project (Clubhouse in the article) from being profitable to being unprofitable. It's usually an investing question - I have X engineer-months to spend. I can cut costs by Y by optimizing stuff or get Z more profit by building a feature. I will choose to optimize stuff if and only if Y>Z as it returns more.

Clubhouse's major costs are probably bandwidth and engineer time rather than servers. That is to say, even if efficiency was infinity for compute (i.e. server costs magically went to zero) it would probably not change Clubhouse's business proposition that much.

More to the point, I think you are uncharitable at best when you say elsewhere that other frameworks and languages won't require more development work. These frameworks (and the choice of language being implicit in that) are specifically designed to reduce development work. Let's examine for example garbage collection. Garbage collection is undeniably more wasteful than other solutions to memory management, absolutely. But would you really argue that garbage collection does nothing to reduce development time? I find that extremely hard to believe, empirically and subjectively having written programs in many environments including bare metal, reference counted or otherwise semi-managed and garbage collected languages. And so it goes with all of the choices these frameworks like Django and Rails take. And it's getting better with time as things like JRuby are developed, inefficiencies in Rails or Django are removed, etc.

This is a comically yet incredibly common engineering bad take. When you run a company there is only one question to answer, one north star - does it make money ?
To be honest the article does realize this, first blaming it on the poor hindsight from original developer (co-founder) and in the conclusion about maybe rewriting the whole thing.

It seemed to be all about how to extract the most performance from the lemon they had to deal with.

I found the linked reference really informative too: https://rachelbythebay.com/w/2020/03/07/costly/

I don't know Python or how complex their domain is but the number of workers suggests to me it is not that complex and their application spends most of its time switching contexts and in inefficient frameworks.

Per my experience most applications that mostly serve documents from databases should be able to take on at least 10k requests per second on a single node. this is 600k requests per minute on one node, compared to their 1M per 1000 nodes.

This is what I am typically getting from a simple setup with Java, WebFlux and MongoDB with a little bit of experience on what stupid things not to do but without spending much time fine tuning anything.

I think bragging about performance improvements when your design and architecture is already completely broken is at the very least embarrassing.

> poor hindsight from original developer (co-founder)

Well, you have a choice of technologies to write your application in, why chose one that sucks so much when there are so many others that suck less?

It is not poor choice, it is lack of competency.

You are co-founder and want your product to succeed? Don't do stupid shit like choosing stack that already makes reaching your goal very hard.

(CH employee here)

The job of the cofounder is to create a thing that people want, which has nothing to do with performance. The first goal is capturing lightning in a bottle with social products. Performance doesn’t matter until the lightning is there, and 99%+ of the time you never have to worry about performance, because you don’t get the lightning. So, probably the correct choice is leveraging the tech stack that gives you the best shot at capturing the lightning. Django seemed to help!

Don’t sweat it buddy. People here just want to stand on your toes and feel taller. Classic HN.

Velocity of development is priority #1 and having something that needs to be scaled is a monumental achievement.

Plus, if he could've predicted the pandemic that far in advance there would probably have been plenty of not clubhouse ways to monetise that prescience ;)
This is just silly excuse.

The job of the cofounder is also to anticipate possible risks.

And building your company on an astronomically inefficient technology sounds like a huge risk to me.

Those 1000s of servers are probably a very significant cost with such small technical staff. Just by choosing the right technology for the problem, most of that cost could have been avoided.

Django has nothing special in it that would allow building applications faster than in a lot other frameworks that are also much more efficient.

So it is just a matter of simple choice.

Nobody expects people to write webapps in C++ or Rust. Just don't choose technology that is famous for being inefficient.

Python is not astronomically inefficient. Instagram serves like a billion users with it. Job of a cofounder is to build what people want. You can always scale in Silicon Valley by hiring people like you. You can’t build another viral app like clubhouse by hiring from the same crowd.

This may hurt you but the truth is scaling and software engineering is highly commoditised. That’s the whole point of being in the valley. You can hire people for such things and forget about it.

Clubhouse is not a tech company. They don’t have to care about being the best at infra

> Python is not astronomically inefficient.

Well, it is. It is a fact.

https://rachelbythebay.com/w/2020/03/07/costly/

> Clubhouse is not a tech company.

When you spin 1000s of nodes you need some tech competency.

Or in other words, if it blew one day and there would be a link to writeup on HN, people would be asking "They had 1000s of servers and nobody competent to maintain it?"

You sound just like the average sports fan commenting after a match about what x player should have done, shouldn't have done, blame it on decisions, style of the trainer, owner etc.. But you're just that.. a fan yapping about how they could do better.
So Django is an "astronomically inefficient technology"? I would just stop if I were you.
So do you think using Django is stupid? I guess you think the same about every product that uses Ruby on Rails?
No, Django is not stupid.

It is the decision to choose it to run load that will require 1000s of servers when it could be handled with 5-10 servers in another technology without more development effort.

I doubt they expected that level of request load that early on - I imagine the technology choice was made significantly before the whole pandemic thing started.
You are right if it’s a technical driving thing. But most are not that case.

CPU is much cheaper for scaling a business.