Hacker News new | ask | show | jobs
by stblack 1305 days ago
Definition of scalability: (Change in performance) over (the change in something else).

As math:

  Scalability = δ Performance / δ x
  
... where x can be almost anything.

Take note: performance == "throughput" in most cases.

THEREFORE, when scalability questions arise, you should be specific about which dimension of scalability is in question.

Some dimensions of “scalability”

* Number of customers

* Total number of users

* Number of concurrent users

* Number of locations where a business or application runs

* Size of the database

* Transaction volume

* Output volume

* Response time

2 comments

It’s not very constructive to respond to an article which sets out a thesis about thinking of scalability along multiple dimensions by instead just setting out your own different way of thinking about scalability in dimensional terms.

At least maybe address how your thinking about scalability in this way fits with or contradicts the original article.

At what X is that?

It's way more useful to define it as the maximum X such that dPerformance / dX is higher than some minimum viable amount. So you can say things like "this algorithm scales up to 1000 requests for second on our infrastructure"

That is not a statement about scalability. That is a statement about a lack of scalability.

If you multiply your infrastructure by n, what RPS does your algorithm now scale up to? 1000n is good. > 1000n is excellent. < 1000n is not scalable.

> If you multiply your infrastructure by n

You've read it wrong. That number doesn't change with the size of your infrastructure, it changes with its cost/size.

I'm confused.

If someone tells me that an algorithm scales to 1000 RPS on their infrastructure, and we are having a conversation about scalability, I am interested in what I have to do to increase the number of requests per second by some factor.

Unless what you meant by 1000 RPS was 1000 RPS per server, say. In which case you're claiming that your system scales linearly - adding another server adds another 1000RPS.

Yes, my writing wasn't very clear either.

Your infrastructure imposes a cost structure. Any scalable algorithm will give you sublinear performance gains while you add those costs. At some point, the costs explode, and solving more of the problem isn't economical anymore. Thus, any algorithm will have a maximum capacity at your infrastructure. (1000 RPS is quite low tough, it may have been a bad example.)

Okay, so yes: you're describing something that does not scale.