I am running top-1000 site in one of EU countries on one 4 core machine with 20-30% load. Around 1000 http/https reqs/s. Most of those requests do couple of postgres reqs (read and write) and couple of redis reqs.
Elasticsearch - for searching/recommendations
Redis - hot data (certain data is only kept in redis)
Postgres - for the rest of data
Clickhouse - analytics
Most of the system is written in Go.
Whole system was tuned for performance from day one.
As to latency, data from the last 21 million requests today:
Thanks for the reply! May I ask how far you go with respect to reliability/availability? Do you have a live replicated server on stand by or just replicate a log stream elsewhere, something else?
I know, just a good natured poke :) Plus you could probably take that comment at face value - making use of web caching is definitely an important tool when building a large scale system.
HN has the luxury of being able to make few high level changes over years, though. It might be tougher to maintain that single box elegance and performance if they were adding new features every month or two (which is much more applicable to the rest of us).
Why should adding features make an app crumble on a single server?
I think the point is that good software is able to serve a lot of users on a single server.
A great example imho is Blender. Features are added constantly but because the software is modular it doesn't have any impact on the overall performance.
Today the problem is that adding features means: adding the latest and greatest lib while having absolutely no idea about the inner workings.
Yes it takes time to write your own libs. But when performance is an issue you will either have to write your own lib or take one that is good and tested.
> Why should adding features make an app crumble on a single server?
It's not inherent, but obviously as you have more developers working on more and more things independently, each with different needs, tolerances, and deadlines, it becomes increasingly unreasonable to presume it can all be managed well on a single box.
If they went and added chat, or Twitter-like features, or subreddits, or similar, it might be a lot tougher to keep it all on a single box. It's a lot easier when we're all looking at the same top 30 stories, and pretty limited in how we interact with them and each other.
> A great example imho is Blender. Features are added constantly but because the software is modular it doesn't have any impact on the overall performance.
As someone who used to hack on Blender all I can say is it's a big ball of inter-dependent modules all with interlocking dependencies. The only thing that really keeps it manageable is the strict adherence to MVC which, I suppose, does make it modular.
To me the comparison just isn't there. The user experience here could be identical to how it was in the '90s. It's certainly something to marvel at to some extent, but a lot of us could get a pretty high level of elegance in our backend if our user experience had no reason to change for 10+ years.
The user experience is very simple and that's a good thing, too. I much prefer HN's UX over Reddit's, which is painfully slow on mobile and noticeably slower on desktop browsers.
I think the arc codebase is worth studying and understanding, primarily so that you can extend its simplicity into your own projects. The reason HN was such a success is because it handles so many cases in the same way: Stories, comments, and polls are all the same thing: items. If you want to add a new thing, you just create a new item and add whatever fields you want.
These rapid prototyping techniques have downsides, but the cure is to keep in mind what you can't do. (For example, you can't rename item keys without breaking existing items, so be sure to choose good names.)
Like emacs, HN's design is borne out of simplicity and generality. It's what you get when you write the next most important feature as quickly as possible, then cut as much code as possible, every day. Both halves are equally important.
It's fine to say that our modern applications are so much more complicated that the old lessons don't apply. And in extreme cases, that may be true. I don't think SpaceX has the luxury of rapid prototyping their software.
But the typical app is CRUD. For those, data processing flexibility is perhaps the most important factor in whether you can write new features quickly. And since code is data, a lisp master can write systems with a shocking number of features in shockingly few lines of code. (See Jak'n'dexter: https://all-things-andy-gavin.com/2011/03/12/making-crash-ba...)
> I think the arc codebase is worth studying and understanding
Are you talking about the source code for ARC, or for Hacker News?
It would be interesting to see how ARC is being employed on such a high profile site. I expect that some algorithms won't be freely available so as not enable people to game the site, but the rest would be interesting to see.
I could find any source for Hacker News though.
> And since code is data, a lisp master can write systems with a shocking number of features in shockingly few lines of code
You don't even have to be a master. So much bikeshedding has been spent over the decades. We are still going back and forth on data interchange formats...
So there's a few features that I'd like to see at some point: Some way of marking "new comments" when I revisit a thread. Rescanning HN threads just to see what's changed since I last looked at one is kinda frustrating. (I realize you're not @dang, but posting this here just as an idea)
> runs on a single box via a single racket process
Incomplete. They are fronted by one of the largest CDNs in the world, on whom they rely for most traffic.
Well, I actually don't know the numbers, but I do know that for popular posts HN admins (a) try to break the conversation up over multiple posts and (b) plead with users to log out so as to allow Cloudflare to handle the load.
The biggest lesson HN teaches for designing large scale systems is "use a large scale system someone else has already designed".
HN is relatively easy to optimise though - there are only a few stories with high traffic, so if you have good caching the load on the back end can be very low. It's more difficult to do that with something like github where the users are spread across millions of repos.
Hacker News has been up and running for like 10 years now, isn’t it? Instead of hypothetical question, I guess we have actual data on the uptime for such question.
I think redundancy is much easier if the thing being made redundant is extremely simple? Imagine having to manage 50 different services (and sometimes servers) as opposed to 1 machine.
The figure I saw around maybe 2015 was that AWS (maybe Heroku) is 10x more expensive and bare metal is 10x more efficient/faster, lending a 100x difference in price-performance. Both factors are probably larger nowadays.
It's the convenience and the enormous ecosystem of plug and play services that make AWS do good for point and click building architectures.
You can do all of it locally (except multi zone and multi region reliability) but it would take a lot of configuration, skills and you wouldn't have the same Gui for everything.
AWS is not cheap but companies don't care, because without it, you have to pay to find experts and keep them happy, which is a lot harder.
Elasticsearch - for searching/recommendations
Redis - hot data (certain data is only kept in redis)
Postgres - for the rest of data
Clickhouse - analytics
Most of the system is written in Go. Whole system was tuned for performance from day one. As to latency, data from the last 21 million requests today:
p99: 17.37 ms
p95: 6.86 ms
avg: 2.37 ms