| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by fotta 1962 days ago
	I'm surprised that a site as big as HN is only hosted in one place.

5 comments

Aperocky 1962 days ago

HN is probably very small. Curious as to the minimum size of the backend that will hold up the website.

There may need to be read replicas, but maybe not even that is needed.

link

dang 1962 days ago

It's about the same as what Scott described here: https://news.ycombinator.com/item?id=16076041

But we get around 6M requests a day now.

link

skissane 1962 days ago

What was the motivation in choosing FreeBSD?

(Just so nobody misinterprets my question, nothing wrong with FreeBSD, I know other stuff also runs on it like Netflix’s CDN. Still always interested to hear why people choose the road less travelled)

link

tlb 1962 days ago

RTM, PG and I used BSDI (a commercial distribution of 4.4BSD) at Viaweb (starting 1995) and migrated to FreeBSD when that became stable. RTM and I had hacked on BSD networking code in grad school, and it was far ahead of Linux at the time for handling heavy network activity and RAID disks. PG kept using FreeBSD for some early web experiments, and then YC's website, and then for HN.

FreeBSD is still an excellent choice for servers. You may prefer Linux for servers if you're more familiar with it from using it on your laptop. But you use Mac laptops, FreeBSD sysadmin will seem at least as comfortable as Linux.

link

caslon 1962 days ago

Do you think this influenced early YC companies more generally? For example, reddit's choice in picking FreeBSD over Linux?

It's interesting that they might still be on Lisp if they hadn't picked FreeBSD (a chiefly cited concern was that spez's local dev environment couldn't actually run reddit, which seems like it wouldn't have been a problem with Linux, since Linux & OS X both had OpenMCL (now known as CCL) as a choice for threaded Lisp implementations at the time).

link

tlb 1962 days ago

Lisp was indeed a hassle on FreeBSD. Viaweb used CLisp, which did clever things with the VM system and garbage collection that weren't quite portable (and CLisp's C code was all written in German for extra debugging fahrvergnügen.)

I don't know how Reddit came to use FreeBSD, but if you asked which OS to use around university CS departments in 2005 you'd get that answer pretty often.

link

dang 1962 days ago

I don't know, because that decision dates back to pg and rtm and probably Viaweb days. We like it.

link

ethbr0 1962 days ago

Pragmatic engineering: What will this change enable me to do that I cannot do now? Does being able to do that solve any of my major problems? (If no, spend time elsewhere)

link

bhl 1962 days ago

Can I ask a question that's half facetious half serious (0.5\s): does hackernews use docker or any containers in its backend? With 6M requests per day, if it didn't use containers, HN might be a good counter example against premature optimization (?).

link

dang 1962 days ago

Nope, nothing like that. I don't understand why containers would be relevant here though? I thought they had to do more with things like isolation and deployment than with performance, and it's not obvious to me how an extra layer would speed things up?

link

bhl 1962 days ago

I was trying to point out in my original comment that some people maybe pre-maturely optimizing for scale, and having tooling drive decision-making rather than problems at hand. And a good logical short circuit to that would be: "if Hacker News serves 6M requests per day, then using docker would be overkill for a small CRUD app".

That being said, if modern websites were rated by utility to user divided by complexity of tech stack, I must say Hacker News would be one of the top ranked sites compared to something similar like Reddit or Twitter which at times feels... like a juggling act on top of unicycle just to read some comments. :)

link

cm2187 1962 days ago

Agree. No one has created anything better than html tables!

link

ksec 1962 days ago

Not even sure any other modern stack could handle this with the same Hardware.

link

siquick 1962 days ago

Does anyone know what the AWS instance size equivalent of that would be?

link

maccard 1962 days ago

Very roughly equivalent to an m4 xlarge

link

fotta 1962 days ago

Wow, that's not as big as I thought then. What's the average peak rps?

link

dang 1962 days ago

We use Nginx to cache requests for logged-out users (introduced by the greatly-missed kogir), and I only ever look at the numbers for the app server, i.e. the Arc program that sits behind Nginx and serves logged-in users and regenerates the pages for Nginx. For that program I'd say the average peak rps is maybe 60. What I mean by that is that if I see 50 rps I think "wow, we're smoking right now" and if I see 70 I think "WTF?".

link

fotta 1959 days ago

Lol I can imagine you see 70 and think "oh no what's going on now".

link

bombcar 1962 days ago

Maybe standby should be in another rack, perhaps even another datacenter.

link

dang 1962 days ago

That would be the natural next step, but it's a question of whether it's worth the engineering and maintenance effort, especially compared to other things that need doing.

For failures that don't take down the datacenter, we already have a hot standby. For datacenter failures, we can migrate to a different host (at least, we believe we can—it's been a while since we verified this). But it would take at least a few hours, and probably the inevitable glitches would make it take the better part of a day. Let's say a day. The question is whether the considerable effort to build and maintain a cross-datacenter standby, in order to prevent outages of a few hours like today's, would be a good investment of resources.

link

floatingatoll 1962 days ago

My vote is no. We will all be fine for a day without HN, as today proved. There have to be so many other ways HN can be improved, that will have more of an impact for HN users, in the remaining 364 days of the year.

link

cesarb 1962 days ago

> For failures that don't take down the datacenter, we already have a hot standby. For datacenter failures, we can migrate to a different host (at least, we believe we can—it's been a while since we verified this).

It might be a good idea to verify it; see the recent events at OVH (https://news.ycombinator.com/item?id=26407323).

link

Aperocky 1962 days ago

Question: what is the other things that need doing?

Obviously does not apply to engineering effort outside of hacker news website, which the team might be working on.

But this forum has seen little change over the years and it's pretty awesome as is.

(Though I didn't use HN api too much so not sure what's going on that side).

link

dang 1962 days ago

> Question: what is the other things that need doing?

I'm currently working on fixing a bug where collapsing comments in Firefox jumps you back to the top of the page. I'm taking it as an opportunity to refine my (deliberately) dead-simple implementation from 2016.

> But this forum has seen little change over the years and it's pretty awesome as is.

That's an illusion that we work hard to preserve, because users like it. People may not have seen much change over the years but that's not because change isn't happening, it's because we work mostly behind the scenes. Though I have to say, I really need more time to work on the code. I shouldn't have to wait for 3 hours of network outage to do that (but before anyone gets indignant, it's my own fault).

link

phpnode 1962 days ago

Team is maybe a bit of a generous term to describe dang!

link

amelius 1962 days ago

A read-only copy in a different DC could be a simple and still acceptable option.

And a status page would be nice.

link

_-david-_ 1962 days ago

Can you add any additional information like database or webserver?

link

yamrzou 1962 days ago

How much memory does HN use?

link

dang 1962 days ago

That depends on how much Racket's garbage collector will let us (edit: I mean without eating all our CPU). Right now it's 1.4GB.

Obviously the entire HN dataset could and should be in RAM, but the biggest performance improvements I ever made came from shrinking the working set as much as possible. Yes, we have long-term plans to fix this, but at present the only reliable strategy for getting to work on the code is for HN to go down hard, and we don't. want. that.

link

soegaard 1961 days ago

Are you using Racket BC or CS?

link

brodock 1962 days ago

Funny they didn't had to build 10 million Microservices and host it through a million kubernetes pod instances to handle "internet traffic".

link

_joel 1962 days ago

They only have one server, iirc.

link

voxadam 1962 days ago

And, if I'm not mistaken, the site is single threaded.

link

aspectmin 1962 days ago

Would love to see the HN architecture.

link

krapp 1962 days ago

arclanguage.org hosts the current version of Arc Lisp, including an old version of the forum, but HN has made a lot of changes locally that they won't disclose for business reasons.

There's an open source fork at https://github.com/arclanguage/anarki, but it doesn't have any direct relationship with HN.

link

mike_d 1962 days ago

Single threaded LISP application running on a single machine. Ta-da.

link

dang 1962 days ago

The application is multi-threaded. But it runs over a green-thread language runtime, which maps everything to one OS thread.

That's a significant distinction because if you swap the underlying implementation then the same application should magically become multithreaded, which is exactly the plan.

link

jsty 1962 days ago

Until 2018 at least it was ... wait for it ... a single server!

https://news.ycombinator.com/item?id=18496344

(Anyone know if that's still the case?)

link

dang 1962 days ago

One production server and one failover (in the same data center, obviously).

link

betamaxthetape 1962 days ago

I assume there are off-site backups, though?

Asking as someone who was impacted by the OVH fire last week, and I didn't have recent backups and therefore lost data.

link

ghotli 1962 days ago

I've been waiting to see a comment like this somewhere. Just a hugops from the internet and a reminder to all who see this to get your backups fire-proof and off-site.

link

dang 1962 days ago

Yes, we've got a good backup system thanks to the greatly-missed sctb.

Sorry to hear that, that sucks.

link

mwcampbell 1962 days ago

Running on a single server is cheaper, and nobody loses money if HN is down (as far as I know), so it makes sense.

link

Sahbak 1962 days ago

Sometimes, it pays off being extremely simple. In HN, it definitely does

link

mromanuk 1962 days ago

After this event, they should switch to two servers in different DC.

link

johannes1234321 1962 days ago

When going to two you need to handle split brain some way probably, otherwise you end up with an database state hard to merge, thus you better get three, so two can find consensus, or at least an external arbitration node, deciding on who is up. At that point you have lots of complexity ... while for HN being down for a bit isn't much of a (business) loss. For other sites that maths probably is different. (I assume they keep off-site backups and could recover from there fairly quickly)

link

ethbr0 1962 days ago

I haven't run a ton of complicated DR architectures, but how complicated is the controller in just hot+cold?

E.g. some periodic replication + external down detector + a break-before make failover that brings up the cold, accepting any unreplicated state will be trashed and rendering the hot inactive until manual reactivation

link

johannes1234321 1962 days ago

Well, there you have to keep two systems maintained, plus keep Synchronisation/replication working. And you need to keep a system running which decides whether to fail over. This triples the work. At least.

link

bpicolo 1962 days ago

There are plenty of sites where it's acceptable to be down for a bit sometimes.

link

centimeter 1962 days ago

Having two servers is a lot more than 2x as complicated and expensive as having 1 server.

link

jpwgarrison 1962 days ago

A wise colleague recently explained to me that if you build HA things HA from the start, it's only a little more than 2x. If you try to make an _existing_ system HA, it's 3x at best. HN is not a paid service, they can be down for a few hours per year, no problem. We're not all going to walk away in disgust.

link

joshmanders 1962 days ago

Not to mention the HNStatus[0] twitter account has so few tweets[1] I don't think it's even worth it.

[0]: https://twitter.com/HNStatus

[1]: Last tweet since before today's incident was 2 years ago, 4 years ago since the one before that.

link

cm2187 1962 days ago

You should look at stackoverflow's hosting!

link

aspectmin 1962 days ago

Is this described somewhere? :)

link

cm2187 1962 days ago

The most recent resource I found. I think they basically use a rack in one datacentre.

https://meta.stackexchange.com/questions/10369/which-tools-a...

link

giantrobot 1962 days ago

This is the newest version of their architecture I've seen [0]. Compare to an overview from 2009 [1].

tl;dr StackOverflow's architecture is fairly simple and has done mostly vertical scaling (more powerful machines) and bare metal servers rather than virtual servers. They also realize their use patterns are read-heavy so there's a lot of caching and they take advantage of CDNs for static content which completely offloads that traffic off their main servers.

[0] https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...

[1] http://highscalability.com/stack-overflow-architecture

link