Hacker News new | ask | show | jobs
HN was down (twitter.com)
468 points by jontro 1915 days ago
36 comments

Founder and CEO of M5 Hosting here. We did have a network outage today that affected Hacker News. As with any outage, we will do an RCA and we will learn and improve as a result.

I'm a big fan of HN and YC in general, we host of other YC alum, and I have taken a few things through YC Startup School. During this incident, I spoke to YC personally when they called this morning.

We have been using M5 Hosting for one of our servers since 2011. They have been extremely reliable up until today. Based on what was posted about the Hacker News server setup, we have something similar. We have a "warm spare" server in a different data center. We use Debian, not FreeBSD.

We are in the process of slowly moving to a distributed system (distributed DB) that is going to make fallover easier. However, that kind of setup is orders of magnitudes more complex than the current (manual fallover) setup. I really wonder if the planned design is going to be more reliable in practice. Complexity is almost always a bad idea, in my experience. Distributed systems are just fundamentally very complicated.

Oh hi! Thank you for the kind words. I cant tell who you are by your name here, but if you've been with us since 2011, we have certainly spoken. Are you using our second San Diego data center for your failover location? If you and I aren't already talking directly, ask to speak with Mike in your ticket.
I had used M5 some years ago to host an online rent payment / property management app. Have nothing but positive things to say about that experience. We once had an outage that was our own fault on our single server and they had someone go in, in the middle of the night, to reboot it for us and we weren't even on an SLA.
Thank you for sharing your positive experience! We can power cycle power outlets remotely and can connect a console (ip kvm)... and we are staffed 24x7.... in case you need another server. Thanks again!
HN is one of the few sites I always keep zoomed-in (around 200%), which led to me finding an interesting bug in Chrome while HN was down: Chrome's internal "This site can't be reached" page uses the zoom level of the site you would be visiting (if it were up), rather than Chrome's default zoom.

Screenshot: https://i.imgur.com/VwFtgQh.png

Chrome used to store 'zoom level' for URLs even if you were in incognito mode: and in plain text. Not sure if it still does.... (if you changed the zoom level for a site while in incognito from the default, it would save the value and the associated URL).
not anymore. it does it the other way 'round though, which can be frustrating.
Whew. That gives the site extra bits of information to connect you to your real identity/account.
I'm incognito for a reason. I don't need to be followed all over, i'm a big boy, i can log in without any help.
I noticed the same behavior in Firefox as well. I wouldn't consider this as a bug though
It would be cool if zooming in / out on the T-Rex game caused it to switch your character to larger / smaller dinosaurs.
The Trex game really needs a meteor animation when the connection is re-established.
They even have the artwork! It's used when the game is disabled by ~~fun-hating sysadmins~~ enterprise policy.
Give this man a PM role at Google Chrome!
Only if he promises to discontinue the game in a year
what can you possibly not like about the dinosaur game?? I love it and i like how chrome has debugged it over the years.
So, the meteor hits the Trex, and its game over because you need to get back to work?
what's the t-rex game?
We have a version of it we adapted so that the t-rex has to jump over npm packages as they're being published in real-time!

https://www.onegraph.com/docs/subscriptions.html (it'll load in at the top of the page)

minigame built into chrome you can play when you're offline, or alternatively go to chrome://dino
Edge has one too at edge://surf
There's also an extracted version of it on Github for those of us who use Firefox.
You’re one of today’s lucky 10,000!
Don't know what today's lucky 10,000 is? You are also one of today's lucky 10,000! https://xkcd.com/1053/
hit space bar when offline in chrome, ps: addictive.
Game you get when the network is unavailable in chrome
Firefox does the same, as I discovered - I don't know whether it's a bug or intended functionality.

(As an aside, I keep HN at 150% and old reddit at 120% - those are the only 2 sites I have permanently zoomed)

Either a bug or an over-eager member of the Mozilla UX team had actually filed a bug with a feature-parity Chrome tag on it in BMO.
Bug-parity. Not even odd parity.
You could say I punted.
It's part of the charm. Unusable on retina without zooming (at least with my eyes).
I've observed a related issue with much amusement for a few years now: when loading a new resource (specifically: spinner going anticlockwise, waiting for TTFB), Chrome will invisibly switch the renderer over to the font size settings of the to-be-loaded resource, then carefully inhibit repainting the view.

But, if said destination resource is very slow to hit TTFB, you switch to a different tab, then back to the loading tab, you'll see the current page at the destination page's zoom settings.

My guess is that the interstitial system that injects error pages, Safe Browsing warnings, etc, doesn't hit the code path that says "we loaded a new (regular) page, go find its zoom settings".

Demo/PoC:

1. Run $anything that will serve a webpage on an arbitrary port - even an error page or directory listing. eg, python3 -m http.server, php -S 0:8000, etc.

2. Open the resource you just set up in a new tab, zoom in or out as preferred (eg, to a crazy level), copy the URL (for convenience), then close the tab.

3. Stop the server in (1), then run `nc -lp 8000` (or netcat, ncat, or $anything that will listen but never respond).

4. Open a new tab, navigate to a valid website (eg here :), example.com, etc), then once it's loaded, paste the URL you copied. With the page spinning and waiting for netcat (et al), navigate away from the tab, then back to it again.

Think I noticed this for the first time a couple years ago. Seems harmless enough.

Is that really a bug?
Feels like it to me. I'd expect the zoom to be associated with the site.

Granted, I am probably importing old thoughts of it being a sort of user provided style sheet.

It'd make the zoom level you see jump up and down depending on whether you lose your connection or regain it, this is less jarring.

You could say that Chrome is designed to tie the zoom level to the viewport but I wouldn't count on this behavior springing up from an underlying design and implementation rather than it being a design choice for the user experience.

I'm not sure I follow. If the page is constantly bouncing to the no connection page, it is jarring, period. If the page of my "no connection" changes because of the address in trying, that is jarring if the problem is on my end.

That is, consider your network is down. You try to go to an address. It doesn't load, so you try another address, the page changes; but it is the same content.

Is it important that the no connection message be an HTML document treated like others in the web browser? If browsers used to model it that way and you saw behaviors corresponding to the switching of a webpage, it was arbitrary too, and in this case, would cause more disruption.
> I'd expect the zoom to be associated with the site.

That's what the GP comment said happened: the zoom level was the one associated with what they previously had set on HN, and they expected it to be the opposite, the default zoom level for the browser.

But the site didn't load. My browser's not loading page did.

Is easier to see as broken by thinking of "how could I set it so that my browser's error page has a default zoom?"

But your browser's connection-failure page is considered to come from the HTTP Origin of the site. It's like when browsers receive a specific HTTP status-code (e.g. 500) with no body, so they render a default HTML error document.

In both cases, those are the browser supplying a resource representation, while still technically being on the resource specified in the navigation bar. The thing you're seeing is an overridden representation of the server's response. (Which, in this case, just happened to be "no response.")

It's almost exactly the same as how the server sending a 304 gets the browser to load the document from cache. The server's actual response was a 304; but the browser's representation of that response is the cached HTML DOM it had laying around from the last 2xx resource-representation it received "about" the same resource.

I would consider the browser's built-in page for "I couldn't load news.ycombinator.com" to be a separate site from "news.ycombinator.com".
I think the zoom level for Chrome is global per window at the least, it's definitely not per tab.
it's per subdomain i think.
oh indeed it is, wow that's subtle, always escaped me where the focused setting applied. In that case yah probably a bug.
FWIW Safari doesn't have this bug (I too keep HN zoomed at 200% for some reason).
Glad to hear I'm not alone in this. Currently at 133%, so not quite as extreme.

Judging from the responses, this is actually a lot more popular than I assumed.

Which begs the question: Does anyone feel the default font is just perfect and wouldn't want it to be bigger even by a tiny bit?

I think, the font size around 105-110% would be perfect but the default one is fine as well. It definitely is the smallest default font I've seen on a popular website but it's workable for me.
> Which begs the question: Does anyone feel the default font is just perfect and wouldn't want it to be bigger even by a tiny bit?

I think it's perfect. What is your screen DPI (or rather angular pixel size from your normal viewing position) and is your browser set up to do any scaling based on that? Maybe it should be.

I really dislike the trend of giant fonts and whitespace.

Same, I don't understand why the font is so small by default. Just use the default browser font size I've defined as a user!
Same with Firefox. I have HN at 190%, and got startled by the error message being so. big. and. weird.
I had to check my own zoom, 200% as well.
The "View page source" pages too.
The font-size on HN is barely readable, I'm working on an accessible skin for the HN frontend that addresses this.

I'm targeting WCAG 2.0. Keep an eye out for the "Show HN" coming soon!

Are you using a high dpi monitor but not using > 100% display scaling in your OS or something? It's roughly the same size as most other sites for me.

(And pretty much all browsers have a zoom function for exactly this, it feels like a totally separate frontend would be more hassle to use than just ctrl + scroll wheel once)

I’ve come to enjoy using a high DPI monitor without display scaling as a way to counteract the huge amount of whitespace in modern UIs, coupled with content zooming so words are still actually readable :) https://addons.mozilla.org/en-US/firefox/addon/zoom-page-we/
As a designer, I've been thinking a lot about this "huge amount of whitespace in modern UIs" thing. I personally hate it, I want most things to be always in reach.

Two of my hypotheses are:

(1) some designers are working on huge screens themselves, and don't test enough in usual resolutions

(2) it's easier to achieve good visual composition by doing a lot of whitespace (to the expense of hiding things below fold or in triggerable containers)

It's the only site I have problems with, tbh. Stylesheet says it's supposed to be 10pt (with comment text dropping down to 9pt), which is even smaller than the too-small 12pt font that gets recommended a lot.
On Win10 with default settings, fonts on most sites are totally comfortable to me.

HN is readable - just - but it's definitely on the small side.

The complete lack of some sort of horizontal constraint doesn't help either. 200 character lines are no bueno for reading.

I've found Linux to handle scaling pretty inconsistently; I've got a 4K television I connect my computer to and if I tell it to scale 200% in the monitor configuration most things get scaled nicely, but random stuff (especially proprietary stuff) doesn't know what to do.

It worked much better to just tell it to output 1080p and let my television scale it... less graphics memory too. I still need to scale HN up relative to other sites in order to read it though.

If I compare the text of your comment to the text of an article on npr.org it seems like about the same as the difference between 9pt and 12pt, and they are using a serif font that seems to be a lot easier to read.

It's a style choice I guess? It seems like it would work best on a large 1080p display, so maybe that's just what the person who designed the layout was using.

No I'm not. The font-size for most text on the site is 10pt and 9pt.

Zoom doesn't fix line lengths of 1500 characters and terrible color contrast.

The link to the site guidelines is 7pt with a contrast that fails WCAG 2.0. No wonder no one reads them.

> Zoom doesn't fix line lengths of 1500 characters

That depends on your browsers zoom implementation. Firefox is able to zoom the text/element sizes while keeping the page width the same on HN.

While you are there, can I feature request that bring the upvote icon/button to end of the comment? Right now that triangle is at the beginning of comment, sometimes a comment is long & interesting,I want to upvote it because its relevant, interesting & correct, have to scroll back up.
I consider that a feature, not a bug. I typically do all browsing zoomed in somewhat and I expect the "page can't load" to also be zoomed. Or am I misunderstanding what you're saying?

EDIT: People who disagree, care to explain? I zoomed in, so why would I expect it to zoom out just because its a different page? What am I missing?

Hacker News is hosted at M5 and they are having a network outage:

http://status.m5hosting.com/pages/incident/5407b8e2b00244251...

edit: Unrelated to the Azure outage.

HN is also available through Cloudflare but that seems to depend on M5.

Don't take my word for it. Test it for yourself:

  printf 'GET / HTTP/1.1\r\nHost: news.ycombinator.com\r\nConnection: close\r\n\r\n' \
   |openssl s_client -connect cloudflare.com:443 -ign_eof -servername news.ycombinator.com
>The desire to have fewer moving parts

This actually got me thinking. Do we really need CDN? This is one of those thing we take and use without actually thinking whether we could do without it.

Interesting thought experiment.

Cloudflare only proxies dynamic websites.
CloudFlare will proxy whatever site you configure, even if it is static.

Static websites will get the best speed boost from locally served assets (much reduced latency from the local POP) because the page itself can be cached (presuming headers on origin site are correctly set). Especially for page requests from international users.

Sorry, I wasn't clear. What I meant was that since HN is dynamic its dynamic content is not usually cached. I mentioned specifically "dynamic" sites on my parent comment because as of this month Cloudflare can host static pages.
I'm surprised that a site as big as HN is only hosted in one place.
HN is probably very small. Curious as to the minimum size of the backend that will hold up the website.

There may need to be read replicas, but maybe not even that is needed.

It's about the same as what Scott described here: https://news.ycombinator.com/item?id=16076041

But we get around 6M requests a day now.

What was the motivation in choosing FreeBSD?

(Just so nobody misinterprets my question, nothing wrong with FreeBSD, I know other stuff also runs on it like Netflix’s CDN. Still always interested to hear why people choose the road less travelled)

RTM, PG and I used BSDI (a commercial distribution of 4.4BSD) at Viaweb (starting 1995) and migrated to FreeBSD when that became stable. RTM and I had hacked on BSD networking code in grad school, and it was far ahead of Linux at the time for handling heavy network activity and RAID disks. PG kept using FreeBSD for some early web experiments, and then YC's website, and then for HN.

FreeBSD is still an excellent choice for servers. You may prefer Linux for servers if you're more familiar with it from using it on your laptop. But you use Mac laptops, FreeBSD sysadmin will seem at least as comfortable as Linux.

I don't know, because that decision dates back to pg and rtm and probably Viaweb days. We like it.
Can I ask a question that's half facetious half serious (0.5\s): does hackernews use docker or any containers in its backend? With 6M requests per day, if it didn't use containers, HN might be a good counter example against premature optimization (?).
Nope, nothing like that. I don't understand why containers would be relevant here though? I thought they had to do more with things like isolation and deployment than with performance, and it's not obvious to me how an extra layer would speed things up?
Not even sure any other modern stack could handle this with the same Hardware.
Does anyone know what the AWS instance size equivalent of that would be?
Very roughly equivalent to an m4 xlarge
Wow, that's not as big as I thought then. What's the average peak rps?
We use Nginx to cache requests for logged-out users (introduced by the greatly-missed kogir), and I only ever look at the numbers for the app server, i.e. the Arc program that sits behind Nginx and serves logged-in users and regenerates the pages for Nginx. For that program I'd say the average peak rps is maybe 60. What I mean by that is that if I see 50 rps I think "wow, we're smoking right now" and if I see 70 I think "WTF?".
Maybe standby should be in another rack, perhaps even another datacenter.
That would be the natural next step, but it's a question of whether it's worth the engineering and maintenance effort, especially compared to other things that need doing.

For failures that don't take down the datacenter, we already have a hot standby. For datacenter failures, we can migrate to a different host (at least, we believe we can—it's been a while since we verified this). But it would take at least a few hours, and probably the inevitable glitches would make it take the better part of a day. Let's say a day. The question is whether the considerable effort to build and maintain a cross-datacenter standby, in order to prevent outages of a few hours like today's, would be a good investment of resources.

A read-only copy in a different DC could be a simple and still acceptable option.

And a status page would be nice.

Can you add any additional information like database or webserver?
How much memory does HN use?
That depends on how much Racket's garbage collector will let us (edit: I mean without eating all our CPU). Right now it's 1.4GB.

Obviously the entire HN dataset could and should be in RAM, but the biggest performance improvements I ever made came from shrinking the working set as much as possible. Yes, we have long-term plans to fix this, but at present the only reliable strategy for getting to work on the code is for HN to go down hard, and we don't. want. that.

Funny they didn't had to build 10 million Microservices and host it through a million kubernetes pod instances to handle "internet traffic".
They only have one server, iirc.
And, if I'm not mistaken, the site is single threaded.
Would love to see the HN architecture.
arclanguage.org hosts the current version of Arc Lisp, including an old version of the forum, but HN has made a lot of changes locally that they won't disclose for business reasons.

There's an open source fork at https://github.com/arclanguage/anarki, but it doesn't have any direct relationship with HN.

Single threaded LISP application running on a single machine. Ta-da.
The application is multi-threaded. But it runs over a green-thread language runtime, which maps everything to one OS thread.

That's a significant distinction because if you swap the underlying implementation then the same application should magically become multithreaded, which is exactly the plan.

Until 2018 at least it was ... wait for it ... a single server!

https://news.ycombinator.com/item?id=18496344

(Anyone know if that's still the case?)

One production server and one failover (in the same data center, obviously).
I assume there are off-site backups, though?

Asking as someone who was impacted by the OVH fire last week, and I didn't have recent backups and therefore lost data.

I've been waiting to see a comment like this somewhere. Just a hugops from the internet and a reminder to all who see this to get your backups fire-proof and off-site.
Yes, we've got a good backup system thanks to the greatly-missed sctb.

Sorry to hear that, that sucks.

Running on a single server is cheaper, and nobody loses money if HN is down (as far as I know), so it makes sense.
Sometimes, it pays off being extremely simple. In HN, it definitely does
After this event, they should switch to two servers in different DC.
When going to two you need to handle split brain some way probably, otherwise you end up with an database state hard to merge, thus you better get three, so two can find consensus, or at least an external arbitration node, deciding on who is up. At that point you have lots of complexity ... while for HN being down for a bit isn't much of a (business) loss. For other sites that maths probably is different. (I assume they keep off-site backups and could recover from there fairly quickly)
I haven't run a ton of complicated DR architectures, but how complicated is the controller in just hot+cold?

E.g. some periodic replication + external down detector + a break-before make failover that brings up the cold, accepting any unreplicated state will be trashed and rendering the hot inactive until manual reactivation

There are plenty of sites where it's acceptable to be down for a bit sometimes.
Having two servers is a lot more than 2x as complicated and expensive as having 1 server.
A wise colleague recently explained to me that if you build HA things HA from the start, it's only a little more than 2x. If you try to make an _existing_ system HA, it's 3x at best. HN is not a paid service, they can be down for a few hours per year, no problem. We're not all going to walk away in disgust.
You should look at stackoverflow's hosting!
Is this described somewhere? :)
The most recent resource I found. I think they basically use a rack in one datacentre.

https://meta.stackexchange.com/questions/10369/which-tools-a...

This is the newest version of their architecture I've seen [0]. Compare to an overview from 2009 [1].

tl;dr StackOverflow's architecture is fairly simple and has done mostly vertical scaling (more powerful machines) and bare metal servers rather than virtual servers. They also realize their use patterns are read-heavy so there's a lot of caching and they take advantage of CDNs for static content which completely offloads that traffic off their main servers.

[0] https://nickcraver.com/blog/2016/02/17/stack-overflow-the-ar...

[1] http://highscalability.com/stack-overflow-architecture

We had some AWS network issues today too...
Not sure why HN would still be a hosted at a 3rd tier provider. A few EC2 instances (multi-zone) behind a application load balancer should do the trick.
They're first-tier to us.
Because it doesn't make any money
YC's companies get free advertising and job listings on the front page.
You just described not making money.
Who cares it’s a few hundred a month to host on AWS
Then where will people go to learn about AWS outages?
I think I have tried to visit HN for more than 10 times in last 2 hours and failed. This made me realize how much I'm addicted to HN
I keep 3 pinned tabs in my browser:

- Reddit (my main source of addiction)

- HackerNews (the second source of addiction)

- Cookie Clicker (a rather recent addition that I'm slightly embarassed of)

At a point in time I also had facebook, but I've since stopped going there (maybe once a week).

Just cheat. It'll break the spell quickly.

Also, check out universal paperclips if you haven't already. it has a definite end. You likely won't play more than maybe 10-20 hours.

I've played Universal Paperclips from start to finish 4 times! I loved it. In fact, I loved it so much the last time around that I wanted to have another game "somewhat like it" in the background -- that's where the recent Cookie Clicker tab came from.

I always recommend Universal Paperclips to people who don't like cookie clicker games, because I fell in love with it the first time I tried it (heard of it from the Hello Internet podcast)

Universal Paperclips is the cure for all other clicker games.

Once you've played a fair and truly exponential clicker through a few times you can't tolerate the forced linearity of a pay-to-win clicker app.

Alas, I am OP, the one with the Cookie Clicker tab, and I came to Cookie Clicker after Universal Paperclips: https://news.ycombinator.com/item?id=26469366

I have to say that UP was definitely a much better experience.

Not only did I not realize you were OP, I also didn't realize the original Cookie Clicker isn't the one I played.

Universal Paperclips cures/inoculates against Cookie Clicker 2 easily. Cookie Clicker Classic looks too addictive for me to let myself try it.

I played Universal Paperclips, and converted a HectoVerse (100 Universes) to paperclips.

Long covid sucks.

Spaceplan is a pretty fun and slightly comedic play through one as well http://spaceplan.click/
Just wait until you find out Reddit and Cookie Clicker have the same endgame...
I'm not sure if this is still a thing, but at one point you could open up a JS console on cookie clicker and run game.ruinTheFun() to unlock everything. :)
I used HN to quit Reddit and I must say it’s been a change for the better.
I kind of feel like we need a "Year Zero" clicker game where once you get up to 1.7m "clicks" you'll see Pol Pot start dancing in the corner. Then as you accumulate more clicks, you'll see Hitler, Stalin, and Mao make appearances as well. Then, finally, once you've overflown the 32-bit integer, the year resets to 1970 and dennis ritchie and brian kernighan start dancing in the corner as well.
>more than 10 times in last 2 hours

You could utilise the noprocrast option in your HN settings.

Didn't know that this feature existed. I enabled it.
> I think I have tried to visit HN for more than 10 times in last 2 hours and failed.

Mee too!

> This made me realize how much I'm addicted to HN

I sought that my IP was shadow-banned by HN...

I was scared Dang had blocked me
Had me quite confused because I'm also having home internet issues. I was trying to get my laptop to switch to my mobile hotspot and HN is one of the sites I used to test connectivity because a) it's almost always available and b) loads very quick.

A bit of a mindfuck trying to assess my actual internet connectivity via a site that was also down : )_

The common method of testing connectivity is opening Bing. Because it's guaranteed to not be cached in the browser.
That reminds me of the old IE joke. IE, the most commonly used browser to download another browser.
Ha, I do this with yahoo.com. I never have a reason to visit it otherwise.
Found some use for Bing!
Ditto. HN is so reliable and light on JavaScript that I typically use it to test my connection. I thought my connection was down earlier but guess this was the rare case where it was HN.

(Other comments suggest it was a network outage at M5 where HN is hosted.)

Me too. I was trying to browse HN on my phone earlier and my first instinct was that my WiFi was having a moment. It's a testament to how reliable HN is.
I was trying to read some news while training in the basement, where I don't have very good Wi-Fi. Usually HN is one of the pages that work better down there, haha.
The title should be updated to "Productivity was up".
Not sure that is true...trying to find other info as to why HN was down led to more productivity lost here
Can't sign into Azure Portal, let's check HN, oh that's down too, hmm is my internet up ...

Huge rabbit hole

With Azure also going down, lots of people were probably scrambling to figure out what blew up.
my fingers automatically just start typing in "news.y" when I'm idle, I definitely didn't know what to do when greeted with a 404!

Is there any way to put the HN homepage on an edge cache so at least the homepage shows up? Or am I admitting that I'm addicted to checking HN too many times a day?

It's gotten so bad for me that I'm down to just "n". I think I have a problem.
I used to use a web browser with Emacs keybindings, so visiting a URL was the same keystroke as opening a file. I'd type "C-x C-f news.ycombinator.com" quite regularly, and my fingers still go to that "n" when I visit a file in Emacs.
LOL, same. In fact, every site I visit often is one char + enter in the browser. With the exception of W, being east of the Mississippi every station starts with W.

That got me to thinking about 'first letter advantages.' If a site has a first letter not currently in use, I'm much more likely to visit it more often(mostly out of boredom, sure).

V and X are still available if anyone is wondering. Zillow got Z!

i was gonna say, check out that n00b that has to type all the way to "news.y" for the browser autocomplete! :D

/s

Oh man this hits home so much
Admitting you have a problem is the first step to recovery, right? I'm sure I've heard that. Dunno how it's supposed to help.
Open a 2nd tab. Turn your 2 404s into one 808. Then start making music
Just don't type "news." and hit enter, it'll redirect to some domain squatter crap and it'll be stuck in your autocomplete for a while =)
Yes, that's what you're admitting. :) Not that you're alone in that...
I never expect HN to be down... I asked my wife - hey is the internet down? She said - no, it's working for me. I clicked on another site and my mouth dropped.
It says a lot that @hnstatus has not tweeted since 2018.
@HNStatus tweeted about the outage 3 hours ago.
That's the point...
Ah, my bad. I was wrongly interpreting OP’s comment.
@dang

Would love an updated post on what the current hardware / software stack that’s running HN.

It’s been years since I’ve seen a post/comment on this topic.

Are you still running FreeBSD, on a few high frequency cores (iirc)?

He commented on this a couple hours ago:

https://news.ycombinator.com/item?id=26469566

TIL HN uses "mirrored magnetic for logs (UFS)". Is there a privacy policy posted anywhere? What's in these logs? Magnetic is for long term storage. How far back does it go?
At the bottom of (almost) every HN page is a link titled "Legal" which contains the privacy policy: https://www.ycombinator.com/legal/

(I don't have any relationship to HN)

Thanks!

The section on ("HN Information") - does this include e.g. IP address under "any submissions or comments that you have publicly posted to the Hacker News site"? My naive reading of that would say "no". But is that correct?

"If you create a Hacker News profile, we may collect your username (please note that references to your username in this Privacy Policy include your Hacker News ID or another username that you are permitted to create in connection with the Site, depending on the circumstances), password, email address (only if you choose to provide it), the date you created your account, your karma (HN points accumulated by your account in response to submissions and comments you post), any information you choose to provide in the “about” field, and any submissions or comments that you have publicly posted to the Hacker News site (“HN Information”)."

This specifically calls out IP addresses:

"Log data: Information that your browser automatically sends whenever you visit the Site (“log data”). Log data includes your Internet Protocol address, browser type and settings, the date and time of your request, and how you interacted with the Site."

Then there's also this section:

"Online Tracking and Do Not Track Signals: We and our third party service providers may use cookies or other tracking technologies to collect information about your browsing activities over time and across different websites following your use of the Site."

I would assume that "other tracking technologies" includes IP addresses.

Azure AAD also had an outage at this time - perhaps linked in some domino effect, or perhaps a coincidence?

https://status.azure.com/en-us/status

Looks like it was a coincidence - unless Azure auth going down shut off a rack in San Diego.
Wondering this too, Teams started going came to HN to get commentary and it was down too.
Seeing HN unresolved was a bit weird as it is the best performing site I ever visit on my low bandwidth phone so several times I thought the problem was on my end. But in the end it helped me realize how frequently I dial into HN while it was down. I have a bit of a problem and I think I need to turn on that no procrastination flag on.
I've been on this site 12+ years, and I don't ever remember it being down. I assumed we were under nuclear attack.
HN is my homepage and when it wouldn't load, I told my coworkers the internet was down and took a 3 hour lunch.
> Back now. Got to write some code for a change... - 5:39

https://twitter.com/HNStatus/status/1371576822748487683

It's seems an odd coincidence of this and the Azure AD outage -- I was trying to get to HN to see what people were saying about it!
Definitely. My thought was "HN is hosted on Azure"? So I went looking into their hosting provider, and lo, they were down too. M5 might be Azure hosted... couldn't confirm that.
Something that I learned from this is that HN has a status Twitter. Rarely used though, which is a testament to the team.

https://twitter.com/HNStatus/status/1371525940656803848?s=20

Only they never posted anything during the outage there
they did, if you click on the link in the post you replied to, you'll see it is a link to their post from today about the outage.
Unsure what you mean there, as the linked tweet was from 4h ago.
I thought pg posted "Memphis".
I didn't notice unfortunately due to the Azure outage blowing everything up :(
So strange that this coincided with Azure authentication eating it.
Unrelated issues, but I did hear from our other clients that O365 was having issues at the same time as our network outage affected HN and many others.
The one time I'm actually reading HN for actually relevant information for actual work, it's down for half a day. Made for a great excuse to take a nap.
I have a massive learning project[1], and I think 2/3 of my "to get through as soon as sensible" content is news.ycombinator links.

Needless to say, this site is my own personal StackOverflow, and I think there's something about ingratitude bouncing around in my mind somewhere.

[1]https://github.com/PhilosAccounting/ts-learning

Wow, looks like a wealth of knowledge. Forked it for myself, only for reference, hope that is ok. Just seems like a ton of great info that I’d love to comb through myself. Cheers.
Totally okay, though I added the PDFs/videos to gitignore. I'm mildly paranoid about IPs[1]!

[1]https://gainedin.site/ip/

Makes sense, well I will be following the TechSplained project, good luck!
Is HN fully back? Looks like this was a little less than 3 hours total, is that right?
Between 3 and 3.5 hours to judge by when PagerDuty stopped bugging me. I was working on code and someone had to tell me it was back up.
Thanks! (And thanks for all your hard work!)
Funny, my first thought was "oh no they've blacklisted VPNs", can't remember when HN was ever down!
It did not seem to affect the Firebase feed.
Curious, what is the uptime of HN - is there some data about that?

My guess is around 99.9% ... but maybe that's too optimistic?

Why too optimistic?

Probably closer to 4 9s.

With this outage of ~2 hours, we are at ~99.97% for this year. (I am not aware of any other downtime during 2021)

Rule of thumb (I strongly prefer minutes/year instead of 9s, to get an immediate sense of how good the availability is):

99.9% : down for 525 minutes / year, or roughly ~10 hours

99.99% : down for 52 minutes / year, or roughly ~1 hour

99.999% : down for 5 minutes / year

Yes, but it would be nice to read some "official" numbers backed by HN's monitoring (although I'm on HN quite/too often, I would not notice every downtime).
Who needs so many 9's when there's actually interesting content we keep coming back for?
We also had issues with our YC application earlier today, was that related to this issue?
It’s OK. The skies haven’t fallen. (And I was able to accomplish something today.)
There was a noticeable increase in productivity during the last hour or so.
It's 3pm.... do you know where your servers are?
And now it looks like there is an outage at reddit
Is this related to the big Microsoft outage?
If only they used Kubernetes! /s
Since I couldn't get to HN, I wrote up how to make the site resilient to outages: https://gist.github.com/peterwwillis/ce2bfaba7fc72e4af44c281...

tl;dr 1 server x 2 providers, different regions, replicate content

Can't decide if the Gist is tongue-in-cheek or actually serious...
This site is so reliable, I thought my I.P. had gotten banned.