Hacker News new | ask | show | jobs
by hardware2win 1117 days ago
I dont understand opinons like this

Just because it would be dangerous for your nodejs web_app.exe running on ubuntu behind apache fully exposed on the internet

then there are billion other ways to use computers, like even air gapped systems.

So, dont try to justify obvious flaw

1 comments

I mean, hardware is cheap enough that any server of importance should be individually disposable.

Yeah, you can do stuff to maximize uptime but if it needs to stay up that badly you have to consider the case of the hardware needing to be turned off at some point.

> So, dont try to justify obvious flaw

I'm not, it's a bug and should be fixed. But I think if anything is powered for 3 years straight it's a bit concerning.

Otherwise you're liable to find things like that somebody started something by hand 2 years ago, and at a critical moment nobody quite remember what the command was.

You live in your own World with other people. Please just keep in mind there are many other Worlds with other people and laws of the Universe.

I don't know if you're young or don't know much about history but what you describe is a fairly recent way of looking at things, it's not the only one and I guarantee you it will become "out of fashion".

Yeah, the "cattle not pets" philosophy is fairly recent, but I don't see it changing any time soon. If anything we're going even more in that direction.

And it makes a lot of sense because if uptime is that important, then no matter how fancy the hardware it can't do anything about disasters or losing internet connectivity.

We might go so far in that direction we wind up right back on the other side. It always happens, it’s more of a pendulum swinging back and forth than any kind of straight forward progress you are imagining.
The 7002 seems like it could be used in a workstation, where the “cattle vs pets” thing is less of a distinction, right? (I guess a workstation is sort of like a work dog in this analogy).
That's the EPYC lineup, which is the server model. Support for terabytes of RAM, 128 PCIe lanes, that sort of thing.

I mean you could use it in a workstation, but unless you need 4 video cards locally it's probably overkill for most uses.

And a workstation should have no problem rebooting once in a while.

You have the whole "I don't understand why something is this way, therefore everyone who does understand why it's that way is wrong" stick down cold. It's not a good look.
As an additional data point -

I have ~1000 7002 cores in my home DC (8 dual socket R7525s with 48-64 cores each) that run kubernetes but are connected to a battery backup and use kexec to perform upgrades. So, while I am very bought into the cattle not pets philosophy, it's rare that any of these machines need to be turned off and I could see them being on for three years continuously without problem otherwise.

> But I think if anything is powered for 3 years straight it's a bit concerning.

Pretty much why Pawsey has an Annual High Voltage inspection shutdown [1]

> Otherwise you're liable to find things like [..]

TBH that's not really been an issue of note at any of the big iron farms I've been around since the 1980s .. generally there's a disciplined approach to maintaining 24/7/365 operation (that includes scheduled downtime for equipment checks) part of which is process documentation and justification and soft means of freezing | migrating processes+data etc.

[1] https://status.pawsey.org.au/incidents/tk5n5y965r5j

Individually disposable, yes. But if you have a cluster of those, and you powered them on at the same time -- as it often happens -- you're in for an exciting ride when your servers start rebooting almost simultaneously, give or take a few minutes.