| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by blueplanet200 1589 days ago
	I hope they figure out what’s going on every morning. Heard from inside they don’t know why the db dies everyday but restarting it fixes it.

5 comments

exikyut 1589 days ago

What's "the db"? It sounds like something of small to medium scale if you can just restart it like that.

In any case, why not just relocate some vendor engineers on site for a bit? Or, better, why does the vendor not have a small presence in the corner?

Sounds like whatever "the db" is it's probably some (objectively) small but very scary thing that's currently on fire and people are trying to figure out how to put it out without crashing the plane and also making too many waves internally, which is probably even harder. So asking about making vendor noises is (as useful as it may be) probably going down the wrong path - in much the same way this is probably not related to the outages (it may well be, but from the outside it's all coincidence anyway).

fundmondawyaya 1589 days ago

Cock crows. DB crashed.

Systemctl restart

mysqld

(Or mariadb, if you pronounce "SQL" as "sequel")

yebyen 1589 days ago

Sounds like it was a MySQL database:

https://github.blog/2022-03-23-an-update-on-recent-service-d...

shepardrtc 1589 days ago

IIS Server had/has a memory leak in worker threads that many years ago always forced us to restart the server every few days. Starting in 6.0, they added worker thread recycling and made it a mandatory to choose a time period for every thread to be recycled. Why fix the error when you can just restart the service?

djbusby 1589 days ago

Apache prefork had that since forever. Seems just a garbage collect type pattern.

mst 1589 days ago

For old-school mod_perl apps setting MaxRequestsPerChild was often a much better ROI than actually finding and fixing the leaks.

Speaking as somebody who's done over a decade of large scale OO applications perl and is actually really good at finding and fixing the leaks, this has often been intellectually aggravating but every time I've set that option instead I rewarded myself with a glass of bourbon for picking the pragmatic choice and then went back to adding (non-leaky) features that were far more useful to the company in question than cleaning up the older code would've been.

shepardrtc 1589 days ago

It's not a bug, it's a pattern.

Seriously though, IIS 5.0 had no worker recycling. There was no method to fix the issue. Threads would eat up GB's of memory until you killed them.

whimsicalism 1589 days ago

I doubt they use IIS

throwra620 1589 days ago

MSer here, yes we do… for some things

prepend 1589 days ago

For GitHub? It seems unbelievable that they would use IIS pre-purchase and why in the world would you mix in a second web server for post-purchase enhancements.

Yuioup 1586 days ago

Why trade an open source solution with third rate garbage that is called IIS which runs on a sub-par desktop OS called Windows. I thought that Github was supposed to be independant.

whimsicalism 1589 days ago

If GH is around the same level of integration with Microsoft as my employer, which is another Microsoft acquisition, I don't really believe you have a ton of insight into GH processes.

edgyquant 1589 days ago

I dated a girl at GitHub for awhile last year who said they weren’t even completely off of AWS yet and she liked how they didn’t seem like working for Microsoft. Maybe this has changed though.

cube00 1589 days ago

Break out the early morning restart cron job.

gaoshan 1589 days ago

Here you go, Github:

0 4 * * * /etc/init.d/postgresql restart

I'll take an architect position as compensation, but only if there is equity.

rish 1589 days ago

GitHub uses MySQL primarily though.

grumple 1589 days ago

MySQL also has a restart command! I'll take my rsus now ty.

Kostic 1589 days ago

Early morning in which timezone?

afterburner 1589 days ago

GaryOldman.gif

glenneroo 1589 days ago

When the least amount of users are online?

MuffinFlavored 1589 days ago

How long does restarting it take?

raffraffraff 1589 days ago

Yuck. Honestly, restarting a database to fix a major outage sounds like "we have no idea what we're doing"

blueplanet200 1589 days ago

It sounds like "they don't know why it's going down." I've worked with plenty super competent people that have taken time to root cause incidents.

Guide to incidents: Step 1: Stop the bleeding Step 2: Prevent it in the future

Doing Step 1 doesn't make you incompetent.

raffraffraff 1588 days ago

I'm not a DBA, and maybe you're not a DBA either, so this question goes to DBAs who may be reading: aren't you always better off killing the bad queries instead of rebooting the whole box, if that's an option? (ie: aside from times when the entire host is screwed, load per core is >50, metrics aren't getting out, you can't ssh in etc)

bpicolo 1589 days ago

Sporadic database performance issues can certainly make you feel that way. They are definitely not trivially debugged at scale

vimda 1589 days ago

Would you rather it stay down while they spend a day debugging it?

paulryanrogers 1589 days ago

If that means it won't be down every morning in my time zone then yes.

seanw444 1589 days ago

As long as it's announced in advance so that users/customers can plan ahead, I don't see why not.

karmakaze 1589 days ago

They could use multiple writer hosts and rollover the restarts. MySQL has had GTIDs since 5.6 and replication groups rather than writer-replicas since some 5.7.x version.