|
|
|
|
|
by onetimeuse92304
886 days ago
|
|
I have been using Linux since 1999. I have seen lots of kernel panics. But recently much less, unfortunately replaced by more problems in platform and userspace. I know Linux works more reliably for some people and less reliably for some others. It probably has much to do what you do with it. What kind of hardware you are running it on, do you just install it and use it as it is or you are the kind of person like me who likes to change everything to his liking. I also tend to not like to reinstall my machines. For about 15 years my daily driver was a single Debian unstable installation which was continuously updated until I faced too much problems and had to completely replace it. I would have fixed it all but I just did not have the time and I needed it working. |
|
Randomly in my career so far, notable kernel panic causes were:
- when a spark job finishes and deallocates close to a TB of memory, kernel panic. jobs using below 750GB were typically not seeing this happen, so it was something in there. this just kind of stopped happening after we updated the kernel and spark in a semi-unrelated push, so never really got a root cause here.
- bad hardware
- a spark job that was doing simply insane amounts of shuffle output (which goes to disk) was hitting kernel panics which ended up being related to a kernel bug that only impacted ridiculously high-disk-io-using applications, with some additional spin that made me think "ah so this is basically only affecting spark jobs"
- bad hardware
Did I mention bad hardware? I've spent way too much time hunting down "bugs" that ended up just being a bad mobo and linux was kind enough to inform you of it. But "this is the only program that causes the kernel panics!" and yet when we move it to a temp server for a few days the program mysteriously stops crashing. Another reason I do like "the cloud" - I can just cycle out an ec2 box I suspect is bad instead of fighting with the IT guy about whether the 2 year old expensive server is already busted or not.