Hacker News new | ask | show | jobs
by userbinator 519 days ago
I don't get it. The only times I've had problems with filesystem corruption in the past few decades was with a hardware problem, and said hardware was quickly replaced. FAT family has been perfectly fine while I've encountered corruption on every other FS including NTFS, exFAT, and the ext* family.

Meanwhile you can read plenty of stories of others having the exact opposite experience.

If you keep losing data to power losses or crashes, perhaps fix the cause of that? It doesn't make sense to try to work around it.

3 comments

> If you keep losing data to power losses or crashes, perhaps fix the cause of that? It doesn't make sense to try to work around it.

Ponder this notion for a moment: there are problems within one's control and problems outside of one's control.

For example, we can't control the weather. If it snows three feet overnight you simply have to deal with the fact that you're not getting to work today.

Since we can't simply stop hardware from failing, we have to deal with the fact that hardware fails. Your seventeen redundant UPSes might experience a one in a trillion cascade failure. It might take the utility ten minutes longer to restore your power than you have onsite generation.

This is not a class of problem we can control or prevent. We fix these problems by building systems which withstand failures. You can't just will electrons out of the wall socket, but you can build a better disk or FS that corrupts less data when the electrons stop.

There was that time (2009 or so?) I wrote 2 million files to a single directory on NTFS and that filesystem was never the same again. It didn't seem to be a hardware problem. I used to be really careful to not put a crazy number of files in a directory on Linux and Windows storing them in subdirs like

  b7/b74a/b74a56
where the digits are derived from a hash of the file name but lately I've had some NTFS volumes with a 1M file directory that seem to be OK.

Hardware problems also manifest in mysterious ways. On both Windows and MacOS I had computers that seemed to be OK until I did an OS update which caused enough IO that a failing HDD was pushed over the edge and the update failed; in one case I was able to roll back the update but not apply the update, in another case the machine was trashed. Careful investigation (like taking the disk out and inspecting it on another computer) revealed a hard drive error although there was no clear indication of this in the UI and the average person would blame to software update

> If you keep losing data to power losses or crashes, perhaps fix the cause of that?

I keep telling my users to make sure to plug their phones in before the battery dies, but for some reason they keep forgetting...

Phones shut down when close, but before they hit zero battery
Then that's entirely their fault. They deserve all the corruption they get.
Seems like I hit a nerve. Apparently teaching users responsibility is a bad thing?

No wonder things are "hard". Because otherwise many in this godforsaken industry wouldn't need to be employed.