Hacker News new | ask | show | jobs
by geocar 427 days ago
> An interesting debate emerged about the necessity of checking every possible error:

> In JS world this could be true, but for Rust (and statically typed compiled languages in general) this is actually not the case… GO pointers are the only exceptions to this. There are no nil check protection at compile level. But Rust, kotlin etc are solid.

Yes it actually is the case. You cannot check/validate for every error, not even in rust. I recommend getting over it.

For a stupid-simple example: You can't even check if disk is going to be full!

The disk being full is a real error you have to deal with, and it could happen at any line in your code through no fault of your own, and no it doesn't always happen at write() but can also when you allocate pages for writing (e.g. SIGSEGV). You cannot really do anything about this with code- aborting or unwinding will only ever annoy users, but you can do something.

We live in a multitasking world, so our users can deal with out-of-disk and out-of-memory errors by deleting files, adding more storage, closing other (lower priority) processes, paging/swapping, and so on. So you can wait: maybe alert the user/operator that there is trouble but then wait for the trouble to clear.

Also: Dynamic-wind is a useful general-purpose programming technique awkward to emulate, and I personally dislike subclassing BackTrack from Error because of what can only be a lack of imagination.

4 comments

> We live in a multitasking world, so our users can deal with out-of-disk and out-of-memory errors by deleting files, adding more storage, closing other (lower priority) processes, paging/swapping, and so on. So you can wait: maybe alert the user/operator that there is trouble but then wait for the trouble to clear.

That's a weird take. I've been working for multiple decades now with systems that have no UI to speak of; their end-users are barely aware that there's a whole system behind what they can see, and that's a good thing because they become aware of it when it causes them trouble.

I take from my mentor in programming this stance for many things, including error handling: the best solution to a problem is to avoid it. That's something everybody knows actually, but we can forget that when designing/programming because one has so many things to deal with and worry about. Making the thing barely work can be a challenge in itself.

For errors, this usually means: don't let them happen. E.g. avoid OOM by avoiding dynamic allocation as much as possible; statically pre-allocate everything, even if it means megabytes of unused reserved space. Don't design your serialization format with quotes around your keys just to allow "weird" key names, a feature that nobody will ever use and that creates opportunities for errors.

Of course it is not always possible, but don't miss the opportunity when it is.

> That's a weird take

I appreciate that, but...

> I've been working for multiple decades now with systems that have no UI to speak of; their end-users are barely aware that there's a whole system behind what they can see, and that's a good thing because they become aware of it when it causes them trouble.

Notice I said "user" not "end-user" or "customer".

This was not an accident.

In your system (as in mine) the "user" is the operator.

> the best solution to a problem is to avoid it.

That's your opinion man. I don't know if you can avoid everything (I certainly can't).

Something to consider is why Erlang people have been trying to get people to "let it crash" and just deal with that, because enumerating the solutions is sometimes easier than enumerating the problems.

That’s not his opinion, that’s the standard technique in systems programming. It’s why there’s software out there that does in fact never crash and shows consistent performance.
> Something to consider is why Erlang people have been trying to get people to "let it crash" and just deal with that

Yes, if you can afford it, I would say it is a way to avoid the problem of handling errors in a bug-free way. But it is more than yet another error handling tactic, it is a design strategy.

> For a stupid-simple example: You can't even check if disk is going to be full!

Isn’t this addressed by preallocating data files in advance of writing application data? It’s pretty common practice for databases for both ensuring space and sometimes performance (by ensuring a contiguous extent allocation).

I don’t think it’s possible to get that to work 100% of the time on typical modern hardware.

As an example, a disk block may be bad, requiring the OS to find another one to store that pre-allocated disk space. If you try to prevent that by writing to the preallocated space after you allocated it, you still can hit a case where the block goes bad after you did that.

> Isn’t this addressed by preallocating data files in advance of writing application data?

Allocation isn't the only thing that can fail: Actually writing to the blocks can fail, and just because you can write zeros doesn't mean you can write anything else.

You really can't know until you try. This is life.

You’re not wrong, but you are moving the goalposts a little; GP is responding to your “disks is going to be full” scenario, and that is well handled I’d say by pre allocation.. then of course other things can go wrong too.
Yes things can go wrong. That's the point. The problem is what to do about it.

I think if you needed a better example of something you can't defend against in order to get the main idea, that's one thing, but I'm not giving advice in bad faith: Can you say the same?

The person you are replying to, has a reasonable position and explained it well. IMHO
No they really don't.

fallocate() failing is exactly the same as write() failing from the perspective of the user, because the disk is still full, and the user/operator responds exactly the same way (by waiting for cleanup, deleting files, adding storage, etc).

Databases (the example given) actually do exactly as koolba suggests, and ostensibly for the reason of surfacing the error to the application. The point is what to do about the error itself though, not about whether fallocate() always works or is even possible.

This. There are errors and states you cannot predict. As a grandchild comment says: It's easier to provide solutions than to list all the errors. Find your happy path and write code that steers you back on to it. The code will be shorter, less surprising, and actually describable. It's also testable because you treat whole classes of errors consistently so your error combinations count is smaller.
There is in fact a common strategy for dealing with those errors. Shut the process down. That relies on another strategy. Reliable persisted state. Best practice here is to use mechanisms that ensures that at every moment the persisted state is valid. Some databases can guarantee this. You can also write out the new state to a temp file and atomically replace the old state with the new one.