Hacker News new | ask | show | jobs
by kjgkjhfkjf 125 days ago
The article is a bit dense, but what it's announcing is effectively golang's `defer` (with extra braces) or a limited form of C++'s RAII (with much less boilerplate).

Both RAII and `defer` have proven to be highly useful in real-world code. This seems like a good addition to the C language that I hope makes it into the standard.

5 comments

Probably closer to defer in Zig than in Go, I would imagine. Defer in Go executes when the function deferred within returns; defer in Zig executes when the scope deferred within exits.
This is the crucial difference. Scope-based is much better.

By the way, GCC and Clang have attribute((cleanup)) (which is the same, scope-based clean-up) and have done for over a decade, and this is widely used in open source projects now.

I wonder what the thought process of the Go designers was when coming up with that approach. Function scope is rarely what a user needs, has major pitfalls, and is more complex to implement in the compiler (need to append to an unbounded list).
> I wonder what the thought process of the Go designers was when coming up with that approach.

Sometimes we need block scoped cleanup, other times we need the function one.

You can turn the function scoped defer into a block scoped defer in a function literal.

AFAICT, you cannot turn a block scoped defer into the function one.

So I think the choice was obvious - go with the more general(izable) variant. Picking the alternative, which can do only half of the job, would be IMO a mistake.

> AFAICT, you cannot turn a block scoped defer into the function one.

You kinda-sorta can by creating an array/vector/slice/etc. of thunks (?) in the outer scope and then `defer`ing iterating through/invoking those.

I hate that you can't call defer in a loop.

I hate even more that you can call defer in a loop, and it will appear to work, as long as the loop has relatively few iterations, and is just silently massively wasteful.

The go way of dealing with it is wrapping the block with your defers in a lambda. Looks weird at first, but you can get used to it.
I know. Or in some cases, you can put the loop body in a dedicated function. There are workarounds. It's just bad that the wrong way a) is the most obvious way, and b) is silently wrong in such a way that it appears to work during testing, often becoming a problem only when confronted with real-world data, and often surfacing only as being a hard-to-debug performance or resource usage issue.
What's the use-case for block-level defer?

In a tight loop you'd want your cleanup to happen after the fact. And in, say, an IO loop, you're going to want concurrency anyway, which necessarily introduces new function scope.

I would like to second this.

In Golang if you iterate over a thousand files and

    defer File.close()
your OS will run out of file descriptors
Well, unless you're on Windows :D Even on Windows XP Home Edition I could open a million file handles with no problems.

Seriously, why is default ulimit on file descriptors on Linux measly 1024?

Some system calls like select() will not work if there are more than 1024 FDs open (https://man7.org/linux/man-pages/man2/select.2.html), so it probably (?) makes sense to default to it. Although I don't really think that in 2k26 it makes sense to have such a low limit on desktops, that is true.
defer was invented by Andrei Alexandrescu who spelled it scope(exit)/scope(failure) [Zig's errdefer]/scope(success) ... it first appeared in D 2.0 after Andrei convinced Walter Bright to add it.
Both defer and RAII have proven to be useful, but RAII has also proven to be quite harmful in cases, in the limit introducing a lot of hidden control flow.

I think that defer is actually limited in ways that are good - I don't see it introducing surprising control flow in the same way.

Defer is also hidden control flow. At the end of every block, you need to read backwards in the entire block to see if a defer was declared in order to determine where control will jump to. Please stop pretending that defer isn't hidden control flow.

> RAII has also proven to be quite harmful in cases

The downsides of defer are much worse than the "downsides" of RAII. Defer is manual and error-prone, something that you have to remember to do every single time.

Defer is a restricted form of COMEFROM with automatic labels. You COMEFROM the end of the next `defer` block in the same scope, or from the end of the function (before `return`) if there is no more `defer`. The order of execution of defer-blocks is backwards (bottom-to-top) rather than the typical top-to-bottom.

    puts("foo");
    defer { puts("bar"); }
    puts("baz");
    defer { puts("qux"); }
    puts("corge");
    return;
Will evaluate:

    puts("foo");
    puts("baz");
    puts("corge");
    puts("qux");
    puts("bar");
    return;
That is the most cursed description I have seen on how defer works. Ever.
This is how it would look with explicit labels and comefrom:

    puts("foo");
    before_defer0:
    comefrom after_defer1;
    puts("bar");
    after_defer0:
    comefrom before_defer0;
    puts("baz");
    before_defer1:
    comefrom before_ret;
    puts("qux");
    after_defer1:
    comefrom before_defer1;
    puts("corge");
    before_ret:
    comefrom after_defer0;
    return;
---

`defer` is obviously not implemented in this way, it will re-order the code to flow top-to-bottom and have fewer branches, but the control flow is effectively the same thing.

In theory a compiler could implement `comefrom` by re-ordering the basic blocks like `defer` does, so that the actual runtime evaluation of code is still top-to-bottom.

But of course what you call "surprising" and "hidden" is also RAII's strength.

It allows library authors to take responsibility for cleaning up resources in exactly one place rather than forcing library users to insert a defer call in every single place the library is used.

RAII also composes.
This certainly isn't RAII—the term is quite literal, Resource Acquisition Is Initialization, rather than calling code as the scope exits. This is the latter of course, not the former.
People often say that "RAII" is kind of a misnomer; the real power of RAII is deterministic destruction. And I agree with this sentiment; resource acquisition is the boring part of RAII, deterministic destruction is where the utility comes from. In that sense, there's a clear analogy between RAII and defer.

But yeah, RAII can only provide deterministic destruction because resource acquisition is initialization. As long as resource acquisition is decoupled from initialization, you need to manually track whether a variable has been initialized or not, and make sure to only call a destruction function (be that by putting free() before a return or through 'defer my_type_destroy(my_var)') in the paths where you know that your variable is initialized.

So "A limited form of RAII" is probably the wrong way to think about it.

> and make sure to...call a destruction function

Which removes half the value of RAII as I see it—needing when and to know how to unacquire the resource is half the battle, a burden that using RAII removes.

Of course, calling code as the scope exits is still useful. It just seems silly to call it any form of RAII.

In my opinion, it's the initialization part of RAII which is really powerful and still missing from most other languages. When implemented properly, RAII completely eliminates a whole class of bugs related to uninitialized or partially initialized objects: if all initialization happens during construction, then you either have a fully initialized correct object, or you exit via an exception, no third state. Additionaly, tying resources to constructors makes the correct order of freeing these resources automatic. If you consume all your dependencies during construction, then destructors just walk the dependency graph in the correct order without you even thinking about it. Agreed, that writing your code like this requires some getting used to and isn't even always possible, but it's still a very powerful idea that goes beyond simple automatic destruction
This sounds like a nice theoretical benefit to a theoretical RAII system (or even a practical benefit to RAII in Rust), but in C++, I encounter no end of bugs related to uninitialized or partially initialized objects. All primitive types have a no-op constructor, so objects of those types are uninitialized by default. Structs containing members of primitive types can be in partially initialized states where some members are uninitialized because of a missing '= 0'.

It's not uncommon that I encounter a bug when running some code on new hardware or a new architecture or a new compiler for the first time because the code assumed that an integer member of a class would be 0 right after initialization and that happened to be true before. ASan helps here, but it's not trivial to run in all embedded contexts (and it's completely out of the question on MCUs).

I think you are both right, to some degree.

It's been some since I have used C++, but as far as I understand it RAII is primarily about controlling leaks, rather than strictly defined state (even if the name would imply that) once the constructor runs. The core idea is that if resource allocations are condensed in constructors then destructors gracefully handle deallocations, and as long you don't forget about the object (_ptr helpers help here) the destructors get called and you don't leak resources. You may end up with a bunch of FooManager wrapper classes if acquisition can fail (throw), though. So yes, I agree with your GP comment, it's the deterministic destruction that is the power of RAII.

On the other hand, what you refer to in this* comment and what parent hints at with "When implemented properly" is what I have heard referred to (non English) type totality. Think AbstractFoo vs ConcreteFoo, but used not only for abstracting state and behavior in class hierarchy, but rather to ensure that objects are total. Imagine, dunno, database connection. You create some AbstractDBConnection (bad name), which holds some config data, then the open() method returns OpenDBCOnnection() object. In this case Abstract does not even need to call close() and the total object can safely call close() in the destructor. Maybe not the best example. This avoids resources that are in an undefined state.

You're talking about the part of C++ that was inherited from C. Unfortunately, it was way too late to fix by the time RAII was even invented
And the consequence is that, at least in C++, we don't see the benefit you describe of "objects can never be in an uninitialized or partially-initialized state".

Anyway, I think this could be fixed, if we wanted to. C just describes the objects as being uninitialized and has a bunch of UB around uninitialized objects. Nothing in C says that an implementation can't make every uninitialized object 0. As such, it would not harm C interoperability if C++ just declared that all variable declarations initialize variables to their zero value unless the declaration initializes it to something else.

To be fair, RAII is so much more than just automatic cleanup. It's a shame how misunderstood this idea has become over the years
Can you share some sources that give a more complete overview of it?

I got out my 4e Stroustrup book and checked the index, RAII only comes up when discussing resource management.

Interestingly, the verbatim introduction to RAII given is:

> ... RAII allows us to eliminate "naked new operations," that is, to avoid allocations in general code and keep them buried inside the implementation of well-behaved abstractions. Similarly "naked delete" operations should be avoided. Avoiding naked new and naked delete makes code far less error-prone and far easier to keep free of resource leaks

From the embedded standpoint, and after working with Zig a bit, I'm not convinced about that last line. Hiding heap allocations seems like it make it harder to avoid resource leaks!

> Hiding heap allocations seems like it make it harder to avoid resource leaks!

Because types come in constructor / destructor pairs. When creating variables, you're forced to invoke a constructor, and when an the object's lifetime ends, the compiler will insert a destructor call for you. If you allocate on construction and de-allocate on destruction, it'll be very hard for the leak to happen because you can't forget to call the destructor

> with extra braces

The extra braces appear to be optional according to the examples in https://www.open-std.org/JTC1/SC22/WG14/www/docs/n3734.pdf (see pages 13-14)