Hacker News new | ask | show | jobs
by dparoski 4336 days ago
First, I want to thank you, Paul, for writing this critique. It's great to read well thought-out feedback about the draft spec that was announced yesterday.

Before getting into technical nitty gritty, I wanted to clarify that the spec in its current state is a draft offered to the PHP community as a starting point for specifying the PHP language. It now belongs in the php-src repository and can be updated through the standard commit processes for that repository as the community sees fit. Some decisions about specificity were made for the initial draft, but these decisions are by no means final and the hope is that the community will settle on what's right for the PHP ecosystem overall.

Regarding RAII, I'd argue that __destruct methods in PHP are a bit different than stack-allocated variables in C++. Stack allocated variables in C++ have a strictly defined lifetime based on the scope of the variable. In PHP on the other hand, objects are heap allocated and when they are destroyed is not as well defined. For example, you can have a cycle of two or more objects pointing to each other that are unreachable (i.e. cyclical garbage), and it such cases any __destruct methods for these objects are not immediately invoked when the objects become unreachable. I considered requiring refcounting-based automatic memory management for the initial draft of the memory model, but describing in detail cyclical garbage felt really implementation specific and so at a gut level it seemed better to not require RC-based automatic memory management and see how people reacted.

Based on my personal programming tastes, I'd argue that try/finally is a cleaner, more robust way to ensure certain cleanup happens when a scope is exited rather than relying on __destruct. However, I understand that there are some PHP programs out there that rely on __destruct being invoked eagerly in non-cyclical-garbage cases, and that such code will probably exist in the wild for a quite a while regardless of whether try/finally is "superior" or not. I'm curious to see how this issue settles over time.

For the record, HHVM uses RC-based automatic memory management and will eagerly call __destruct on objects that become unreachable that are not part of cyclical garbage. The initial choice to be a bit more liberal about reclamation was not essential to make sure that HHVM was compliant with the spec.

1 comments

RAII simply requires that resource allocation is tied to object lifetime. It requires that destructors are called deterministically and immediately once all references to an object no longer exist. (In the case of cycles, a reference remains until otherwise broken -- a non-issue for this definition).

I personally prefer RAII to finally for object-related cleanup because finally is fallible. If you have a File object instance whose destructor calls fclose() automatically then I don't have to remember to call a Close() method inside a finally block. It happens automatically. But if you don't have RAII then you must remember to put in try blocks and finally clauses everywhere you create an instance. Multiply that by every database connection, network connection, and file and that is a lot of work and potential to miss something. And finally doesn't work at all if your object lifetimes aren't tied to scope.

Finally is inferior to RAII in almost all cases except where you aren't using nicely defined objects. If you use an fopen() call directly, finally is your only recourse to fclose() it properly. The only criticism of RAII is that does limit concurrency and garbage collection options.

Hmm, IMHO it feels like you're bending the definition of "lifetime" a bit in a manner that already presumes RC-based automatic memory management, and once one accepts that assumption then naturally one concludes that the lifetime ends when refcounting says it ends. To me, the lifetime of a heap allocated thing effectively ends when it is no longer reachable via any existing variables (though perhaps this definition is biased towards a tracing-GC view of the world).

I agree with your points about "finally". It can definitely be a bit clunky, and this is why some languages have introduced scoped cleanup constructs such as C#'s "using" statement, Python's "with" statement, and D's "scope" statement. I guess what I was getting at with my original comment is that I feel scoped cleanup constructs are a better way to go vs. relying on heap allocated things being reclaimed at a certain time, given that the lifetime of a heap allocated thing can depend on non-local state outside of the current function/method.

I completely agree with your definition. All that RAII needs is that the destructor be called the instant that the object is no longer reachable. With reference counting, that is the case (the object is freed when the refcount reaches zero). But with many other forms of GC (including tracing) that isn't the case -- a process eventually cleans up unreachable objects and the order of that cleanup is indeterminate.

Even scoped cleanup constructs are problematic. I often forget to use them when needed in C# and it's often hard to tell if an object needs it. Furthermore, implementators of IDiposable in C# have to write a lot of boilerplate code[1] to do disposable correctly and cascade disposing to every contained object (again determining if it's necessary). It's also not change friendly -- a class might not need to be disposed today but if changed later all the existing instantiating code won't have guards.

I'm not sure why you think the the lifetime of heap allocated objects under RAII is an issue -- they'll just clean themselves up when they're not needed. It's much less worry and code. What issue do you think exists?

As for Scoped cleanup constructs, they are just hack for languages that can't do RAII -- they have no other benefit.

[1] http://msdn.microsoft.com/en-us/library/b1yfkh5e%28v=vs.110%...

Well, for PHP fopen() specifically fclose() would happen automatically as soon as the variable containing the result of fopen() goes out of scope. Unless you shared it with some other code, "properly" would happen automatically here.