A curiously recurring lifetime issue | HN Mirror

Y	Hacker News new \| ask \| show \| jobs

	A curiously recurring lifetime issue (blog.dureuill.net)
	79 points by irsagent 912 days ago

10 comments

kentonv 910 days ago

Author of Cap'n Proto here.

The main innovation of Cap'n Proto serialization compared to Protobuf is that it doesn't copy anything, it generates a nice API where all the accessor methods are directly backed by the underlying buffer. Hence the generated classes that you use all act as "views" into the single buffer.

C++, meanwhile, is famously an RAII lanugage, not garbage-collected. In such languages, you have to keep track of which things own which other things so that everyone knows who is responsible for freeing memory.

Thus in an RAII language, you generally don't expect view types to own the underlying data -- you must separately ensure that whatever does own the backing data structure stays alive. C++ infamously doesn't really help you with this job -- unlike Rust, which has a type system capable of catching mistakes at compile time.

You might argue that backing buffers should be reference counted and all of Cap'n Proto's view types should hold a refcount on the buffer. However, this creates new footguns. Should the refcounting be atomic? If so, it's really slow. If not, then using a message from multiple threads (even without modifying it) may occasionally blow up. Also, refcounting would have to keep the entire backing buffer alive if any one object is pointing at it. This can lead to hard-to-understand memory bloat.

In short, the design of Cap'n Proto's C++ API is a natural result of what it implements, and the language it is implemented in. It is well-documented that all these types are "pointer-like", behaving as views. This kind of API is very common in C++, especially high-performing C++. New projects should absolutely choose Rust instead of C++ to avoid these kinds of footguns.

In my experience each new developer makes this mistake once, figures it out, and doesn't have much trouble using the API after that.

foxhill 910 days ago

apologies, perhaps i’m missing something here, having not used cap’n proto in any context at all before.

is it not possible to delete the rvalue reference overload of ‘getList’?

as far as i can tell, the error producing code wouldn’t have produced a diagnostic, but failed to build in the first instance, like the rust case?

kentonv 910 days ago

That would catch some legitimate use cases, where you get the list and immediately use it on the same line. Admittedly this is not so common for lists, but very common for struct readers, e.g.:

    int i = call.send().getSomeStruct().getValue();

Here, even though `send()` returns a response that is not saved anywhere, and a struct reader is constructed from it, the struct reader is used immediately in the same line, so there's no use-after-free.

Someone else mentioned using lifetimebound annotations. This will probably work a lot better, avoiding the false positives. It just hadn't been done because the annotations didn't exist at the time that most of Cap'n Proto was originally written.

kentonv 910 days ago

Oh actually there's a much more obvious case where prohibiting getters on rvalues would be a problem. It would prevent you from doing this in general:

    myReader.getFoo().getBar()

Here, `myReader` is already a view type; ownership of the backing buffer lives elsewhere. `getFoo()` returns a reader for some sub-struct, and `getBar()` returns a member of that struct. If we say getters are not permitted to be called on rvalues, this expression is illegal, but there's no actual problem with it and in practice we write code like this all the time.

foxhill 910 days ago

i could be wrong, but i’m reasonably confident that this is UB for even trivial types? someone more knowledgeable with the language lawyering would need to opine one way or the other.

regardless of that outcome, i think i’d prefer to require a value preserving the lifetime of the reader/view. in the cases that it may not be necessary, i'd prefer to lean on the optimiser to take care of it..!

kentonv 910 days ago

What's UB about it? Any temporary objects constructed during the evaluation of a statement live until the end of the statement. The standard is clear on that.

> i think i’d prefer to require a value preserving the lifetime of the reader/view. in the cases that it may not be necessary, i'd prefer to lean on the optimiser to take care of it..!

We'd all prefer APIs that cannot be used unsafely but realistically there's no magic the optimizer can do to make the problems with refcounting go away. You need to use a language like Rust to solve this.

foxhill 910 days ago

ah, sorry, i didn’t read that correctly.

perhaps for values like this you’re fine. i think my point still stands about the reader of a built-in list/sequence type, surely?

and, not to sound facetious, that’s exactly what optimisers do :)

the c++ type system is more than capable about reasoning about lifetimes, the issue is that, with c++, it’s an optional part of the language. also, the lack of non-destructive moves. but to require both of those things in the language would require, essentially, the borrow checker in rust.

IshKebab 911 days ago

> Is this on Cap'n Proto? Honestly, I don't know.

It absolutely is. It's a fairly basic principle that APIs should be difficult to misuse, and that fact that you made the same mistake 2/3 times shows that it is very easy to misuse. In other words it is a badly designed API. At the very least it should be called ListView.

I am not a big fan of CapnProto. It has some neat features but it's very complicated, full of footguns like this, and the API is extremely unergonomic.

dmeybohm 910 days ago

Yeah I agree, the ListView naming is more appropriate.

I haven't used non-owning types like string_view or span too much because I haven't needed that level of performance or memory optimization yet, and so those just seem like footguns as compared to just a reference without those needs. I do like to use a technique in classes that use non-owning references that would work for those too to prevent this particular problem.

For that, there are two methods with the same name, but different access - an lvalue version and an rvalue version. Then, you delete the rvalue method like this:

  class Response {
    auto getListView() & -> ListView {
      return ListView(m_List);
    }
    void getListView() && = delete;
  }

Then you get a compile error like in Rust when you try to call getListView() from a temporary object, but if you call the method from an lvalue it still works at least as long as the object is in scope.

bsder 910 days ago

> In other words it is a badly designed API.

I don't agree. The API is what it is because it is specifically a zero copy API for performance. If you don't care about performance, why are you using C++ (stupid) and a zero-copy API (doubly stupid)?

I absolutely do NOT expect a zero copy API to own things. If I drop the underlying reference that is really an alias, how on earth is that the fault of the zero copy API?

The combination of aliasing and lifetimes are C++ footguns--full stop. This is aptly demonstrated by how quickly Rust kills this cold.

If you use sharp knives, sometimes you cut your fingers. People like you would claim the knife is the problem.

IshKebab 910 days ago

I think you misunderstood. It's not a bad API because it uses non-owning references. It's a bad API because it doesn't make that clear.

> If you use sharp knives, sometimes you cut your fingers. People like you would claim the knife is the problem.

This is more like a cutting yourself on a razor sharp butter knife. If it's a sharp knife it should look like a sharp knife.

nyanpasu64 911 days ago

I'd say that method chaining (referential transparency, etc.) and implicit destructor calls with side effects don't mix.

I have a general rule that "resource" types which own a heap allocation should usually be given a variable name with explicit scope (and likely even an explicit type, rather than `auto response` like in this post). This is a general guideline to avoid holding a reference to a temporary that gets destroyed, but doesn't protect against returning a dangling reference into a resource type from a function.

In other places, where languages make the opposite decision (from this blog post) to extend the lifetime of a temporary variable with a destructor when you call methods on it, you get things like C++'s temporary lifetime extension (not a bug, note that I don't understand it well), and footguns like Rust's `match lock.lock().something {}` (https://github.com/rust-lang/lang-team/blob/master/design-me...).

phendrenad2 911 days ago

Referential transparency be damned, I guess. This feels like a inherent downside to languages where you have to manage lifetimes.

GuB-42 910 days ago

The problem I see here is that one of the functions returns a pointer and it doesn't use the usual pointer syntax.

I see no *, no & and no -> in the code. So I would assume everything to behave as if it was owned or even copied. Had it returned actual pointers, or pointer-like objects like iterators, it would have been more obvious.

amluto 910 days ago

This is C++ we’re talking about.

    auto x = y();

Is x a pointer or reference? There’s no way to tell. Maybe if you then do

    x->foo();

You have some idea that x is pointer-ish, but unique_ptr works like this and isn’t very pointer-ish.

HarHarVeryFunny 910 days ago

Seems like someone trying to be too clever to me, and perhaps a case of premature optimization. Non-owning references are a problem waiting to happen. Even if your language/api allows you to check if the reference is still valid before use, you can obviously forget to do so.

Rather than use a non-owning reference I'd rather use a design that didn't need it, or just use a std::shared_ptr owning reference instead. I realize there are potential cases (i.e. one can choose to design such cases) of circular references where a non-owning reference might be needed to break the circular chain, or where one wants a non-owning view of a data structure, but without very careful API design and code review these are easy to mess up.

kentonv 910 days ago

This sounds nice but it just isn't realistic. If you try to write a complex system in C++ without non-owning references, you're basically heap-allocating every single object and using slow atomic refcounting everywhere. Performance will likely be much worse than just using a garbage collected language to start with.

HarHarVeryFunny 910 days ago

Worrying about the speed of smart pointer reference counting sounds like premature optimization at best. If this is really what's slowing your app down in 2023, then you've got bigger problems than choosing owning vs non-owning references.

kentonv 910 days ago

Again, if you aren't worried about this you might as well be using a garbage collected language. They are generally faster than using atomic reference counting for everything. But people use C++ (or Rust) for a reason.

FWIW I don't work on an "app", I work on a cloud platform that hosts web apps, operating in hundreds of datacenters and serving millions of requests per second. Small optimizations are worth a lot of money here -- and garbage collection is a nonstarter in this kind of systems work, for more reasons than just performance.

HarHarVeryFunny 910 days ago

Well, there are exceptions to everything, but for the vast majority of use cases the emphasis should be on writing code that is easy to get right and maintain, which for C++ means using smart pointers as your default choice. I say this as someone who started programming in assembler in the 1970's and is still doing so today in C++. Counting instruction cycles was fun, but nowadays having the luxury of normally not having to worry about it is a massive productivity gain.

Maybe you're advocating for Rust, but in C++ you can of course always fall back to raw pointers for efficiency, while retaining well defined (e.g. class/scope based) lifetimes to avoid bugs.

kentonv 910 days ago

> in C++ you can of course always fall back to raw pointers for efficiency, while retaining well defined (e.g. class/scope based) lifetimes to avoid bugs.

This is certainly what I'd recommend wherever possible. It's OK to give Foo a reference to Bar as long as Foo is declared after Bar in some common scope, so will be destroyed first. If you find it's overly difficult to prove this relationship then you need to refactor your code or use some reference counting.

And then of course Rust actually checks the relationships for you.

dataflow 911 days ago

I think [[clang::lifetimebound]] would let the compiler detect this at compile time?

jimberlage 911 days ago

Is there a good tutorial on Valgrind for beginners? It’s a tool I’ve only ever seen praised so I’m curious to play with it a bit.

kentonv 910 days ago

In my experience all you really have to do is:

1. Write a test program that exercises your segfault.

2. Build it in debug mode.

3. Run `valgrind <my-program>`

And it just tells you where your problem is. Not much more to it.

gavinhoward 911 days ago

As an avid Valgrind user, I don't know of one.

But if you would like, I could post one to my blog.

If you would like me to, contact me privately. [1]

[1]: https://gavinhoward.com/contact/

plagiarist 911 days ago

My takeaway is just the last paragraph, it sounds like Cap'n Proto is a footgun and I should use anything else.

mungaihaha 911 days ago

Nice read