Hacker News new | ask | show | jobs
by 1718627440 265 days ago
I hope for the former since then the functionality might become available as attributes in C too.
1 comments

Might depend on the specific feature. If Clang is any indication I wouldn't hold out hope that (frontend?) features like lifetimes/borrow checking are easily transferable.

In any case, there is work on a Rust frontend for GCC (gccrs), but I haven't seen any discussion on whether any of the Rust-specific features can benefit other GCC frontends. Might be a bit premature for those discussions anyways since the frontend has yet to reach production readiness.

The compiler has already the concept of scope and variables existing at least in the optimizer. Ownership tracking programs for C have been existing for 20 years, so it can't be too hard to integrate it in the compiler, once it has been implemented for one language.
> The compiler has already the concept of scope and variables existing at least in the optimizer.

My gut feeling is that while the high-level concepts might share a name I'm not actually sure if they're similar enough for useful transfer? The optimizer is working on a very different representation at a later stage of the compilation process, so I'm a bit skeptical about the level of similarity and/or transferability when you get down to details. I guess a more concrete example might be like comparing type inference with optimizer value range analysis - both will analyze a CFG, but beyond that they're working on different-enough representations that transforming work on the latter to useful work on the former seems unlikely to me (though I'm a nobody, so take that with an appropriate grain of salt).

For example, consider work on improved lifetime analysis in Clang [0], which seems to be discussing a from-scratch implementation based on concepts from Polonius and doesn't seem to reference anything from LLVM. And more generally, the fact that neither GCC nor Clang appear to have discussed reusing concepts from their optimizer passes to recreate the borrow checker or borrow checker-like functionality makes it seem more likely to me that there's some fundamental distinction and/or additional considerations that make such a project difficult.

> Ownership tracking programs for C have been existing for 20 years

Could you give some examples? Not sure I've heard of anything that would fill the same niche as the borrow checker.

> so it can't be too hard to integrate it in the compiler, once it has been implemented for one language.

Sorry, I'm getting a bit confused here. When you say "integrate it in the compiler", by "it" do you mean the above mentioned "ownership tracking programs for C", or do you mean features implemented in the GCC Rust frontend?

In any case, as far as the borrow checker goes gccrs is currently planning on reusing rustc's borrowck implementation so that's going to be a bit of a hurdle to integrating similar functionality into other frontends. I don't know whether they plan on eventually writing an independent borrow checking implementation. Not sure if you had other features in mind, either.

[0]: https://discourse.llvm.org/t/rfc-intra-procedural-lifetime-a...

> My gut feeling is

You're probably right, I was saying it just can be transferred, with a large "just".

> the fact that neither GCC nor Clang appear to have discussed reusing concepts from their optimizer passes to recreate the borrow checker or borrow checker-like functionality

There is however a commitment from GCC, that every UB the compiler exploits must be reported by -fanalyzer, otherwise it's a compiler bug.

> Ownership tracking programs for C

I think Frama-C is a modern program: https://frama-c.com/.

Where I got to know the concept is SPlint: http://splint.org/ Last I checked the development stopped in 2010 and the implementation shipped in Debian was buggy, but there seams to be newer development on Github. The initial commit is from 2000-06-13.

I've given up on using it due to bugs, but I do use the annotation to specify, among other things, ownership semantics in C.

Any C API already documents ownership semantics, otherwise its underspecified and can't be used. It's just specified in prose instead of code. The semantics are however more often more complicated then a simple owning pointer. A common thing is for example, that whether ownership was transferred depends on the return value of the called function.

There are C APIs out there without the necessary documentation, but you can't actually use them, without either introducing leaks, use-after-free bugs or reading the source code.

> Sorry, I'm getting a bit confused here. When you say "integrate it in the compiler", by "it" do you mean the above mentioned "ownership tracking programs for C", or do you mean features implemented in the GCC Rust frontend?

"it" means ownership tracking implementation intended for Rust in the frontend.

> I was saying it just can be transferred, with a large "just".

I'm not really convinced it can be transferred at all, but I'm not convinced it can't be either. The "just" feels so large as effectively

> There is however a commitment from GCC, that every UB the compiler exploits must be reported by -fanalyzer, otherwise it's a compiler bug.

Huh, don't think I've heard that commitment before. Do you mean that the GCC devs intend for -fanalyzer to (eventually?) guarantee catching all exploitable UB (which would be... ambitious, to say the least), or that -fanalyzer is a best-effort analysis? The docs currently state the latter more or less ("It is neither sound nor complete: it can have false positives and false negatives.") but that doesn't necessarily rule out attempts to make it so later (though that feels like it should run into Rice's theorem and/or false positive rate issues and/or require code alterations).

The closest thing I heard of is something about Clang/LLVM aiming to catch all the UB it exploits using sanitizers, but that's done at runtime so it's a lot easier to be precise about what you catch.

> I think Frama-C is a modern program: https://frama-c.com/.

Ah. I suppose that counts, though I would probably describe Frama-C as more than just an ownership tracking program given its other capabilities. I guess it technically could fill the same niche as the borrow checker, though given its capabilities and what's needed to use it I think there's probably not a lot of practical overlap in use cases.

> Where I got to know the concept is SPlint: http://splint.org/

Haven't heard of that one before. It does look like it can provide (some?) similar capabilities, though perhaps not to the same level of soundness as what the borrow checker provides. From one of the papers linked on the website [0]:

> In real programs it is sometimes necessary to use weaker assumptions about memory use. The `owned` annotation denotes a reference with an obligation to release storage. Unlike `only`, however, other external references (marked with `dependent` annotations) may share this object. It is up to the programmer to ensure that the lifetime of a `dependent `reference is contained within the lifetime of the corresponding `owned` reference.

It's also not quite clear to me whether Splint can cover more "interesting" borrow checker cases like those involving named lifetimes or view structs, but given this is the first time I've heard of it I definitely don't have the experience or knowledge to say for sure.

> There are C APIs out there without the necessary documentation, but you can't actually use them, without either introducing leaks, use-after-free bugs or reading the source code.

Sure, and that's what makes analysis so practically difficult. Whole program analysis doesn't scale well, standard C doesn't have enough information for cheap inference, etc., etc.

> "it" means ownership tracking implementation intended for Rust in the frontend.

In that case I'm not sure if gccrs would provide the implementation you hope for since they currently plan on integrating rustc's borrow checker implementation as-is. I'm not aware of a desire to write an independent borrow checker implementation at the moment as well.

[0]: https://www.cs.virginia.edu/~evans/pubs/pldi96.pdf

> Huh, don't think I've heard that commitment before. Do you mean that the GCC devs intend for -fanalyzer to (eventually?) guarantee catching all exploitable UB (which would be... ambitious, to say the least), or that -fanalyzer is a best-effort analysis? The docs currently state the latter more or less ("It is neither sound nor complete: it can have false positives and false negatives.")

Both actually. Any UB exploits not catched by -fanalyzer would need to be disabled. However I can't find a reference to this, so maybe my memory is deceiving me.

When writing Frama-C what I was thinking of was actually PVS-Studio (https://pvs-studio.com/), as this can also be used by students. It's also more of a standalone linter.

>> It is up to the programmer to ensure that the lifetime of a `dependent` reference is contained within the lifetime of the corresponding `owned` reference.

Yes, this is an escape hatch, when the pointer shenanigans can't be fully described. But I heard Rust also has those. If you want your program to be described by ownership semantics, you will make use of this less and less.

> named lifetimes support

I don't know enough Rust, but from what I read at https://doc.rust-lang.org/book/ch10-03-lifetime-syntax.html, yes it does. Specifying that the 'lifetime' of the return value corresponds to a parameter happens with the `returned` annotation. The cool thing in SPlint is that you can describe the lifetime of param1.foo.bar[0..42] . It also has several types of 'lifetimes': allocated, readable and writable which is useful to represent uninitialized memory, meaning after a function call some stuff is newly uninitialized, that before the call wasn't. You also can combine this with parameters, so you can say that param1.baz[0..param2] is writable and param1.baz[0..param3] is readable and also that readable param1.baz[0..X] and writable param1.baz[0..Y] always means that X > Y.

It doesn't use the term 'lifetime', but talks about owned, allocated, initialized, readable and writable memory. In addition it also supports adding other properties, so much more then 'lifetimes' can be tracked. The manual shows as an example how it can be used to track variables that are tainted by user input (10.1). What I think is missing though, are conditionals on the return value.

How much of these features can be written in Rust? (Honest question)

> view struct support

I don't know really what these are. Maybe I already described that above?