| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by wg0 612 days ago
	Otherwise is a decent language but what makes it difficult is the borrow semantics and lifetimes. Lifetimes are more complicated to get your head around. But then there's this Arc, Ref, Pinning and what not - how deep is that rabbit hole?

5 comments

junon 611 days ago

Context: I'm writing a novel kernel in Rust.

Lifetimes aren't bad, the learning curve is admittedly a bit high. Post-v1 rust significantly reduced the number of places you need them and a recent update allows you to elide them even more if memory serves.

Arc isn't any different than other languages, not sure what you're referring to by ref but a reference is just a pointer with added semantic guarantees, and Pin isn't necessary unless you're doing async (not a single Pin shows up in the kernel thus far and I can't imagine why I'd have one going forward).

mcherm 611 days ago

Would you not want to use Pin when sharing memory with a driver or extension written in a different language (eg: C)?

bombela 611 days ago

Pin is a pure compile time abstraction for a single problem: memory safety of self referential struct.

Pin leverages the type system to expose to the programmer receiving a pointer to a Pin'ned object, that this object has some pointer on itself (a self referencial struct). You better be mindful not to move this object to a different memory location unless you know for sure that it is safe to do so. The Pin abstraction makes it harder to forget; and easier to notice during code review; by forcing you to use the keyword unsafe for any operations on the pinned object that could move it around.

In C, there is no such way to warn the programmer besides documentation. It is up to the programmer to be very careful.

junon 610 days ago

Nah not really. Pin is for self-refefential data typically. It's compile time only so that information would get lost in C anyway, and there's no way to distinguish that data at runtime.

The kernel is doing so much anyway with memory maps and flipping in / out pages for scheduling and context switching that Pin doesn't add any value in such cases anyway.

It was also specifically built for async rust. I've never personally seen it in the wild in any other context.

baq 611 days ago

If you’re writing C and don’t track ownership of values, you’re in a world of hurt. Rust makes you do from day one what you could do in C but unless you have years of experience you think it isn’t necessary.

wg0 611 days ago

Okay, I think it is is more like Typescript. You hate it but one day you just write small JS program and convert it to Typescript to discover that static analysis alone had so many code paths revealed that would have resulted in uncaught errors and then you always feel very uncomfortable writing plain Javascript.

But what about tools like valgrind in context of C?

rcxdude 611 days ago

Valgrind can only tell you about issues that your testcases exercise. It doesn't provide the same guarantees as static checking of memory safety invariants. But, if you're really concerned (especially about unsafe code), belt-and-bracers is a good strategy, and valgrind will work with rust binaries as well. Rust also has a tool called MIRI which can similarly flag up issues in testcases (it's effectively an interpreter for the intermediate representation in the compiler, and it can detect undefined behaviour even if the compiled assembly would happen to look OK. Still has the same limitation of needing extensive testcases though)

tracker1 611 days ago

A few years ago I was worrying on a Byzantine mess of a JS project. I converted everything to TS for the sole reason of somewhat safely refactoring the project as a whole.

There was so little trust in the fragility of the original, it took a few months to convince everyone the refactored TS branch was safe.

After that, feature development was a lot faster in terms of productivity again.

baq 611 days ago

You probably should run your rust programs through valgrind regardless. Rust is safer than C, but any unsafe code drops you to approximately C level of safety and any C FFI calls are obviously outside of rust's control or responsibility.

badmintonbaseba 611 days ago

Valgrind is great, especially if you write extensive tests and you actually run them through it regularly. And even then, it does not prove the absence of any kind of bugs. Safe rust has strong guarantees.

jimbomins 611 days ago

frama-c

junon 610 days ago

Which is terrible for kernel development, and is generally very hard to work with, unfortunately.

metalloid 611 days ago

It was true until LLMs arrive. Feature compilers + IDEs can be integrated with LLMs to help programmers.

Rust was a great idea, before LLMs, but I don't see the motivation for Rust when LLMs can be the solution initial for C/C++ 'problems'.

smolder 611 days ago

Relying on LLMs to code for you in no way solves the safety problem of C/C++ and probably worsens it.

BrainInAJar 611 days ago

probably?

even if the LLM is trained on flawless C code (which it isn't) it still has no way of reasoning about a complex system, it's just "what token is statistically most likely to come next"

smolder 610 days ago

I said that because it's very possible for someone to write a more flawed program without an LLMs help. The exact probabilities weren't central to my point.

ulbu 611 days ago

Rust compiler checks things for you. People trust the Rust compiler because it enforces rules they want, so people don’t have to be in its place. Your suggestion is to be that checker to LLM-generated code. Back to square one.

baq 611 days ago

On the contrary LLMs make using safe but constraining languages easier - you can just ask it how to do what you want in Rust, perhaps even by asking it to translate C-ish pseudocode.

oersted 611 days ago

I don’t entirely agree, you can get used to the borrow checker relatively quickly and you mostly stop thinking about it.

What tends to make Rust complex is advanced use of traits, generics, iterators, closures, wrapper types, async, error types… You start getting these massive semi-autogenerated nested types, the syntax sugar starts generating complex logic for you in the background that you cannot see but have to keep in mind.

It’s tempting to use the advanced type system to encode and enforce complex API semantics, using Rust almost like a formal verifier / theorem prover. But things can easily become overwhelming down that rabbit hole.

jonathanstrange 611 days ago

It's just overengineered. Many Rust folks don't realize it because they come from C++ and suffer from Stockholm Syndrome.

junon 610 days ago

How is it overengineered?

jonathanstrange 609 days ago

That's my personal opinion after I've learned it and read Klabnik's book. I'm aware that other people's mileage differs. I'm listing a few reasons below.

- Overall too complex

- Wrong philosophy: demanding the user to solve problems instead of solving problems for the user

- Trying to provide infinite backwards compatibility with crates, which leads to hidden bitrot

- Slow compilation times

- Claims to be "safe" but allows arbitrary unsafe code, and it's everywhere.

- Adding features to fix misfeatures (e.g. all that lifetime cruft; arc pointers) instead of fixing the underlying problem

- Hiding implementations with leaky abstractions (traits)

- Going at great length to avoid existing solutions so users re-invent it (e.g. OOP with inheritance; GC), or worse, invent more complex paradigms to work around the lack (e.g. some Rust GUI efforts; all those smart pointer types to work around the lack of GC)

- A horrendous convoluted syntax that encourages bad programming style: lot's of unwrap, and_then, etc. that makes programs hard to read and audit.

- Rust's safe code is not safe: "Rust’s safety guarantees do not include a guarantee that destructors will always run. [...] Thus, allowing mem::forget from safe code does not fundamentally change Rust’s safety guarantees."

It already has similar complexity and cognitive demands as C++ and it's going to get worse. IMHO, that's also why it's popular. Programmers love shitty languages that allow them to show off. Boring is good.

selfmodruntime 606 days ago

> Claims to be "safe" but allows arbitrary unsafe code, and it's everywhere.

Sigh. This is not true. Not the first part, and especially not the last part. `Unsafe` doesn't allow arbitrary, unsafe code. It resets the compiler to a level where most manually managed languages are all the time. You still have to uphold all guarantees the compiler provides, just manually. That's why Miri exists.

jonathanstrange 606 days ago

Either it's safe or it's unsafe. If you use the keyword "unsafe" it should definitely not mean "safe" (and it doesn't, but you seem to suggest it).

junon 609 days ago

> Overall too complex

Completely subjective. I've learned all there is to learn about Rust's syntax and most of its standard libraries, I think, and it's really not all that, in my personal opinion. There are certainly much more complex languages out there, even dynamic languages. I'd argue Typescript is more complex than Rust as a language.

> Wrong philosophy: demanding the user to solve problems instead of solving problems for the user

I have no idea what you mean by this. Do you mean you want more magic?

> Trying to provide infinite backwards compatibility with crates, which leads to hidden bitrot

Backwards compatibility reduces bitrot. Bitrot is when the ecosystem has moved on to a point of not supporting features used by stale code, thus making the code partially or completely unusable in newer environments as time progresses and the code doesn't update.

The Rust editions explicitly and definitively solve the bitrot problem, so I'm not sure what you're on about here.

> Slow compilation times

Sure, of course. That's really the biggest complaint most people have, though I've had C++ programs take just as long. Really depends on how the code is structured.

> Claims to be "safe" but allows arbitrary unsafe code, and it's everywhere.

Unsafe isn't a license to kill. It also doesn't allow "arbitrary" code. I suggest reading the rustnomicon, the book about Rust undefined behavior. All `unsafe` code must adhere to the postcondition that no undefined behavior is present. It also doesn't remove borrow checking and the like. Without `unsafe` you couldn't do really anything that a systems language would need to do in certain cases - e.g. writing a kernel requires doing inherently unsafe things (e.g. switching out CR3) where no compiler on earth currently written will understand those semantics.

People seem to parrot this same "unsafe nullifies rust's safety" without really understanding it. I suppose they could have renamed the `unsafe` keyword `code_the_does_stuff_unverifiable_by_the_compiler_so_must_still_adhere_to_well_formed_postrequisites_at_risk_of_invoking_undefined_behavior` but alas I think it'd be pretty annoying to write that so often.

It's pretty typical to abstract away `unsafe` code into a safe API, as most crates do.

> Adding features to fix misfeatures (e.g. all that lifetime cruft; arc pointers) instead of fixing the underlying problem

Lifetimes aren't "cruft", not sure what you mean. They've also been elided in a ton of cases.

An "arc pointer" isn't a thing; there's ARC (which is present in every unmanaged language, including C++, Objective-C, Swift, etc). I'm not sure what the "underlying problem" is you're referring to. Rust takes the position that the standard library shouldn't automatically make e.g. Mutexes an atomically reference counted abstraction, but instead allow the user to determine if reference counting if even necessary (Rc<Mutex>) and if it should be atomic so as to be shareable across cores (Arc<Mutex>). This type composure is exactly why Rust's type system is so easy to work with, refactor and optimize.

> Hiding implementations with leaky abstractions (traits)

Sorry for being blunt but this is a word salad. Traits aren't leaky abstractions. In my personal experience they compose so, so much better and have better optimization strategies than more rigid OOP class hierarchies. So I'm not sure what you mean here.

> Going at great length to avoid existing solutions so users re-invent it (e.g. OOP with inheritance; GC), or worse, invent more complex paradigms to work around the lack (e.g. some Rust GUI efforts; all those smart pointer types to work around the lack of GC)

Trait theory has been around for ages. GC is not a silver bullet and I wish people would stop pretending it was. There are endless drawbacks to GC. "All those smart pointer types" -- which ones? You just seem to want GC. I'm not sure why you want GC. GC solves few problems and creates many more. It can't be used in a ton of environments, either.

> A horrendous convoluted syntax that encourages bad programming style: lot's of unwrap, and_then, etc. that makes programs hard to read and audit.

This is completely subjective. And no, there's not a lot of `and_then`, I don't think you've read much Rust. Sorry if I'm sounding rude, but it's clear to me by this point in my response that you've played with the language only at a very surface level and have come to some pretty strong (and wrong) conclusions about it.

If you don't like it, fine, but don't try to assert it as being a bad language and imply something about the people that use it or work on it.

> Rust's safe code is not safe: "Rust’s safety guarantees do not include a guarantee that destructors will always run. [...] Thus, allowing mem::forget from safe code does not fundamentally change Rust’s safety guarantees."

You misunderstand what it's saying there but I'm honestly tired of rehashing stuff that's very easily researched that you seem to not be willing to do.

jonathanstrange 608 days ago

That's a lengthy and passionate reply. The phrases "in my opinion" and "other people's mileage may differ" should have given away that my take was mostly subjective opinion. Rust is definitely not a language for me and would be a bad choice for the projects I'm working on. I continue to think it's totally overengineered. But, as noted before, other people's mileage may differ.

As long as the Rust fans stick to their favorite language, everybody can be happy.

KingOfCoders 611 days ago

I always feel Arc is the admission that the borrow checker with different/overlapping lifetimes is too difficult, despite what many Rust developers - who liberally use Arc - claim.

jeroenhd 611 days ago

Lifetime tracking and ownership are very difficult. That's why languages like C and C++ don't do it. It's also why those languages needs tons of extra validation steps and analysis tools to prevent bugs.

Arc is nothing more than reference counting. C++ can do that too, and I'm sure there are C libraries for it. That's not an admission of anything, it's actually solving the problem rather than ignoring it and hoping it doesn't crash your program in fun and unexpected ways.

Using Arc also comes with a performance hit because validation needs to be done at runtime. You can go back to the faster C/C++ style data exchange by wrapping your code in unsafe {} blocks, though, but the risks of memory corruption, concurrent access, and using deallocated memory are on you if you do it, and those are generally the whole reason people pick Rust over C++ in the first place.

GoblinSlayer 611 days ago

Looking at the code, it consists of long chains of get().unwrap().to_mut().unwrap().get() noise. Looks like coping with library design than ownership tacking. Also why Result<Option<T>>? Isn't Result already Option by itself? I guess that's why you need get().unwrap().to_mut() to get a value from Result<Option<T>> from an average function call?

thesuperbigfrog 611 days ago

>> Also why Result<Option<T>>? Isn't Result already Option by itself?

No. I've written code that returns Result<Option<T>>. It was a wrapper for a server's query web API.

The Result part determines whether the request succeeded and the response is valid.

The Option part is because the parameter being queried might not exist. For example, if I ask the API for the current state of the user session with a given Session ID, but that Session ID does not exist, then the Rust wrapper could return OK(None) meaning that the request succeeded, but that no such session was found.

oniony 611 days ago

Result is whether an operation returned an error or not. Option is whether you have a value or no value.

thesuperbigfrog 611 days ago

Exactly.

That is why a query that successfully returns no items can be represented as Ok(None).

A successful query with items returned would instead be Ok(Vec<Item>).

An error in the completing the query (for example, problem with the database), would be Err(DatabaseError) or Err(SomeOtherError).

GoblinSlayer 611 days ago

Presumably missing session is an alternative scenario and thus should be reported as an error, then you match and handle this error. Your design complicates the common scenario: in case of valid session you need double unwrap, cf File::open that could return Result<Option<File>> if file is not found.

thesuperbigfrog 611 days ago

>> Presumably missing session is an alternative scenario and thus should be reported as an error

But in this case, a query using an invalid session ID is not an error. It is asking for details about something that does not exist.

>> cf File::open that could return Result<Option<File>> if file is not found.

This type of query is not like File::open which gets a handle to a resource. Trying to get a handle to a resource that does not exist is an error.

This type of query is read-only and does not allocate any resources or prepare to do anything with the session.

It simplifies the control flow because it distinguishes between errors in completing a query versus the presence or absence of items returned from the query.

LinXitoW 611 days ago

If I ask my repository (backed by an sql db) to get a user, there might be 3 different scenarios I'm interested in:

- Technical problem (like connection problems) means I don't know what's in the db

- No technical problem, but no user entry

- No technical problem, and a user entry

You need the Result for the technical problems, and the Option for whether there's a user entry or not.

GoblinSlayer 611 days ago

Surely Result is supposed to hold both system errors and business errors.

Galanwe 611 days ago

It's not that the borrow checker is too difficult, it's that it's too limiting.

The _static_ borrow checker can only check what is _statically_ verifiable, which is but a subset of valid programs. There are few things more frustrating than doing something you know is correct, but that you cannot express in your language.

acomar 611 days ago

I've found the opposite. every time I attempt to subvert the borrow checker, I eventually discover that I'm attempting to write a bug.

netbsdusers 611 days ago

For kernels (and I suspect database engines might be added to the list, since they seem to have similar requirements to be both scalable and deal with massive amounts of shared state, but I'm not overly familiar with them) is where it gets particularly difficult.

Several kernels for example use type-stable memory, memory that is guaranteed to only hold objects of a particular type, though perhaps only providing that guarantee for as long as you hold an RCU read-lock (this is the case in Linux with SLAB_TYPESAFE_BY_RCU). It is possible in some cases to be able to safely deal with references to objects where the "lifetime" of the referent has ended, but where by dint of it being guaranteed to be the same type of object, you can still do what you want to do.

This comes in handy when you have a problem that commonly appears in kernels where you need to invert a typical lock ordering (a classic case is that the page fault codepath might want to lock, say, VM object then page queue, but the page-replacement codepath will want to lock page-queue then VM object.)

Unfortunately it's hard to think of how the preconditions for these tricks could be formally expressed.

andrewflnr 611 days ago

Wouldn't you want the relevant Rust lifetime to be that of the type-stable memory block, not the individual "object" inside? I'm not that familiar with kernel programming, but that sounds a lot like an arena, and (IIRC) that's the approach with an arena.

GolDDranks 611 days ago

It's not just difficult, sometimes it's impossible to statically know a lifetime of a value, so you must dynamically track it. Arc is one of such tools.

lmm 611 days ago

If tracking lifetimes is simple 90% of the time and complex 10% of the time, maybe a tool that lets you have them automatically managed (with some runtime overhead) that 10% of the time is the right way forward.

tracker1 611 days ago

Then you can use a language and runtime like C# or Java. Or you can use patterns like Go promotes.

There are lots of options if you want.

lmm 611 days ago

> Then you can use a language and runtime like C# or Java. Or you can use patterns like Go promotes.

But in that case you're stuck paying the overhead 100% of the time, even though 90% of the lifetimes are simple. (Perhaps a little less so with escape analysis etc., but doing it at compile time in a way that's understandable in the source feels a lot more reliable)

neonsunset 611 days ago

Java and C# are languages of different class. C# is perfectly capable of systems programming and manual memory management that is at least more convenient than C (but not C++ with move semantics and operator overloading abuse, otoh C#'s type system and build process are saner which at least partially pays for this).

surajrmal 611 days ago

Arc usually means you've not structured your code to have clear lifetimes where one object clearly outlives another. Typically I see c++ applications avoid it but actually suffer from bugs due to the same structural deficiencies. They said, I think it's almost always possible to to avoid it if you try hard enough. With async you need to use structured concurrency.

oneshtein 611 days ago

Rust lifetime is just a label for a region of memory with various data, which is discarded at the end of its life time. When compiler enters a function, it creates a memory block to hold data of all variables in the function, and then discards this block at the exit from the function, so these variables are valid for life time of the function call only.