Hacker News new | ask | show | jobs
by mswphd 23 days ago
There's a number of things about rust that help compared to other statically typed languages.

1. the compiler gives very high quality error messages. It helps humans, and also helps LLMs

2. Rust reduces memory management to local reasoning (via the borrow checker). This means that it performs well even as context grows, because checks in one function/module are well-encapsulated to that function/module.

3. Rust can more easily obtain this encapsulation for more general properties than many other statically typed languages. In particular, rust's type system is very strong, so it's easy to take a function `func(x: T)` that relies on some implicit assumption on `x` (say that it is non-zero), and turn it into an explicit requirement. By this, I mean you define `pub struct NonZero(T)`, and provide constructors `pub try_new(t: T) -> Result<NonZero<T>, _>` that error if the condition doesn't hold. If you additionally only provide public methods on `NonZero<T>` that uphold the invariant, you can lift runtime runtime assertions to the type level. This is both good practice, and helps out LLMs quite a bit.

This is to say that rust makes it quite easy to encapsulate implementation details (both regarding memory management, as well as other details) essentially completely. Sometimes you still have invariants that need care/can't be encapsulated in the type system, but such invariants should be marked `unsafe`, so it can be easier to audit the LLM's output.

Anyway, the "more constraints to balance" is only problematic if all the constraints are inter-dependent. It's definitely possible to get LLMs to generate spaghetti code like this, but the way you fix it is the way you fix similar issues in other languages.

1 comments

> This is both good practice, and helps out LLMs quite a bit.

Don’t get me wrong, I like this aspect of Rust, but I can’t make heads or tails as to whether it helps or if they just have to iterate more to figure out how to make something work. LLMs already do pretty well with a comment “this value can’t be zero” in my experience, so I’m unsure how much value the static typing provides. Maybe it lets you get by with a lower quality model, but that model will likely just spend more tokens on iteration so I can’t discern an obvious win. (shrug) I hope I’m wrong though—if I can have super fast code with the ease of LLM generation then I’m happy.

> Rust reduces memory management to local reasoning (via the borrow checker). This means that it performs well even as context grows, because checks in one function/module are well-encapsulated to that function/module.

I don’t think this is true, right? Changing a single lifetime in a function signature can easily propagate across your entire program. Maybe I’m just a Rust noob, but any time I change a field from owned to borrowed or vice versa I have to propagate that change pretty broadly, which to my mind implies consuming a lot of the context window. Garbage collection (I know, ewww, shame on me, etc) allows for local reasoning in a much more meaningful way however morally impure it may be. :)

LLMs do significantly better when they get reliable feedback on their actions (try to create any non-trivial project in some language without letting the LLM use a compiler. Similarly, talking with a "chat LLM" will produce worse code than an "agentic LLM").

Anyway, making such a (breaking) change in rust immediately tells you all of the callsites that break. You have to chase it through, but that's mechanical/low context work. More formally, you can parallelize across files with sub-agents to not pollute your main agents context window. So it really should be a "zero context window" cost.

Whether or not strict typing is strictly better is really a correctness/velocity tradeoff, the same as it always has been. For most projects something in the middle is right.

As for owned vs borrowed and things propagating quite a bit, sometimes it happens. It's often avoidable with a couple of tricks

1. always default to borrowed unless you have a good reason to otherwise

2. make your function signatures more permissive so they can support either way. This can be done by modifying f<T>(x: T) (or f<T>(x: &T)) to f<U: AsRef<T>>(x: U). The later can be equivalently written as f(x: impl AsRef<T>).

When you say "change a field from owned to borrowed", I'd generally suggest not doing that. It's generally easier to start with some owned type MyType. You can then have function signatures take &MyType as input. This borrows all the fields, and is often good enough for most functions.

If you have a more esoteric function (that needs a combination of borrowed and owned inputs), it's typically easier to define a struct for that function. The steps are

1. Define a FunctionInputsRef<'a>

2. write `impl<'a> From<&'a MyStruct> for FunctionInputsRef<'a>`, then

3. update your function to take as input FunctionInputsRef<'a> rather than `&MyStruct`, and

4. update callers with `input -> input.into()`.

It has the benefit of less churn, as you're maintaining the old def (which might be useful elsewhere), and only updating the callers in a fairly trivial way. `FunctionInputsRef<'a>` can also be defined local to the function, so it is modularized better. If you later have other functions with other requirements, it's a relatively easy pattern to duplicate as well.