Hacker News new | ask | show | jobs
by andrewmcwatters 3026 days ago
Could you explain to a non-Rust user why this is a good thing? I'm not familiar with the terms "owned" and "borrowed" in terms of strings. I take it this refers to strings instantiated by a piece of code and managed by a user or vendor code vs strings passed around to be read (pass by copy)?
1 comments

> I'm not familiar with the terms "owned" and "borrowed" in terms of strings.

A string (general) is fundamentally a bunch of bytes in memory. In Rust, that's implemented as a contiguous buffer of UTF8 code units composing valid unicode data.

Now because Rust aims to be a systems-oriented language, these bytes must have one[0] thing which is fundamentally responsible for them, that's the owner. If the owner goes away, so do the bytes. "Owned" qualifies the owner of those bytes. In Rust, it's String: String maintains a "strong" reference to a bunch of bytes in memory which form a valid string, and if the String disappear so does the data associated with it (it's deallocated).

Borrowed by comparison is something which holds a "weak" reference to the same buffer, it knows they're there, but when the "borrowing" structure is destroyed nothing happens to the data it refers to, because it was just borrowing it. That's what `&str` is.

Note that `&str` can borrow from a String (that's a common case), but it can also borrow from static data in the binary (all string literals are in that case) or from just a random bytes buffer (using str::from_utf8[1]).

> Could you explain to a non-Rust user why this is a good thing?

It doesn't matter for high-level languages like Java or Python[2] but it matters a lot for lower-level languages like C, C++ or Rust, because the owner of a piece of data is whoever's supposed to deallocate it (and whoever's allowed to reallocate it to expand it). When there's no difference between owner and borrower (e.g. C's char * ) the complete onus of tracking who's responsible for what is on the developer and failure brings for them to do so generates memory unsafety (dangling pointers, double-free, use-after-free, …). And developers being humans, they mess up regularly.

Making a very clear distinction between owned and borrowed types firstly helps the developer: they know they must free through the owner but mustn't — and normally can't — free through the borrower, that's what C++ is adding; and secondly can — with some additional constraints on the developer — have the language manage all that on its own, the latter being what Rust does (C++ will do the freeing part but you can still have extant borrows and so it's not memory-safe, just significantly more helpful than C).

[0] unless you're in a case where it's unclear who should be responsible and you just go "everyone!" and use reference-counting to just punt

[1] https://doc.rust-lang.org/std/str/fn.from_utf8.html

[2] generally, it does matter in the problem of substrings and whether substrings copy or point to the base data, the former is more expensive (you allocate on each substring operation) but the latter can maintain gargantuan amount of data "live" and prevent their collection

Thanks, I figured it was something of this sort, but wanted to confirm. I find it interesting the distinction even exists in Rust, whereas in C, authors just tell you to not alter the data, or use a cleanup function they've provided, etc.