Hacker News new | ask | show | jobs
by josephg 536 days ago
As someone who’s gone down the rust “native pointers vs pin vs …” rabbit hole many times now, I really recommend just using a Vec for the data and storing indexes into the vec when you need a pointer.

Pin adds a huge amount of weird incidental complexity to your code base - since you need to pin-project your struct fields (but which ones?). You can’t just take an &self or &mut self in functions if your value is pinned, and pin is just generally confusing, hard to use and hard to reason about.

The article ended up with Vec<Box<T>> - but that’s a huge code smell in my book. It’s much less performant than Vec<T> because every object needs to be individually allocated & deallocated. So you have orders of magitude more calls to malloc & free, more memory fragmentation and way more cache misses while accessing your data. The impact this has on performance is insane.

Vec & indexes is a lovely middle ground. In my experience it’s often (remarkably) slightly more performant than using raw pointers. You don’t have to worry about vec reallocations (since the indexes don’t change). And it’s 100% safe rust. It feels weird at first - indexes are just pointers with more steps. But I find rust’s language affordances just work better if you write your code like that. Code is simple, safe, ergonomic and obvious.

1 comments

> Code is simple, safe, ergonomic and obvious.

Dunno about 'safe' -- or at least not in the more general sense that you seem to intend, rather than the more limited sense of rust's safe/unsafe distinction. If you store an index into a Vec<T> as a usize, rather than a &T, very little is stopping you from invalidating that pseudo-pointer without knowing it. (Or from using it as an index into the wrong vector, etc...)

These problems are manageable and I'm not saying 'never do this' -- I've done it myself on occasion. It's just that there are more pitfalls than you're indicating here, and it is actually a meaningful tradeoff of bug potential for ease-of-use.

I mean safe in the narrow way that rust intends. It’s memory safe, but as you imply, we’re leaving the door to open to logic bugs if you misuse those array indices.

But honestly, I think danger from that is wildly overstated. The author isn’t talking about implementing an ECS or b-tree here. They’re just populating an array from a file when the program launches, then freeing the whole thing when the program terminates. It’s really not rocket science.

The other big advantage of this approach is that you don’t have to deal with unsafe rust. So, no unsafe {} blocks. No wrangling with rust’s frankly awful syntax for following raw pointers. No stressing about whether or not a future version of rust will change some subtle invariant you’re accidentally depending on, or worrying about if you need to use MaybeInit or something like that. I think the chance of making a mistake while interacting with unsafe code is far higher than the chance of misusing an array index. And the impact is usually worse.

The author details running into exactly that problem while coding - since they assumed memory allocated by vec would be pinned (it isn’t). And the program they ended up with still doesn’t use pin, even though they depend on the memory being pinned. That’s cause for far more concern than a simple array index.

> The author isn’t talking about implementing an ECS or b-tree here.

Do you mean that b-tree might work here better?

> They’re just populating an array from a file when the program launches, then freeing the whole thing when the program terminates. It’s really not rocket science.

That's exactly why I consider indices.

> since they assumed memory allocated by vec would be pinned (it isn’t)

Could you tell me, please, where you read in the article that I assume it? I wrote in the article "I realized that the problem is related to the fact that vectors of children move in the memory if they don't have enough space to extend." and even made an animation for clarity https://laladrik.xyz/VectorMove.webm. However, if you see the assumption in the article, please, let me know. I correct it or elaborate.

Yes, in your article you consider indexes then ultimately decide not to use them in favor of Vec<Box<T>> & pointers. I recommend that you use indexes instead. I think they’re the better choice.

> Could you tell me, please, where you read in the article that I assume it?

You assume it in your first attempt at solving this problem. You describe that attempt in detail. That’s what I’m referring to.

The code you ended up with is still dangerous code, because your boxes are still not guaranteed to remain pinned in memory.

A clear. I hid it in my mind. I haven't tried the approach with indices, because... well, I was lazy to do it. However, I agree that this approach would be better, then the current one.

> You describe that attempt in detail.

I appreciate if you put a quote, because I fail to find the description of the attempt in detail. In fact, instead of assuming that a vector is pinned I wrote this "I realized that the problem is related to the fact that vectors of children move in the memory if they don't have enough space to extend."

> The code you ended up with is still dangerous code, because your boxes are still not guaranteed to remain pinned in memory.

You are right, boxes are not pinned, but the data, which the point to, is pinned, isn't it? My pointers point to that part of memory.