Hacker News new | ask | show | jobs
by zeeboo 1880 days ago
What about `&m[x]` where m is some map? Does that heap allocate and create a copy, or is it a pointer to the actual storage slot? If the former, that's a hidden copy/allocation that didn't exist before, and if it's the latter, resizing the map invalidates the pointer, so it must be updated somehow.
2 comments

`&` will "move" something to the heap if it isn't already on the heap.

The simpler way to think about it is that in Golang everything is on the heap. However the optimizer will move things to the stack if they don't have their address taken. I think the point about explicitness is that if you don't use `&` then it will be able to be put on the stack. So `&` doesn't cause a heap allocation but lack of `&` (or new()) confirms that there isn't one. (I don't actually know if that is true but I can't think of any counterexamples)

> So `&` doesn't cause a heap allocation but lack of `&` (or new()) confirms that there isn't one. (I don't actually know if that is true but I can't think of any counterexamples)

I think assigning to a pointer would cause an escape.

Just taking a reference wouldn't though, the reference still has to escape (of course you'd usually take a reference so that it can escape but that's not always the case, especially with inlining).

What do you mean by assigning to a pointer? You can only assign a pointer value to a pointer variable and you need to get that pointer from & IIUC.
> What do you mean by assigning to a pointer?

    *x = y
I don't think that does because IIUC you are copying the bits of y to x. So I guess semantically y has escaped but you aren't doing a new heap allocation, you are reusing the memory of x.
I think I didn't communicate my point clearly. Consider this hypothetical program:

    x := make(map[int]int)
    x[0] = 5
    
    y := &x[0]
    *y = 10
    
    print(x[0]) // 5 or 10?
    
    x[0] = 6
    
    print(*y) // 6 or 10?
    
    // force the map to grow and reallocate the buckets
    for j := 1; j < 100; j++ {
        x[j] = j
    }
    *y = 11
    
    print(x[0]) // 5, 6, 10, or 11?
The crux of the problem is answering what y actually points at: the value in the map bucket, or some freshly allocated value? There are problems with whichever one you pick.

edit: changed the second print to *y instead of x[0]. thanks masklinn for catching this error.

> print(x[0]) // 5, 6 or 10?

Do you mean `print(*y)`? You just assigned to `x[0]` so its value should not be in question.

Also

> if it's the latter, resizing the map invalidates the pointer, so it must be updated somehow.

It doesn't (have to) invalidate the pointer though. When resized the map's content get copied to a new backing buffer, the pointer can keep pointing to the old buffer. That's basically the same behaviour as slices: when a slice resizes, a new backing array is allocated, the contents get copied to the new array, and the slice is retargeted to the new array. There can be other slices pointing to the old array (it's of course a very bad idea to update slices to shared arrays, but Go will let you do it).

> It doesn't (have to) invalidate the pointer though. When resized the map's content get copied to a new backing buffer, the pointer can keep pointing to the old buffer.

That's true, but I don't think it's very comparable to slices. With slices, you have to explicitly reallocate either by creating a whole new slice or using append. Reslicing, indexing, or other operations do not reallocate. On the other hand, maps may end up resizing on any operation that involves them, or even theoretically in the background without any operations (during GC, for example). It would be unfortunate to lose that implementation flexibility, and keeping it means that you're essentially picking the "make a copy" option.

> That's true, but I don't think it's very comparable to slices

It's exactly the same.

> With slices, you have to explicitly reallocate either by creating a whole new slice or using append.

That's a distinction without a difference. `append` does not "explicitly reallocate", it may or may not reallocate, you've no idea. Even if the backing array is full, it might be realloc'd in-place.

> On the other hand, maps may end up resizing on any operation that involves them, or even theoretically in the background without any operations (during GC, for example).

So?

Also technically nothing prevents a GC from reallocating the slice.

> It would be unfortunate to lose that implementation flexibility, and keeping it means that you're essentially picking the "make a copy" option.

I've never heard of a hashmap implementation which would do otherwise.

Trying to extend in-place and attempting to properly redistribute if that works sounds like absolute hell. Likewise trying to shrink in-place, though at least you've got some scratch space which you don't have in the other case: you'd have to segregate everything into one half of the map then insert them in the other half, before shrinking your allocation, which might give you a new allocation anyway, at which point you've moved all your values thrice whereas just creating a new allocation and reinserting your stuff there is a single move.

> That's a distinction without a difference. `append` does not "explicitly reallocate", it may or may not reallocate, you've no idea. Even if the backing array is full, it might be realloc'd in-place.

Maybe to you, but to me, a pointer going from modifying the value inside of the map to no longer modifying the value inside of the map during any operation is quite a bit different than requiring a reassignment of the slice header. In other words:

    x := make([]int, 5)
    y := &x[0]
    x[3] = 8
    *y = 5
    print(x[0]) // always prints 5
as compared to

    x := make(map[int]int)
    y = &x[0] // btw, is this even valid? let's assume it implicitly does x[0] = 0
    x[3] = 8
    *y = 5
    print(x[0]) // maybe sometimes prints 5?
is meaningfully different. For slices, we know that x[0] will always print 5 until the value of x is reassigned in some way.

> Also technically nothing prevents a GC from reallocating the slice.

It would have the same problem the map does: you'd have to update any pointers into the slice to point to the new slice, otherwise the semantics of the program changes. That is not something the GC currently does, and would require an awful lot of metadata and scanning.

> I've never heard of a hashmap implementation which would do otherwise.

I'm not sure what this is referring to. I agree every map implementation has to reallocate the backing store of values periodically. I was trying to say that keeping the flexibility to reallocate the backing store of the map during GC means that you cannot choose the "writes through pointer are observed in the map" option (at least without a lot of complication around updating pointers) because as a programmer, you would not be able to know if it would do that or not, which is a fairly useless primitive.

Since it doesn't seem like this was answered in the other discussion, the answer is that Go does not allow taking the address of a map value. You get a compile-time error: "cannot take the address of m[x]".

https://play.golang.org/p/rX8A6ez9fVx

Indeed. This is in a thread where the original comment was "I think the best approach would be making & work in basically any scenario." I'm trying to demonstrate the complications of making it work on map accesses.