Hacker News new | ask | show | jobs
Proposal: expression to create pointer to simple types (github.com)
142 points by jgrimm 1892 days ago
15 comments

I was surprised that you can't apply & to any value. I thought it was gut an ordinary operator and it would ensure that the value it was applied to would be put onto the heap.

  s := S{}
  sp := &s // Works
  _ = sp
  
  _ = &S{} // Works
  
  i := int32(1)
  ip := &i // Works!
  _ = ip

  _ = &int32(1) // Doesn't work!
https://play.golang.org/p/fdgvbEwJWgh

It seems odd that you can't apply & to a function's return value. I think the best approach would be making & work in basically any scenario. For example the following also doesn't currently work.

  &(int32(1) + int32(1))
It seems like it should be possible to "desugar" &X to `_tmp = X; &_tmp` and solve this weirdness.
& doesn't always imply a value is on the heap. Escape analysis will ensure that pointers to the stack are safe.

Here's an example with a bit of explanation: if you pass a value to fmt.Println it will escape. The raw println builtin does not cause values to escape. So calling the first function twice and seeing the same address for the value strongly implies stack allocation while calling the 2nd function twice and getting different addresses implies heap allocation.

https://play.golang.org/p/PSb1wj1-x1c

Minutes ago I was wondering about https://play.golang.org/p/9C0puRUstrP (via https://github.com/golang/go/issues/23440).

Thank you for explaining what's going on there! :)

And btw, compiling the above example with -gcflags="-m" (which I learned about earlier today) proves you right.

One of the things I think the Go tutorials don't make a big enough deal of is that Go is relatively explicit about allocations. := isn't just a shortcut for declaring variables, it's an allocation, and an error to use it when it doesn't allocate. var X Sometype isn't just a declaration, it's an allocation.

:= kinda smears the clarity by not allocating if you have a variable on the left that is already allocated, and there's some other places where it kinda smears things up, but at the core, Go makes you explicitly allocate.

> it's an allocation

Most people probably think "heap allocation" when you say this. Go doesn't do dynamic allocations within a stack frame (alloca in C), so when you say "it's an allocation", what does that mean? It could be a stack allocation that occurred at compile time as a reservation in the size of the stack frame for that function. It could be a heap allocation. Only the compiler knows!

The Go compiler is the ultimate authority on what becomes a heap allocation. It tries to make everything into a stack allocation when possible, and stack allocations are "free".

Beyond that, a sufficiently smart compiler can reuse stack "allocations" within a single function as certain values become "dead" (never used again). So there isn't even guaranteed to be a 1:1 correspondence between "stack allocations" and variables that you declared inside the function.

So, I completely disagree with your statement about Go being "relatively explicit about allocations." It's one of the least explicit compiled languages in that regard.

Go makes a distinction between declaration and assignment, which is the syntax you're talking about. It really has nothing to do with allocations.

You file a good complaint, and I should clarify. What I mean is more like Go is secretly quite explicit about its allocations, if I may. It superficially looks like it doesn't really care with a variety of syntax glosses that can make it look like it's more like a scripting language where it doesn't care, but it actually does care quite a lot even at the syntax level, and if you dig past the syntax glosses, it is actually explicit about what gets allocated. It doesn't successfully hide it from you like a scripting language does.

Also, allocations are just... allocations. Go qua Go doesn't have stack vs. heap, and it's a mistake to care except when optimizing. So in Go qua Go, it isn't an issue that it may "reuse" a particular address, because in Go qua Go you can't witness that anyhow. (If you try to keep a pointer around to witness it with, you'll keep the thing pointed to alive.) From Go's perspective, it's still an allocation even if the implementation manages to re-use a particular memory address to do so.

I'm talking about the runtime Go implements here, not the implementation.

This actually took me some years to correctly internalize, for what it's worth. It does a "good" job of glossing over things. However, if you really poke at it, allocations are still explicit. They just may not look like what you are used to from other languages.

I don't see how:

  temp := someFunc()
  p := &temp
is any less explicit than

  p := &someFunc()
It seems that the `&` is still required to put something onto the heap.
What about `&m[x]` where m is some map? Does that heap allocate and create a copy, or is it a pointer to the actual storage slot? If the former, that's a hidden copy/allocation that didn't exist before, and if it's the latter, resizing the map invalidates the pointer, so it must be updated somehow.
`&` will "move" something to the heap if it isn't already on the heap.

The simpler way to think about it is that in Golang everything is on the heap. However the optimizer will move things to the stack if they don't have their address taken. I think the point about explicitness is that if you don't use `&` then it will be able to be put on the stack. So `&` doesn't cause a heap allocation but lack of `&` (or new()) confirms that there isn't one. (I don't actually know if that is true but I can't think of any counterexamples)

> So `&` doesn't cause a heap allocation but lack of `&` (or new()) confirms that there isn't one. (I don't actually know if that is true but I can't think of any counterexamples)

I think assigning to a pointer would cause an escape.

Just taking a reference wouldn't though, the reference still has to escape (of course you'd usually take a reference so that it can escape but that's not always the case, especially with inlining).

I think I didn't communicate my point clearly. Consider this hypothetical program:

    x := make(map[int]int)
    x[0] = 5
    
    y := &x[0]
    *y = 10
    
    print(x[0]) // 5 or 10?
    
    x[0] = 6
    
    print(*y) // 6 or 10?
    
    // force the map to grow and reallocate the buckets
    for j := 1; j < 100; j++ {
        x[j] = j
    }
    *y = 11
    
    print(x[0]) // 5, 6, 10, or 11?
The crux of the problem is answering what y actually points at: the value in the map bucket, or some freshly allocated value? There are problems with whichever one you pick.

edit: changed the second print to *y instead of x[0]. thanks masklinn for catching this error.

Since it doesn't seem like this was answered in the other discussion, the answer is that Go does not allow taking the address of a map value. You get a compile-time error: "cannot take the address of m[x]".

https://play.golang.org/p/rX8A6ez9fVx

Indeed. This is in a thread where the original comment was "I think the best approach would be making & work in basically any scenario." I'm trying to demonstrate the complications of making it work on map accesses.
I think the way to say it is that Go requires you to declare every allocation, but allows over-declaration in the case of copying.

> := [...] an error to use it when it doesn't allocate.

> := [...] not allocating if you have a variable on the left that is already allocated,

This appears to be a contradiction.

I suppose you mean something like "error to use it when there's no possible context where that line of code would allocate"; what's an example of that?

a, b := 1, 2

If either a or b (but not both) were already defined, this won't re-define (and reallocate space for) them.

Aha, lossy compression syntax!
I’m not sure I understood this correctly. Does the following allocate (on the heap)?

    foo := MyStruct{}
No, that does not cause a heap allocation on its own. If other lines of code in that function caused a pointer to that value to escape the lifetime of the current function's stack frame, the compiler would determine that it has to be heap allocated instead.

I believe the person you are replying to was making a confusing point about some hand wavy notion of "any kind of allocation", which includes stack allocations... which are determined at compile time, not with "alloca".

Ah, that makes much more sense!
No. In this case foo will live on the stack (unless you take its address later).

  foo := MyStruct{} // Could live on the stack
  _ := &foo // Oh, now foo must live on the heap.
This isn't even true.

No matter what syntax you write inside a function, the Go compiler always has the final say on what is stack allocated and what is heap allocated. Taking the address of foo will not cause foo to be heap allocated unless Go is unable to prove that the pointer will live for less time than the current stack frame. Look up "escape analysis".

Basically the only way to guarantee that something will always be heap allocated is to assign it to a global variable. Even returning a pointer to that object from the current function is not a strong guarantee, since the compiler could inline this function into the caller and determine that everything can live happily inside the newly inlined stack frame without heap allocation.

Good point. It isn't "must", I should have said "may".
I asked that in the Github issue, response here: https://github.com/golang/go/issues/45624#issuecomment-82259... With my reply (and further exploration) here: https://github.com/golang/go/issues/45624#issuecomment-82263...
Russ Cox has some nice examples of the issues with this:

    Otherwise the meaning of &f().x is different for f() returning pointer-to-struct and f() returning struct.
    Similarly &m["x"] is a compile error today but would silently make a copy tomorrow rather than produce a pointer to the value in a map.
    All of that would be incredibly confusing and the source of many subtle bugs.
>It seems odd that you can't apply & to a function's return value.

Offtopic: Surprisingly I was asking myself this question but if possible in C... Is it?

No, but in C you can't apply `&` to any stack value and "automagically" pop it onto the heap. Or from another point of view everything in Go is logically on the heap, the compiler just optimizes values that don't have their address taken to live on the stack.

In C:

  int *f() {
    int x = 0;
    return &x;
  }
It works, but it is wrong. The C type system isn't smart enough to realize the lifetime of x in this case. It is not allowed for a function return because C does have the concept of a temporary value so it is disallowed because it is basically always incorrect to do so.

Note that C++ does somewhat allow this with lifetime extension. It is somewhat like what I expected Go to do, except because lifetime extension only extends to the enclosing block it is more of a footgun. With a dynamic tracing garbage collector like Go it not a footgun.

Nope. You can only take a reference to an lvalue, which is (essentially) an expression that is legal to use in the form `my_lvalue = .... Otherwise, there's nothing to take the reference of.

    int* ref1() {   return &1; }
    
    -> error: lvalue required as unary '&' operand
   
    // 

    #include <stdlib.h>
    int alloc()  {
      return *(int*)malloc(sizeof(int));
    }
    int* ref() {  return &alloc();  }

    ->  error: lvalue required as unary '&' operand
You can still be unsafe though, by making a reference to a stack-allocated object and letting it go out of scope :

    int* make_unsafe_ref() {  int a; return &a;  }
    ->  warning: function returns address of local variable
> You can only take a reference to an lvalue, which is (essentially) an expression that is legal to use in the form `my_lvalue = ….

I mean it could implicitly allocate, that's what Rust does for instance.

Your second and third attempts would not compile though, the first would by returning a `&'static T`.

"implicitly allocate" is slightly misleading, imho. It's promoted to a static. There's no malloc involved.
> "implicitly allocate" is slightly misleading

Maybe. I just meant that storage is created implicitly (static or stackframe depending on the case), then a reference is created to that.,

Yes, if f() returns the type T then you can write

    &(T[]){ f() }
The type in brackets needs to be an array so that if f() returns a struct then the initializer list has the right shape. If T is a simple type then you can drop the [].
I like the second option (&int(3)) the most personally, as I find myself occasionally defining a bunch of variables before I can use them as pointers in structs. It looks and feels a lot cleaner to use this vs having new everywhere.
That would be my preference too. Great readability, and most users would eventually try this out even before searching for the right way (I have tried it).

I tend to not declare variables when the pointer is used deep into a struct because I find the back-and-forth in the editor to be bad. I usually resort to a pointer to an inline anonymous function, e.g.:

    a := SomeStruct{
      Field: func() *int64 { x := int64(13); return &x }(),
    }
It's ugly and verbose but after seeing it 2 or 3 times you immediately know what it's about the next time.
How exemplary that he filled in the full template questionnaire for language changes, including questions such as "Would you consider yourself a novice, intermediate, or experienced Go programmer?" (he replied "I have some experience").
IIRC Tim Berners-Lee also rather amusingly called himself a "Web Developer" at a conference.
There's a double connotation being played with there, a web developer vs the web developer
Yeah, I think that was the "rather amusingly" part.
He did develop the web.
I found it more amusing that he listed 8 languages that he has experience with and then went out of his way to exclude JavaScript.
When you know JavaScript enough to be able to say that you don't know JavaScript.
That's basically me and C++, and I suspect I am not alone
> Is this about generics?

> No.

Not sure why, but something about this being the (first part of the) last question he had to answer makes it quite funny to me.

When you have two questions about a specific feature that every change proposal must answer, it's a sign that something needs improvement.
Personally, I like the 100% correct answer to

Can you describe a possible implementation? (Yes.)

A couple years ago there was a clear shift in how the highest profile people in go core interact with the community. In the past it felt a bit one way. I have no idea what triggered the change, but it's definitely visible, and hugely positive imo.
There's the General who responds, "You don't need to see my ID. Don't you know who I am? Who's your CO?!"

And the General who responds, "Right. Well done."

Also good:

> What would change in the language spec?

> The new operator would get an optional second argument, and/or conversions would become addressible.

> [...]

> How would the language spec change?

> Answered above. Why is this question here twice?

I have so many BS helpers in my project to do this

    util.StrPtr("hello")
    util.BoolPtr(false)
    util.Int64Ptr(7)
    // etc
This is just a gap in Go's design, so I'm glad this proposal exists :)
Generics would solve the issue.
Not entirely. Someone in the comments of the issue suggests to implement this with generics as:

  func PointerOf[T any](t T) *T {
    return &t
  }
But that has a nasty gotcha:

  func Process(x *int32) {
    if (x != nil) {
      fmt.Println(*x + 5);
    }
  }

  func main() {
    Process(nil);          //ok
    x := i32(5)
    Process(&x);           //ok
    Process(PointerOf(5)); //BOOM: cannot use PointerOf(5) (value of 
                           //type *int) as *int32 value in argument to Process
  }
Go's type coercion is quite primitive. It strictly works inside-out (propagating types strictly upwards in the AST), with the only exception that a numeric literal can be coerced into a specific numeric type by considering the immediate parent in the AST. So when you have `func f(x int32)` and you call it as `f(5)`, the literal 5 gets coerced into int32 to match the context it appears in. (The same strategy is also applied to determine the type of a nil literal.)

However, in `Process(PointerOf(5))`, the immediate surrounding of the literal 5 (the PointerOf call) does not coerce the literal into a specific type, so it takes on its default type, int.

The proposal (or, to be exact, both proposals) avoids this gotcha by requiring a type to be stated explicitly.

    Process(new(i32, 5));
    Process(&i32(5));
You can just write PointerOf(int32(5)) or PointerOf[int32](5). Or go could improve its type inference.
Well. It would reduce it to 1 (PtrTo) instead of [however many]. And unless they also add a new top-level func like that, it's still a `package.PtrTo` rather than `&`. And `&`'s special abilities on only composite literals remains.
> What other languages do you have experience with?

> Fortran, C, Forth, Basic, C, C++, Java, Python, and probably more. Just not JavaScript

Forth is a "top-of-mind" language for Rob Pike. That's unexpected and incredibly cool :)

This reminds me a lot of C99 compound literals: one of the Go suggestions looks like &int(3) which in C99 is spelled &(int){ 3 }.

(I was slightly surprised when I learned that C99 compound literals are not just for structs: you can use any complete object type, and the result is an lvalue so you can take its address.)

Why can't &3 work? Rob says 3 does not have a type and that's a problem. Would it be possible to change the Go compiler such that 3 has a type? (I'm guessing no, at least not easily, otherwise he'd be suggesting it, but I'm curious about the reason)
> Would it be possible to change the Go compiler such that 3 has a type? (I'm guessing no, at least not easily, otherwise he'd be suggesting it, but I'm curious about the reason)

Why not? In fact it already kind-of does: Go has "default types" for most untyped constants. When you write

    i := 3
absent an explicit type, Go will fall back to "int".
From the language specification:

> Numeric constants represent exact values of arbitrary precision and do not overflow.

Ok I think I'm not understanding this correctly then. Why does this return an error?

    package main

    import (
        "fmt"
    )
    
    const tst = 1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
    
    func main() {
        fmt.Println("%v", tst)
    }
Error: ./prog.go:16:17: constant 1000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 overflows int

See: https://play.golang.org/p/47l5qAsXD5r

https://golang.org/ref/spec#Constants

> An untyped constant has a default type which is the type to which the constant is implicitly converted in contexts where a typed value is required.

The default type for a number (integer) is int. If you were to add a period in that long string of zeros, the default for floating-point is float64.

I asked that in the thread, response here: https://github.com/golang/go/issues/45624#issuecomment-82259... With my reply (and further exploration) here: https://github.com/golang/go/issues/45624#issuecomment-82263...

tl;dr: Go uses casts to coerce values to the correct types, which means you couldn't get pointers to number literals for non-default number types.

If the syntax were &3, you could use it to obtain a *int, but not a *int32, *uint64 or such.
>How would we measure it? >Eyeballing.

Love this one

It's interesting that this can be largely implemented oneself once type parameters are part of the language (as one thread commenter pointed out with `PointerOf(t T) *T`), I'm curious what other syntactical oddities become a thing of the past once we can create more expressive and typesafe functions for common kludges.
Personally, I've only seen this crop up in one place: structs that signal optional fields via pointers. Are there other usecases?
Best part of this proposal from Go's co-creator:

> Would you consider yourself a novice, intermediate, or experienced Go programmer?

I have some experience.

Can I create a pointer to a pointer?
Yes, however the utility of doing so in Go is fairly limited. Go has "pointers" but doesn't have pointer arithmetic, and the big use case of pointer-to-pointer in C is to iterate over an array of pointers via pointer arithmetic. Personally I would call them "references" since I consider pointer arithmetic to be the thing that makes pointers pointers and not just references, but that's a personal opinion, not a universally-agreed-upon definition.
An other big use-case for pointers to pointers in C is pointer-type out parameters. Most of that use-case is handled by MRV, but I'm pretty sure there's the odd situation where a double pointer is either necessary or convenient (I remember seeing the odd one in Rust once in a while).
Pointer-pointers are nice to implement linked data structures in C; they make a lot of logic surrounding re-seating the head pointer far simpler and with fewer edge cases.
This reminds me of the old adage that the level of experience of C developers can be ranked into 1 star, 2 stars, 3 stars and so on, based on the highest number of consecutive stars they use in type expressions.
> (...) that has the nasty problem that 3 does not have a type (...)

How is that possible? At least in Common Lisp, all literal objects have types, and the same is true of C from what I have just checked.

Go supports untyped constants -- https://golang.org/ref/spec#Constants. It's useful for defining a named constant, and then using the name to initialize variable values of any compatible type.
It doesn't have a type in Haskell, from a certain point of view. `3` is polymorphic.

Prelude> :t 3

3 :: Num p => p

It does have a type. You just wrote it down!
Hence "from a certain point of view" - I would argue, in fact, that it's extremely similar to the sense in which 3 doesn't have a type in Go. Haskell's type system can express that sense, whereas Go's can't; but it's the same sense.

> It is an error if the constant value cannot be represented as a value of the respective type. An untyped constant has a default type which is the type to which the constant is implicitly converted in contexts where a typed value is required, for instance, in a short variable declaration such as i := 0 where there is no explicit type. The default type of an untyped constant is bool, rune, int, float64, complex128 or string respectively, depending on whether it is a boolean, rune, integer, floating-point, complex, or string constant.

This is why we need to stop inventing new languages.
For a language that has taken extreme measures to exclude generics because they are deemed to complex, this proposal is absolutely surprising to me. And I'm still not sure what practical benefit comes from it.
For a language that has taken extreme measures to exclude generics, Go seems to have an awful lot of accepted design proposals for generics.
Practical benefit in the sense of "you get a free pony" or "before you didn't have closures and now you do"? None.

Practical benefit in the sense of "you get to express something in a shorter, more uniform way"? Some.

I'm not sure why this would be posted to HN, honestly; it's a very "inside baseball" thing. In my personal experience this would save a lot less than one line per module. I've encountered this, but it's infrequent. The benefit is very minimal in practice.
Imagine you want to create an instance of a struct with many string and int pointer fields. This is actually a big pain in the ass in Go (the AWS SDK offers an aws.String helper for this reason).
While true, generics are on the way to be released on the end of the year as first step.