Hacker News new | ask | show | jobs
by personZ 4303 days ago
The dogma about preventing the sharing of pointers with C is a bit concerning. I've made use of that mechanism, to fantastic effect, albeit understanding the risks and consequences. The notion that it must be prevented because bad things can happen if you aren't careful isn't a starter, as there are essentially zero people doing that who've had problems. Instead a lot of people have enjoyed fantastic productivity because of the similarities of the implementations, and the synergy that allows.

But the team wants move to a compacting GC. Fine, add pin/unpin idioms. This ground has been well covered by other languages. Don't completely destroy a very productive mechanism because in some oddball cases it might not work.

3 comments

The current state of the CGO interface is that you can't safely pass Go pointers to C code. That's all this document is repeating. You may have used this successfully in the past, I know I have, and it works with the GC toolchain, but it is still technically incorrect to do so.

I think there's another design doc being worked on that will document the situation, and possibly have proposals to make sharing memory between Go and C more convenient. There's are many projects that make heavy use of this, where copying is slow or infeasible, and malloc is inconvenient.

The current state of the CGO interface is that you can't safely pass Go pointers to C code. That's all this document is repeating.

Where is this stated? Where is it said that it's technically incorrect?

Just to be clear, I'm not passing it and then running a thread in C that just runs with the pointer to an object that no longer exists in Go. Instead I'm calling a C function synchronously -- there is no possible way the element is going to be collected during that run. It could be compacted (though Go up until now has explicitly had a non-compacting GC, so not a current risk), of course, which is what brought up this issue, but that's the reason such interop languages have pinning (flagging that something shouldn't be moved).

It wasn't by chance -- it was by design, and it was purposefully understanding the lifetime of objects.

There is a bit more background at http://golang.org/issue/8310.

The problem is that we hope to move toward a concurrent GC. That means that GC must be able to run concurrently with a long-running cgo call--otherwise a call to some C function that never returns for perfectly valid reasons will block GC forever. And when we have a concurrent moving GC, it is no longer possible to casually pass a Go pointer to C.

In other words, we want to make the Go garbage collector work much better for Go programs, and we don't want calling C to prevent that.

Although the actual plan is not nailed down, I suspect that we will permit passing a Go pointer to C as long as the memory to which that pointer points does not itself contain any Go pointers. The cgo interface will be permitted to either pin the pointer for the duration of the call, or to copy the data to a pinned memory area--your program won't know which will happen.

If you need something more complex to work, you will have to allocate the memory on the C side. It's always OK to pass C pointers to C.

That's a much more reasonable proposal than the prior suggestions of a strict no-Go-pointers-into-C rule, and it'll keep most interfaces working. Some C packages such as qml will still need to change as they allow custom types to travel through C and back into Go, and these pointers may move, but that's certainly not the common case.
You're right, I can't find where it's explicitly forbidden in the docs, but it also isn't stated that you can pass pointers to C code either. There has been more recent discussion here (https://code.google.com/p/go/issues/detail?id=8310) and on the mailing lists too:

    > +rsc
    > Passing a Go pointer like that to C is problematic because eventually
    > we will want the garbage collector to be able to move things

and later on with a response from Dmitry:

    >> In my opinion, the current behavior should be covered by the 
    >> compatibility guarantee. In Go 1, passing pointers to Go-allocated 
    >> memory into C was not a problem. 
    >
    > This was never fully working. Only some individual cases were working.


Even though your C thread is being run synchronously, the GC is running concurrently in another thread (concurrent GC was introduced in go1.3), and has the ability to alter the memory that was passed to your C function.
The rule was never entirely clear, but there was definitely support towards allowing Go pointers to be visible in C code, as long as they were held referenced inside Go code somewhere. See this thread, for example:

https://code.google.com/p/go/issues/detail?id=6397#c11

That said, the rule is clearly changing now, and it is not even clear to what it is changing yet (see Ian's comment in this HN thread).

>Where is this stated? Where is it said that it's technically incorrect?

If I am not mistaken, it was discussed on the go-dev group.

Currently on the phone, cannot search properly for it.

EDIT: Now at home. This was the discussion thread

https://groups.google.com/forum/#!topic/golang-dev/pIuOcqAlv...

You can do your own pinning, which is what I do in gobind[1]. The basic idea is to keep a map in Go of the pointers:

    var ptrs = make(map[uintptr]unsafe.Pointer)
When passing a pointer from Go to C:

    var p unsafe.Pointer = ...
    id := uintptr(p)
    ptrs[id] = p
    C.Fn(id)
When C returns the pointer to Go to be used, run it through the ptrs map again

    //export GoFn
    func GoFn(id uintptr) {
        p := ptrs[id]
        // ... use p
    }
Don't forget a cleanup function where you delete(ptrs, id). If you want to pass the same pointer multiple times to C and keep it comparable, you'll need a second map.

All of this is requires being very careful, but the hard part is the notion of holding references to memory outside the realm of the GC. I don't believe a runtime-assisted pinning API can do any better than what you can do yourself with a map.

[1] https://godoc.org/code.google.com/p/go.mobile/cmd/gobind

That's a fine model when you want to hand a reference to a Go value, just to be used in Go itself, but it's not actual pinning. The data can still move, which means you cannot use the data in C, for example to share a buffer without copying into another temporary buffer managed by the C allocator.
That's right, you cannot dereference the pointer from C. If you want to use allocated memory from both C and Go, you'll need to allocate it in C.
Please note that the details aren't defined yet, and Ian indicates in a comment in this HN thread that some sort of pinning may exist. I also expect that to happen, given that a strict rule would break a lot of packages and turn what is today trivial into boring and slow code.
What's the use-case for passing a pointer to C that can't be de-referenced?
You can pass it back into Go for dereferencing, at which point you can use the uintptr => unsafe.Pointer map, and it will be correct since the moving GC will have preserved the unsafe.Pointer address at the right location.

The pattern is useful. It just doesn't handle the most common case, which is passing a buffer or a simple output parameter into a C function.

I don't think this would really handle the pinning aspect if a compacting GC is implemented, would it? It does keep the pointer alive, but it technically doesn't do anything to prevent the GC from changing the pointer.
The uintptr version of the pointer is not moved by the GC, so it acts as a stable identifier.
This does not work if the GC updates the pointer concurrently.
The GC will not update a pointer once it is converted to the type uintptr. It becomes invisible to the precise collector, so the uintptr acts as an id value.
Worse, it might be collected then as the collector might think it is no longer in use.
But it's in the map, so it's in use.
This remark is what triggered my comment

> It becomes invisible to the precise collector

So does the conversion to uintptr remove it from the root lists the GC searches?

I agree with the decision to forbid pointer sharing with C code.

This is the type of issues that nicely lead to security bugs, in the hands of the typical average enterprise developer.

Having a pin/unpin idiom might be good enough, though.

This is the type of issues that nicely lead to security bugs

It's also the type of functionality that enables the efficient use of high performance C libraries and existing code (I use it to great effect with the Intel compiler for extremely high performance SIMD code).

The "someone might not use it right, therefore it's anathema" notion is misguided, and is orthogonal with the philosophy of Go. Even C# and Java -- the pinnacle of "enterprise" languages -- understand and respect the notion of pointer sharing, and early on built in mechanisms to support it. Go is a league above, though, given that it follows C struct layout rules.

But I see where the person proposing banning it got their notion from -- it's a classic hubris of "if it isn't important to me, it isn't important to anyone". A compacting GC has benefits, but the lazy notion of wholesale blocking broad and powerful uses because one doesn't personally benefit from them is how languages die.

Java and .NET offer GC free allocation for such purposes, via their interop APIs. This is also one area that is being improved in Java 9 (Arrays 2.0, Value Types and the new FFI).

But as I mentioned, thus agreeing partially with you, pin/upin could be enough.

Although I would perfer pin/unpin,

You could easily implmement GC free allocation in Go and people do. Its not hard to write a library for explicit memory management:

1. mmap some memory 2. Write or port malloc and free like functions

This implementation could probably even use sync.Pool to good effect.