Hacker News new | ask | show | jobs
by masklinn 1474 days ago
> The "Slices" example is just nasty!

I've got to say I'm not entirely clear on what they talk about specifically.

Is it simply that the `results` inside the goroutine will be desync'd from `myResults` (and so the call to myAppend will interact oddly with additional manipulations of results), or is it that the copy can be made mid-update, and `result` itself could be incoherent?

3 comments

So a slice consists of a pointer to a backing array, a length and a capacity. If you don't use a pointer and pass this slice around you will copy it.

This is problematic because even though you copy it, you're still pointing at the same backing array.

Therefore, a backing array with data like [1,2,3,4,5] could be pointed at by 2 slice headers (slice metadata) looking like

A: {len: 2, cap: 10} [1,2] B: {len:5, cap: 10} [1,2,3,4,5]

So any append operations on slice A will mess up the data in that backing array.

Now, sometimes your append will resize the slice, in which case the data is copied and a slice with a new larger backing array is returned. If this was happening concurrently then you'd lose the data in racing appends.

If the append doesn't need to resize the slice, then you'll overwrite the data in the backing array. And so you'll corrupt the data in the slice.

Here's an example I threw together: https://go.dev/play/p/qRUKUwIf3vx

Although the code in the post doesn't actually look like it has an issue. Their tooling just flagged it up as it potentially has an issue if the copy was actually used in the function. But the `safeAppend` function targets the correct slice each time.

I’m a bit doubtful as what you talk about is definitely a slice issue but it’s already an issue in completely sequential code if you reuse appended-to slices.

So while it’s also an issue in concurrent code, it’s really no more so.

It is an issue in sequential code because as you say, that's just how slices work. But if you're always using the same variable you'll never encounter it because that slice can't change between you reading that variable and writing to it.

Once concurrency is introduced you can now read from the same variable, but another goroutine may have written to the same slice in the meantime. That's why you must protect the read and writes and synchronise them.

It's fundamentally just a race condition issue with unprotected reads. But people often overlook it in the case of slices because they think they're just taking a reference to the slice, which is safe to do concurrently IF slices were reference types. But they're not, they are copied.

> Once concurrency is introduced you can now read from the same variable, but another goroutine may have written to the same slice in the meantime. That's why you must protect the read and writes and synchronise them.

But, again, this can easily occur in sequential code as well: you call a function passing it the slice, it mutates the slice internally, it doesn't document that, or maybe the documentation is even wrong, you now hit this issue.

I believe they made a mistake with that example. It doesn't look unsafe to me because the myResults sliced passed to the goroutine is not used. Or perhaps the racy part was left out of their snippet.

Below is what might be what they have meant. This code snippet is racy because an unsafe read of myResults is done to pass it to the goroutine and then that version of myResults is passed to safeAppend:

  func ProcessAll(uuids []string) {
    var myResults []string
    var mutex sync.Mutex
    safeAppend := func(results []string, res string) {
      mutex.Lock()
      myResults = append(myResults, res)
      mutex.Unlock()
    }

    for _, uuid := range uuids {
      go func(id string, results []string) {
        res := Foo(id)
        safeAppend(myResults, id)
      }(uuid, myResults) # <<< unsafe read of myResults
    } 
  }
EDIT: Formatting and clarity
Yea I think your read on the intent here is correct. I re-read that part five times wondering what I wasn't getting.
I have the same question.

They talk about the "meta fields" of a slice. Is the problem that these "meta fields" (e.g. slice length and capacity) are passed by value, and that by copying them, they can get out of sync between coroutines?