Hacker News new | ask | show | jobs
by sinhpham 3193 days ago
They implemented what looks like the Rust ownership model: SE-0176 Enforce Exclusive Access to Memory (https://github.com/apple/swift-evolution/blob/master/proposa...), but I'm having a hard time understanding the proposal, can anyone shed some light on this?
4 comments

You can think of inout parameters in Swift as something analogous to a mutable borrow in Rust. Until Swift 4 we allowed overlapping inout access, for example:

    var counter = 0
    func foo(x: inout Int) {
      x += 1
      print(counter)
    }
    foo(x: &counter)
Note how 'counter' is read by 'foo(x:)' during an inout access of the same value. This is now prohibited by Swift 4, using a combination of static and dynamic checks.

This fixes some undefined behavior and will also enable more aggressive compiler optimizations to be added in the future.

Isn't the mutable borrow ended after the end of the x += 1 line? leaving you free to read the contents of the ptr?

I don't know swift at all. but from their document on swapAt() it looks like they are trying to prevent two fn(&p, &p) where func fn(a: inout Type, b: inout Type)

> Isn't the mutable borrow ended after the end of the x += 1 line?

No, see Mike's comment here.

> Note how 'counter' is read by 'foo(x:)' during an inout access of the same value

It's not clear to me in your example why reading the value of counter after mutating it is bad; why is this now prohibited?

Swift specifies inout parameters as copying the value that's passed in, giving the copy to the callee as a mutable value, and then writing back the value to the original storage after the function returns. & is not an "address of" operator.

Of course, it would be inefficient to do this all the time, so Swift will optimize copy-then-writeback to just passing a pointer to the original storage whenever it can. But this is an optimization operating under the "as if" rule: as long as it works as if it does a copy-then-writeback, the compiler can make it actually do whatever it wants.

If the example code were legal, then it would have to print `0`, because the writeback to `counter` doesn't happen until the end of the function. That means the compiler couldn't just pass in a pointer to `counter`, but would have to actually go through the copy-then-writeback procedure it's supposed to do, so you'd lose out on optimizations.

Instead, Swift makes it illegal. You can't access a value while this call is happening. That allows the language semantics to coexist with optimizations.

what does & do in Swift then? And what's the purpose of even having inout and & if you can't count on the code to actually be an address?
& is just a sigil saying "I acknowledge that I am passing this as an inout parameter, and therefore the value may be modified by the function I am calling." Note that & is only legal on a function parameter. You cannot write, for example, `let b = &a`.

The purpose is to allow for out-parameters. A classic example would be the `+=` operator. (Swift operators are just normal functions with special call syntax.) It takes its first parameter as `inout` so that it can mutate the value.

Note that inout parameters work with expressions where it would be impossible to take the address. For example, you can use & on a computed property that has a setter. In that case it has to read the initial value, pass that to the function, then write back the new value, because it has no idea where the computed property actually stores the value, if anywhere.

Edit: because I'm obsessive and weird, I made a quick example of this computed property stuff:

http://swift.sandbox.bluemix.net/#/repl/59c284376cbea87f72c4...

Click the play triangle at the bottom to see the output.

> (Swift operators are just normal functions with special call syntax.)

Coming from Haskell and Rust it's nice to see this trend catching on.

Is Swift planning to introduce a distinction between borrows and mutable borrows to the user? From what you describe it seems like right now syntax-wise a borrow and a mutable borrow look the same, and the runtime makes some decision about it.

edit: Or I guess it could be the opposite. Since Swift passes by value always unless the runtime can optimize (right?), you could just not write & and inout and cross your fingers it gets optimized to a borrow rather than a copy?

Stuff like this makes me prefer the explicitness of Rust. It seems like here on the surface it's abstracted from you, but really you need to know the rules anyway or you could get into trouble.

Still new to Swift but I believe & is an explicit syntax necessary to make clear in the code that the function being called is mutating the argument, and thus the variable passed into that function could be mutated. It must be used wherever an argument is in/out. It makes mutation explicit both in the function declaration and also in the function call. Which is nice!

It's good for code readability but also prevents accidentally passing a variable to a function that could mutate it when you weren't expecting that, and vice-versa.

Values are not guaranteed to even have an address, IIRC.
Thanks Mike, very clear.
Note: I like to read about Rust but don't work with it seriously, and don't follow Swift at all. Corrections welcome. That said, these seemed like the key passages:

"Swift has always considered read/write and write/write races on the same variable to be undefined behavior. It is the programmer's responsibility to avoid such races in their code by using appropriate thread-safe programming techniques."

"The assumptions we want to make about value types depend on having unique access to the variable holding the value; there's no way to make a similar assumption about reference types without knowing that we have a unique reference to the object, which would radically change the programming model of classes and make them unacceptable for the concurrent patterns described above."

Sounds like a system in the vein of rust but more limited, with more runtime checks and no lifetime parameters, falling back to "programmer's responsibility" when things get hard. The last paragraph makes it sound like one of the motivations is in enabling specific categories of optimizations, as opposed to eliminating races at the language level.

One of my biggest questions as a reader is how a language like C handles these cases that Swift can't handle without these guarantees. Is this a move to get faster-than-C performance? Does C do these optimizations unsafely? Is there some other characteristic of Swift that makes this harder than C? Closures get a lot of focus in the article...

The other responses to your comment are correct: C generally can't do those optimizations, unless you manually write `restrict`, and this can hinder optimization. But to complete the picture -

> Is there some other characteristic of Swift that makes this harder than C?

Yes:

1. Swift doesn't have pointers.

Instead, you have a lot of copying of value types, and the compiler has to do its best to elide those copies where it can. For instance, at one point the document mentions:

> For example, the Array type has an optimization in its subscript operator which allows callers to directly access the storage of array elements.

In C, C++, or Rust, you can "directly access the storage" without relying on any optimizations: just write &array[i] and you get a pointer to it. The downsides are (a) more complicated semantics and (b) the problem of what happens if array is deallocated/resized while you have a pointer to it. In C and C++, this results in memory unsafety; in Rust, the borrow checker statically rules it out at the cost of somewhat cumbersome restrictions on code.

2. Swift guarantees memory safety; C and C++ don't.

This goes beyond pointers. For instance, some of the examples in the document talk about potentially unsafe behavior if a collection is mutated while it's being iterated over. In Swift, the implementation has to watch out for this case and behave correctly in spite of it. In C++, if you, say, append to a std::vector while holding an iterator to it, further use of the iterator is specified as undefined behavior; the implementation can just assume you won't do that, and woe to you if you do. (In Rust, see above about the borrow checker. Iterator invalidation is in fact one of the most common examples Rust evangelists use to demonstrate that C++ is unsafe, even when using 'modern C++' style.)

> 1. Swift doesn't have pointers.

Sure it does:

    func modify(_ x:UnsafeMutablePointer<Int>) {
        x.pointee = 12;
    }
     
    func main()
    {
        var x = 23;
        print("Before \(x)\n");
        modify(&x);
        print("After \(x)\n");
    }
> Swift guarantees memory safety

But memory leaks are quite easy to do with reference counting which is why Swift has some more complex syntax to prevent strong references. But it can take skill to understand when to use those techniques; the compiler doesn't always find these problems, thus there really isn't the guarantee you mentioned.

Memory leaks are not the same as memory safety problems.
Those Rust evangelists are only partially correct. It's the STL that's unsafe, not the language itself.

Iterators could be implemented in C++ in a safer way with some performance loss, but it doesn't seem to be a priority for anyone except the safercpp guy that posts here every now and then. STLs can enable iterator validation in a special debug mode.

That is incorrect.

The language includes memory unsafe constructs without marking them in any way, since it must be compatible with C.

We're discussing iterators here, and the STL iterators implemented as class templates can't even be compatible with C.

To clarify: safe containers, iterators and algorithms can be designed, but they don't seem to be a priority of the C++ community. Personally I'm quite scared of accidentally passing the wrong iterator to some function, but OTOH I can't recall it ever happening. I don't use the debug STL either, haven't needed it.

The examples that pcwalton keeps bringing up seem artificial to me. It's true that you can't have perfect safety in C++, but with some effort and custom libraries, many errors can be caught at compile or run-time. The advantage of Rust is that it's safe by default, not necessarily that there's a major safety difference between quality C++ and quality Rust.

You’ve probably heard this before, but the security angle is important. “I haven’t had these problems in my code” really means “I haven’t triggered these problems in my code”… that is, unless you’ve had a security code audit done. Testing isn’t enough: even well-tested codebases can and do have vulnerabilities. In practice, they’re usually triggered by input that’s so nonsensical or insane from a semantic perspective, not only would it never happen in practice in ‘legitimate’ use, the code author doesn’t even think to test it. For a simple example, if some binary data has a count field that’s usually 1 or 2 or 10, what happens if someone passes 0x40000000 or -1? As a security researcher myself, I think it‘s actually easier to audit code with less knowledge of how the design is supposed to work, up to a point, because it leaves my mind more open. Rather than making assumptions about how different pieces are supposed to fit together, I have to look it up, and as part of looking it up I might find that the author’s assumptions were subtly wrong… For this reason, it’s really hard to audit your own code, at least in my experience. I mean, you can definitely keep reviewing it, building more and more assurance that it’s correct, but if your codebase is large enough, there may well be ‘that one thing’ you just never thought of.

I’m not actually sure how frequent iterator invalidation is as a source of vulnerabilities; I don’t think I’ve ever found one of that type myself. However, use-after-frees in general (of which iterator invalidation is a special case) are very common, usually with raw pointers. In theory you can prevent many use-after-frees by eschewing raw pointers altogether in favor of shared_ptr, but nobody actually does that – that’s important, because there’s a big difference between something being theoretically possible in a language and it being done in practice. (After all, modern C++ recommendations generally prefer unique_ptr or nothing, not shared_ptr!). And even if you do that, you can’t make the `this` pointer anything but raw, and same for the implicit raw pointer behind accesses to captured-by-reference variables in lambdas.

You can definitely greatly reduce the prevalence of vulnerabilities with both best practices for memory handling and just general code quality (that helps a lot). But if you can actually do that well enough - at scale - to get to no “major safety difference”, well, I haven’t seen the evidence for it, in the form of large frequently-targeted codebases with ‘zero memory safety bugs’ records. Maybe it’s just that C++’s backwards compatibility encourages people to build on old codebases rather than start new ones. Maybe. It’s certainly part of the story. But for now, I’m pretty sure it’s not the whole story.

> STLs can enable iterator validation in a special debug mode.

And they do actually, in the MSVC debug mode and with libstdc++'s -D_GLIBCXX_DEBUG. But nobody ever wants to use them.

C compilers must assume that a function pointer (that is, a function passed as an argument, or that is a property of an object) may write to any global variable.

C compilers must also assume that any two pointers to the same type may alias (refer to the same object). The programmer can assert to the compiler that a pointer does not alias any others used in the same scope by declaring it with the `restrict` keyword.

For most functions this won't have much effect on the generated code. Writing equivalent functions to the ones in the swift-evolution doc in C, both with and without `restrict` everywhere possible, it looks like `restrict` only has an effect on the generated code for `increaseByGlobal`: https://godbolt.org/g/W8s3BA

> One of my biggest questions as a reader is how a language like C handles these cases that Swift can't handle without these guarantees.

C doesn't address these issues at all as far as I know.

This is the first step into adopting more Rust like memory safety, but not at the expense of productivity.

Basically Swift will keep using reference counting as its GC algorithm, but for high performance situations it will be possible to have a bit more of fine grained control over ownership.

However they want to avoid any design that might result in "fighting with borrow checker" feeling.

Some info from WWDC 2017,

https://developer.apple.com/videos/play/wwdc2017/402/

There is also a transcript.

Here's the 'Ownership Manifesto' that tries to clarify some of the differences between the ownership models of Rust and Swift [1]. The main point raised in that document is how 'shared values' are being implemented in Swift in a less strict way compared to Rust.

[1] https://github.com/apple/swift/blob/master/docs/OwnershipMan...