The Safety of Unsafe Swift

 

New Features in Realm Obj-C & Swift

We recently released version 2.5 of Realm Objective‑C and Realm Swift – In this release, we’re empowering queries with support for diacritic-insensitive string search, adding the ability to compare nested objects against NULL and applying a large number of bug fixes to keep your apps running strong. We’re also now producing Swift binaries for Swift 3.0, 3.0.1, 3.0.2 and 3.1.

Swift protects you from undefined behavior by not allowing direct memory access by default. The Swift unsafe APIs help you construct code that is highly readable and only unsafe where it has to be.


Undefined Behavior

I’ll be going over the safety of unsafe Swift today, and we begin with undefined behavior. Undefined behavior is universally bad. It’s utterly bogus. Developers hate undefined behavior because the program might crash. It might work. It might crash, but an hour later, something undeniably bizarre might happen. So, in short, undefined behavior destroys software schedule and Swift goes to great lengths in order to ensure safety for your code, and it does that by not allowing direct access to memory, or access to uninitialized variables and that’s by default.

Swift Pointers

But sometimes you need to get that access, such as when you’re working with an unsafe language like C or you need some performance or access to some low level stuff. And Swift gives you a lot of tools to do this. Instead of just one pointer type, there’s over half a dozen of them. Mutable, not mutable. Raw versus tight. Buffers or collections versus strides.

And actually earlier this year, I produced a free tutorial on raywenderlich.com that explains these pointer types. It also gives a couple examples, a streaming data compression algorithm and a typesafe random number generator.

Dictionaries and Sets

But I wanna talk about another aspect today or another application with regard to dictionaries and sets. And these are great because they give you constant time lookup, but that all depends on having a good hash value. If you have a bad hash value, all of those guarantees go away, and you’re suddenly left with linear time lookup, and linear time lookup is just one loop away from quadratic time.

And actually hackers have used flaws in hashing algorithms to launch denial of service attacks. So the Swift Standard Library gives you a bunch of nice… supplies the hash value for a bunch of types and you can leverage those in your own code.

So here’s a good example using hash value. Yay.


    struct Angle: Hashable {
        var radians: Double

    var hashValue: Int {
        return radians.hashValue
        }
    }

But what about larger types? Well, we could use an exclusive OR operation.

  struct Point: Hashable {
    var x, y: Double

    var hashValue: Int {
      return x.hashValue ^ y.hashValue
    }
  }

But if X and Y happen to be the same value, this is gonna hash to zero. That means you’re gonna get lots of collisions at zero if your data set has lots of equal values. So depending on your data set, this might not be a great hashing algorithm. So you could fake it, and I’ve seen this in actual code, where someone will render their object out as a string, and then call hash value on that string.


    struct Point: Hashable {
        var x, y: Double

    var hashValue: Int {
        return "\(x),\(y)".hashValue
        }
    }
    

While this fulfills the requirements of the hashable protocol, it’s not great because it requires a heap allocation, and we’re trying to be fast, remember? Heap allocations are expensive. So we can do better.


    protocol HashAlgorithm {
        init()                         // 1
        mutating func consume(bytes:)  // 2
        var finalValue: Int            // 3
    }

And it turns out that heap hashing algorithms have this basic form of initializing themselves, consuming bytes, and then spitting out a final value.

So a hash algorithm author just needs to consume bytes and maybe while this isn’t so friendly of code, there’s actually no real unsafe code here.


    struct FVN1AHash: HashAlgorithm {
        private var hash: UInt64 = 0xcbf29ce484222325
        private let prime: UInt64 = 0x100000001b3

        mutating func consume<S: Sequence>(bytes: S) 
                where S.Iterator.Element == UInt8 {
            for byte in bytes {
                hash = (hash ^ UInt64(byte)) &* prime
                }
            }

        var finalValue: Int {
            return Int(truncatingBitPattern: hash)
        }
    }

And on the client’s side, we have this hash value, the FVN1A hash value.


    var hashValue: Int {
        var hash = FVN1AHash()
        hash.consume(x)
        hash.consume(y)
        return hash.finalValue
    }

They’re just declaring it on the stack and then consuming the X and Y and spitting out the hash value. And it’s only hidden away in the hash algorithm, these protocol extensions that give you access to the bytes or the types that you’re accessing, using unsafe code to access bytes, so it’s hidden away.


    extension HashAlgorithm {
        mutating func consume<I: Integer>(_ value: I) {
            var temp = value
            withUnsafeBytes(of: &temp) { rawBufferPointer in
                consume(bytes: rawBufferPointer)
            }
        }
    }

And this is the point, is that while Swift gives you this unsafe access, by constructing your APIs in this kind of way, you can really isolate your unsafe code so that you have these nice, easy to use APIs. And this talk, or the material for this talk, is coming from a Ray Wenderlich video tutorial series that I’m working on that will hopefully come out later this year and I hope you have a chance to watch it. Thank you!


Ray Fix

Ray Fix

Cofounder and lead developer at Pelfunc, Inc. a small startup in San Diego, CA. Before that he worked as a software engineer at Cognex Corporation based in Natick, MA.