This post will dive into a PR I made to the Swift Core Libs project that improves the performance of NSIndexSet.mutableCopy. Instruments is used to profile performance. If you are not familiar with Instruments (part of Xcode) check out this Instruments Overview.

Previous mutableCopy Implementation

NSIndexSet.mutableCopy is responsible for taking an existing NSIndexSet and returning a mutable copy. The existing Swift Foundation implementation calls add(in:) on the mutable copy for each internal range of the NSIndexSet:

open func mutableCopy(with zone: NSZone? = nil) -> Any {
    let set = NSMutableIndexSet()
    enumerateWithRanges(options: []) { (range, _) in 
        set.add(in: range)
    }
    return set
}

Note: there are often many considerations to evaluate when looking at performance improvements. For example, there is an implementation difference of NSIndexSet on Darwin as explained by @parkera:

I think this [PR] has the same problem as we were discussion in another PR (which I'm having trouble locating) about subclassing correctness. On Darwin, NSMutableSet is abstract, so it doesn't even have the ivars in the first place.
That said, we don't have a way yet in Swift to implement our factory method pattern, so this [PR] is probably a reasonable way to preserve the performance we need.

Identifying The Performance Improvement

The following code performed slower when built using Swift Foundation versus the Swift standard library, prompting further inspection of NSIndexSet.mutableCopy:

let a = IndexSet()
var b = a
b.insert(1)

In Swift, structs are copied on write. In the example, a copy is triggered on line 3 when b is written to. Perserving correctly, the copy ensures a is empty and b contains 1 at the end of script execution.

IndexSet is a struct backed by the NSIndexSet reference type (learn more about references types from Apple). Since b is a var and therefore mutable, line 3 triggers NSIndexSet.mutableCopy before the insert is called.

Diving into the performance of the copy-on-write, here is a focused portion of the Instruments profile looking at NSIndexSet.mutableCopy showing a significant amount of computation spent within enumerateWithRanges(options:):

Time Weight Symbol
1.64s 83.4% specialized NSIndexSet.enumerateWithOptions<A,B>(:range:paramType:returnType:block:)
1.27s 64.5%     specialized closure #1 in NSIndexSet.enumerateWithOptions<A,B>(:range:paramType:returnType:block:)
376.0ms 19.1%         _RandomAccessCollectionBox.index(:offsetBy:)
207.0ms 10.5%         partial apply for specialized
180.0ms 9.1%             partial apply for closure #1 in NSIndexSet.mutableCopy(with:)
179.0ms 9.61         NSMutableIndexSet.add(_:)
126.0ms 6.4% swift_release(swift::HeapObject*)
55.00ms 2.7% swift_retain(swift::HeapObject*)

Internally, an NSIndexSet uses the following variables to represent the set:

internal var _ranges = [NSRange]()
internal var _count = 0

Faster mutableCopy Implementation

Knowing that Swift contains copy-on-write mechanics, a faster mutableCopy implementation uses assignment:

open func mutableCopy(with zone: NSZone? = nil) -> Any {
    let setCopy = NSMutableIndexSet()
    setCopy._ranges = _ranges
    setCopy._count = _count
    return setCopy
}

For index sets with some number of ranges, using assignment over enumerateWithRanges(options:) has large performance impact. Copying a 100 range IndexSet is 400% faster and copying a 1,000 range IndexSet is 2,500% faster on the machine used for profiling.

This speedup occurs because the eventual and expensive add(in:) call for each range is replace with a direct copy of the entire range collection.