What are the problems with concurrency?

The “pyramid of doom” shape of nested code blocks can quickly become unwieldy, asynchronous error handling can be unintuitive or incomplete, issues with third-party integrations are exacerbated, and while intended for performance gains, it may lead to wastefulness and sub-optimal performance.

Technology13 minute read

Advanced Concurrency in Swift with HoneyBee

Handling concurrency in Swift can cause headaches and pyramids of doom. HoneyBee is a futures/promises library that makes concurrent programming easy, expressive, and safe.

Join Toptal Swift Developer Alex Lynch in exploring the performance and readability advantage of using this library.

authors are vetted experts in their fields and write on topics in which they have demonstrated experience. All of our content is peer reviewed and validated by Toptal experts in the same field.

Handling concurrency in Swift can cause headaches and pyramids of doom. HoneyBee is a futures/promises library that makes concurrent programming easy, expressive, and safe.

Join Toptal Swift Developer Alex Lynch in exploring the performance and readability advantage of using this library.

authors are vetted experts in their fields and write on topics in which they have demonstrated experience. All of our content is peer reviewed and validated by Toptal experts in the same field.

Alex Lynch

Verified Expert in Engineering

11 Years of Experience

Alex is an expert iOS and full-stack developer with over 11 years of experience in iOS and 20 years developing applications.

Expertise

Swift iOS

Designing, testing, and maintaining concurrent algorithms in Swift is hard and getting the details right is critical to the success of your app. A concurrent algorithm (also called parallel programming) is an algorithm that is designed to perform multiple (perhaps many) operations at the same time to take advantage of more hardware resources and reduce overall execution time.

On Apple’s platforms, the traditional way to write concurrent algorithms is NSOperation. The design of NSOperation invites the programmer to subdivide a concurrent algorithm into individual long-running, asynchronous tasks. Each task would be defined in its own subclass of NSOperation and instances of those classes would be combined via an objective API to create a partial order of tasks at runtime. This method of designing concurrent algorithms was the state of the art on Apple’s platforms for seven years.

In 2014 Apple introduced Grand Central Dispatch (GCD) as a dramatic step forward in the expression of concurrent operations. GCD, along with the new language feature blocks that accompanied and powered it, provided a way to compactly describe an asynchronous response handler immediately after the initiating async request. No longer were programmers encouraged to spread the definition of concurrent tasks across multiple files in numerous NSOperation subclasses. Now, an entire concurrent algorithm could feasibly be written within a single method. This increase in expressiveness and type safety was a significant conceptual shift forward. An algorithm typical of this way of writing might look like the following:

func processImageData(completion: (result: Image?, error: Error?) -> Void) {
  loadWebResource("dataprofile.txt") { (dataResource, error) in
    guard let dataResource = dataResource else {
      completion(nil, error)
      return
    }
    loadWebResource("imagedata.dat") { (imageResource, error) in
      guard let imageResource = imageResource else {
        completion(nil, error)
        return
      }
      decodeImage(dataResource, imageResource) { (imageTmp, error) in
        guard let imageTmp = imageTmp else {
          completion(nil, error)
          return
        }
        dewarpAndCleanupImage(imageTmp) { imageResult in
          guard let imageResult = imageResult else {
            completion(nil, error)
            return
          }
          completion(imageResult, nil)
        }
      }
    }
  }
}

Let’s break this algorithm down a little bit. The function processImageData is an asynchronous function that makes four asynchronous calls of its own to complete its work. The four async invocations are nested one inside the other in the way that is most natural to block-based async handling. The result blocks each have an optional Error parameter and all but one contains an additional optional parameter signifying the result of the aysnc operation.

The shape of the above code block probably appears familiar to most Swift developers. But what’s wrong with this approach? The following list of pain points will probably be equally familiar.

This “pyramid of doom” shape of nested code blocks can quickly become unwieldy. What happens if we add two more async operations? Four? What about conditional operations? How about retry behavior or protections for resource limits? Real world code is never as clean and simple as examples in blog posts. The “pyramid of doom” effect can easily result in code that is hard to read, hard to maintain, and prone to bugs.
The attempt at error handling in the above example, although Swifty, is in fact, incomplete. The programmer has assumed that the two-parameter, Objective-C style async callback blocks will always provide one of the two parameters; they will never both be nil at the same time. This is not a safe assumption. Concurrent algorithms are renown for being hard to write and debug, and unfounded assumptions are part of the reason. Complete and correct error handling is an inescapable necessity for any concurrent algorithm that intends to operate in the real world.
Taking this thought even further, perhaps the programmer who wrote the called async functions was not as principled as you. What if there are conditions under which the called functions fail to call back? Or call back more than once? What happens to the correctness of processImageData under these circumstances? Pros don’t take chances. Mission-critical functions need to be correct even when they rely on functions written by third parties.
Perhaps most compelling, the considered async algorithm is suboptimally constructed. The first two async operations are both downloads of remote resources. Even though they have no interdependency the above algorithm executes the downloads sequentially and not in parallel. The reasons for this are obvious; the nested block syntax encourages such wastefulness. Competitive markets don’t tolerate needless lagginess. If your app doesn’t perform its asynchronous operations as quickly as possible, another app will.

How can we do better? HoneyBee is a futures/promises library that makes Swift concurrent programming easy, expressive, and safe. Let’s rewrite the above async algorithm with HoneyBee and examine the result:

func processImageData(completion: (result: Image?, error: Error?) -> Void) {
  HoneyBee.start()
    .setErrorHandler { completion(nil, $0) }
    .branch { stem in
      stem.chain(loadWebResource =<< "dataprofile.txt")
       +
       stem.chain(loadWebResource =<< "imagedata.dat")
    }
    .chain(decodeImage)
    .chain(dewarpAndCleanupImage)
    .chain { completion($0, nil) }
}

The first line this implementation starts is a new HoneyBee recipe. The second line establishes the default error handler. Error handling is not optional in HoneyBee recipes. If something can go wrong, the algorithm must handle it. The third line opens a branch which allows for parallel execution. The two chains of loadWebResource will execute in parallel and their results will be combined (line 5). The combined values of the two loaded resources are forwarded to decodeImage and so on down the chain until completion is invoked.

Let’s walk through the above list of pain points and see how HoneyBee has improved this code. Maintaining this function is now significantly easier. The HoneyBee recipe looks like the algorithm that it expresses. The code is readable, understandable, and quickly modifiable. HoneyBee’s design ensures that any mis-ordering of instructions results in a compile-time error, not a runtime error. The function is now much less susceptible to bugs and human error.

All possible runtime errors have been fully handled. Every function signature that HoneyBee supports (there are 38 of them) is assured to be fully handled. In our example, the Objective-C style two-parameter callback will either produce a non-nil error which will be routed to the error handler, or it will produce a non-nil value which will progress down the chain, or else if both values are nil HoneyBee will generate an error explaining that the function callback is not fulfilling its contract.

HoneyBee also handles contractual correctness for the number of times function callbacks are invoked. If a function fails to invoke its callback, HoneyBee produces a descriptive failure. If the function invokes its callback more than once, HoneyBee will suppress the ancillary invocations and log warnings. Both of these fault responses (and others) can be customized for the programmer’s individual needs.

Hopefully, it should already be apparent that this form of processImageData properly parallelizes the resource downloads to provide optimum performance. One of HoneyBee’s strongest design goals is that the recipe should look like the algorithm that it expresses.

Much better. Right? But HoneyBee has much more to offer.

Be warned: The next case study is not for the faint of heart. Consider the following problem description: Your mobile app uses CoreData to persist its state. You have an NSManagedObject model called Media, which represents a media asset uploaded to your back-end server. The user is to be allowed to select dozens of media items at once and upload them in a batch to the backend system. The media are first represented via a reference String, which must be converted to a Media object. Fortunately, your app already contains a helper method that does just that:

func export(_ mediaRef: String, completion: @escaping (Media?, Error?) -> Void) {
  // transcoding stuff
  completion(Media(context: managedObjectContext), nil)
}

After the media reference is converted to a Media object, you must upload the media item to the back-end. Again you have a helper function ready to do the network stuff.

func upload(_ media: Media, completion: @escaping (Error?) -> Void) {
  // network stuff
  completion(nil)
}

Because the user is allowed to select dozens of media items at once, the UX designer has specified a fairly robust amount of feedback on upload progress. The requirements have been distilled into the following four functions:

/// Called if anything goes wrong in the upload
func errorHandler(_ error: Error) {
  // do the right thing
}

/// Called once per mediaRef, after either a successful or unsuccessful upload
func singleUploadCompletion(_ mediaRef: String) {
  // update a progress indicator
}

/// Called once per successful upload
func singleUploadSuccess(_ media: Media) {
  // do celebratory things
}
/// Called if the entire batch was considered to be uploaded successfully. 
func totalProcessSuccess() {
  // declare victory
}

However, because your app sources media references that are sometimes expired, the business managers have decided to send the user a “success” message if at least half of the uploads are successful. That is to say, that the concurrent process should declare victory—and call totalProcessSuccess—if less than half of the attempted uploads fail. This is the specification handed to you as the developer. But as an experienced programmer, you realize that there are more requirements that must be applied.

Of course, Business wants the batch upload to happen as quickly as possible, so serial uploading is out of the question. The uploads must be performed in parallel.

But not too much. If you just indiscriminately async the entire batch, the dozens of concurrent uploads will flood the mobile NIC (network interface card), and the uploads will actually proceed slower than serially, not faster.

Mobile network connections are not considered to be stable. Even short transactions might fail due only to changes in network connectivity. In order to truly declare that an upload has failed, we’ll need to retry the upload at least once.

The retry policy should not include the export operation because it is not subject to transient failures.

The export process is compute-bound and therefore must be performed off of the main thread.

Because the export is compute-bound, it should have a smaller number of concurrent instances than the rest upload process to avoid thrashing the processor.

The four callback functions described above all update the UI, and so must all be called on the main thread.

Media is an NSManagedObject, which comes from an NSManagedObjectContext and has its own threading requirements which must be respected.

Does this problem specification seem a bit obscure? Don’t be surprised if you find problems like this lurking in your future. I encountered one like this in my own work. Let’s first try to solve this problem with traditional tools. Buckle up, this will not be pretty.

/// An enum describing specific problems that the algorithm might encounter. 
enum UploadingError : Error {
  case invalidResponse
  case tooManyFailures
}

/// A semaphore to prevent flooding the NIC
let outerLimit = DispatchSemaphore(value: 4)
/// A semaphore to prevent thrashing the processor
let exportLimit = DispatchSemaphore(value: 1)
/// The number of times to retry the upload if it fails
let uploadRetries = 1
/// Dispatch group to keep track of when the entire process is finished
let fullProcessDispatchGroup = DispatchGroup()
/// How many of the uploads fully completed. 
var uploadSuccesses = 0

// this notify block is called when the full process has completed.
fullProcessDispatchGroup.notify(queue: DispatchQueue.main) {
  let successRate = Float(uploadSuccesses) / Float(mediaReferences.count)
  if successRate > 0.5 {
    totalProcessSuccess()
  } else {
    errorHandler(UploadingError.tooManyFailures)
  }
}

// start in the background
DispatchQueue.global().async {
  for mediaRef in mediaReferences {
    // alert the group that we're starting a process
    fullProcessDispatchGroup.enter()
    // wait until it's safe to start uploading
    outerLimit.wait()
    
    /// common cleanup operations needed later
    func finalizeMediaRef() {
      singleUploadCompletion(mediaRef)
      fullProcessDispatchGroup.leave()
      outerLimit.signal()
    }
    
    // wait until it's safe to start exporting
    exportLimit.wait()
    export(mediaRef) { (media, error) in
      // allow another export to begin
      exportLimit.signal() 
      if let error = error {
        DispatchQueue.main.async {
          errorHandler(error)
          finalizeMediaRef()
        }
      } else {
        guard let media = media else {
          DispatchQueue.main.async {
            errorHandler(UploadingError.invalidResponse)
            finalizeMediaRef()
          }
          return
        }
        // the export was successful
        
        var uploadAttempts = 0
        /// define the upload process and its retry behavior
        func doUpload() {
          // respect Media's threading requirements
          managedObjectContext.perform {
            upload(media) { error in
              if let error = error {
                if uploadAttempts < uploadRetries {
                  uploadAttempts += 1
                  doUpload() // retry
                } else {
                  DispatchQueue.main.async {
                    // too many upload failures
                    errorHandler(error)
                    finalizeMediaRef()
                  }
                }
              } else {
                DispatchQueue.main.async {
                  uploadSuccesses += 1
                  singleUploadSuccess(media)
                  finalizeMediaRef()
                }
              }
            }
          }
        }
        // kick off the first upload
        doUpload()
      }
    }
  }
}

Woah! Without comments, that’s about 75 lines. Did you follow the reasoning all the way through? How would you feel if you encountered this monster your first week on a new job? Would you feel ready to maintain it, or modify it? Would you know if it contained errors? Does it contain errors?

Now, consider the HoneyBee alternative:

HoneyBee.start(on: DispatchQueue.main)
  .setErrorHandler(errorHandler)
  .insert(mediaReferences)
  .setBlockPerformer(DispatchQueue.global())
  .each(limit: 4, acceptableFailure: .ratio(0.5)) { elem in
    elem.finally { link in
      link.setBlockPerformer(DispatchQueue.main)
        .chain(singleUploadCompletion)
    }
    .limit(1) { link in
      link.chain(export)
    }
    .setBlockPerformer(managedObjectContext)
    .retry(1) { link in
      link.chain(upload) // subject to transient failure
    }
    .setBlockPerformer(DispatchQueue.main)
    .chain(singleUploadSuccess)
  }
  .setBlockPerformer(DispatchQueue.main)
  .drop()
  .chain(totalProcessSuccess)

How does this form strike you? Let’s work through it piece by piece. On the first line, we start the HoneyBee recipe, beginning on the main thread. By beginning on the main thread we ensure that all errors will be passed to errorHandler (line 2) on the main thread. Line 3 inserts the mediaReferences array into the process chain. Next, we switch to the global background queue in preparation for some parallelism. On line 5, we begin a parallel iteration over each of the mediaReferences. We limit this parallelism to a max of 4 concurrent operations. We also declare that the full iteration will be considered successful if at least half of the subchains succeed (do not error). Line 6 declares a finally link which will be called whether the subchain below succeeds or fails. On the finally link, we switch to the main thread (line 7) and call singleUploadCompletion (line 8). On line 10, we set a maximum parallelization of 1 (single execution) around the export operation (line 11). Line 13 switches to the private queue owned by our managedObjectContext instance. Line 14 declares a single retry attempt for the upload operation (line 15). Line 17 switches to the main thread once again and 18 invokes singleUploadSuccess. By the time line 20 would be executed, all of the parallel iterations have completed. If less than half of the iterations failed, then line 20 switches to the main queue one last time (recall the each was run on the background queue), 21 drops the inbound value (still mediaReferences), and 22 invokes totalProcessSuccess.

The HoneyBee form is clearer, cleaner, and easier to read, not to mention easier to maintain. What would happen to the long form of this algorithm if the loop was required to reintegrate the Media objects into an array like a map function? After you had made the change, how confident would you be that all of the algorithm’s requirements were still being met? In the HoneyBee form, this change would be to replace each with map to employ a parallel map function. (Yes, it has reduce too.)

HoneyBee is a powerful futures library for Swift that makes writing asynchronous and concurrent algorithms easier, safer and more expressive. In this article, we’ve seen how HoneyBee can make your algorithms easier to maintain, more correct, and faster. HoneyBee also has support for other key async paradigms like retry support, multiple error handlers, resource guarding, and collection processing (async forms of map, filter, and reduce). For a full list of features refer to the website. To learn more or ask questions see the brand new community forums.

Appendix: Ensuring Contractual Correctness of Async Functions

Ensuring the contractual correctness of functions is a fundamental tenet of computer science. So much so that virtually all modern compilers have checks to ensure that a function which declares to return a value, returns exactly once. Returning less than or more than once is treated as an error and appropriately prevents a full compilation.

But this compiler assistance usually does not apply to asynchronous functions. Consider the following (playful) example:

func generateIcecream(from int: Int, completion: (String) -> Void) {
  if int > 5 {
    if int < 20 {
      completion("Chocolate")
    } else if int < 10 {
      completion("Strawberry")
    }
    completion("Pistachio")
  } else if int < 2 {
    completion("Vanilla")
  }
}

The generateIcecream function accepts an Int and asynchronously returns a String. The swift compiler happily accepts the above form as correct, even though it contains some obvious problems. Given certain inputs, this function might call completion zero, one, or two times. Programmers who have worked with async functions often will recall examples of this problem in their own work. What can we do? Certainly, we could refactor the code to be neater (a switch with range cases would work here). But sometimes functional complexity is hard to reduce. Wouldn’t it be better if the compiler could assist us in verifying correctness just like it does with regularly returning functions?

It turns out there is a way. Observe the following Swifty incantation:

func generateIcecream(from int: Int, completion: (String) -> Void) {
  let finalResult: String
  defer { completion(finalResult) }
  let completion: Void = Void()
  defer { completion }

  if int > 5 {
    if int < 20 {
      completion("Chocolate")
    } else if int < 10 {
      completion("Strawberry")
    } // else
    completion("Pistachio")
  } else if int < 2 {
    completion("Vanilla")
  }
}

The four lines inserted at the top of this function force the compiler to verify that the completion callback is invoked exactly once, which means that this function no longer compiles. What’s going on? In the first line, we declare but do not initialize the result which we ultimately want this function to produce. By leaving it undefined we ensure that it must be assigned to once before it can be used, and by declaring it let we ensure that it can never be assigned to twice. The second line is a defer which will execute as the final action of this function. It invokes the completion block with finalResult - after it has been assigned to by the rest of the function. Line 3 creates a new constant called completion which shadows the call-back parameter. The new completion is of type Void which declares no public API. This line ensures that any use of completion after this line will be a compiler error. The defer on line 2 is the only permitted use of the completion block. Line 4 removes a compiler warning that would otherwise be present about the new completion constant being unused.

So we’ve successfully forced the swift compiler to report that this asynchronous function is not fulfilling its contract. Let’s walk through the steps to make it correct. First, let’s replace all direct access to callback with an assignment to finalResult.

func generateIcecream(from int: Int, completion: (String) -> Void) {
  let finalResult: String
  defer { completion(finalResult) }
  let completion: Void = Void()
  defer { completion }
 
  if int > 5 {
    if int < 20 {
      finalResult = "Chocolate"
    } else if int < 10 {
      finalResult =  "Strawberry"
    } // else
    finalResult = "Pistachio"
  } else if int < 2 {
    finalResult = "Vanilla"
  }
}

Now the compiler is reporting two problems:

error: AsyncCorrectness.playground:1:8: error: constant 'finalResult' used before being initialized
        defer { completion(finalResult) }
              ^
 
error: AsyncCorrectness.playground:11:3: error: immutable value 'finalResult' may only be initialized once
                finalResult = "Pistachio"

As expected, the function has a pathway where finalResult is assigned zero times and also a pathway where it is assigned more than once. We resolve these problems as follows:

func generateIcecream(from int: Int, completion: (String) -> Void) {
  let finalResult: String
  defer { completion(finalResult) }
  let completion: Void = Void()
  defer { completion }
 
  if int > 5 {
    if int < 20 {
      finalResult = "Chocolate"
    } else if int < 10 {
      finalResult =  "Strawberry"
    } else {
      finalResult = "Pistachio"
    }
  } else if int < 2 {
    finalResult = "Vanilla"
  } else {
    finalResult = "Neapolitan"
  }
}

The “Pistachio” has been moved to a proper else clause and we realize that we failed to cover the general case—which of course is “Neapolitan.”

The patterns just described can easily be adjusted to return optional values, optional errors, or complex types like the common Result enum. By coercing the compiler to verify that callbacks are invoked exactly once, we can assert the correctness and completeness of asynchronous functions.

Understanding the basics

What is concurrency in programming?
A concurrent algorithm (also called parallel programming) is an algorithm that is designed to perform multiple (perhaps many) operations at the same time to take advantage of more hardware resources and reduce overall execution time.
What are the problems with concurrency?
The “pyramid of doom” shape of nested code blocks can quickly become unwieldy, asynchronous error handling can be unintuitive or incomplete, issues with third-party integrations are exacerbated, and while intended for performance gains, it may lead to wastefulness and sub-optimal performance.

Alex Lynch

Verified Expert in Engineering

11 Years of Experience

Atlanta, GA, United States

Member since August 27, 2018

About the author

Alex is an expert iOS and full-stack developer with over 11 years of experience in iOS and 20 years developing applications.

authors are vetted experts in their fields and write on topics in which they have demonstrated experience. All of our content is peer reviewed and validated by Toptal experts in the same field.

Expertise

Swift iOS

Hire Alex

Advanced Concurrency in Swift with HoneyBee

Alex Lynch

Expertise

Appendix: Ensuring Contractual Correctness of Async Functions

Understanding the basics

What is concurrency in programming?

What are the problems with concurrency?

Tags

Alex Lynch

About the author

Expertise

Alex Lynch

Using an LLM API As an Intelligent Virtual Assistant for Python Development

Toptal Developers

Advanced Concurrency in Swift with HoneyBee

ByAlex Lynch

Expertise

Appendix: Ensuring Contractual Correctness of Async Functions

Understanding the basics

What is concurrency in programming?

What are the problems with concurrency?

Tags

About the author

Expertise

Toptal Developers

Alex Lynch