Audio Manipulation Using AVAudioEngine

Audio manipulation in iOS can be very daunting. The documentation for low level core audio doesn’t seem to be up to par. There are, of course, frameworks out there that can be used to make things a little less painful when dealing with low level core audio. However, those may not fit your needs or they may be a bit excessive for what you need. So let’s get to something a little simpler but that is still quite powerful, AVAudioEngine.

AVAudioEngine is not a new API but it’s one that doesn’t seem to have gained a whole lot of exposure. It was introduced by Apple at the 2014 WWDC. It’s part of AVFoundation, which is a framework used for playback, recording, and editing media. AVAudioEngine simplifies low-latency, real-time audio. It can’t do everything that core audio can, but for playing, recording, mixing, effects, and even working with MIDI and samplers, it can be quite powerful.

The AVAudioEngine class manages graphs of audio nodes. The engine is used to connect the nodes into active chains. The graph can be reconfigured and the nodes can be created and attached as needed. There are 3 types of nodes – source, processing, and destination. These are pretty self explanatory, but just for the sake of clarity, source nodes are the player and microphone, processing nodes mixers and effects, and the destination nodes are the speaker and headphones. The engine creates three nodes implicitly, an input node, an output node, and the main mixer node. Originally each node could only have a single output, however thanks to

AVAudioConnectionPoint(node: AVAudioNode, bus: ABAudioNodeBus),

multiple outputs are now possible. Connections between each node are made via busses, which each have an audio format. One of the gotchas with AVAudioEngine seems to be these formats. If the formats are not compatible, you will most likely end up with a broken graph. This can be quite frustrating as the error messages are not real helpful in some cases. If you need to change formats, make sure you use a mixer node to avoid any problems.

So let’s see what it looks like in action. You’re basic steps are:

1) To set up an audio session

do {
    // Here recordingSession is just a shared instance of AVAudioSession 

      try recordingSession.setCategory(AVAudioSessionCategoryPlayAndRecord, with: [.mixWithOthers, .defaultToSpeaker]) // There are several options here - choose what best suits your needs
      try recordingSession.setActive(true)

      // I suggest adding notifications here for route and configuration changes
  catch {
      // Handle the error

2) Create the engine

  let audioEngine = AVAudioEngine()

3) Create the nodes

  let audioMixer = AVAudioMixerNode()
  let micMixer = AVAudioMixerNode()
  let reverb = AVAudioUnitReverb()
  let echo = AVAudioUnitDelay()
  let audioPlayerNode = AVAudioPlayerNode()

4) Attach the nodes


5) Connect the nodes

 // Sound effect connections

  audioEngine.connect(audioMixer, to: audioEngine.mainMixerNode, format: audioFormat)
  audioEngine.connect(echo, to: audioMixer, fromBus: 0, toBus: 0, format: audioFormat)
  audioEngine.connect(reverb, to: echo, fromBus: 0, toBus: 0, format: audioFormat)
  audioEngine.connect(micMixer, to: reverb, format: audioFormat)

  // Here we're making multiple output connections from the player node 1) to the main mixer and 2) to another mixer node we're using for adding effects.

    let playerConnectionPoints = [
        AVAudioConnectionPoint(node: audioEngine.mainMixerNode, bus: 0),
        AVAudioConnectionPoint(node: audioMixer, bus: 1)

    audioEngine.connect(audioPlayerNode, to: playerConnectionPoints, fromBus: 0, format: audioFormat)

    // Finally making the connection for the mic input

    guard let micInput = audioEngine.inputNode else { return }

    let micFormat = micInput.inputFormat(forBus: 0)
    audioEngine.connect(micInput, to: micMixer, format: micFormat)

6) Prepare the files for playing/recording

do {
    // Here trackURL is our audio track
        if let trackURL = trackURL {
            audioPlayerFile = try AVAudioFile.init(forReading: trackURL)
    catch {

    // Schedule track audio immediately if read is successful

    guard let audioPlayerFile = audioPlayerFile else { return }

    audioPlayerNode.scheduleFile(audioPlayerFile, at: nil, completionHandler: nil)

    audioURL = URL(fileURLWithPath: <YOUR_OUTPUT_URL>)

    if let audioURL = audioURL {
        do {
            self.recordedOutputFile = try AVAudioFile(forWriting:  audioURL, settings: audioMixer.outputFormat(forBus: 0).settings)
        catch {
            // HANDLE THE ERROR

7) Start the engine

    // Prepare and start audioEngine
    do {
        try audioEngine.start()
    catch {
        // HANDLE ERROR

8) Add tap to record the output

     guard let recordedOutputFile = recordedOutputFile,
        let audioPlayerFile = audioPlayerFile else { return }

      let tapFormat = audioMixer.outputFormat(forBus: 0)

    // The data is collected from the buffer using a block
      audioMixer.installTap(onBus: 0, bufferSize: AVAudioFrameCount(recordedOutputFile.length), format: tapFormat)
      { buffer, _ in

          do {
              if recordedOutputFile.length < audioPlayerFile.length {
                  try self.recordedOutputFile?.write(from: buffer)
              else {
                  self.audioMixer.removeTap(onBus: 0)
                  // Handle your UI changes here

          catch {
              // Handle error

That’s it, you’re playing, processing, and recording! There is a whole lot more that can be done with AVAudioEngine. I suggest diving in. Documentation isn’t always the greatest and trial and error in some cases is necessary. But, it’s a little friendlier to use than the low level core audio!


Christina McIntyre
Christina McIntyre