July 19, 2018 by Shahan Nercessian

iZotope and Assistive Audio Technology

With the recent release of Nectar Elements, the newest assistive audio production tool from iZotope, research engineer Shahan Nercessian overviews our evolution in assistive audio technology, why we care about it so much, and interesting avenues for its future.

This article references previous versions of Ozone. Learn about the latest Ozone and its powerful new features like Master RebalanceLow End Focus, and improved Tonal Balance Control by clicking here.

What is assistive audio technology, and how does it work?

Since the 2016 release of Track Assistant in Neutron, iZotope has been developing assistive audio technology designed to remove the guesswork from audio production and make it more efficient. One of our major goals as a company is to find solutions to eliminate time-consuming audio production tasks for our users so they can focus on their creative visions.

Our assistive audio technology intelligently analyzes your audio and provides custom presets that are tailored to the sound you’re going for. We do this by combining years of intelligent digital signal processing (DSP) algorithm development with modern machine learning techniques to analyze your audio signal and make suggestions accordingly.

Generally speaking, our assistive audio technology consists of three pieces:

  1. High-level user preference: before running our assistive tech, we ask you to broadly characterize the sound you are going for and the amount of processing you wish to apply. This way, the assistant can get a sense of your desired aesthetic and how drastic a change you are looking to make to your audio.
  2. Machine learning: a machine learning algorithm characterizes your audio in some task-specific way (take the instrument classifier in Neutron for example). This information allows us to line up the steps of processing that your track will undergo, and potentially dial-in settings for some of them.
  3. Intelligent DSP: we further analyze specific properties of your audio to set parameters of different DSP modules in your processing chain. We do all of this taking into consideration your specified user preferences. For example, we may analyze the dynamic range of your audio in certain ways to select parameter values for a compressor. The parameters we come up with should hopefully provide the desired amount of consistency for your track.

Who is assistive audio technology for?

Assistive audio technology can be beneficial for amateurs and professionals alike. For the audio amateur, it breaks the overwhelming barrier to entry in mixing and mastering, quickly getting your tracks sounding great in a few clicks. It is also an invaluable educational resource, from which budding producers can learn how to make informed decisions by analyzing choices made by our assistants on different source materials. For the seasoned veteran, our assistive technology minimizes time-consuming cleanup work so that they can hone in on the creative side of things, adding your signature expertise.

Check out how to use Nectar Elements’ Vocal Assistant in a vocal mixing workflow:

How is assistive audio technology represented in iZotope products?

Examples of our assistive audio technology include Track Assistant in Neutron, Master Assistant in Ozone, and Vocal Assistant in Nectar Elements. These assistants all generally adhere to the three-stage assistant pipeline we highlighted earlier, but operate slightly differently according to the task at hand.

Track Assistant

The first assistant to see the light of day in iZotope products was the Track Assistant feature in Neutron, which has since been refined in Neutron 2. Track Assistant uses machine learning to identify the instrument type of a given track, and uses this information to load a baseline processing chain and module presets.

Next, each module in the chain is adapted based on further analysis of your audio. Intelligent EQ technology learns the positions of EQ nodes, while intelligent exciter technology learns where to place exciter bands. Our intelligent dynamics technology learns where to place not only compressor bands, but also how to set compressor threshold, attack, and release times.

Master Assistant

The Master Assistant in Ozone 8 expands on the Track Assistant concept specifically for a mastering workflow. Master Assistant begins by using machine learning to characterize your audio content and generate a suitable target EQ curve. It leverages our intelligent dynamics technology to determine whether dynamics processing is necessary, and sets compressor threshold and attack/release times accordingly.

Based on your desired distribution format, the assistive intelligent sets a maximizer to hit a target loudness, dynamically adjusting for problematic peak resonances it discovers, and reducing the impact the limiter has on the overall sound.

Vocal Assistant

The Vocal Assistant in Nectar Elements is the latest innovation in iZotope’s Assistive Audio Technology. Vocal Assistant uses machine learning to determine your vocal timbre, and dial-in the appropriate type of character to your processing. It also analyzes the pitch contour of the vocal and uses this information to optimize pitch tracking performance. Next, it uses our intelligent EQ and dynamics processing to remove harsh peak resonances, and provide a consistent level for your vocal. It even detects the presence of sibilance, and sets de-esser parameters to tame it.

Lastly, Vocal Assistant adds just a dash of reverb to give your vocal space and life. All of this adaptive intelligence is paired with a stripped-down set of intuitive controls, which allow the user to alter different aspects of its processing to taste quickly and easily.

Why do I need Vocal Assistant if Voice Mode in Neutron’s Track Assistant exists?

We’re glad you asked! While Neutron’s Track Assistant is a general mixing workhorse, mixing vocals is a highly specific workflow that requires dedicated focus and form-fit tools. The distinctions between Vocal Assistant in Nectar Elements and Track Assistant in Neutron applied on vocals can be summarized as follows:

  1. The high-level user preferences in Vocal Assistant more specifically cater to achieving certain vocal aesthetics. Bring that undeniably vintage or modern quality to your vocal, or dial-in just the right amount of crispness to your dialogue.
  2. Vocal Assistant instantiates multiple EQs at distinct points in the processing chain. Separate subtractive and tone EQs engaged by Vocal Assistant are more in line with modern vocal production workflows.
  3. Nectar Elements contains vocal-specific DSP modules, and Vocal Assistant automatically sets their parameters. Real-time pitch correction and de-esser modules are tuned to optimize pitch tracking and reduce harsh sibilance.
  4. Vocal Assistant’s machine learning technology analyzes your vocal timbre. Having a deeper understanding of your vocal allows us to add character while maintaining transparency.

What are some future opportunities for assistive audio technology in mixing?

Artificial intelligence and machine learning open up new horizons for us. We can think about creating tools that help us navigate complex tasks and understand relationships between signals in new and exciting ways. They open up possibilities to improve the way music producers and engineers interact with machines, and they help us learn about ourselves and our work.

Track Assistant in Neutron and Vocal Assistant in Nectar Elements are great advances toward enhancing your mixing workflow, making it easier and faster. Moving forward, we would like to analyze interactions between different tracks, and more holistically address all the tracks in your mix. We’re also interested in leveraging machine learning to further expand our referencing capabilities, allowing our users to learn more and apply settings from the music they know and love.