Behind the Magic of iZotope RX 10: an Interview with Alexey Lukin, Principal DSP Engineer
Go behind the scenes of creating iZotope RX 10 and the technological advances that power the latest version of the industry standard for audio repair.
RX 10 Advanced
Tell us more about yourself and your role at iZotope.
My engineering for iZotope began with an effects processor called Spectron, which we released in 2003. Around the same time I began early prototyping of noise reduction algorithms that four years later became iZotope RX. My education combines audio processing and image processing, so my Ph.D. thesis explored similarities of these domains. RX is trying to make audio editing more visual, letting you touch-up the sound.
Although RX is now my favorite child (after my own children), being a DSP engineer at iZotope, I have designed algorithms for other products, like iZotope Ozone IRC maximizers, Radius time/pitch modification, dither, SRC, EQs, and many others.
Over the years our research team has grown and I’ve been lucky to work alongside people who expanded my skill set in areas like machine learning (ML), embedded DSP, spatial audio, hardcore math, or just beautiful C++ programming. And now that iZotope has joined forces with Native Instruments and Brainworx, I’ve got even more opportunities to learn.
How has the landscape of audio cleanup, repair, and restoration changed over the past few years, and how is RX 10 situated to tackle those changes?
During the pandemic a good deal of audio production has shifted to home studios, which has created new challenges in audio repair. Issues like hum, noise, frequency loss, or early reflections — virtually nonexistent in the studio — have returned and require accurate cleanup.
We are witnessing an explosion of neural nets used for audio processing, even in real time contexts. Things like dialogue enhancement or source separation — barely imaginable just a decade ago — are becoming widespread and affordable.
"We are witnessing an explosion of neural nets used for audio processing, even in real time contexts."
RX 10 offers several new tools to tackle cleanup of difficult samples. The Adaptive Mode of RX Dynamic De-hum is able to attenuate hum (steady-state tonal noises like buzz or interference) without prior training, even when the frequency is slowly drifting. And thanks to the gated action of the notch filters, the amount of ringing is minimized, compared with the old Static De-hum.
The updated ML algorithm in RX Spectral Recovery (developed by my colleague Shahan Nercessian) is able to recover missing upper frequencies of speech more realistically than RX 9. And now it can also synthesize lower frequencies, like a missing fundamental tone, which is quite helpful for repair of Zoom or cell phone recordings.
For novice users, or those on a tight schedule, a redesigned RX Repair Assistant is able to tackle more problems than before. It builds custom signal chains for repair of different classes of source material, like speech, music, or drums. Repair Assistant is now also available as a plug-in, in addition to the module in RX app.
One of my favorite RX 10 features is new Selection Feathering. It lets me apply repair modules to “softer” frequency bands, so that the processed and unprocessed material blend together better. The Frequency Feathering slider adjusts the width of the crossover.
Last, but not least, a feature that will be appreciated by podcasters and anyone working with long speech files. RX Text Navigation analyzes dialogue and displays a searchable transcription above the spectrogram that supports text-based editing. And automatic Multiple Speaker Detection finds and tags the sections of speech associated with each individual speaker.
What were some of the challenges you faced in creating this new technology?
One challenge was to fit some of the complex new algorithms into the product, enabling them to run locally, without access to the cloud (which is not allowed in many post production studios). Algorithm latency was a challenge in developing the Adaptive De-hum. It needs a couple of seconds of lookahead to be able to reliably distinguish between hum and speech. This may create a problem when running the plugin in some DAWs.
A hard problem that we have not solved yet is elimination of early reflections. Our RX Dialogue De-reverb algorithm is efficient on longer reverbs, but short reverb times often seen in home studios are challenging its abilities. Such reverbs not only create decaying tails in the spectrogram, but also introduce comb filtering and change the phase of the signal. We are hoping to tackle this problem better one day.
What is something everyone should know you can do with RX that is maybe lesser known?
I’ll begin with a joke: many people don’t realize that RX has a standalone app (or a suite of plug-ins, depending on who you ask).
All jokes aside, we want RX to be able to seamlessly fit into your workflow, however it might look.
Now here are my favorite serious features:
View ► Show Channels Separately mode [Ctrl + Shift + C] allows you to view a stereo file as mono and gain 2× frequency resolution in the spectrogram. You can still select individual channels for editing using channel selector buttons.
RX Deconstruct module is quite helpful for semi-manual cleanup of residual distortion, crackle, or cicada noise. In RX 10 it can be used together with frequency feathering for gentle repair of target selections.
When working with stereo files, I often use the “M/S encoder-decoder” preset of the RX Mixing module. It allows me to test how similar L and R channels are and whether there is any time shift between them that could be compensated with the RX Azimuth module. When the recording is close to mono, I’m often reaching for the Center Extract module for additional noise reduction.
When it comes to exporting the results, selecting only one stereo channel and applying File ► Export Selection [Ctrl + Shift + E] will export a mono file with just this channel. If you are exporting in lossy formats, like MP3 or OGG, RX has a unique Prevent Clipping feature in the export window that makes sure your exported files do not clip when they are decoded and played back. This is more than just a true peak limiter, because it also eliminates codec clipping that could occur even for files limited to 0 dBTP.
One additional export option that I’m frequently using is File ► Export Screenshot. It is helpful for online demos and automatically crops the screen to just the spectrogram window. Plus, it can save your animated selection if you choose the GIF format.
For more stuff like this, I invite you to watch this video by my colleague Geoff Manchester who describes his own ten hidden features of RX.
In your eyes, what does the future of audio repair look like?
I can definitely see how in the future machines could repair the problems that are considered impossible today. Through the use of machine learning, they will be able to understand more context of the signal: whether it is speech or music, what is being said, the harmony, the instruments, etc. With this knowledge, the repair will step up in quality. In some cases, repair will be replaced by the resynthesis that is close to the source. With better understanding of the source signal, machines will offer greater assistance with audio repair: they will automatically identify more types of noise and provide more customized ways to fix them.
Another trend we are seeing is democratization of audio repair, emergence of cheaper tools with simpler controls that are available to a wider audience. What previously required a specialized hardware, that now runs on a fast PC — and in the future will run in your browser, phone, or a hearing aid. Imagine what kind of processing will be then available in a studio!
Start cleaning up your audio with RX 10
Teammates like Alexey have made RX 10 the award-winning audio repair suite that helps restore, clean up, and improve recordings in post production, music, and content creation.
Get your copy of RX 10 and see the magic of audio clean up and background noise removal yourself.
Already own a copy of RX? Log into your iZotope account for special loyalty pricing.