De-essing is the process of attenuating or reducing sibilance, or harsh high-frequency sounds that come from dialogue or vocals using the letters S, F, X, SH, and soft Cs.
It’s often a necessary process when mixing audio, but rarely is it easy—especially when you’re just getting started. Many factors contribute to the complex nature of de-essing, from the way split-band processors can impact the character of a sound, to the manner in which the human voice can change from sibilance to sibilance.
So I found it necessary for my practice to develop a list of dos and don’ts. It’s my pleasure to share it with you now.
Do use manual de-essing
Manual de-essing is the act of grabbing every sibilant part of a signal “by hand” and clip-gaining it down. It is a very cumbersome and annoying process, but if you’re mixing music, it’s worth it. The process can often sound more natural than other forms of de-essing. Basically, you look for the recognizable “ess” in the waveform (it often resembles a solid football), separate the ess into its own region, and clip-gain that region down.
Manual de-essing also lets you tailor how hard the offensive sibilance hits further downstream processing. This enables you to get the most out of a single plug-in chain: Nothing’s worse than breaking out a phrase into its own track during a verse because it’s not playing nicely with the plug-ins. With the help of tricks like manual de-essing, you don’t necessarily have to!
The drawback is time; manual de-essing requires oodles of time. However, it is time well spent.
Do use wide-band de-essing more often than you’d think
Wide band de-essing pulls down the entire signal when it detects a sibilance. In a way, you can think of it like automated manual de-essing.
Split band de-essing, on the other hand, splits the signal into two or three bands, and only pulls down a selected range of frequencies when a sibilance triggers the compressor. This makes the process a momentary dynamic EQ (or multiband compressor). So, for a split second, your split-band de-esser is affecting the timbre of the signal in a way that you must now account for, as it will be suddenly equalizing the signal, rather than decreasing its overall level.
But chances are you’re going to want to use other processes to equalize your signal; these processes could very well be muddled by a split-band de-esser: it’s harder to get an idea of what to do consistently when a specific element—say, the frequency band of an ess—is constantly changing in response to a threshold. It’s one thing if the totality of the signal changes in amplitude, it’s another if only a small band of its harmonic makeup shifts. For this reason, I tend to favor wide-band de-essers over split-band, especially if they’re the first processors in the chain.
This isn’t to say that I avoid split band de-essers. I don’t—and neither should you. They make wonderful additions to high-shelf boosts on a vocal, either before or after the EQ. But I tend to use them as a secondary de-esser, and if brightness is not something I’m trying to add overall, I tend to favor the wide band processes alone.
Don’t hit the process hard, all at once, with one de-esser
We’ve all seen it, folks—an online tutorial where someone de-esses a vocal with all the grace of a sledgehammer. In doing so, the mixing engineer invariably causes the singer to sound like he or she is spitting out sibilance. I call it the Sylvester effect, in honor of the famous cartoon cat.
Friends, don’t do this. We know it’s often better to move the effectual needle one small process at a time, using many tools in series for a cumulative, more natural-sounding result, Often this approach beats using one plug-in in a heavy-handed manner.
In my experience, this principle is especially true when de-essing. A little manual de-essing here (half a dB, maybe a dB), followed by some further de-essing of another decibel or two (with other processes between) is far more likely to get you there without overkilling the vocal—without taking away its natural presence.
Don’t set it and forget it
Because of the unique variance of the human voice (unique, even from syllable to syllable), don’t think you can set a de-esser and walk away. You can’t expect it to act consistently across all esses. Maybe the singer stepped away from the mic, turning her head off axis; maybe the singer put his tongue in a different mouth position; whatever the case, there will be different esses from time to time.
Here, the answer is automation. Either you change your de-esser’s sidechain parameters, or you set up an entirely new de-esser, automating it to engage on a specific part of the song. Whichever method you employ, expect to change parameters every once and a while, as the human voice is not a one stop shop situation.
Don’t give up on a pesky vocal without trying the following trick
There are so many times that I can’t tamp down the specific, aggravating timbre of an ess with a conventional de-esser. Years ago, before I realized I was using the wrong tool for the job, it made me want to pull what’s left of my hair out.
I’m talking about nasty bunches of frequencies that hover down lower than you’d expect—in the 4–6 kHz region—frequencies that sometimes sound horrible on certain sibilated phrases. A conventional de-esser might not work, and here’s why:
While a sideband selector usually can analyze this band, this band might not actually be appropriate for triggering the de-esser. It will pull down more than the esses, in other words. The detecting frequencies—the frequencies that carry the majority of the esses (and are therefore better for “tuning” the de-esser)—these often lie higher up the spectrum, in the 10–12 kHz region.
Here you have two choices: You can separate the sibilance to a new track and EQ it, making sure it goes into the same chain for further downstream processing. This can cause artifacts due to switching between regions if you’re not carefully. Enter the other solution:
You can use something like Neutron 2’s EQ in dynamic mode. It works as follows: call up a node in the upper, more ess-located frequencies of 10 or 12 kHz, then set the gain to 0 dB (so the band is not actually doing anything). Next, set up a dynamic cut lower down, at 4–6 kHz, where the offensive frequency is. After that, you assign the internal side-chain of that 4–6 kHz band to the higher node (the one with 0 dB of gain). Yes, it’s bit tough to explain on digital paper, but it looks like this:
In our efforts to master a technique, sometimes we apply de-essing where it’s not needed. Sometimes we overdo the process just to prove that we have a handle on it.
De-essing must be carefully employed; not all vocals require it, and sometimes, a prolonged experience editing the voice before the mix makes us overly sensitive. Mixing immediately after editing can lead us to de-essing too much.
Thus, along with this list of do’s and don’ts, always remember the context of what you’re de-essing. If you’ve spent an hour editing a vocal, give your ears a rest before mixing it, even if it’s just five minutes. If your vocal sounds good against a reference without the de-esser, maybe you don’t need it. Always remember that context is key in achieving your best ess.