Lossy compression formats like mp3 and AAC are known to create artifacts such as clipping. In order to avoid clipping, it is typically recommended to lower the signal peak levels before compression. This article explains how peak levels are affected by lossy compression and shows how to avoid clipping in compressed files.
How lossy encoding changes peak levels
Most music nowadays is distributed in compressed formats: either mp3 (MPEG-1 Layer III) or AAC (Advanced Audio Coding). These compression algorithms reduce the size of a CD-quality audio by a factor of 5–10, depending on the chosen bitrate. This is a far stronger compression than a typical 2× ratio achievable by lossless codecs, like FLAC or ALAC. Therefore, the signal encoded by mp3 or AAC cannot be preserved exactly. These algorithms create an approximation of the signal that sounds as close to the original as possible.
Lossy encoding increases peak levels of the waveform.
This increase in levels in often wrongly attributed to ISPs — intersample peaks (or true peaks). But, in fact, it has little to do with ISPs. In the waveform above, true peaks have been limited to −1 dBTP, but after lossy compression, both sample peaks and true peaks are significantly higher. The cause of this increase is quantization happening during lossy compression.
Lossy compression is often easy to identify by looking at the spectrogram. The upper frequencies are completely cut (a psychoacoustic model finds them inaudible) and the cutoff line is serrated, with occasional “black holes” below the cutoff. Signals at middle frequencies are typically preserved much better because they matter more for the perception. The goal of a psychoacoustic model is to allocate more bits to spectrogram bins that have a higher chance of being audible and shape quantization noise below the masking threshold.
Lossy encoding often creates a serrated cutoff at higher frequencies
Here is a pathological case of peak level increase after lossy compression.
Interestingly, any clipping that happens because of peak level increase during lossy encoding is reversible! It happens during file decoding, while the internal representation of an mp3 (or AAC) file is not clipped—very much like a floating-point sample format is clipped upon playback, but may contain valid signals above 0 dB. Some decoders are smart and able to apply some negative gain to prevent clipping. Others, like RX 7.01, can decode to a non-clipping 32-bit float format, where you can manually take care of any overshoots. Unfortunately, most decoders are dumb: they decode to a 16-bit sample format and clip. So, the safest way to prevent clipping of mp3 or AAC files is to leave some headroom below 0 dB. Even half a decibel of headroom will eliminate most of the audible clipping.
Clipping-free export in RX 7.01
The new update of RX introduces a unique feature for automatic prevention of clipping during lossy encoding. It is available directly in the File – Export window and has two new modes of operation:
- Normalize mode attenuates the whole file so that decoded levels do not exceed 0 dBTP
- Limiter mode only attenuates parts of the file that could become clipped by the codec. This retains the level of non-clipping secions, while overall true peak levels are limited to 0 dBTP.
When clipping prevention is enabled, RX automatically finds the correct level adjustment for the file depending on the amount of clipping occurring in the codec. This guarantees that your encoded file does not clip upon decoding. When clipping prevention is off, lossy formats are encoded in the old way that does not protect from codec clipping.
Which option should you go with?
The Limiter will leave larger sections of the file unchanged in level and will only attenuate sections that would experience clipping. However, like any dynamic processing, this may create pumping.
The Normalize mode can completely avoid pumping at the expense of slightly reducing the overall level of the file. Both of these options run the encoding slower than the old way because they are dynamically adjusting file levels to ensure that no clipping occurs in the codec.
In addition to that, RX can fix codec clipping that has occurred in files compressed elsewhere. Upon decoding, all lossy files will be decompressed to a 32-bit floating-point format without clipping. This means peaks above 0 dBFS will be automatically recovered and brought under 0 dB either by normalization or with a peak limiter like in Ozone.
When encoding is lossy formats, care has to be taken to avoid clipping, because lossy codecs do increase peak levels of the signal. A traditional way of addressing the problem is leaving around 1 dB of headroom. A better way offered in RX since version 7.01 is to use “Prevent clipping” option which guarantees no overs during lossy encoding with minimum possible amount of headroom.