What do RX's Spectrogram Settings do?

April 29, 2016

iZotope RX offers advanced spectrogram options under the View | Spectrogram Settings menu. Choosing different settings can allow you to see more detail in audio. Here is an in-depth description of the Spectrogram Settings controls:

 

Spectrogram Type - RX offers some different methods for displaying time and frequency information in the spectrogram. RX has advanced spectrogram modes that allow you to see sharper time (horizontal) and frequency (vertical) resolution at the same time. There is always a trade off of display quality versus processing time, so keep in mind that some modes will take longer to draw on the screen than others.

  • STFT (short-time Fourier transform) - This refers to the method that's used to transform the audio data into the spectrogram display. This type of spectrogram is the most common one and can be found in other editors. It has a fixed uniform time-frequency resolution. This is the simplest and fastest drawing mode in RX.
  • Auto-Adjustable STFT - This mode automatically adjusts FFT size (i.e. time and frequency resolution of a spectrogram) according to the zoom level. For example, if you zoom in horizontally (time) you'll see that percussive sounds and transients will be more clearly defined. When you zoom in vertically (frequency), you'll see individual musical notes and frequency events will appear more clearly defined.
  • Multi-resolution - This mode calculates the spectrogram with better frequency resolution at low frequencies and better time resolution at high frequencies. This mimics psychoacoustic properties of our perception, allowing the spectrogram display to show you the most important information clearly.
  • Adaptively Sparse - This mode automatically varies time and frequency resolution of a spectrogram to achieve the best spectrogram sharpness in every area of a time-frequency plane. This often lets you see the most details for a thorough analysis, but it's the slowest mode to calculate.


FFT size - FFT is a fast Fourier transform, a procedure for calculation of a signal frequency spectrum. The higher the FFT size, the greater the frequency resolution i.e. notes and tonal events will be clearer. However, choosing a larger number here will make time events less sharply defined because of the way this type of processing is done. Choosing "Auto-Adjustable" or "Multi-resolution" modes allow you to get a good combination of frequency and time resolution without having to change this setting as you work.

Enable reassignment - This control enables a special technique for spectrogram calculation that allows very precise pitch tracking for any harmonic components of the signal. When used together with Frequency overlap / Time overlap controls, this option can provide virtually unlimited time and frequency resolution simultaneously for signals consisting of tones.

Window - The Window control lets you choose different weighting windows that are used for the FFT analysis. This gets into some complex mathematics, but trying each of these will have different spectrum smoothing properties and prevent "spectral leakage" in different ways.

Frequency Scale - Using different frequency scales can help you see useful information more easily. Different scales have different characteristics for displaying the vertical (frequency) information in the spectrogram display.

  • Linear - this simply shows frequencies spread out in a uniform way. This is most useful when you want to analyze higher frequencies.
  • Logarithmic - this scale puts more attention on lower frequencies.
  • Mel - the Mel scale (derived from the word Melody) is a freqency scale based on how humans percieve sound. This selection is one of the more intuitive choices because it corresponds to how we hear differences in pitch.
  • Bark - the Bark scale is also based on how we percieve sound, and corresponds to a series of critical bands.


Frequency overlap (a.k.a. zero padding factor) - controls the amount of oversampling on the frequency scale of spectrogram. When used together with the Reassignment option, it will increase the resolution of the spectrogram vertically (by frequency).

Time overlap - This controls the time oversampling of the spectrogram. In most cases, overlap of 4x or 8x is a good setting to start with. However using higher overlap together with the Reassignment option will increase the time resolution of a spectrogram, letting you see transient events clearly.

Color Map - RX's spectrogram display allows you to choose between several different color schemes. There is no right or wrong color setting to use-we recommend you try them all. Some times certain color modes will make different types of noise stand out more clearly. Experiment!

High quality rendering - turning this control off makes spectrogram rendering slightly faster, but you'll loose some detail and clarity in the spectrogram image.

Reduce Quality Above - RX's spectrogram uses very accurate rendering letting you see audio problems, such as clicks, even at zoomed-out state. However doing such rendering for long files can be somewhat slow. So, when the length of the visible spectrogram is above the specified number of seconds, the spectrogram calculation is changed to a fast and less accurate preview mode. When the user zooms in, the spectrogram calculation becomes accurate again.

©Copyright 2001-2018, iZotope, Inc. All Rights reserved.
License Agreement | Privacy Policy