Learn

Psychoacoustics: How Perception Influences Music Production

by Daniel Dixon, iZotope Contributor March 28, 2019

Balance your own mix:

Neutron 3

iZotope email subscribe

Never Miss An Article!

Sign up for our newsletter and get tutorials and tips delivered to your inbox. 

Though we can measure sound with meters and spectrum analyzers, how we experience it is a matter of human perception—otherwise known as the field of psychoacoustics.

Even if you aren’t aware of the term, you likely engage with psychoacoustic principles in music production on a regular basis—for example, manipulating an unusual sound source to be heard as a conventional instrument. With a basic understanding of how humans interpret and react to sound, you can create more satisfying mixes that play on the experience of listening, independent of software and equipment. Let’s take a look at some examples.

Limits of hearing

The best place to start with psychoacoustics is to get familiar with the limits of human hearing. You probably already know that we can hear sounds within a range of 20 Hz to 20 kHz (20,000 Hz), with the upper limit decreasing to around 16 kHz with age. Noise-induced hearing loss and tinnitus will impact the perception of sound too, and for producers with these conditions, workarounds need to be developed to achieve balanced mixes.

Due to our hearing limits, you may find that high-passing frequencies around 30 Hz brightens a mix by removing unimportant low-end information that is hard to perceive, although this is not always the case. Increasing that filter to 50–60 Hz, while reducing the high-end to 10–12 kHz, will make mixes and instruments sound “lo-fi,” replicating the poor frequency response of old recording technology.

Read about the evolution of frequency response in popular records from the 1920s to today here.

Sound sensitivity

In modern music, the key to achieving a pleasing mix is an even balance of frequencies across the spectrum. While simple in theory, this is often a challenge to pull off since our ears do not perceive all frequencies equally, specifically in the high mid-range (between 2500–5000 Hz) where we are most sensitive.

This is the main reason behind the commonly used “smiley face” EQ curve, where the mids are scooped out and the lows and highs are boosted. At low levels, a broad bass and treble boost will make a mix sound more balanced and powerful, but negatively impact dynamic range and even introduce distortion, a classic case of louder sounding better.

As we crank up playback level, our ears’ frequency response across the spectrum begins to even out, and some of the issues introduced by our selective ears are eliminated. But, as you know, producing at high levels for extended periods of time is both damaging to our ears and misleading, since it makes us think every instrument is upfront in the mix. This becomes evident when levels are turned down and what we’re working on seems out of control.

So how do we make sense of all of this in a music production setting? What EQ and level settings do we use in order to remain neutral?

One solution is Tonal Balance Control, which analyses and visualizes your audio against a custom target (a genre, song, or collection of songs) so you can get a better idea of how well frequencies are being distributed across the spectrum. Capable of reflecting spectral and level changes in real-time, Tonal Balance Control takes the guesswork out of balancing a mix by providing you with an objective quality review of your mix. It can also integrate with the EQ in Neutron, allowing you to make adjustments to instrument curves and see the results within the same window.  

The acoustic environment of the rooms in which we produce music poses yet another obstacle to getting an accurate representation of our sound, at times making it quite difficult to know just what a mix needs to shine—is the high end overwhelming? Or is it the bass that’s lacking?

If you regularly find yourself using an EQ to cut or boost a specific set of frequencies, it is possible the wacky acoustics of your studio space are the culprit. Tonal Balance Control confirms or denies these suspicions so you can make quick and informed decisions about what needs attention in your mix.

Unmasking instruments

The more sounds we add to a mix, the harder it is to separate them, and the more frequency masking occurs. This is particularly noticeable between instruments that share similar frequencies—if a kick and bass note occur at the same time, one will mask parts of the other, sometimes to the point of being inaudible.

Masking is one of the most common psychoacoustic phenomena, and is present in all mixes, demos, and polished masters. But too much of it may be undesirable. To reduce masking in a production or engineering context, we use EQ to carve out a unique space in the spectrum for each element in a mix. Some of this work can also be taken care of at the writing and arrangement stage by choosing instrument and notes that don’t reach over one another.

Even if you take these precautions, we often add and remove song parts throughout the course of a mix, shifting harmonic structure and causing masking issues to surface. For this reason, a quick solution to solve masking is required for efficient workflow. That’s why we came up with the Unmask feature in Neutron, which reveals competing frequencies between two tracks in a single window, and allows you to EQ them independently or with an inverse curve, meaning a cut in one EQ will have a complementary boost in the other.

Like Tonal Balance Control, the Unmask feature in Neutron gives you a big picture view of your mix with an unbiased perspective, along with control over the smallest details, so you can more easily resolve common audio perception problems and stay focused on the music.

Spatial location

Having two ears instead of one allows us to more accurately determine the location of sound. At a crowded party, localization tells us which direction people are speaking to us from. It also provides cues as to whether traffic is moving toward us or away from us, and where our keys are hiding in our jacket. In a music production context, we place sounds at various spatial locations to achieve a sense of mix width and depth.  

Width is the stereo field from left to right. A key psychoacoustic principle used to achieve the illusion of width is the Haas effect, which explains that when two identical sounds occur within 30 milliseconds of one another, we perceive them as a single event. Depending on the source material, the delay time can reach 40 ms.

I wager the most common application for this effect is on vocals. To create a stereo vocal in a pop chorus, we duplicate the mono lead, add a slight delay to the copy, then pan each part in opposite directions. In addition to opening up the center of a mix for other sounds, this move allows the listener (and producer) to perceive anthemic vocal width from a single mono source. Our free Vocal Doubler plugin, released this past October, is based on this exact concept. 

With a short delay times between 5–15 ms between two identical sounds, you will notice some funny metallic sounds that occur as a result of the signals going in and out of phase with each other—this comb filtering effect is the underlying concept for audio processors like chorus, flangers, and phasers, explored in depth in the linked article.

Long delays in the range of 50-80 ms break the illusion, and the second sound will be perceived as an echo. This is rarely a desired effect for pop, but can produce some psychedelic and disorienting moments in more experimental music.

Depth, the front-back space in a mix, is a trickier concept to navigate, but once again, we have psychoacoustic principles as a guide. If loud and bright sounds appear closer, we can push sounds further away by rolling off low and high frequencies—essentially flipping the “smiley face” curve upside down.

This trick works because it mimics how a sound wave travels in the natural world—the further it goes, the more it’s high frequencies are absorbed by the air, until it disappears completely. Removing some low-end enhances this illusion in a DAW. Think of this next time you shout into a canyon or large open space.

Conclusion

Based on our hearing capabilities and acoustic environment (among other factors), how we perceive sound changes. With a basic understanding of psychoacoustics, you can more easily shape music to provoke a specific response in the listener. The examples in this article provide a few starting points, and with a bit of research you’re sure to find more.

iZotope Logo
iZotope Logo

We make innovative audio products that inspire and enable people to be creative.

Subscribe to our newsletter

Get top stories of the week and special discount offers right in your inbox. You can unsubscribe at any time.

Follow us

Copyright © 2001–2019 iZotope, Inc. All rights reserved.