I remember how eager I was to get into music production. The arrangement possibilities were endless, and I could learn how to mix music to sound like what I heard. Unfortunately, in the chaos of beginning to produce, I didn’t learn the basics of how a computer actually handles audio, so the whole concept of making music on a laptop felt a bit abstract.
Even bouncing my first track was confusing. What does each of the options do? How was I supposed to know what would sound best?
In this article, we’ll cover some basic aspects of digital audio, and how they affect the production process. Today, we’ll focus on audio sample rate and audio bit depth, as well as a few topics related to them. It’s a bit of theory and a bit of math, but hopefully it will peel away some of the mystery behind how digital audio works.
What is digital audio?
Digital audio is a representation of sound recorded or converted into a digital signal. During the analog to digital conversion process, amplitudes of an analog sound wave are captured at a specified sample rate and bit depth and converted into data a computer software can read.
The main difference between sound and digital audio is that digital audio is a series of amplitude values used to reconstruct the original analog sound wave whereas as analog sound is a continuous signal with infinite amplitude values at any one point in time. Digital audio is like playing connect-the-dots, whereas real sound is the full original image.
Quantization: audio-to-digital conversion
The analog-to-digital conversion process is called quantization and it's very similar to the way cameras capture video. A video camera reconstructs a continuous moment in time by capturing thousands of consecutive images per second, called frames. The higher the frame rate, the smoother the movie. In digital audio, an anlog-to-digital converter captures thousands of audio samples per second at a specified sample rate and bit depth to reconstruct the original signal. The higher the sample rate and bit depth, the higher the audio resolution.
What is an audio sample rate?
Sample rate is the number of samples per second that are taken of a waveform to create a discete digital signal. The higher the sample rate, the more snapshots you capture of the audio signal. The audio sample rate is measured in kilohertz (kHz) and it determines the range of frequencies captured in digital audio. In most DAWs, you’ll find an adjustable sample rate in your audio preferences. This controls the sample rate for audio in your project.
The options you see in the average DAW—44.1 kHz, 48 kHz—may seem a bit random, but they aren’t!
Sample. rates aren't arbitrary numbers. The computer should be able to recreate waves with frequencies up to 20 kHz in order to recreate frequencies within the range of human hearing—humans hear frequencies between 20 Hz and 20 kHz. But for computers to recreate that, they have to use sample rates double that. So a sample rate that is 40 kHz should technically do the trick, right?
This is true, but you need a pretty powerful—and at one time, expensive—low-pass filter to prevent audible aliasing. The sample rate of 44.1 kHz technically allows for audio at frequencies up to 22.05 kHz to be recorded. By placing the Nyquist frequency outside of our hearing range, we can use more moderate filters to eliminate aliasing without much audible effect. Most people lose their ability to hear upper frequencies over the course of their lives and can only hear frequencies up to 15 kHz–18 kHz. However, this “20-to-20” rule is still accepted as the standard range for everything we could hear.
This means we can capture and reconstruct the original sine wave’s frequency with an audio sample rate at least twice its frequency, a rate called the Nyquist rate. Conversely, a system can capture and recreate frequencies up to half the audio sample rate, a limit called the Nyquist frequency.
Signals above the Nyquist frequency are not recorded properly by audio-to-digital converters (ADCs), becoming mirrored back across the Nyquist frequency and introducing artificial frequencies in a process called aliasing.
To prevent aliasing, audio-to-digital converters are often preceded by low-pass filters that eliminate frequencies above the Nyquist frequency before audio reaches the converter. This will prevent unwanted super-high frequencies in the original audio from causing aliasing. Early filters could taint the audio, but this problem is being minimized as better technology is introduced.
Looking to experiment with audio concepts in your DAW?
What sample rate should I record at?
When recording, mixing, and mastering, it's always advantageous to work at the highest sample rates and bit-depths possible: 48 kHz, 96, kHz, or even 192 kHz. This allows for greater resolution in all mixing and effects and gives you the flexibility of bouncing down to a sample rate compatible with your medium of distribution. However, once it comes to bouncing down your audio, you'll have to choose a bit depth and sample rate that's compatible with your medium of distribution.
The standard sample rate for CDs, streaming, and consumer audio is 44.1 kHz, 48kHz is often used in audio for video, and 96 kHz or 192 kHz is used for archival audio.
44.1 kHz vs. 48 kHz
If you’re recording music, a standard sample rate is 44.1 kHz or 44,100 samples per second. This is the standard for most consumer audio, used for formats like CDs. 48 kHz is another common audio sample rate used for movies. The higher sample rate technically leads to more measurements per second and a closer recreation of the original audio, so 48kHz is often used in audio for video which usually calls for a big dynamic range.
96 kHz vs. 192 kHz
Given that 192 kHz is taking twice as much samples per second as 96 kHz, it will require double the amount of hard-drive space to store. While using high sample rates like 96 kHz and 192 kHz will give you the highest resolution audio, it takes a lot of processing power and the difference is rarely noticeable to the human ear. For most musical applications, recording at 48 kHz through a good audio interface will yield excellent results.
Can you hear the difference between audio sample rates?
Some experienced engineers may be able to hear the differences between sample rates. However, as filtering and analog/digital conversion technologies improve, it becomes more difficult to hear these differences.
Is a higher audio sample rate better?
In theory, it’s not a bad idea to work at a higher audio sample rate, like 176.4 kHz or 192 kHz. The files will be larger, but it can be nice to maximize the sound quality until the final bounce. In the end, however, the audio will likely be converted to either 44.1 kHz or 48 kHz. It is mathematically much easier to convert 88.2 to 44.1 and 96 to 48, so it’s best to stay in one format for the whole project. However, a common practice is to work in 44.1 kHz or 48 kHz.
If the system was set to a sample rate of 48 kHz and we used a 44.1 kHz audio file, the system would read the samples faster than it should. As a result, the audio would sound sped up and slightly higher-pitched. The inverse happens if the system sample rate is on the 44.1 kHz scale and audio files are on the 48 kHz scale; audio sounds slowed down and slightly lower-pitched.
Super-high audio sample rates also have an interesting creative use. If you’ve ever lowered the pitch of a standard 44.1 kHz audio file, you’ve probably noticed the highs become somewhat empty. Frequencies above 22.05 kHz were filtered out before conversion, so there is no frequency content to pitch down, resulting in a gaping hole in the highs.
However, if this audio were recorded at 192 kHz, for example, frequencies of up to 96 kHz in the original audio would be recorded. This is obviously way outside of what humans can hear, but pitching the audio down causes these inaudible frequencies to become audible. As a result, you can greatly drop a recording’s pitch while preserving high-frequency content. For more information on audio sample rate, be sure to check out the video below.
What is audio bit depth?
The audio bit depth determines the number of possible amplitude values we can record for each audio sample. The higher the bit depth, the more amplitude values per sample are captured to recreate the original audio signal.
The most common audio bit depths are 16-bit, 24-bit, and 32-bit. Each is a binary term, representing a number of possible values. Systems of higher audio bit depths are able to express more possible values:
- 16-bit: 65,536 values
- 24-bit: 16,777,216 values
- 32-bit: 4,294,967,296 values
Higher bit depths mean higher resolution audio; if the bit depth is too low, some information of the original audio signal will be lost. With a higher audio bit depth—and therefore a higher resolution—more amplitude values are available for us to record. As a result, the continuous analog wave’s exact amplitude is closer to an available value when sampled. Therefore, a digital approximation of the amplitude becomes closer to the original fluid analog wave.
- 16-bit: 65,536 amp. values
- 24-bit: 16,777,217 amp. values
- 32-bit: 4,284,967,296 amp. values
Increasing the audio bit depth, along with increasing the audio sample rate, creates more total points to reconstruct the analog wave.
However, the fluid analog wave does not always perfectly line up with a possible value, regardless of the resolution. As a result, the last bit in the data denoting the amplitude is rounded to either 0 or 1, in a process called quantization. This means there is an essentially randomized part of the signal.
In digital audio, we hear this randomization as a low white noise, which we call the noise floor. Like the mechanical noise introduced in an analog context or background noise in a live acoustic setting, digital quantization error introduces noise into our audio.
Harmonic relationships between the sample rate and audio, along with the bit depth, can cause certain patterns in quantization. This is known as correlated noise, which we hear as resonances in the noise floor at certain frequencies. Here, our noise floor is actually higher, taking up potential amplitude values for a recorded signal.
However, we can perform artificial randomization to make sure these patterns don’t occur. In a process called dithering, we can randomize how this last bit gets rounded. Patterns are not created, creating more randomized “uncorrelated noise” that leaves more potential amplitude values.
The amplitude of the noise floor becomes the bottom of our possible dynamic range. On the other side of the spectrum, a digital system can distort if the amplitude is too high when a signal exceeds the maximum value the binary system can create. This level is referred to as 0 dBFS.
In the end, our audio bit depth determines the number of possible amplitude values between the noise floor and 0 dBFS.
Can you hear the difference between audio bit depths?
You may be thinking, “Can human ears really tell the difference between 65,536 and 4,294,967,296 amplitude levels?”
This is a valid question. The noise floor, even in a 16-bit system, is incredibly low. Unless you need more than 96 dB of effective dynamic range, 16-bit is viable for the final bounce of a project.
However, while working on a project, it’s not a bad idea to work with a higher audio bit depth. Because the noise floor drops, you essentially have more room before distortion occurs—also known as headroom. Having this extra buffer space before distortion is a good failsafe while working and provides more flexibility.
For more information on audio bit depth, be sure to check out the video below.
What should my sample rate and bit depth be?
For music production try a sample rate of 48 kHz at 24 bits. This strikes a nice balance between quality, file size, and processing power. However, the right sample rate and bit depth will ultimately depend on what medium of distribution you're mastering your audio for.
Summary: sample rate vs bit depth
In summary, sample rate determines the number of snapshots taken to recreate the original sound wave while bit depth determines how many amplitude values each of those snap shots contain. Together bit depth and sample rate work together to determine audio resolution. You should try producing at the highest values possible and later bounce your high fidelity master to a bit depth and sample rate suited for the intended medium of distribution.