The compact disc format stores 16 bits per sample. Mathematically, that limits signal-to-noise ratio to 96 dB. Yet mastering engineers have been extracting 120 dB of perceived dynamic range from those same 16 bits since the 1990s. The mechanism is not magic and it is not marketing. It is an error feedback loop that trades noise where you can hear it for noise where you cannot.
The Problem with Rounding
When a 24-bit mix gets reduced to 16-bit for CD distribution, the bottom 8 bits are discarded. Every sample gets rounded to the nearest 16-bit value. The rounding error is bounded between -0.5 and +0.5 LSB (least significant bit) and has an RMS value of LSB/√12, approximately 0.289 LSB. That is the textbook figure. The problem is not the amplitude — it is the correlation.
Without dither, quantization error is a deterministic function of the input signal. Feed in a 440 Hz sine wave and the rounding error repeats at 440 Hz and its harmonics. The ear registers this as harmonic distortion: a set of periodic artifacts that are, in psychoacoustic terms, far more objectionable than broadband noise of equal or even higher power. A noise floor you cannot identify as musical is tolerable. Distortion products that track the melody are not.
TPDF: Breaking the Correlation
The solution was formalized by Stanley Lipshitz and John Vanderkooy at the University of Waterloo in their 1984 AES paper "Dither in Digital Audio." Adding noise before quantization breaks the statistical link between input signal and rounding error. The specific distribution of that noise determines how well.
Rectangular Probability Density Function (RPDF) dither — one sample of uniform random noise — eliminates harmonic distortion but introduces a secondary artifact: noise modulation, where the noise floor rises and falls with signal level. The result sounds like a breathing artifact on quiet passages.
Triangular Probability Density Function (TPDF) dither, formed by summing two independent RPDF sources, eliminates both problems simultaneously. The triangular distribution completely decorrelates the quantization error from the input, replacing periodic distortion products with spectrally flat noise at a constant, signal-independent level. The cost is a slightly higher noise floor than RPDF, but the noise sits uniformly — audible only on silence, not during material.
The Noise Shaping Loop
TPDF dither solves the distortion problem while leaving the question of where noise energy sits spectrally. Noise shaping addresses this directly.
The core mechanism is an error feedback loop:
- Quantize the current sample to the target bit depth
- Measure the quantization error (input minus quantized output)
- Feed that error back into the next input sample before quantization
- Repeat for every sample
# First-order noise shaping (samples normalized to -1.0 .. +1.0)
scale = 2 ** (target_bits - 1)
error = 0.0
for sample in input_signal:
# Inject previous quantization error before rounding
pre_quantize = sample + error
quantized = round(pre_quantize * scale) / scale
error = pre_quantize - quantized # error fed into next iteration
output.append(quantized)
This first-order feedback loop is equivalent to applying a high-pass filter to the error signal. The noise transfer function (NTF) becomes a high-pass response; the signal transfer function (STF) stays flat. Total noise power increases — the feedback loop redistributes energy, it does not destroy it — but the spectral shape shifts dramatically. Low-frequency noise drops. High-frequency noise rises.
Higher-order noise shapers use multiple delayed taps in the feedback path, applying steeper suppression across the midrange and concentrating the noise peak close to the Nyquist boundary. Each additional order allows a sharper spectral notch in the audible band, at the cost of a more pronounced ultrasonic noise peak.
Why High Frequencies Work
Fletcher and Munson established in 1933 — later codified in ISO 226 as the equal-loudness standard — that human hearing sensitivity varies enormously with frequency. The ear is most sensitive between 2 and 5 kHz due to ear canal resonance and middle ear mechanics. Sensitivity at 15 kHz requires roughly 30 dB more SPL to produce the same perceived loudness as a tone at 3 kHz. Above 18 kHz, most adults hear nothing at all.
Noise shaping exploits this asymmetry. By pushing quantization error into the 15-20 kHz band, it places the noise energy exactly where the ear's effective gain is lowest. The total noise power in the system has increased. The perceived noise has dropped substantially.
The result: well-designed noise shaping raises the perceived dynamic range of 16-bit audio from 96 dB to approximately 120 dB. That is 24 dB recovered from a format with a fixed mathematical ceiling, without changing the bit depth.
Three Different Approaches
Commercial implementations differ primarily in how they shape the noise spectrum — the feedback filter coefficients determine where the noise notch falls and how steep it is.
POW-r (Psychoacoustically Optimized Wordlength Reduction) was developed between 1997 and 1998 by a consortium of four companies: Lake Technology, Weiss Engineering, Millennia Media and Z-Systems. Available in products from 1999, it offers three algorithm variants tuned for different material: POW-r 1 for spoken word and wide dynamic range material, POW-r 2 for dense mixes, POW-r 3 for orchestral and audiophile content. The consortium formed specifically because a license change on the leading bit-depth reduction algorithm of the time made alternatives commercially necessary.
iZotope MBIT+ uses a psychoacoustic noise distribution model and offers a noise shaping range from flat to maximum, with higher settings providing approximately 14 dB of additional audible noise suppression at the cost of a more pronounced ultrasonic shelf.
Apogee UV22 takes a different approach entirely. Rather than a conventional feedback noise shaper, it modulates the least significant bits onto a high-frequency bias signal near 22 kHz, concentrating all quantization energy at a single narrow band. The result is a consistent noise floor at inaudible frequencies, with no low-level noise modulation across the spectrum.
When to Apply Dither
Dither and noise shaping belong at one place in the signal chain: the final wordlength reduction before delivery. Applying dither earlier — at plugin boundaries, during internal processing — adds noise without benefit. Most modern DAWs process internally at 32-bit or 64-bit floating point, where the noise floor sits far below any audible threshold. Dithering at that stage is pointless. Dithering twice adds the noise floors together.
The workflow is: process in floating point, apply dither once when bouncing the 16-bit master. If you are delivering 24-bit files for streaming (which handles its own downstream processing), dither is optional since the 24-bit noise floor at -144 dBFS is below any audible threshold under normal conditions.
The 16-bit format has carried commercial audio for four decades. It continues to do so not because 96 dB is sufficient, but because careful noise shaping has consistently delivered two to three bits beyond what the math implies. The engineering behind that headroom is one of the less visible examples of digital audio theory producing a tangible result in the physical world.