Blog / Sound Design

"The Uncertainty at the Heart of Every Granular Synth"

Photo by Logan Voss on Unsplash

The blurriness you hear when grain size drops below 20 milliseconds is not a design flaw. It is physics. In 1947, physicist Dennis Gabor derived a mathematical boundary between time precision and frequency precision that applies to all signals, not just synthesized ones. Every grain-size knob you have ever touched is a direct encounter with that boundary.

The Logon: Gabor's Indivisible Unit of Sound

Gabor was not building a synthesizer. He was trying to solve a telephone engineering problem: how much information does a speech signal actually carry, and how efficiently can it be transmitted? His 1947 paper "Acoustical Quanta and the Theory of Hearing" proposed that sound could be decomposed into elementary units he called logons, each a windowed burst of oscillation lasting no longer than 21 milliseconds. Each logon carried both a time coordinate (when it occurred) and a frequency coordinate (its pitch center).

The catch was fundamental: a logon could not be arbitrarily precise in both dimensions simultaneously. Make it shorter in time, and its frequency spread widens. Narrow the frequency spread by making it longer, and you lose temporal precision. This is the Gabor Uncertainty Principle, and it is the acoustic analogue of Heisenberg's uncertainty relation in quantum mechanics. Mathematically, the product of a signal's time duration and its bandwidth has a hard lower bound. A Gaussian window reaches that bound exactly, which is why granular synthesis literature so often treats the Gaussian envelope as the theoretically ideal grain shape.

Gabor was describing a constraint, not a technique. The technique came later.

From Physics to Music: Thirty Years of Catching Up

Iannis Xenakis read Gabor's work and grasped its compositional implications. In his 1971 book "Formalized Music," Xenakis proposed that musical texture could be built from clouds of elementary sound particles distributed according to stochastic probability functions. He called this "granular synthesis" and had already explored the concept by hand in his 1959 tape piece "Analogique B," where he manually spliced magnetic tape segments into grains. The idea was clear. What was missing was a computer fast enough to execute it.

Curtis Roads provided the first digital implementation. In 1978 his paper "Automated Granular Synthesis of Sound" appeared in the Computer Music Journal, describing a system that generated clouds of grains on a mainframe. The composition was abstract and the computational overhead enormous, but the technique worked. Roads spent the next two decades refining the theory, eventually publishing "Microsound" in 2001 (MIT Press), still the most comprehensive technical account of time-domain sound particles.

The breakthrough that made granular synthesis musically practical came in 1986, when Barry Truax developed the GSX system at Simon Fraser University. For the first time, grains could be generated and mixed in real time. His composition "Riverrun" pushed the system to a peak density of 2,375 grains per second, layering them into textures that moved from isolated droplets to solid sheets of sound. The piece received the Magisterium prize at the International Electroacoustic Music Competition in Bourges in 1991.

What the Uncertainty Principle Actually Means for Your Synth

When you turn a grain-size knob, you are trading time resolution for frequency resolution or vice versa. The effects are concrete and predictable.

At grain durations below roughly 10 milliseconds, individual cycles of most musical fundamentals do not fit inside a single grain. A 100 Hz tone has a period of 10 ms. A grain shorter than that cannot contain even one complete cycle, so the auditory system has nothing to lock onto as a pitch center. What you hear instead is a click, a transient, or a textural smear. This is not aliasing or a plugin artifact. The uncertainty principle guarantees it.

Between approximately 10 and 25 milliseconds, rudimentary pitch perception begins to emerge, but inconsistently. The threshold depends on frequency content: higher frequencies have shorter periods and can establish pitch in shorter grains, while lower fundamentals need longer windows. A 40 Hz sub bass has a 25 ms period and needs grains longer than that to sound pitched at all.

Above roughly 50 milliseconds, grains behave like recognizable sound fragments. Pitch is clear, and the source material becomes audible. This is the range where granular synthesis sounds most like time-stretching or pitch-shifting, because it effectively is.

Window Functions: Implementing the Tradeoff

Each grain needs an amplitude envelope, or window, applied to it. The window determines how a grain's edges taper, which directly affects its spectral character.

The Gaussian window achieves the minimum uncertainty product from Gabor's original formulation. It is the theoretically optimal shape for a joint time-frequency representation. Its limitation is practical: a Gaussian window never reaches exactly zero amplitude at its edges, which can cause low-level discontinuities and clicking artifacts, particularly at aggressive overlap ratios.

The Hanning window (a raised cosine) is the more common choice in plugin implementations. It reaches zero at both endpoints, eliminating discontinuities, and satisfies the Constant Overlap-Add property at 50% overlap. This means that when grains overlap by half their duration and are summed, the result is a constant amplitude, preventing the rippling volume fluctuations that occur with naively overlapping rectangular windows. Hanning adds slightly more frequency spread than a Gaussian of the same duration, but in practice the difference is inaudible.

The tradeoff between window shapes is a smaller version of the same tradeoff that governs grain duration: smoothness in one domain costs you something in the other.

Synchronous vs. Asynchronous Granular Synthesis

Grain density and timing distribution produce distinctly different sonic results, and the underlying reason connects back to the uncertainty principle.

Synchronous granular synthesis spaces grains at regular intervals. When grains are short, dense, and evenly spaced, their periodic arrival creates formant-like resonances in the output spectrum. The regularity imposes frequency structure on the cloud. This mode produces the pitched, bell-like tones associated with formant synthesis.

Asynchronous granular synthesis distributes grains according to a probability function, typically with randomized timing, position within the source material, pitch offset, and duration. Truax observed that irregular grain placement creates a "thickening effect" by smearing formant structures rather than reinforcing them. The result is the diffuse, cloudy texture that most people picture when they hear "granular synthesis." The randomness is not simply an aesthetic choice: it breaks the periodic regularity that would otherwise impose spectral coloring on the output.

Practical Implications

Understanding the uncertainty principle makes several common granular synthesis behaviors less mysterious.

The harsh, metallic edge you hear when grain size drops below 15 milliseconds while pitch transposition is high is a frequency-spread artifact. Short grains have wide spectral envelopes, and transposing them upward pushes that wide envelope into a range where it creates dissonant beating with other grains. Increasing grain size smooths it out at the cost of temporal precision.

The "frozen" quality of extreme time-stretch at small grain sizes happens because the grains are too short to carry recognizable sonic texture. You hear the grain boundaries and the window shape more than the source material.

The scatter or jitter parameter found on most granular synthesizers, which randomizes grain position within the source buffer, effectively converts synchronous synthesis toward asynchronous. A small scatter value adds the organic irregularity Truax described. Too much collapses into undifferentiated noise.

Modern implementations like Ableton's Granulator III (Robert Henke), Arturia's Pigments granular engine, and the Mutable Instruments Clouds module all expose these same tradeoffs under different labels. Clouds supports up to 40 to 60 concurrent grains, positions each within a 1-second recording buffer, and applies a diffusion network to the output. The knobs have different names, but the physics underneath them are the same physics Gabor described in 1947 while trying to save telephone bandwidth.

The uncertainty is not a bug in the design. It is the feature.