AUDITORY SCENE ANALYSIS

Bregman's ASA · Stream segregation · Gestalt grouping in audition

Gestalt grouping principles in audition (Bregman 1990):
Auditory stream segregation occurs when tones interleaved in time are perceived as separate melodic streams rather than a unified pattern.

Streaming threshold factors:
· Frequency separation (primary driver)
· Tempo (faster → easier segregation)
· Timbre differences
· Spatial location differences

At ~4+ semitones and fast tempos, two streams are heard. Below ~2 semitones, one integrated stream. The boundary is perceptually hysteretic.
Peripheral basis — cochlear tonotopy:
High-frequency tones activate the base of the cochlea; low frequencies activate the apex. Simultaneous activity at nearby cochlear locations tends to fuse into one stream (frequency proximity = cochlear proximity).

The auditory system uses "harmonic templates" — a set of tones whose frequencies are in integer ratios is grouped as one sound source (fundamental + harmonics).
Auditory spectrogram of streaming sequence. The two interleaved tone sequences are shown. When they segregate perceptually, they form two distinct "auditory objects." Notice how the temporal pattern (galloping rhythm) breaks into two isochronous streams when frequency separation is large. The high stream and low stream each carry their own rhythm.