Cochlear Implant Atlas
CI Atlas · Auditory Physiology · Module 02

2Sound & acoustics

Hearing begins not in the ear but in the air. Before any physiology happens, there is a physical signal — a travelling disturbance of air pressure — with properties the ear has evolved to measure: how fast it oscillates, how far, and how those oscillations are mixed across frequency and time. Get these few acoustic ideas straight and the rest of the chapter falls into place, because everything the cochlea, the nerve, and ultimately a cochlear implant does is an attempt to capture and re-represent this signal.

FWhat sound is

Sound is a pressure wave. When something vibrates — a vocal fold, a guitar string, a loudspeaker cone — it pushes and pulls on the air next to it, alternately compressing and rarefying it. Each air molecule only jostles a tiny distance back and forth, but it collides with its neighbours, and the disturbance propagates outward as a wave. What travels is the pattern of pressure change, not the air itself; the molecules stay roughly where they were.[2012]

Two consequences follow immediately. First, sound needs a medium — there is no sound in a vacuum. Second, what the ear ultimately has to measure is a fluctuating pressure at one point in space: the eardrum sits in the path of the wave and moves in and out as the pressure rises and falls. Everything downstream is the nervous system's reading of that one wiggling membrane.

A sound wave is a travelling pattern of pressure

compressions (dark bands) chase rarefactions →
Period2.27 ms
Wavelength in air78 cm
Pitchmid

Each molecule oscillates in place; what moves across the field is the pattern. Raise the frequency and the compressions pack closer (shorter wavelength); raise the amplitude and each molecule swings further (a louder sound). This is the signal the eardrum reads off as it wiggles in and out.

FFrequency & amplitude — the roots of pitch and loudness

The simplest sound is a pure tone: a single sinusoidal oscillation, like an idealised tuning fork. It has just two properties, and they map onto the two most basic perceptual qualities:

  • Frequency — how many pressure cycles occur each second, measured in hertz (Hz). Frequency is the main physical correlate of pitch: more cycles per second, higher pitch. The healthy young human ear responds from roughly 20 Hz to 20,000 Hz.
  • Amplitude — how large the pressure swing is. Amplitude is the main physical correlate of loudness: bigger swings, louder sound.

Drag the two sliders below. Raising frequency packs more cycles into the same window of time; raising amplitude makes the wave taller. The period — the time for one cycle — is simply the inverse of frequency.

Pure-tone waveform — frequency & amplitude

0 ms10 mstime →
Frequency → pitch440 Hz
Period2.27 ms
Amplitude → loudness-1.9 dB

Two independent properties. Frequency — the number of pressure cycles per second (hertz) — sets pitch: more cycles in the window, higher pitch. Amplitude — the height of the wave — sets loudness. The period is simply 1/frequency. Natural sounds are not single sinusoids but sums of many, which is the next idea: spectrum.

FTMeasuring level — the decibel

The ear copes with an astonishing range of sound pressures — the loudest tolerable sound is around a million times more intense in pressure than the faintest audible one. A linear scale would be unwieldy, so sound level is expressed logarithmically in decibels of sound pressure level (dB SPL), referenced to 20 micropascals — about the quietest sound a healthy young ear can detect, which is therefore defined as 0 dB SPL.[2012]

Because the scale is logarithmic, equal steps in decibels are equal ratios in pressure: every 20 dB is a tenfold change in sound pressure. This compresses the enormous physical range into the familiar 0–120 dB span. Slide the marker up the ladder.

The decibel ladder (dB SPL)

0Threshold of hearing20A whisper40Quiet library60Conversation80Busy traffic / alarm clock100Nightclub120Jet at takeoff — pain begins60 dB
Level60 dB SPL
ZoneConversational

Conversational: The range most speech lives in. The decibel is logarithmic — every 20 dB is a tenfold change in sound pressure, so the scale compresses an enormous physical range into manageable numbers. 0 dB SPL is the reference (≈ 20 µPa), and the usable range of human hearing runs to about 120 dB before discomfort — a dynamic range of roughly a million-to-one in pressure that the cochlea handles routinely.

Why audiology lives in decibels

The whole audiogram — the map of a person's hearing thresholds — is plotted in decibels (dB HL, a clinically referenced cousin of dB SPL). A “profound” hearing loss of 90 dB does not mean sound is 90 units quieter; it means thresholds are elevated by a factor of tens of thousands in pressure. The logarithmic scale is why small-looking numbers on an audiogram describe very large losses.

FTSpectrum & timbre

Almost no real sound is a pure tone. The decisive idea of acoustics — Fourier's — is that any sound can be described as a sum of pure tones, each with its own frequency and amplitude. That recipe is the sound's spectrum: a plot of how much energy sits at each frequency.[2012]

A periodic sound — a sustained musical note, a vowel — has a special spectrum: energy only at integer multiples of a single fundamental frequency, called the harmonics. The fundamental sets the perceived pitch; the relative strengths of the harmonics give the sound its timbre — the quality that lets you tell a violin from a flute playing the same note. Switch between the three sound types below.

Spectrum — pure tone, harmonics, and formants

01k2k3kfrequency (Hz) →

A spoken vowel is a harmonic series (multiples of the voice fundamental, here ≈ 128 Hz) whose amplitudes are sculpted by vocal-tract resonances — the formants (dashed envelope, peaks ≈ 512, 1792, 2432 Hz for /ɛ/). The cochlea, and any hearing device, must capture this spectral shape to convey speech.

TThe speech signal

Speech is the sound that matters most for a hearing device, and it has a characteristic structure. Voiced sounds— vowels and sounds like /m/ or /z/ — are produced by the vocal folds opening and closing rhythmically, generating a harmonic series whose fundamental frequency (F0) is heard as the speaker's pitch. Heavier folds give a lower F0 with closely spaced harmonics (typical adult male voice); lighter folds give a higher F0 with wider spacing (typical female or child voice).[2009]

The vocal tract above the folds then acts as a resonator, emphasising certain frequency bands — the formants. The pattern of formant peaks is what distinguishes one vowel from another: the vowel in “bet” (/ɛ/), for example, has formant peaks near 512, 1792, and 2432 Hz. Consonants, by contrast, are brief and spectrally dynamic — bursts, hisses, and rapid transitions — and carry much of the information that makes speech intelligible.[1952, 2009]

Where the speech information lives

Vowels are loud, low-frequency, and steady; consonants are soft, higher-frequency, and fleeting — yet consonants do much of the work of intelligibility. This is why a hearing loss that spares the low frequencies but takes the highs can leave speech audible but unclear: the vowels come through, the consonants do not. The same logic shapes how a cochlear implant allocates frequency across its electrodes.

FWhy this matters for hearing devices

Every property in this module becomes a design constraint for a hearing device. A cochlear implant has to capture the incoming pressure wave with a microphone, measure its frequency content (to decide which electrodes to stimulate), track its level (to set how much current to deliver), and follow its changes over time(to convey the dynamics of speech) — all while squeezing the ear's ~120 dB acoustic range into the much narrower range of comfortable electrical stimulation.[2009]

The audible field — where hearing, and speech, live

1001k10k04080120speechfrequency (Hz)level (dB SPL)
20 Hz–20 kHz
frequency range of healthy young hearing
≈120 dB
dynamic range from threshold to discomfort
most acute
near 2–4 kHz — the speech-critical band

Hearing is most sensitive (lowest threshold) in the mid frequencies and speech sits in the middle of both axes — which is exactly the region a hearing device must protect. The ear is least sensitive at the extremes, where the threshold curve climbs.

Hold on to three numbers as you go: hearing spans roughly 20 Hz–20 kHz in frequency and about 120 dB in level, and speech lives mostly in the middle of both ranges. The next module follows this airborne wave to the threshold of the inner ear, where the outer and middle ear hand it across to fluid.

Case 2.1 · Audible but unclear
A patient with a sloping high-frequency hearing loss says she can hear that people are talking — she can tell when someone is speaking and roughly how loud — but she cannot make out the words, especially fast speech or unfamiliar names. Her low-frequency hearing is near normal.

Which acoustic feature of speech best explains why she hears that speech is present but cannot understand it?

Self-assessment — Chapter 1, Module 23 questions
Question 1 · Foundation

On a pure tone, which physical property is the main correlate of pitch?

Question 2 · Foundation

The decibel scale for sound level is logarithmic. Roughly what does every 20 dB represent?

Question 3 · Trainee

What are the formants of a vowel?

Tracked locally in your browser — see /progress for the dashboard.