CI Atlas · Audiological Evaluation · Module 05

5Speech audiometry — SRT, word recognition & rollover

Pure tones tell you what a person can detect; speech tests tell you what they can understand — and understanding is, in the end, what hearing is for. Speech audiometry begins with a simple cross-check, the speech reception threshold, which should agree with the pure-tone average and so validates the audiogram. It then moves above threshold to ask a harder question: of the words that are clearly audible, how many are correctly identified? The answer, the maximum word score, characterises the clarity of the ear, and the way that score behaves at still-higher levels — whether it holds or rolls over — is one of the battery's classic pointers beyond the cochlea. This module covers how speech is tested and why the seemingly mundane choices of material, talker and level decide whether the number means anything.

TThe speech reception threshold

The speech reception threshold (SRT) is the lowest level at which a listener correctly repeats 50% of two-syllable spondee words. Its main job is a cross-check: it should agree with the pure-tone average within ~10 dB, validating the audiogram (and flagging a non-organic loss when it does not).

CWord recognition, PB-max & rollover

Suprathreshold word recognition uses phonetically-balanced monosyllables (CNC) presented well above threshold; the peak score is PB-max. A normal ear reaches ~100%, a cochlear loss plateaus lower, and a fall in score at still-higher levels — rollover — is a retrocochlear / neural sign.[2009]

CRecorded, not live voice

Recorded materials are mandatory. Monitored-live-voice and recorded word scores differed in 72% of ears (by up to 80 percentage points), because the VU meter cannot track low-level consonants and a live talker subtly adapts. Only recorded, calibrated speech gives a reproducible, comparable result.[2020]

The same ear yields different numbers depending on method. Sentences score higher than words because context helps — so a listener can pass sentences yet fail words, flagging a temporal/memory problem. Monitored live voice inflates and is irreproducible (recorded vs MLV scores differed in 72% of ears), so recorded materials are mandatory. And scores fall at soft levels — conversational speech sits near 60 dB SPL, which is why 60 dBA is preferred over a flattering 70 dB. The candidacy cutoffs that use these materials live in the Candidacy chapter. Schematic.

CLevel, words vs sentences, open vs closed

Three more choices shape the number. Presentation level: scores fall at soft levels, and since conversational speech sits near 60 dB SPL, candidacy-relevant testing favours 60 dBA over a flattering 70 dB. Words vs sentences: monosyllables strip away context while sentences add it, so a listener can pass sentences yet fail words — a dissociation that flags a temporal-processing or memory deficit. And open- vs closed-set formats trade realism against the need to correct for chance. How these materials become candidacy cutoffs is the subject of the next chapter.

Case 11.5 · Good words, then worse

An adult's word-recognition score peaks at 64% at a moderate level, then falls to 40% when the level is raised further. The audiogram is symmetric.

What does this pattern suggest?

Self-assessment — Module 52 questions

Question 1 · Trainee

What is the main role of the speech reception threshold (SRT)?

Question 2 · Clinician

Why are recorded (not monitored-live-voice) speech materials mandatory?

Tracked locally in your browser — see /progress for the dashboard.