Cochlear Implant Atlas
CI Atlas · The Measure of Success: Speech, Hearing and Real-World Outcomes · Module 15

15Predicting and Benchmarking Outcomes

Patients and clinicians want a number before surgery: how well will I hear? Three decades of large datasets have produced prediction models built from the obvious variables, duration of deafness, age, etiology, residual hearing, and these variables are real and statistically robust. Yet they explain only a small fraction of the variance between recipients, which is why a preoperative model can describe a population but cannot promise an individual a score. This module covers the major prediction efforts, the humbling amount of variance they leave unexplained, how registries let a clinic benchmark its own results, and when a result below benchmark should trigger the poor-performer work-up.

FPrediction models and how little variance they explain

The large Blamey meta-analysis confirmed that duration of severe-to-profound loss, age at implantation, age at onset, etiology and implant experience all significantly affect outcome, yet these factors together accounted for only about 10% of the variance, down from about 21% in the earlier 1996 analysis. The Lazard conceptual-model study pooled 2,251 postlingual adult recipients across 15 international centres and improved the explained variance to about 22% by adding new factors. Lazard's newly significant variables included the duration of preceding moderate hearing loss and hearing-aid use during the profound period (which slowed the decline of speech-coding representations), plus residual better-ear pure-tone average and the proportion of active electrodes. The Holden prospective study of 114 postlingual adults identified duration of severe-to-profound loss, age at implantation, sound-field thresholds, electrode position/insertion depth and cognition as factors affecting open-set word recognition. Even the best models leave roughly three-quarters or more of inter-recipient variance unexplained, so a preoperative estimate is a population probability, not an individual guarantee. Gender, education and choice of better-versus-worse ear were not meaningful predictors in the large datasets.[2013][2012][2013]

Variance in adult CI outcome that the models explain

explainedunexplained remainderBlamey 199621%79% unexplainedBlamey 201310%90% unexplainedLazard 201222%78% unexplainedTap a model to read it
Lazard 2012 explains22%unexplained78%

The 2300-recipient Lazard model reached ~22% — the high-water mark, yet still ~78% of outcome remains unexplained. Even the best multivariate models of adult cochlear-implant outcome explain only about 10–22% of the variance, leaving ~78% or more unaccounted for. Known predictors set expectations and flag risk, but they cannot forecast an individual’s result — most of what drives outcome is still beyond our measured factors. Schematic.

TCounselling honestly from imperfect prediction

The right way to use a model is to set a realistic expected range and to flag risk factors (very long duration, prelingual onset, cochlear nerve concerns), not to quote a single predicted percent. Because most variance is unexplained, both better-than-expected and worse-than-expected results are common and should be pre-empted in counselling. Modifiable or favourable factors, shorter deprivation, consistent hearing-aid use beforehand, more residual hearing, can be discussed honestly without overpromising. Self-reported communication benefit often improves even when predicted speech scores are modest, so counselling should frame outcome across booth and daily-life domains. Documenting the expected range preoperatively makes it possible to recognise an under-performing result afterwards against the patient's own prediction.[2012][2013][2013]

Preoperative predictors vs factors that do NOT predict

Predicts outcomeDoes NOT predict1. Duration of severe-to-profound loss2. Age at implantation3. Age at onset of deafness4. Etiology of hearing loss5. Better-ear pure-tone average6. Prior hearing-aid use7. Cognition× Gender× Education level× Implant better vs worse earTap a factor for its evidence note
Duration of severe-to-profound losspredictive

The strongest single predictor: the longer the deprived interval before implantation, the lower the expected open-set score. The established preoperative predictors are duration of severe-to-profound loss, age at implantation, age at onset, etiology, better-ear PTA, prior hearing-aid use and cognition. Variables repeatedly shown not to predict include gender, education level and whether the better or worse ear is implanted — knowing what does not matter prevents unwarranted exclusion. Schematic.

TBenchmarking against registries and minimum standards

Because individual prediction is weak, a clinic's most useful reference is its own and peer-aggregated outcomes: registries and large pooled datasets define the distribution a recipient should fall within. Benchmarking compares a recipient's result, and a programme's aggregate results, against expected percentile bands for matched recipients rather than against a single pass threshold. Minimum-standard or expected-outcome bands let a clinic detect both individual outliers and systematic programme drift, for example a coding or fitting issue affecting many recipients. Pooled multicentre data such as the 2,251-patient dataset are valuable precisely because single-centre cohorts are too small to define stable expectations. Registry benchmarking also supports quality improvement and payer reporting by showing outcomes relative to an external reference.[2012][2013][2020]

Recipient score vs expected band for matched recipients

0255075100% open-set3 mo6 mo12 mo24 momonths since switch-on
Visit12 moRecipient36%Expected band4290%Statusbelow band — work-up

Each recipient is plotted against the 10th–90th-centile band of matched implant users at the same interval, with the registry median dashed. A point that drops below the band (flagged red at the 12-month visit here) triggers the poor-performer work-up: check mapping, electrode position, residual nerve and compliance before accepting the result. Comparing a programme’s aggregate against the pooled registry distribution audits the service the same way. Schematic.

CWhen below-benchmark triggers the poor-performer work-up

A result clearly below the expected band for a matched recipient is a signal, not a verdict, and should prompt a structured search for a cause rather than acceptance as just variance. The differential spans device/integrity issues, electrode position or migration, suboptimal mapping, declining residual factors, and recipient-side contributors such as limited use or cognitive change. A divergence between objective scores and self-report (good booth, poor real-world, or vice versa) is itself a flag that warrants investigation. Honest preoperative documentation of the expected range is what makes a below-benchmark result interpretable later; without it, under-performance hides in the wide normal spread. The systematic evaluation of the unexpectedly poor performer is the subject of the programming chapter; benchmarking is the trigger that routes a recipient into it. The goal of benchmarking is not to label recipients but to catch correctable problems early and to keep programme quality measurable.[2013][2020][2013]

Case 18.15 · Predicting and Benchmarking Outcom
A 62-year-old postlingual recipient with a 6-year duration of deafness and good preoperative residual hearing was counselled that he had favourable predictors. At 9 months his CNC word score is 28%, well below the expected band for someone with his profile, and his SSQ scores are correspondingly low. The map looks stable and he reports wearing the device all day.

What is the most appropriate response to this below-benchmark result?

Self-assessment — Module 152 questions
Question 1

Approximately how much of the variance in adult cochlear implant speech outcomes did the large Blamey 2013 analysis attribute to the standard preoperative factors combined?

Question 2

Which factor did the large Lazard model newly identify as significantly associated with better postimplant performance?

Tracked locally in your browser — see /progress for the dashboard.