CI Atlas · Speech-Coding Strategies: The Complete Lineage · Module 14

14The Frontier: Deep Learning, Closed Loops and Light

The lineage does not end with ACE and FS4. The next leap reframes coding itself: end-to-end deep networks that map raw audio straight to stimulation, closed-loop implants that record neural responses and self-adapt, individualised coding driven by each user's neural health, and optical/optogenetic stimulation aimed at escaping electric current spread entirely.

TEnd-to-end deep-learning coding

End-to-end deep neural networks map raw audio directly to electrical stimulation patterns in a single unified model, replacing the separate front-end, filter bank, selection and mapping pipeline Deep denoising sound-coding strategies have been demonstrated that learn band selection and mapping jointly Generative speech enhancement uses GANs and diffusion models to clean the input before or within coding Foundation models are too large for ear-level devices, so knowledge distillation compresses them into efficient on-device networks, and a hearing aid with a dedicated neural-network chip shipped in 2024.[2023][2017]

~300Hz upper limit of temporal (rate) pitch with single-electrode stimulation (Zeng 2002) [2002]

6-8%Rate difference limen held to 600 pps in 18-electrode spread mode (Venter 2014) [2014]

CIndividualised and binaural coding

Individualised coding incorporates each user's electrode insertion depth, surviving-neuron distribution, electric-field characterization and etiology/duration of deafness Binaural end-to-end coding can fuse latent spaces across both ears to model interaural excitation/inhibition and synchronise N-of-M band selection between sides A novel spectral-feature-and-temporal-event strategy using zero-crossing fine structure plus envelope has been proposed (SFE) Sound-coding optimisation specifically for music and singing has been pursued for CI users.[2025][2023]

CClosed-loop, objective-measure-driven implants

Closed-loop implants record peripheral (ECAP) and cortical neural responses through the same electrodes and adapt stimulation/coding in real time A closed-loop CI concept has been proposed in which the device self-adjusts autonomously based on embedded monitoring of peripheral and central neural activity Objective-measure-based prescription rules could give more consistent fitting outcomes Future implants integrate internal memory and DSP, adaptive current sources and record-while-stimulate capability.[2012][2008]

CBeyond electricity: focusing and light

Current focusing and 'phantom'/partial-tripolar multipolar stimulation aim to sharpen the field and virtually extend the array, though benefit is mixed and combining focusing with steering is an active research direction Optical/optogenetic stimulation replaces electrical current with light to activate optogenetically sensitised auditory neurons, promising much finer spatial (frequency) resolution and better performance in noise by escaping electric current spread The overarching future goal is to better reproduce the fine spectral and temporal neural coding of normal hearing via improved electrode arrays plus improved coding systems Optical stimulation requires gene therapy and new optical hardware and remains an emerging frontier.[2013][2021]

COpen frontiers for next-generation coding

Pitch is rate-limited: single-electrode temporal pitch saturates near 300 Hz, while multi-electrode spread stimulation sustains 6-8% rate difference limens to 600 pps, hinting that smarter multi-site timing could push the limit (Zeng 2002; Venter 2014). Music and F0 remain hard: CI users' complex-tone F0 discrimination averages 7.56 semitones vs 1.12 for normal hearing, and rhythm-removed melody recognition is near chance (~12% vs 77%) (Gfeller 2002; Kong 2004). The electric dynamic range is compressed to ~10-20 dB versus ~120 dB acoustically, leaving CI users only ~20 discriminable loudness steps versus ~200, a hard target for adaptive/optimized mapping (Zeng 2004). Current focusing on top of steering improves virtual-channel discrimination (cumulative d' +2.04, Landsberger 2009), a near-term path to more usable channels. The effective-channel ceiling (~8) and noise-driven channel demand (4 in quiet, 8 at +5 dB SNR) define the gap that DNN noise reduction and closed-loop fitting aim to close (Friesen 2001).[2002][2014]

TBy the numbers

Case 15.14 · The Frontier

A research group proposes replacing a CI's entire front-end, filter bank, peak-picker and mapping stages with a single neural network trained to output stimulation patterns directly from raw audio, but worries the model is too large for an ear-level processor.

Which approach best addresses both the design goal and the size constraint?

Self-assessment — Module 142 questions

Question 1

What is the defining feature of end-to-end deep-learning sound coding?

Question 2

Why is optical/optogenetic stimulation pursued as a future direction?

Tracked locally in your browser — see /progress for the dashboard.