This demonstration presents CoNNear processed speech files that aim to improve the cochlear response of a hearing-impaired person (HI) to that of a normal-hearing person (NH). The inset figure illustrates the closed-loop training method. It includes a CNN-based description of a NH and HI auditory processing model that can be individualised to account for different aspects of sensorineural hearing damage: outer-hair-cell damage and cochlear synaptopathy [1]. Details on the model training can be found in [2] and the example sentences we processed came from the English TIMIT speech corpus [3].

We evaluate the quality of our CoNNear based sound processors by comparing three conditions: unprocessed audio, noise reduction (NR), and CoNNearOHC. The CoNNearOHC processor compensates for a high frequency sloping audiogram, characterized by hearing thresholds of 5 dB HL below 1 kHz and 35 dB HL at 8 kHz. Processed audio samples can be found at the bottom of this page, and the objective evaluation is shown in the figure below. We assessed how well each processing condition restores the cochlear response of a hearing impaired (HI) model toward that of the reference normal hearing (NH) model. In all cases, the same sentence stimulus was used, with the unprocessed condition serving as the HI baseline. The HI unprocessed response is up to 7 dB lower than the NH response, illustrating how a typical sensorineural hearing loss reduces the root mean square (RMS) energy of speech across cochlear center frequencies.

The coloured lines show how the CoNNear or NAL-NL2 processing alters the RMS energy in the HI cochlear model to compensate for the hearing-damage. When comparing our CoNNear_OHC processing (green) to the NAL-NL2 processing (blue), it can be observed that at CFs higher than 2 kHz, our method is better able to bring the processed output to the NH target. Secondly, the CoNNear processing performs differently when the algorithm is trained to compensate for OHC damage or when it is trained to compensate for a mixture of OHC and CS damage. Compared to the OHC processing, the OHC+CS processing (red) puts more speech energy into the higher frequencies.
Both CoNNear and NAL-NL2 are nonlinear processing schemes, but CoNNear uses an entirely different method for sound processing that is based on a CNN-based end-to-end processing [2]. Even though the CoNNear method is fundamentally different from industry-standard hearing-aid processing, the sound quality of our CoNNear_OHC processing is equally good (or even better) than the NAL-NL2 processing. Our CoNNear sound quality can also be compared to that achieved with a conventional CNN-based method. The specific CNN architecture we adopt for our CoNNear framework has the advantage that it does not suffer from auto-encoder artifacts that trouble other approaches. This demonstrator shows that CoNNear-based methods for hearing-aid signal processing are a viable route to further explore in studies with patients.
[1] Verhulst, S., Altoe, A., & Vasilkov, V. (2018). Computational modeling of the human auditory periphery: Auditory-nerve responses, evoked potentials and hearing loss. Hearing research, 360, 55-75.
[2] Drakopoulos, F., & Verhulst, S. (2023). A neural-network framework for the design of individualised hearing-loss compensation. IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[3] J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, and D. S. Pallett (1993) Darpa timit acoustic phonetic continous speech corpus cd-rom. nist speech disc 1-1.1, NASA STI/Recon technical report n, vol. 93, p. 27403
[4] Keidser, G., Dillon, H., Flax, M., Ching, T., & Brewer, S. (2011). The NAL-NL2 prescription procedure. Audiology research, 1(1), e24.