This demonstration presents CoNNear processed speech files that aim to improve the cochlear response of a hearing-impaired person (HI) to that of a normal-hearing person (NH). The inset figure illustrates the closed-loop training method. It includes a CNN-based description of a NH and HI auditory processing model that can be individualised to account for different aspects of sensorineural hearing damage: outer-hair-cell damage and cochlear synaptopathy [1]. Details on the model training can be found in [2] and the example sentences we processed came from the English TIMIT speech corpus [3].
We evaluate the quality of our CoNNear-based sound processors against an industry-standard hearing-aid processing scheme (NAL-NL2, [4]). Two hearing-loss profiles were considered: NAL-NL2 and CoNNear_OHC compensate for a high-frequency sloping audiogram with 5 dB HL at frequencies below 1 kHz and 35 dB HL at 8 kHz. A second, CoNNear_OHC+CS processing algorithm compensates for an additional 50% of auditory-nerve-fiber loss (i.e., cochlear synaptopathy). Processed audio samples can be found at the bottom of this page, and an objective evaluation is performed in the below figure. We tested how well the sound processors were able to restore the cochlear response of a HI response (black dashed) to that of the reference NH model (black solid). In both cases, the sentence was unprocessed and therefore the HI_unprocessed response. The HI response is up to 7 dB lower than the NH response and reflects shows how a typical sensorineural hearing loss affects the root-mean-square (RMS) energy of speech along different cochlear center frequencies.
The coloured lines show how the CoNNear or NAL-NL2 processing alters the RMS energy in the HI cochlear model to compensate for the hearing-damage. When comparing our CoNNear_OHC processing (green) to the NAL-NL2 processing (blue), it can be observed that at CFs higher than 2 kHz, our method is better able to bring the processed output to the NH target. Secondly, the CoNNear processing performs differently when the algorithm is trained to compensate for OHC damage or when it is trained to compensate for a mixture of OHC and CS damage. Compared to the OHC processing, the OHC+CS processing (red) puts more speech energy into the higher frequencies.
Both CoNNear and NAL-NL2 are nonlinear processing schemes, but CoNNear uses an entirely different method for sound processing that is based on a CNN-based end-to-end processing [2]. Even though the CoNNear method is fundamentally different from industry-standard hearing-aid processing, the sound quality of our CoNNear_OHC processing is equally good (or even better) than the NAL-NL2 processing. Our CoNNear sound quality can also be compared to that achieved with a conventional CNN-based method. The specific CNN architecture we adopt for our CoNNear framework has the advantage that it does not suffer from auto-encoder artifacts that trouble other approaches. This demonstrator shows that CoNNear-based methods for hearing-aid signal processing are a viable route to further explore in studies with patients.
[1] Verhulst, S., Altoe, A., & Vasilkov, V. (2018). Computational modeling of the human auditory periphery: Auditory-nerve responses, evoked potentials and hearing loss. Hearing research, 360, 55-75.
[2] Drakopoulos, F., & Verhulst, S. (2023). A neural-network framework for the design of individualised hearing-loss compensation. IEEE/ACM Transactions on Audio, Speech, and Language Processing.
[3] J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, and D. S. Pallett (1993) Darpa timit acoustic phonetic continous speech corpus cd-rom. nist speech disc 1-1.1, NASA STI/Recon technical report n, vol. 93, p. 27403
[4] Keidser, G., Dillon, H., Flax, M., Ching, T., & Brewer, S. (2011). The NAL-NL2 prescription procedure. Audiology research, 1(1), e24.