Noise-robust text-dependent speaker identification using cochlear models

Ying Xu, Travis Monk, Saeed Afshar, André van Schaik

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)

Abstract

One challenging issue in speaker identification (SID) is to achieve noise-robust performance. Humans can accurately identify speakers, even in noisy environments. We can leverage our knowledge of the function and anatomy of the human auditory pathway to design SID systems that achieve better noise-robust performance than conventional approaches. We propose a text-dependent SID system based on a real-time cochlear model called cascade of asymmetric resonators with fast-acting compression (CARFAC). We investigate the SID performance of CARFAC on signals corrupted by noise of various types and levels. We compare its performance with conventional auditory feature generators including mel-frequency cepstrum coefficients, frequency domain linear predictions, as well as another biologically inspired model called the auditory nerve model. We show that CARFAC outperforms other approaches when signals are corrupted by noise. Our results are consistent across datasets, types and levels of noise, different speaking speeds, and back-end classifiers. We show that the noise-robust SID performance of CARFAC is largely due to its nonlinear processing of auditory input signals. Presumably, the human auditory system achieves noise-robust performance via inherent nonlinearities as well.
Original languageEnglish
Pages (from-to)500-516
Number of pages17
JournalJournal of the Acoustical Society of America
Volume151
Issue number1
Publication statusPublished - 2022

Open Access - Access Right Statement

©2022 Author(s). All article content, except where otherwise noted, is licensed under a Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Fingerprint

Dive into the research topics of 'Noise-robust text-dependent speaker identification using cochlear models'. Together they form a unique fingerprint.

Cite this