Abstract
Automatic Speaker Identification (SID) is growing for the current demands of human-machine interaction in different fields, such as selfless driving vehicles, access to smartphones and laptops, and online security. These services become challenging while background noise is present. To achieve a noise-robust performance in adverse conditions, we propose two front-end feature extraction algorithms using the Auditory Nerve (AN) model. One algorithm uses energies of the Inner Hair Cell (IHC) response from the AN model. Another uses the energies of the linear chirp filter from the AN model followed by the cubic root and Discrete Cosine Transform (DCT). We investigate which algorithm is better in the SID task. We also use a modified Gammatone Filter Cepstral Coefficient (GFCC) as a reference. We tested these algorithms using text-dependent and text-independent speeches under clean and noisy conditions. This work shows that the performance of the proposed algorithms is way better than the previously proposed algorithm using the AN model. The algorithms with conventional nonlinearities significantly outperform the IHC algorithm in the noise-robust SID task. However, the application of conventional nonlinearities on the IHC algorithm provides a significantly improved SID performance.
Original language | English |
---|---|
Title of host publication | Proceedings of the 6th International Conference on Frontiers of Signal Processing (ICFSP), 9-11 September 2021, Paris, France |
Publisher | IEEE |
Pages | 12-16 |
Number of pages | 5 |
ISBN (Print) | 9781665413459 |
DOIs | |
Publication status | Published - 2021 |
Event | International Conference on Frontiers of Signal Processing - Duration: 9 Sept 2021 → … |
Conference
Conference | International Conference on Frontiers of Signal Processing |
---|---|
Period | 9/09/21 → … |