Exploring Bengali speech for gender classification: machine learning and deep learning approaches

Habiba Dewan Arpita, Abdullah Al Ryan, Md. Fahad Hossain, Md. Sadekur Rahman, Md Sajjad, Nuzhat Noor Islam Prova

Research output: Contribution to journalArticlepeer-review

4 Citations (Scopus)

Abstract

Speech enables clear and powerful idea transmission. The human voice, rich in tone and emotion, holds unique beauty and significance in daily life. Vocal pitches vary by gender and are influenced by emotions and languages. While people naturally perceive these nuances, machines often struggle to capture these subtle distinctions. Machines may struggle to detect these nuances, but people effortlessly perceive them. This project aims to use various machine learning (ML) and deep learning (DL) techniques to reliably determine an individual's gender from a corpus of Bengali conversations. Our dataset comprises 3185 Bengali speeches, with 1100 delivered by males, 1035 by women, and 1050 by those who identify as third gender. We employed six distinct feature extraction techniques to examine the audio data: roll-off, spectral centroid, chroma-stft, spectral bandwidth, zero crossing rate, and Mel-frequency cepstral coefficients (MFCC). Extreme gradient boosting (XGBoost), support vector machines (SVM), K-nearest neighbors (KNN), decision trees classifier (DTC), and random forest (RF) were employed as the five ML algorithms to comprehensively analyze the dataset. For a full study, we also included 1D convolutional neural networks (CNN) from the DL area. The 1D CNN performed extraordinarily well, exceeding the accuracy of all other algorithms with a stunning 99.37%.
Original languageEnglish
Pages (from-to)328-337
Number of pages10
JournalBulletin of Electrical Engineering and Informatics
Volume14
Issue number1
DOIs
Publication statusPublished - Feb 2025
Externally publishedYes

Keywords

  • Deep learning
  • Gender classification
  • Machine learning
  • Mel-frequency cepstral coefficients
  • Speech recognition

Fingerprint

Dive into the research topics of 'Exploring Bengali speech for gender classification: machine learning and deep learning approaches'. Together they form a unique fingerprint.

Cite this