Abstract
With the availability of voice-enabled devices such as smartphones, mental health disorders such as depression could be detected and treated earlier, particularly post-pandemic. The current methods involve extracting features directly from audio signals. In this paper, two methods are used to enrich voice analysis for depression detection: the transformation of voice signals into a visibility graph and the natural language processing of the transcript text based on representational learning. The results of processing text and voice with different features are fused to produce final class labels. Experimental evaluation with the DAIC-WOZ dataset suggests that integrating text-based voice classification and learning from low-level and graph-based voice signal features can improve the detection of mental disorders like depression. Our text-based method has achieved %72.7 F1-score, which is higher than other single-modal scores. The fusion of all prediction models based on voice and text has resulted in %82.4 F1-score that outperforms other models.
| Original language | English |
|---|---|
| Title of host publication | Intelligent Systems Design and Applications - 22nd International Conference on Intelligent Systems Design and Applications ISDA 2022 - Volume 1 |
| Editors | Ajith Abraham, Ajith Abraham, Sabri Pllana, Gabriella Casalino, Kun Ma, Anu Bajaj |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 332-341 |
| Number of pages | 10 |
| ISBN (Print) | 9783031274398 |
| DOIs | |
| Publication status | Published - 2023 |
| Externally published | Yes |
| Event | 22nd International Conference on Intelligent Systems Design and Applications, ISDA 2022 - Virtual, Online Duration: 12 Dec 2022 → 14 Dec 2022 |
Publication series
| Name | Lecture Notes in Networks and Systems |
|---|---|
| Volume | 646 LNNS |
| ISSN (Print) | 2367-3370 |
| ISSN (Electronic) | 2367-3389 |
Conference
| Conference | 22nd International Conference on Intelligent Systems Design and Applications, ISDA 2022 |
|---|---|
| City | Virtual, Online |
| Period | 12/12/22 → 14/12/22 |
Bibliographical note
Publisher Copyright:© 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.
Keywords
- Multi-modal depression detection
- Natural Language Procession
- Speech Signal Processing
- Voting Ensemble