An end-to-end approach for child reading assessment in the Xhosa language

Sergio Chevtchenko, Nikhil Navas, Rafaella Vale, Franco Ubaudi, Sipumelele Lucwaba, Cally Ardington, Soheil Afshar, Mark Antoniou, Saeed Afshar

Research output: Chapter in Book / Conference PaperChapterpeer-review

Abstract

Child literacy is a strong predictor of life outcomes at the subsequent stages of an individual’s life. This points to a need for targeted interventions in vulnerable low and middle income populations to help bridge the gap between literacy levels in these regions and high income ones. In this effort, reading assessments provide an important tool to measure the effectiveness of these programs and AI can be a reliable and economical tool to support educators with this task. Developing accurate automatic reading assessment systems for child speech in low-resource languages poses significant challenges due to limited data and the unique acoustic properties of children’s voices. This study focuses on Xhosa, a language spoken in South Africa, to advance child speech recognition capabilities. We present a novel dataset composed of child speech samples in Xhosa. The dataset is available upon request and contains ten words and letters, which are part of the Early Grade Reading Assessment (EGRA) system. Each recording is labeled with an online and cost-effective approach by multiple markers and a subsample is validated by an independent EGRA reviewer. This dataset is evaluated with three fine-tuned state-of-the-art end-to-end models: wav2vec 2.0, HuBERT, and Whisper. The results indicate that the performance of these models can be significantly influenced by the amount and balancing of the available training data, which is fundamental for cost-effective large dataset collection. Furthermore, our experiments indicate that the wav2vec 2.0 performance is improved by training on multiple classes at a time, even when the number of available samples is constrained.

Original languageEnglish
Title of host publicationArtificial Intelligence in Education: 26th International Conference, AIED 2025, Palermo, Italy, July 22-26, 2025, Proceedings, Part II
EditorsAlexandra I. Cristea, Erin Walker, Yu Lu, Olga C. Santos, Seiji Isotani
Place of PublicationSwitzerland
PublisherSpringer
Pages106-119
Number of pages14
ISBN (Electronic)9783031984174
ISBN (Print)9783031984167
DOIs
Publication statusPublished - 2025
EventInternational Conference on Artificial Intelligence in Education - Palermo, Italy
Duration: 22 Jul 202526 Jul 2025
Conference number: 26th

Publication series

NameLecture Notes in Computer Science
Volume15877
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Artificial Intelligence in Education
Abbreviated titleAIED
Country/TerritoryItaly
CityPalermo
Period22/07/2526/07/25

Keywords

  • Deep Learning
  • EGRA
  • Speech-to-Text

Fingerprint

Dive into the research topics of 'An end-to-end approach for child reading assessment in the Xhosa language'. Together they form a unique fingerprint.

Cite this