Description
The project involves collecting the child reading dataset for the language is Xhosa, a South African Bantu language. The collected dataset is then processed with the help of native speakers and utilized to train state-of-the-art machine learning models focussed on assessing whether the child has spoken the word correctly or not. The dataset contains 14,972 recordings with an average of 4 seconds each. Each recording is annotated by three independent markers and consists of children speaking a particular word or letter from the Xhosa language in a classroom setting.
| Date made available | 21 May 2025 |
|---|---|
| Publisher | Western Sydney University |
| Date of data production | 1 Feb 2024 - 30 Nov 2024 |
UN SDGs
This dataset contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 4 Quality Education
Cite this
- DataSetCite