Foreign Accented Australian English dataset

Dataset

Description

This dataset contains: Foreign accent is a distinct attribute of second language (L2) speech production. Foreign-accented speech is characterized by deviations in the target language pronunciation due to the learners' native language (L1) influence (Best, 1995; Best & Tyler, 2007; Flege, 1995), specifically due to differences in and contact between L1 and L2 phonology and phonotactics (i.e. consonant/vowel acoustics & articulation, syllable structure), as well as prosody (i.e. intonation, rhythm, stress, pitch). The degree or strength of a foreign accent varies among L2 learners and depends on a number of factors, such as the age of L2 acquisition (Piske, MacKay, & Flege, 2001), language experience (i.e., length of residence in an L2-speaking country: Bohn & Flege, 1990), amount of use (Flege, Frieda, & Nozawa, 1997), motivation/language learning aptitude (Piske et al., 2001). Automatic speech recognition (ASR) systems are predominantly trained on native speech data. However, while the native speakers have shown flexibility in adaptation to L2 speech, despite variation in L2 speakers’ accents, proficiency, and fluency, automatic recognition of foreign-accented speech degrades considerably in comparison to recognition of native speech (Derwing, Munro, & Carbonaro, 2000). Thus, the key challenges in ASR for accented speech are to ensure fast model adaptation on limited (and often homogenous) speech data and to facilitate recognition of unseen (untrained) accents (He & Zhao, 2002). Standard Australian English (AusE) is a distinct regional variety of the English language (Cox & Palethorpe, 2007). To enable accurate ASR of AusE speech, models need to be specifically trained on AusE speech data, in addition to American English and British English data (Chengalvarayan, 2001). Audio data for 226 speakers of Australian English with an Arabic accent (150 males and 76 females) This dataset contains sensitive information. To discuss the data, please contact [email protected] ORCID - 0000-0002-6178-3825
Date made available14 Sept 2023
PublisherWestern Sydney University
Date of data production1 Dec 2018 - 31 Mar 2019
Geographical coverageBankstown and Liverpool, NSW
Geospatial polygon150.884383,-33.97278 150.884383,-33.90613 151.062863,-33.90613 151.062863,-33.97278 150.884383,-33.97278

Cite this