SVM-based evaluation of Thai tone imitations by Thai-naive Mandarin and Vietnamese speakers

Juqiang Chen, Tianyi Ni, Benjawan Kasisopa, Mark Antoniou, Catherine Best

Research output: Chapter in Book / Conference PaperConference Paperpeer-review

Abstract

![CDATA[Native listener judgements and acoustic comparisons are sensitive to deviations between non-native speech and native productions, but both have drawbacks and are inefficient for evaluating large databases. To probe whether Support Vector Machines (SVM) might offer an efficient alternative, we used three SVM models trained with native Thai lexical tones to eval-uate new native stimuli and non-native imitations by Mandarin and Vietnamese speakers. The optimal SVM model categorized native tones accurately but showed lower accuracy with non-native imitations, like native judges do, thus confirming its sensitivity to deviations from native productions. Thai falling tone imitations yielded the lowest classification accuracy, indicating that both groups' imitations were constrained by their native falling tones. Thai rising tones were better recognized for Viet-namese than Mandarin imitators, reflecting differences between their native rising tones. Thus, SVM modeling may provide an effective alternative to traditional perceptual- or acoustic-based evaluations of non-native speech.]]
Original languageEnglish
Title of host publicationProceedings of 2021 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), 14-17 December 2021, Tokyo, Japan
PublisherIEEE
Pages926-931
Number of pages6
ISBN (Print)9789881476890
Publication statusPublished - 2021
EventAsia-Pacific Signal and Information Processing Association. Annual Summit and Conference -
Duration: 14 Dec 2021 → …

Conference

ConferenceAsia-Pacific Signal and Information Processing Association. Annual Summit and Conference
Period14/12/21 → …

Fingerprint

Dive into the research topics of 'SVM-based evaluation of Thai tone imitations by Thai-naive Mandarin and Vietnamese speakers'. Together they form a unique fingerprint.

Cite this