Comparing language-specific and cross-language acoustic models for low-resource phonetic forced alignment
Loading...
Date
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Interviewee
Narrator
Transcriber
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
University of Hawaii Press
Volume
19
Number/Issue
Starting Page
201
Ending Page
223
Alternative Title
Abstract
Phonetic forced alignment can greatly expedite spoken language analysis by providing automatic time alignments at the word and phone levels. In the case of low-resource languages, it remains an open question whether phone-level forced alignment will be more successful with a small language-specific acoustic model or a high-resource cross-language acoustic model. The present study directly compared the forced alignment performance of language-specific and cross-language acoustic models using the Urum and Evenki datasets from the DoReCo Corpus. We evaluated six language-specific acoustic models trained with 5, 10, 15, 20, 25, or approximately 70 minutes of language-specific speech data against four English-based cross-language acoustic models that differed in size and accent homogeneity (large Global English or homogeneous American English of varying data amounts). Acoustic models were developed or obtained from the Montreal Forced Aligner and evaluated against held-out manually aligned phone boundaries. Overall, the Global English model and the larger language-specific acoustic models were competitive with one another and outperformed the homogeneous cross-language and smaller language-specific acoustic models. From this analysis, we recommend that researchers use a language-specific model with at least 25 minutes of actual speech (not just recording duration) or a large, diverse cross-language acoustic model for low-resource forced alignment.
Description
Keywords
Citation
Chodroff, Eleanor, Emily P. Ahn, Hossep Dolatian. 2025. Comparing language-specific and cross-language acoustic models for low-resource phonetic forced alignment. Language Documentation & Conservation 19: 201-223.
DOI
Extent
23
Format
Article
Geographic Location
Time Period
Related To
Related To (URI)
Table of Contents
Rights
Creative Commons Attribution-NonCommercial 4.0 International
Rights Holder
Catalog Record
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.
