Forced Alignment for Understudied Language Varieties: Testing Prosodylab-Aligner with Tongan Data

Date
2017-03-02
Authors
Johnson, Lisa
Di Paolo, Marianna
Bell, Adrian
Holt, Carter
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Johnson, Lisa
Di Paolo, Marianna
Bell, Adrian
Holt, Carter
Researcher
Consultant
Interviewer
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
Ending Page
Alternative Title
Abstract
Description
Linguists engaged in language documentation and sociolinguistics face similar problems when it comes to efficiently processing large corpora of recorded speech. Though field recordings can be collected efficiently, it may take months or years to process the audio for certain types of analysis. Besides transcription, phonetic analysis often requires the time-consuming alignment of transcription to audio. The expense related to this process may limit both the questions researchers can explore and the amount of data they can analyze. Recent advances in speech recognition technology have led to the development of tools to automate time alignment of transcriptions to audio (Evanini, Isard, and Liberman 2009, Goldman 2011, Kisler, Schiel, and Sloetjes 2012, Reddy and Stanford 2015, Rosenfelder 2013). Such automation promises to expedite the process of preparing data for acoustic analysis. Unfortunately, the benefits of auto-alignment have generally been available only to researchers studying majority languages like English, for which large corpora exist and for which acoustic models have been created by large-scale research projects or corporate entities. Prosodylab-Aligner (Gorman, Howell, and Wagner 2011), developed at McGill University and available free of charge, was developed specifically to facilitate automated alignment and segmentation for less-studied languages. It allows researchers to train their own acoustic models using the same audio files for which alignments will be created. Those models can then be used to create Praat Textgrids aligned to those recordings, with boundaries marked at both the word and segment level. Our study tests the use of Prosodylab-Aligner on Tongan field recordings. The results show that automated alignment of recordings of an understudied language is feasible for linguists without programming experience and less time-consuming than traditional manual alignments. For the benefit of others who may wish to use Prosodylab-Aligner for their own research data, the paper also reviews the software, and outlines the steps required to install software components, prepare data files, train acoustic models, and create time-aligned Textgrids. It also provides tips and solutions to problems we encountered along the way. In addition, since field recordings often contain more background noise than the kinds of laboratory recordings Prosodylab-Aligner was designed to use, the paper also presents an analysis (using PraatR (Albin 2014)) of the relative costs and benefits of removing background noise for both training and alignment purposes. References Albin, Aaron L. 2014. "PraatR: An architecture for controlling the phonetics software “Praat” with the R programming language." The Journal of the Acoustical Society of America 135 (4):2198-2199. Evanini, Keelan, Stephen Isard, and Mark Liberman. 2009. "Automatic formant extraction for sociolinguistic analysis of large corpora." INTERSPEECH. Goldman, Jean-Philippe. 2011. "Esayalign: an automatic phonetic alignment tool under Praat." Interspeech-2011:3233-3236. Gorman, Kyle, Jonathan Howell, and Michael Wagner. 2011. "Prosodylab-Aligner: A Tool for Forced Alignment of Laboratroy Speech." Canadian Acoustics 39 (3):192-193. Kisler, Thomas, Florian Schiel, and Han Sloetjes. 2012. "Signal processing via web services: the use case WebMAUS." Digital Humanities Conference 2012. Reddy, Sravana, and James Stanford. 2015. "Toward completely automated vowel extraction: Introducing DARLA." Linguistics Vanguard. Rosenfelder, Ingrid. 2013. "Forced Alignment & Vowel Extraction (FAVE): An online suite for automatic vowel analysis." University of Pennsylvania Linguistics Lab, Last Modified December 8, 2013, accessed November 26. 2015. http://fave.ling.upenn.edu/index.html.
Keywords
Citation
Extent
Format
Geographic Location
Time Period
Related To
Table of Contents
Rights
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.