Forced Alignment for Understudied Language Varieties: Testing Prosodylab-Aligner with Tongan Data

dc.contributor.author Johnson, Lisa
dc.contributor.author Di Paolo, Marianna
dc.contributor.author Bell, Adrian
dc.contributor.author Holt, Carter
dc.contributor.speaker Johnson, Lisa
dc.contributor.speaker Di Paolo, Marianna
dc.contributor.speaker Bell, Adrian
dc.contributor.speaker Holt, Carter
dc.date.accessioned 2017-05-10T21:40:37Z
dc.date.available 2017-05-10T21:40:37Z
dc.date.begin 2017-03-02
dc.date.finish 2017-03-02
dc.date.issued 2017-03-02
dc.description Linguists engaged in language documentation and sociolinguistics face similar problems when it comes to efficiently processing large corpora of recorded speech. Though field recordings can be collected efficiently, it may take months or years to process the audio for certain types of analysis. Besides transcription, phonetic analysis often requires the time-consuming alignment of transcription to audio. The expense related to this process may limit both the questions researchers can explore and the amount of data they can analyze. Recent advances in speech recognition technology have led to the development of tools to automate time alignment of transcriptions to audio (Evanini, Isard, and Liberman 2009, Goldman 2011, Kisler, Schiel, and Sloetjes 2012, Reddy and Stanford 2015, Rosenfelder 2013). Such automation promises to expedite the process of preparing data for acoustic analysis. Unfortunately, the benefits of auto-alignment have generally been available only to researchers studying majority languages like English, for which large corpora exist and for which acoustic models have been created by large-scale research projects or corporate entities. Prosodylab-Aligner (Gorman, Howell, and Wagner 2011), developed at McGill University and available free of charge, was developed specifically to facilitate automated alignment and segmentation for less-studied languages. It allows researchers to train their own acoustic models using the same audio files for which alignments will be created. Those models can then be used to create Praat Textgrids aligned to those recordings, with boundaries marked at both the word and segment level. Our study tests the use of Prosodylab-Aligner on Tongan field recordings. The results show that automated alignment of recordings of an understudied language is feasible for linguists without programming experience and less time-consuming than traditional manual alignments. For the benefit of others who may wish to use Prosodylab-Aligner for their own research data, the paper also reviews the software, and outlines the steps required to install software components, prepare data files, train acoustic models, and create time-aligned Textgrids. It also provides tips and solutions to problems we encountered along the way. In addition, since field recordings often contain more background noise than the kinds of laboratory recordings Prosodylab-Aligner was designed to use, the paper also presents an analysis (using PraatR (Albin 2014)) of the relative costs and benefits of removing background noise for both training and alignment purposes. References Albin, Aaron L. 2014. "PraatR: An architecture for controlling the phonetics software “Praat” with the R programming language." The Journal of the Acoustical Society of America 135 (4):2198-2199. Evanini, Keelan, Stephen Isard, and Mark Liberman. 2009. "Automatic formant extraction for sociolinguistic analysis of large corpora." INTERSPEECH. Goldman, Jean-Philippe. 2011. "Esayalign: an automatic phonetic alignment tool under Praat." Interspeech-2011:3233-3236. Gorman, Kyle, Jonathan Howell, and Michael Wagner. 2011. "Prosodylab-Aligner: A Tool for Forced Alignment of Laboratroy Speech." Canadian Acoustics 39 (3):192-193. Kisler, Thomas, Florian Schiel, and Han Sloetjes. 2012. "Signal processing via web services: the use case WebMAUS." Digital Humanities Conference 2012. Reddy, Sravana, and James Stanford. 2015. "Toward completely automated vowel extraction: Introducing DARLA." Linguistics Vanguard. Rosenfelder, Ingrid. 2013. "Forced Alignment & Vowel Extraction (FAVE): An online suite for automatic vowel analysis." University of Pennsylvania Linguistics Lab, Last Modified December 8, 2013, accessed November 26. 2015. http://fave.ling.upenn.edu/index.html.
dc.identifier.uri http://hdl.handle.net/10125/42032
dc.title Forced Alignment for Understudied Language Varieties: Testing Prosodylab-Aligner with Tongan Data
dc.type.dcmi Text
dc.type.dcmi Sound
Files
Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
42032.mp3
Size:
29.71 MB
Format:
Moving Picture Experts Group Layer-3 Audio
Description:
No Thumbnail Available
Name:
42032.pdf
Size:
4.22 MB
Format:
Adobe Portable Document Format
Description: