Automatic Speech Recognition for Supporting Endangered Language Documentation

Prud’hommeaux, Emily; Jimerson, Robbie; Hatcher, Richard; Michelson, Karin

Automatic Speech Recognition for Supporting Endangered Language Documentation

dc.contributor.author	Prud’hommeaux, Emily
dc.contributor.author	Jimerson, Robbie
dc.contributor.author	Hatcher, Richard
dc.contributor.author	Michelson, Karin
dc.date.accessioned	2021-12-01T05:51:25Z
dc.date.available	2021-12-01T05:51:25Z
dc.date.issued	2021-11
dc.description.abstract	Generating accurate word-level transcripts of recorded speech for language documentation is difficult and time-consuming, even for skilled speakers of the target language. Automatic speech recognition (ASR) has the potential to streamline transcription efforts for endangered language documentation, but the practical utility of ASR for this purpose has not been fully explored. In this paper, we present results of a study in which both linguists and community members, with varying levels of language proficiency, transcribe audio recordings of an endangered language under timed conditions with and without the assistance of ASR. We find that both time-to-transcribe and transcription error rates are significantly reduced when correcting ASR for language learners of all levels. Despite these improvements, most community members in our study express a preference for unassisted transcription, highlighting the need for developers to directly engage with stakeholders when designing and deploying technologies for supporting language documentation.
dc.description.sponsorship	National Foreign Language Resource Center
dc.format.extent	23
dc.identifier.citation	Prud'hommeaux, Emily, Robbie Jimerson, Richard Hatcher, Karin Michelson. 2021. Automatic Speech Recognition for Supporting Endangered Language Documentation. Language Documentation & Conservation 15: 491-513.
dc.identifier.issn	1934-5275
dc.identifier.uri	http://hdl.handle.net/10125/74666
dc.language.iso	en-US
dc.publisher	University of Hawaii Press
dc.rights	Creative Commons Attribution-NonCommercial 4.0 International
dc.rights	Attribution-NonCommercial 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc/3.0/us/	*
dc.subject	Automatic speech recognition
dc.subject	Endangered languages
dc.subject	Transcription
dc.title	Automatic Speech Recognition for Supporting Endangered Language Documentation
dc.type	Article
dc.type.dcmi	Text
prism.endingpage	513
prism.publicationname	Language Documentation & Conservation
prism.startingpage	491
prism.volume	15

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Prudhommeaux_etal.pdf
Size:: 557.98 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.73 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Volume 15 : Language Documentation & Conservation