Automatic Speech Recognition for Supporting Endangered Language Documentation

dc.contributor.authorPrud’hommeaux, Emily
dc.contributor.authorJimerson, Robbie
dc.contributor.authorHatcher, Richard
dc.contributor.authorMichelson, Karin
dc.date.accessioned2021-12-01T05:51:25Z
dc.date.available2021-12-01T05:51:25Z
dc.date.issued2021-11
dc.description.abstractGenerating accurate word-level transcripts of recorded speech for language documentation is difficult and time-consuming, even for skilled speakers of the target language. Automatic speech recognition (ASR) has the potential to streamline transcription efforts for endangered language documentation, but the practical utility of ASR for this purpose has not been fully explored. In this paper, we present results of a study in which both linguists and community members, with varying levels of language proficiency, transcribe audio recordings of an endangered language under timed conditions with and without the assistance of ASR. We find that both time-to-transcribe and transcription error rates are significantly reduced when correcting ASR for language learners of all levels. Despite these improvements, most community members in our study express a preference for unassisted transcription, highlighting the need for developers to directly engage with stakeholders when designing and deploying technologies for supporting language documentation.
dc.description.sponsorshipNational Foreign Language Resource Center
dc.format.extent23
dc.identifier.citationPrud'hommeaux, Emily, Robbie Jimerson, Richard Hatcher, Karin Michelson. 2021. Automatic Speech Recognition for Supporting Endangered Language Documentation. Language Documentation & Conservation 15: 491-513.
dc.identifier.issn1934-5275
dc.identifier.urihttp://hdl.handle.net/10125/74666
dc.language.isoen-US
dc.publisherUniversity of Hawaii Press
dc.rightsCreative Commons Attribution-NonCommercial 4.0 International
dc.rightsAttribution-NonCommercial 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc/3.0/us/*
dc.subjectAutomatic speech recognition
dc.subjectEndangered languages
dc.subjectTranscription
dc.titleAutomatic Speech Recognition for Supporting Endangered Language Documentation
dc.typeArticle
dc.type.dcmiText
prism.endingpage513
prism.publicationnameLanguage Documentation & Conservation
prism.startingpage491
prism.volume15

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Prudhommeaux_etal.pdf
Size:
557.98 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.73 KB
Format:
Item-specific license agreed upon to submission
Description: