Automatic Speech Recognition for Supporting Endangered Language Documentation

Generating accurate word-level transcripts of recorded speech for language documentation is difficult and time-consuming, even for skilled speakers of the target language. Automatic speech recognition (ASR) has the potential to streamline transcription efforts for endangered language documentation, but the practical utility of ASR for this purpose has not been fully explored. In this paper, we present results of a study in which both linguists and community members, with varying levels of language proficiency, transcribe audio recordings of an endangered language under timed conditions with and without the assistance of ASR. We find that both time-to-transcribe and transcription error rates are significantly reduced when correcting ASR for language learners of all levels. Despite these improvements, most community members in our study express a preference for unassisted transcription, highlighting the need for developers to directly engage with stakeholders when designing and deploying technologies for supporting language documentation.

Keywords

Automatic speech recognition, Endangered languages, Transcription

Citation

Prud'hommeaux, Emily, Robbie Jimerson, Richard Hatcher, Karin Michelson. 2021. Automatic Speech Recognition for Supporting Endangered Language Documentation. Language Documentation & Conservation 15: 491-513.

URI

http://hdl.handle.net/10125/74666

Extent

23

Rights

Creative Commons Attribution-NonCommercial 4.0 International
Attribution-NonCommercial 3.0 United States

Collections

Volume 15 : Language Documentation & Conservation

Full item page

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

Automatic Speech Recognition for Supporting Endangered Language Documentation

Files

Date

Authors

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

Description

Keywords

Citation

URI

Extent

Format

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

Rights Holder

Local Contexts

Collections