Please use this identifier to cite or link to this item:

Language Preservation 2.0: Crowdsourcing oral language documentation using mobile devices

File Size Format  
25309.mp3 55.8 MB MP3 View/Open
25309.pdf 2.18 MB Adobe PDF View/Open

Item Summary

Title:Language Preservation 2.0: Crowdsourcing oral language documentation using mobile devices
Authors:Bird, Steven
Contributors:Bird, Steven (speaker)
Date Issued:12 Mar 2015
Description:In crude quantitative terms, Zipf’s law tells us that documentation of something as simple as word usage requires several million words of text or several hundred hours of speech, in a wide variety of genres and styles. The only way to achieve this goal for the majority of the world’s languages is to collect speech. Speech has the added advantage of providing information about phonetics, phonology, and prosody. Speech is also the primary register for dialogue, the most common form of language use. We argue that a combination of community outreach, crowdsourcing techniques, and mobile/web technologies make it relatively easy to collect hundreds or thousands of hours of speech (Callison-Burch and Dredze, 2010; Hughes et al., 2010; Anon 2010).

On its own, this would leave us with a large archive of uninterpreted audio recordings and – once the languages are no longer spoken – an onerous and unverifiable decipherment problem. To avoid this problem and to ensure interpretability, there must be a documentary record that includes translation into a major language. We take as our guide the current typical practice in documentary linguistics, which is to record and report data as interlinear glossed text.
To this end, we add two layers of audio annotation to the primary recordings. The first layer is careful respeaking, or “audio transcription,” in which native speakers listen to the recordings phrase by phrase, and respeak each phrase slowly and carefully. The second layer is oral translation, in which bilingual speakers produce phrase-by-phrase interpretation of the original recordings into a widely-spoken contact language such as English. This combination of respeaking and interpreting constitutes an “acoustic Rosetta stone” which, over time, will grow to a sufficient size to allow open-ended analysis of the language even when it is no longer spoken, including new methods for developing automatic phonetic recognizers and automatic translation systems (Liberman et al., 2013, Lee et al., 2013, Anon 2013).

We will demonstrate a novel way to work with the speakers of endangered languages to collect these spoken language annotations and interlinear glossed texts on a large scale. Our approach addresses key issues in such areas as informed consent, quality control, workflow management, and the diverse technological situations of linguistic fieldwork. Our work promises to speed up the process of preserving the world’s languages and enable future study of these languages and access to knowledge that is captured in archived speech recordings.


Chris Callison-Burch and Mark Dredze. Creating speech and language data with Amazon’s Mechanical Turk. In Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, pages 1–12. Association for Computational Linguistics, 2010. URL W10-0701.

Thad Hughes, Kaisuke Nakajima, Linne Ha, Atul Vasu, Pedro J. Moreno, and Mike LeBeau. Building transcribed speech corpora quickly and cheaply for many languages. In INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, pages 1914–1917. ISCA, 2010.

Mark Liberman, Jiahong Yuan, Andreas Stolcke, Wen Wang, and Vikramjit Mitra (2013). Using multiple versions of speech input in phone recognition, ICASSP.

Chia-ying Lee, Yu Zhang, and James Glass (2013). Joint learning of phonetic units and word pronunciations for ASR. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 182–192. Association for Computational Linguistics.
Rights:Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported
Appears in Collections: 4th International Conference on Language Documentation and Conservation (ICLDC)

Please email if you need this content in ADA-compliant format.

Items in ScholarSpace are protected by copyright, with all rights reserved, unless otherwise indicated.