Please use this identifier to cite or link to this item:

Endangered language sound documentation and audio processing in the cloud

File Size Format  
25255.mp3 55.01 MB MP3 View/Open
25255.pdf 1.28 MB Adobe PDF View/Open

Item Summary

Title:Endangered language sound documentation and audio processing in the cloud
Authors:Chen, Min
Miyashita, Mizuki
Bezirganyan, Robert
Dong, Jingjing
Contributors:Chen, Min (speaker)
Miyashita, Mizuki (speaker)
Bezirganyan, Robert (speaker)
Dong, Jingjing (speaker)
Date Issued:12 Mar 2015
Description:Endangered language documentation places linguists in a competition with time. Comparing to the pre-digital technology, recent advanced digital technology has been providing swift and handy recording devices and data processing software (e.g., Praat, ELAN). However, these still largely rely on manual data processing in order to make a digital search possible. For example, in order to collect sound segments containing a certain phoneme for phonetics/phonology research, a researcher might search for them using transcripts or using a marker function of software. Transcribing, annotating, and/or marking on sound files require a lot of time as the entire sound files (generally large in size) have to be listened through, often not just once but multiple times. The system we introduce in our talk skips these time consuming stages.

Our presentation introduced our project: PELDA (Platform for Endangered Language Documentation and Analysis), which enables a one-stop cloud-based platform for sound documentation ( At this stage, we have developed and deployed an audio search prototype in the Microsoft Azure cloud platform. Users need no other tool except the web browser to submit a sound example. The system finds segments matching the target sound in the database. Currently, the prototype is developed using Blackfoot; it searches a Blackfoot phoneme, velar fricative /x/. This audio processing and retrieval model is expected to support more general "Query-by-Example" mechanism in order for other phonemes or various linguistic features to be successfully searched. With Azure cloud and its data management and version control capabilities, our one-stop platform has another significant merit as to support collaborative projects.

As shown in the literature of Computer Science, cloud computing is an emerging computing paradigm, which has been increasingly adopted in fields ranging from education and scientific applications with its many unique features such as elastic computing, customized services, centralized resource, data and service management, cost reduction, accessible anywhere/anytime, and so forth. Bringing these features to the field of language documentation is an exciting opportunity. In our presentation, we will describe how the system works, demonstrate what can be done, how it contributes to documentation, and how it will be expanded in the future.
Rights:Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported
Appears in Collections: 4th International Conference on Language Documentation and Conservation (ICLDC)

Please email if you need this content in ADA-compliant format.

Items in ScholarSpace are protected by copyright, with all rights reserved, unless otherwise indicated.