Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/25255

Endangered language sound documentation and audio processing in the cloud

File SizeFormat 
25255.mp355.01 MBMP3View/Open
25255.pdf1.28 MBAdobe PDFView/Open

Item Summary

Title: Endangered language sound documentation and audio processing in the cloud
Issue Date: 12 Mar 2015
Description: Endangered language documentation places linguists in a competition with time. Comparing to the pre-digital technology, recent advanced digital technology has been providing swift and handy recording devices and data processing software (e.g., Praat, ELAN). However, these still largely rely on manual data processing in order to make a digital search possible. For example, in order to collect sound segments containing a certain phoneme for phonetics/phonology research, a researcher might search for them using transcripts or using a marker function of software. Transcribing, annotating, and/or marking on sound files require a lot of time as the entire sound files (generally large in size) have to be listened through, often not just once but multiple times. The system we introduce in our talk skips these time consuming stages.

Our presentation introduced our project: PELDA (Platform for Endangered Language Documentation and Analysis), which enables a one-stop cloud-based platform for sound documentation (http://peldaaudiosearch.azurewebsites.net/index.html). At this stage, we have developed and deployed an audio search prototype in the Microsoft Azure cloud platform. Users need no other tool except the web browser to submit a sound example. The system finds segments matching the target sound in the database. Currently, the prototype is developed using Blackfoot; it searches a Blackfoot phoneme, velar fricative /x/. This audio processing and retrieval model is expected to support more general "Query-by-Example" mechanism in order for other phonemes or various linguistic features to be successfully searched. With Azure cloud and its data management and version control capabilities, our one-stop platform has another significant merit as to support collaborative projects.

As shown in the literature of Computer Science, cloud computing is an emerging computing paradigm, which has been increasingly adopted in fields ranging from education and scientific applications with its many unique features such as elastic computing, customized services, centralized resource, data and service management, cost reduction, accessible anywhere/anytime, and so forth. Bringing these features to the field of language documentation is an exciting opportunity. In our presentation, we will describe how the system works, demonstrate what can be done, how it contributes to documentation, and how it will be expanded in the future.
URI/DOI: http://hdl.handle.net/10125/25255
Rights: Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported
Appears in Collections:4th International Conference on Language Documentation and Conservation (ICLDC)



Items in ScholarSpace are protected by copyright, with all rights reserved, unless otherwise indicated.