Accessing, managing, and mobilizing an ELAN-based language documentation corpus: the Kwaras and Namuti tools

Date
2019-02
Authors
Caballero, Gabriela
Carroll, Lucien
Mach, Kevin
Journal Title
Journal ISSN
Volume Title
Publisher
University of Hawaii Press
Abstract
This paper introduces Kwaras and Namuti, two new tools for building, managing, accessing, and mobilizing ELAN-based language documentation corpora. Kwaras integrates WAV files, ELAN annotations, and document metadata into a web-based corpus, allowing immediate access to annotations and recordings. Namuti builds from Kwaras and enables different uses of language documentation products for different audiences and provides links from linguistic analyses to language documentation corpora. The main goal of these new tools is three-fold: (i) to facilitate the use of language documentation in linguistic analysis; (ii) to increase transparency of documentation-based analyses, providing interested users full access to the data on which generalizations are based and contextualization of the projects that generated the data; and (iii) to enable uses of language corpora that may serve the interests of multiple stakeholders, including academic researchers and community members interested in language maintenance and revitalization. We provide a basic overview of how Kwaras and Namuti work, lay out instructions on how to download and use Kwaras, and discuss what uses it currently supports. This article also issues a call for increased collaboration between linguists, community members, language activists, and software developers to further develop these and other similar resources.
Description
Keywords
language documentation, corpora, technology, ELAN
Citation
Caballero, Gabriela, Lucien Carroll & Kevin Mach. 2019. Accessing, managing, and mobilizing an ELAN-based language documentation corpus: the Kwaras and Namuti tools. Language Documentation & Conservation 13: 63-82.
Rights
Access Rights
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.