Managing Fieldwork Data with Toolbox and the Natural Language Toolkit

Contributor

Advisor

Editor

Performer

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Interviewee

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

University of Hawai'i Press

Journal Name

Language Documentation & Conservation

Volume

1

Number/Issue

1

Starting Page

44

Ending Page

57

Alternative Title

Abstract

This paper shows how fieldwork data can be managed using the program Toolbox together with the Natural Language Toolkit (NLTK) for the Python programming language. It provides background information about Toolbox and describes how it can be downloaded and installed. The basic functionality of the program for lexicons and texts is described, and its strengths and weaknesses are reviewed. Its underlying data format is briefly discussed, and Toolbox processing capabilities of NLTK are introduced, showing ways in which it can be used to extend the functionality of Toolbox. This is illustrated with a few simple scripts that demonstrate basic data management tasks relevant to language documentation, such as printing out the contents of a lexicon as HTML.

Description

Citation

Robinson, Stuart, Greg Aumann, and Steven Bird. 2007. Managing fieldwork data with Toolbox and the Natural Language Toolkit. Language Documentation & Conservation 1(1):44–57.

DOI

Extent

Format

Type

Article

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

Rights Holder

Catalog Record

Local Contexts

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.