Managing Fieldwork Data with Toolbox and the Natural Language Toolkit
Date
Authors
Contributor
Advisor
Editor
Performer
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Interviewee
Narrator
Transcriber
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
University of Hawai'i Press
Journal Name
Language Documentation & Conservation
Volume
1
Number/Issue
1
Starting Page
44
Ending Page
57
Alternative Title
Abstract
This paper shows how fieldwork data can be managed using the program Toolbox together with the Natural Language Toolkit (NLTK) for the Python programming language. It provides background information about Toolbox and describes how it can be downloaded and installed. The basic functionality of the program for lexicons and texts is described, and its strengths and weaknesses are reviewed. Its underlying data format is briefly discussed, and Toolbox processing capabilities of NLTK are introduced, showing ways in which it can be used to extend the functionality of Toolbox. This is illustrated with a few simple scripts that demonstrate basic data management tasks relevant to language documentation, such as printing out the contents of a lexicon as HTML.
Description
Keywords
Citation
Robinson, Stuart, Greg Aumann, and Steven Bird. 2007. Managing fieldwork data with Toolbox and the Natural Language Toolkit. Language Documentation & Conservation 1(1):44–57.
DOI
Extent
Format
Type
Article
Geographic Location
Time Period
Related To
Related To (URI)
Table of Contents
Rights
Rights Holder
Catalog Record
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.
