A tool for sharing interlinearized and lexical data in diverse formats

Date
2017-03-02
Authors
Kaufman, Daniel
Finkel, Raphael
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Kaufman, Daniel
Finkel, Raphael
Researcher
Consultant
Interviewer
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
Ending Page
Alternative Title
Abstract
Description
The last decade has seen great advances in the development of electronic tools for automated interlinearization, corpus creation and lexicon building (e.g. Fieldworks Explorer [FLEx]), as well as tools for creating time-aligned annotations (e.g. ELAN). However, methods for sharing these new data formats online lag far behind. While good options exist for lexical data (e.g. Webonary, Lexique Pro), there is no tool for turning a project created in the FLEx software into an online interlinearized corpus. We present here a tool in development which does precisely that. FLEx databases can be searched using regular expressions and individual lines from a text can be linked to audio and video media. The tool can furthermore bring together linguistic data in diverse formats (from ELAN, Praat, Fieldworks, Toolbox, Shoebox) for a single query and allow for queries over multiple language projects. We discuss the benefits of this program in relation to several ongoing fieldwork projects that are being used to evaluate it. These projects present several interesting challenges. In one, we attempt to create a unified database from several centuries of documentation during which the language showed considerable change. Similarly, in the second project we create a unified database for two lexically, syntactically and phonologically distinct dialects of the same language and show how an interlinearized database facilitates searching across dialects. Finally, in the third project, we show how video data can be integrated into an online FLEx database, a feature which is still lacking in the FLEx software itself. By way of conclusion, we show the audience how to upload their own data (either privately or publicly) and experiment with the tool’s features. Ultimately, the open source program will be available for anyone interested in hosting their own installations.
Keywords
Citation
Extent
Format
Geographic Location
Time Period
Related To
Table of Contents
Rights
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.