Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials

Date
2010
Authors
Stebbins, Tonya N.
Hellwig, Birgit
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
University of Hawai'i Press
Volume
4
Number/Issue
Starting Page
34
Ending Page
59
Alternative Title
Abstract
This paper describes a pilot project to develop a machine-readable corpus of early twentieth-century Sm’algyax texts from a large collection of handwritten manuscripts collected by the Tsimshian ethnographer and chief William Beynon. The project seeks to ensure that the materials produced are maximally accessible to the Tsimshian community. It relates established principles for corpus design to practical issues in language retrieval, recognizing that the corpus will likely function as an intermediate stage between the original manuscripts and any language materials developed by the community. The paper is addressed primarily to linguists working on language retrieval projects but may also be of use to communities who are working with linguists, as it provides insight into the concerns and preoccupations that linguists bring to such tasks.
Description
Keywords
corpus, Sm'algyax, William Beynon, Tsimshian
Citation
Stebbins, Tonya N. and Birgit Hellwig. 2010. Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials. Language Documentation & Conservation 4. 34-59.
Extent
26 pages
Format
Geographic Location
Time Period
Related To
Table of Contents
Rights
Creative Commons Attribution Non-Commercial No Derivatives License
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.