Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/4466

Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials

File Description SizeFormat 
stebbinshellwig.pdf1.48 MBAdobe PDFView/Open

Item Summary

Title: Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials
Authors: Stebbins, Tonya N.
Hellwig, Birgit
Keywords: corpus
Sm'algyax
William Beynon
Tsimshian
Issue Date: 2010
Publisher: University of Hawai'i Press
Citation: Stebbins, Tonya N. and Birgit Hellwig. 2010. Principles and Practicalities of Corpus Design in Language Retrieval: Issues in the Digitization of the Beynon Corpus of Early Twentieth-Century Sm’algyax Materials. Language Documentation & Conservation 4. 34-59.
Abstract: This paper describes a pilot project to develop a machine-readable corpus of early twentieth-century Sm’algyax texts from a large collection of handwritten manuscripts collected by the Tsimshian ethnographer and chief William Beynon. The project seeks to ensure that the materials produced are maximally accessible to the Tsimshian community. It relates established principles for corpus design to practical issues in language retrieval, recognizing that the corpus will likely function as an intermediate stage between the original manuscripts and any language materials developed by the community. The paper is addressed primarily to linguists working on language retrieval projects but may also be of use to communities who are working with linguists, as it provides insight into the concerns and preoccupations that linguists bring to such tasks.
Sponsor: National Foreign Language Resource Center
Pages/Duration: 26 pages
URI/DOI: http://hdl.handle.net/10125/4466
ISSN: 1934-5275
Rights: Creative Commons Attribution Non-Commercial No Derivatives License
Appears in Collections:Volume 04 : Language Documentation & Conservation



This item is licensed under a Creative Commons License Creative Commons