Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/4512

Language-specific encoding in endangered language corpora

File Size Format  
03gippert.pdf 675.59 kB Adobe PDF View/Open

Item Summary

Title:Language-specific encoding in endangered language corpora
Authors:Gippert, Jost
Date Issued:Aug 2012
Publisher:University of Hawai'i Press
Citation:Gippert, Jost. 2012. Language-specific encoding in endangered language corpora. In Frank Seifart, Geoffrey Haig, Nikolaus P. Himmelmann, Dagmar Jung, Anna Margetts, and Paul Trilsbeek (eds). 2012. Potentials of Language Documentation: Methods, Analyses, and Utilization. 25-31. Honolulu: University of Hawai'i Press.
Series:LD&C Special Publication
Abstract:The paper addresses problems of corpus building and retrieval resulting from codeswitching, which is a characteristic feature of endangered language recordings. The typical appearance of code-switching phenomena is first outlined on the basis of data collected in the DoBeS ‘ECLinG’ project, which dealt with three endangered Caucasian languages spoken in Georgia: Tsova-Tush (Batsbi), Udi, and Svan. The problem of language-specific retrieval is illustrated with examples showing the usage of the word da in Tsova-Tush contexts, which represents, as a homonym, either a native copula form (‘it is’) or the Georgian conjunction ‘and’. The subsequent section discusses the annotation requirements that are necessary to automatically distinguish the languages involved in code-switching, with a focus on the emerging ISO standard 639-6. It is argued that the fine-grained distinction of varieties and subvarieties and their interrelationship – as aimed at in this standard – requires a thorough reconsideration if it is to be applied in the markup of corpus data.
URI:http://hdl.handle.net/10125/4512
ISBN:978-0-9856211-0-0
Rights:Creative Commons Attribution Non-Commercial Share Alike License
Appears in Collections: LD&C Special Publication No. 3: Potentials of Language Documentation: Methods, Analyses, and Utilization


Please email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

Items in ScholarSpace are protected by copyright, with all rights reserved, unless otherwise indicated.