Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/44831

TEI and the Mixtepec-Mixtec corpus: data integration, annotation and normalization of heterogeneous data for an under-resourced language

File Size Format  
44831.mp3 41.85 MB MP3 View/Open
44831.pdf 10.7 MB Adobe PDF View/Open

Item Summary

Title:TEI and the Mixtepec-Mixtec corpus: data integration, annotation and normalization of heterogeneous data for an under-resourced language
Contributors:Bowers, Jack (speaker)
Romary, Laurent (speaker)
Date Issued:03 Mar 2019
Description:This paper presents our approaches to creating, editing, annotating and curating an extensible and reusable TEI corpus for Mixtepec-Mixtec. We cover issues particular to working with an under-resourced language and show how we integrate a variety of homogeneous resources, normalize orthographic and phonetic data, and create searchable multi-layered annotations. (session 3.3.1)
URI:http://hdl.handle.net/10125/44831
Rights:Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported
Appears in Collections: 6th International Conference on Language Documentation and Conservation (ICLDC)


Please email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.

Items in ScholarSpace are protected by copyright, with all rights reserved, unless otherwise indicated.