ScholarSpace ScholarSpace
 

ScholarSpace at University of Hawaii at Manoa >
Department of Linguistics >
Language Documentation >
Language Documentation & Conservation >
Language Documentation & Conservation (Journal) >
Volume 02 Issue 1 : Language Documentation & Conservation >

Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/1803

Title: Rescuing Legacy Data
Author(s): Schmidt, Thomas
Bennöhr, Jasmine
Keywords: electronic language data
XML
legacy data
corpus
Issue Date: Jun-2008
Publisher: University of Hawai'i Press
Citation: Schmidt, Thomas and Jasmine Bennöhr. 2008. Rescuing Legacy Data. Language Documentation & Conservation 2(1):109–129.
Abstract: This paper discusses issues that arise in the transformation of electronic language data from outdated to modern, sustainable formats. We first describe the problem and then present four different cases in which corpora of spoken language were converted from legacy formats to an XML-based representation. For each of the four cases, we describe the conversion workflow and discuss the difficulties that we had to overcome. Based on this experience, we formulate some more general observations about transforming legacy data and conclude with a set of best practice recommendations for a more sustainable handling of language corpora.
Sponsor(s): National Foreign Language Resource Center
URI: http://hdl.handle.net/10125/1803
ISSN: 1934-5275
Appears in Collections:Volume 02 Issue 1 : Language Documentation & Conservation

Files in This Item:

File Description SizeFormat
schmidt.pdfBetter quality3.19 MBAdobe PDFView/Open
schmidtsmall.pdfFaster download1.08 MBAdobe PDFView/Open


This item is protected by original copyright

Recommend this item
Statistics

This item is licensed under a Creative Commons License
Creative Commons

Items in ScholarSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

 

Valid XHTML 1.0! DSpace Software Copyright © 2002-2007 MIT and Hewlett-Packard - Feedback