Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/1803

Rescuing Legacy Data

File Description SizeFormat 
schmidt.pdfBetter quality3.19 MBAdobe PDFView/Open
schmidtsmall.pdfFaster download1.08 MBAdobe PDFView/Open

Full Item Record

DC FieldValueLanguage
dc.contributor.authorSchmidt, Thomasen_US
dc.contributor.authorBennöhr, Jasmineen_US
dc.date.accessioned2008-06-27T22:46:51Z-
dc.date.available2008-06-27T22:46:51Z-
dc.date.issued2008-06en_US
dc.identifier.citationSchmidt, Thomas and Jasmine Bennöhr. 2008. Rescuing Legacy Data. Language Documentation & Conservation 2(1):109–129.en_US
dc.identifier.issn1934-5275en_US
dc.identifier.urihttp://hdl.handle.net/10125/1803-
dc.description.abstractThis paper discusses issues that arise in the transformation of electronic language data from outdated to modern, sustainable formats. We first describe the problem and then present four different cases in which corpora of spoken language were converted from legacy formats to an XML-based representation. For each of the four cases, we describe the conversion workflow and discuss the difficulties that we had to overcome. Based on this experience, we formulate some more general observations about transforming legacy data and conclude with a set of best practice recommendations for a more sustainable handling of language corpora.en_US
dc.description.sponsorshipNational Foreign Language Resource Centeren_US
dc.language.isoengen_US
dc.publisherUniversity of Hawai'i Pressen_US
dc.subjectelectronic language dataen_US
dc.subjectXMLen_US
dc.subjectlegacy dataen_US
dc.subjectcorpusen_US
dc.titleRescuing Legacy Dataen_US
dc.typeArticleen_US
dc.type.dcmiTexten_US
prism.publicationnameLanguage Documentation & Conservationen_US
prism.volume2en_US
prism.number1en_US
Appears in Collections:Volume 02 Issue 1 : Language Documentation & Conservation



This item is licensed under a Creative Commons License Creative Commons