Rescuing Legacy Data

dc.contributor.authorSchmidt, Thomas
dc.contributor.authorBennöhr, Jasmine
dc.date.accessioned2008-06-27T22:46:51Z
dc.date.available2008-06-27T22:46:51Z
dc.date.issued2008-06
dc.description.abstractThis paper discusses issues that arise in the transformation of electronic language data from outdated to modern, sustainable formats. We first describe the problem and then present four different cases in which corpora of spoken language were converted from legacy formats to an XML-based representation. For each of the four cases, we describe the conversion workflow and discuss the difficulties that we had to overcome. Based on this experience, we formulate some more general observations about transforming legacy data and conclude with a set of best practice recommendations for a more sustainable handling of language corpora.
dc.description.sponsorshipNational Foreign Language Resource Center
dc.identifier.citationSchmidt, Thomas and Jasmine Bennöhr. 2008. Rescuing Legacy Data. Language Documentation & Conservation 2(1):109–129.
dc.identifier.issn1934-5275
dc.identifier.urihttp://hdl.handle.net/10125/1803
dc.language.isoeng
dc.publisherUniversity of Hawai'i Press
dc.subjectelectronic language data
dc.subjectXML
dc.subjectlegacy data
dc.subjectcorpus
dc.titleRescuing Legacy Data
dc.typeArticle
dc.type.dcmiText
prism.endingpage129
prism.number1
prism.publicationnameLanguage Documentation & Conservation
prism.startingpage109
prism.volume2

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
schmidt.pdf
Size:
3.12 MB
Format:
Adobe Portable Document Format
Description:
Better quality
Loading...
Thumbnail Image
Name:
schmidtsmall.pdf
Size:
1.06 MB
Format:
Adobe Portable Document Format
Description:
Faster download
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
146 B
Format:
Item-specific license agreed upon to submission
Description: