Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/42045

Two Main Contents in a Syllabus for Language Documentation: the Learning Data Models and an Assignation of Data Conversion

File SizeFormat 
42045-a.pdf33.12 kBAdobe PDFView/Open
42045-b.pdf24.16 kBAdobe PDFView/Open

Item Summary

Title: Two Main Contents in a Syllabus for Language Documentation: the Learning Data Models and an Assignation of Data Conversion
Authors: Ohya, Kazushi
Issue Date: 02 Mar 2017
Description: Previously, in Author(2015a) at ICLDC4, we suggested a data format of time information and a data model for inter-linear data, and the strategy of language documentation in the same line was proposed in Author(2016) at LREC2016. As far as these experiments go, there are four fields in language documentation: (a) learning usages of tools including devices and software, (b) learning data models that are used in programming based on computer science, (c) transforming data formats of language data into others in moving to the next phase of data handling in language documentation, and (d) implementing software of data management system such as ELAN, FLEx, SQL engines, web systems, and so on. The actual components selected for language documentation are different according to the projects' scale, period, and members. However, if we suppose that participants of language documentation are simply linguists and computer scientists and/or engineers, we can get a clear perspective on a structure of language documentation. Linguists are mainly engaged in the (a), (b), and sometimes (c), and computer scientists and/or engineers in the (c) and (d). Linguists seem to be awake to the danger of sticking to an application, but there are few people who give up doing it. As a history of computer science and archive studies said, there is no future in application dependence. The (a) is not appropriate as a field of language documentation. For linguists, learning data models is the only way to ensure life-long operation of their language data. However, it is true that there are not enough guidelines or textbooks on data models for linguists. In the poster we will suggest a syllabus for that. For example, two data models, a simple tree structure adopted in many markup languages and a inter-linear data structure adopted in record-based data or in XML data formats like ELAN and FLEx, the background theory and the actual way of manipulation are included in our syllabus. And, in terms of projects' architecture, the (c) is the most problematic point because there is a missing theory of data conversion: no format to define a pattern of a final description or converted data. In nine years of our projects, we have not found a good solution to this problem. In our syllabus, linguists learn an idea of system science and a way to assign roles with computer scientists and/or engineers. References: Durand, J.et.al eds, 2014, The Oxford Handbook of Corpus Phonology, Oxford University Press McCawley, J.D., 1981, Everything that Linguists have Always Wanted to Know about Logic but were ashamed to ask, The University of Chicago Press [Author 2015a] [Author 2015b] [Author 2016] Thieberger, N. ed., 2012, The Oxford Handbook of Linguistic Fieldwork, The Oxford University Press
URI/DOI: http://hdl.handle.net/10125/42045
Appears in Collections:5th International Conference on Language Documentation and Conservation (ICLDC)



Items in ScholarSpace are protected by copyright, with all rights reserved, unless otherwise indicated.