A quantitative analysis of linguistic metadata

Date
2015-03-12
Authors
Hooshiar, Kavon
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Hooshiar, Kavon
Researcher
Consultant
Interviewer
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Volume
Number/Issue
Starting Page
Ending Page
Alternative Title
Abstract
Description
Documentation is both labor intensive and time consuming. The number of man-hours that trained linguists are capable of contributing to the task of documenting the world's languages is not sufficient to document every language to fully satisfactory levels. Therefore, we should use all the tools at our disposal in order to decide which languages to focus on. Obviously, community interest and relationships with community members and consultants will always play a major role in determining which languages are documented, as well as the personal interests of linguists themselves. Beyond this, what factors should we use in choosing languages to document, and which factors have we been using? Advice on this topic varies – some say endangerment is an important factor (Krauss, 1992) while others say it should be ignored completely (Newman, 2013). Our major funding bodies seek to support linguists documenting endangered languages. (Nathan, 2013). Meanwhile, our literature routinely assumes an inverse correlation between language endangerment and the extent of completed linguistic research (King, 2008), while rejecting an inverse correlation between language size and endangerment (Nettle, 2000). In both cases, if evidence is given, it is in the form of specific languages that amount to statistically insignificant data. In short, linguists speculate about statistical relations of linguistic metadata even when no quantitative analysis has been carried out. Furthermore, linguists can be skeptical of quantitative methods and those who use them, treating data mining and analytics as suspect methods used by outsiders that are not sensitive to the issues faced by linguists. In this context, we as a community need to explore quantitative analyses of our data and see where it leads us. In an attempt to begin this process, I ask whether correlations exist among the metadata we collect, including language size, endangerment and typological classification, as well as the extent of available documentation and pedagogical materials. With this work I hope to establish a discussion about what the presence and absence of such correlations means, what we should glean from this statistical analysis, and where we should go from there. King, Kendall A. (2008). Sustaining linguistic diversity: endangered and minority languages and language varieties. Georgetown University Press. Krauss, M. (1992). The world's languages in crisis. Language, 68(1), 4-10. Nathan, David. (2013). The hans rausing endangered languages project. http://www.hrelp.org/aboutus/. Nettle, D. (2000). Vanishing Voices: The Extinction of the World's Languages: The Extinction of the World's Languages. Oxford University Press. Newman, Paul. (2013). "The Law of Unintended Consequences: How the Endangered Languages Movement Undermines Field Linguistics as a Scientific Enterprise." Linguistics departmental seminar series. SOAS, London. October 15th.
Keywords
Citation
Extent
Format
Geographic Location
Time Period
Related To
Table of Contents
Rights
Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.