A quantitative analysis of linguistic metadata

Hooshiar, Kavon

A quantitative analysis of linguistic metadata

Files

25299.mp3 (58.08 MB)

25299.pdf (12 MB)

Date

2015-03-12

Authors

Hooshiar, Kavon

Speaker

Hooshiar, Kavon

Description

Documentation is both labor intensive and time consuming. The number of man-hours that trained linguists are capable of contributing to the task of documenting the world's languages is not sufficient to document every language to fully satisfactory levels. Therefore, we should use all the tools at our disposal in order to decide which languages to focus on. Obviously, community interest and relationships with community members and consultants will always play a major role in determining which languages are documented, as well as the personal interests of linguists themselves. Beyond this, what factors should we use in choosing languages to document, and which factors have we been using? Advice on this topic varies – some say endangerment is an important factor (Krauss, 1992) while others say it should be ignored completely (Newman, 2013). Our major funding bodies seek to support linguists documenting endangered languages. (Nathan, 2013). Meanwhile, our literature routinely assumes an inverse correlation between language endangerment and the extent of completed linguistic research (King, 2008), while rejecting an inverse correlation between language size and endangerment (Nettle, 2000). In both cases, if evidence is given, it is in the form of specific languages that amount to statistically insignificant data. In short, linguists speculate about statistical relations of linguistic metadata even when no quantitative analysis has been carried out. Furthermore, linguists can be skeptical of quantitative methods and those who use them, treating data mining and analytics as suspect methods used by outsiders that are not sensitive to the issues faced by linguists. In this context, we as a community need to explore quantitative analyses of our data and see where it leads us. In an attempt to begin this process, I ask whether correlations exist among the metadata we collect, including language size, endangerment and typological classification, as well as the extent of available documentation and pedagogical materials. With this work I hope to establish a discussion about what the presence and absence of such correlations means, what we should glean from this statistical analysis, and where we should go from there. King, Kendall A. (2008). Sustaining linguistic diversity: endangered and minority languages and language varieties. Georgetown University Press. Krauss, M. (1992). The world's languages in crisis. Language, 68(1), 4-10. Nathan, David. (2013). The hans rausing endangered languages project. http://www.hrelp.org/aboutus/. Nettle, D. (2000). Vanishing Voices: The Extinction of the World's Languages: The Extinction of the World's Languages. Oxford University Press. Newman, Paul. (2013). "The Law of Unintended Consequences: How the Endangered Languages Movement Undermines Field Linguistics as a Scientific Enterprise." Linguistics departmental seminar series. SOAS, London. October 15th.

URI

http://hdl.handle.net/10125/25299

Rights

Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported

Collections

4th International Conference on Language Documentation and Conservation (ICLDC)

Full item page

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.