Lüdeling, Anke2012-07-052012-07-052012-08Lüdeling, Anke. 2012. A corpus linguistics perspective on language documentation, data, and challenge of small corpora. In Frank Seifart, Geoffrey Haig, Nikolaus P. Himmelmann, Dagmar Jung, Anna Margetts, and Paul Trilsbeek (eds). 2012. Potentials of Language Documentation: Methods, Analyses, and Utilization. 39-45. Honolulu: University of Hawai'i Press.978-0-9856211-0-0http://hdl.handle.net/10125/4514This paper deals with issues of corpus design that might prove problematic for the study of under-resourced languages, e.g. in language documentation. It argues that it is not yet well understood which linguistic and extra-linguistic (predictor) variables cause linguistic variation (i.e. the response variable), which means that the scope of a linguistic finding cannot always be assessed. In order to deal with this problem, it is argued that we need a flexible corpus architecture with the option of adding meta-data to corpora/sub-corpora at any point in time.Creative Commons Attribution Non-Commercial Share Alike LicenseA corpus linguistics perspective on language documentation, data, and challenge of small corpora