Computing in the field: language modeling for elicitation and documentation of Shughni

Hippisley, Andrew
Stump, Gregory
Raphael, Finkel
Hippisley, Andrew
Stump, Gregory
Raphael, Finkel
Journal Title
Journal ISSN
Volume Title
Starting Page
Ending Page
Alternative Title
We propose a way of enhancing computer-based approaches to language documentation by making use not only of the engineering capability of computing but also its modeling capacity. Our proposal arises from a documentation pilot project where we used computational modeling as an elicitation tool for documenting the complex verbal morphology of the underdocumented East Iranian Pamir language Shughni. Using the computable lexical knowledge representation language DATR (Evans & Gazdar 1996) and its variant KATR (Author et al. 2002), we wrote a theory of a fragment of the Shughni verb system based on what little we knew about the language. We then presented its theorem to our group of Shughni consultants, and based on their responses refined the model, and then consulted them on the new theorem, and so on to the next refinement. Cycling through these steps allowed us to refine our model and so lead to a more accurate account of the data. Equally importantly, this method gave us an automated ‘questionnaire generator’, i.e. the model's theorem. This provided not only elicitation queries that, given enough time, we may have thought of ourselves but those which may never have occurred to us. Both types of query were available to us precisely because our understanding of the grammar was formal and computationally implemented, and could thereby automatically generate theorems. Computing plays a key language engineering role in language documentation and its accessibility to the wider audience, from standard mark-up of data to its storage in a relational database for query-based retrieval. But computing serves a second purpose for linguists, that of language modeling: this is “the instrumental use of computation in the pursuit of linguistic goals” (Thompson 1983: 23). As we develop new methods for documentation, we need to explore the possibility of harnessing this other language modeling capacity of computing. We demonstrate through our work on Shughni that computer modeling can be a means of furnishing the field-worker with elicitation tasks whose results feed into an enhanced understanding of the data, which in turn show the path to the next stage of elicitation, ultimately leading to a well-informed and robust account of the data which is already digitized and therefore exchangeable. Advances in technology, such as palm-held computers, mean that an automated model-theorem-refinement method is both a practical and potentially highly valuable addition to the field-worker’s toolkit, both while in the field and back in the lab.
Geographic Location
Time Period
Related To
Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported
Rights Holder
Email if you need this content in ADA-compliant format.