Intonation Units and “Sentences” in ELAN and Toolbox

Intonation Units and "Sentences" in ELAN and Toolbox
Sebastian Drude
Han Sloetjes
04 Mar 2017
Description: The very first step in creating annotation in language documentation consists in segmenting the recording, in preparation for the subsequent basic annotation – adding a transcription and a translation. ELAN is one of the most often used tools for these first steps, while Toolbox continues to be often chosen for the following step of annotation: adding basic glossing (splitting word forms into morphs and glossing each morph while building up a lexical database of morphemes and stems/words). One question that continues to cause difficulties and debate is: what are the basic units into which a recording is to be segmented, intonation units (= intonation[al]/prosodic units), or (larger) “sentences”? N. Himmelmann, for one (p.c.), argues that in language documentation, the orality of speech should take the primacy and therefore, intonation units should be the basic segments for annotation, as is often done in discourse and conversational analysis. Sometimes it may even be questioned whether “sentences” are a phenomenon of written language only that often fail to apply in the case of non-written languages. This paper pursues two aims. First, it argues and illustrates, on the basis of the language that author A is investigating, that “sentences” (or more precisely: syntactic units, which are often larger than intonation units) not only do exist also in languages with no tradition in writing, but that they should be the very basis for any (morpho )syntactic analysis. On the other hand, recognizing the importance of intonation units, the authors propose that indeed both should be annotated in language documentation, and that from a methodological point of view it makes much sense to start with segmenting into intonation units. This poses the need for an efficient workflow that (a) avoids doubled segmenting and annotating in ELAN, and that (b) includes a solid round-trip-configuration for exporting basic annotation from ELAN to Toolbox, where basic glossing is done, and importing the result back from Toolbox into ELAN. It is the second aim of this talk to show such a workflow of segmenting, transcribing, translating and glossing. This workflow, which has been developed over years in practicing and teaching language documentation, is illustrated step by step, and the recommended settings for ELAN and Toolbox are presented. Without too much additional effort, one achieves documentation in a format that promises to be a good basis for both discourse and grammatical analysis. References Michael McCarthy and Ronald Carter (2001): Ten criteria for a spoken grammar. In E Hinkel and S Fotos (eds) New Perspectives on Grammar Teaching in Second Language Classrooms. Mahwah, NJ: Lawrence Erlbaum Associates, 51-75 Ronald Carter & Michael J McCarthy (1995): Grammar and the spoken language. Applied Linguistics 16 (2): 141-58 Liesbeth Degand and Anne Catherine Simon (2009): On identifying basic discourse units in speech: theoretical and empirical issues. Discours 4, special issue: Linearization and Segmentation in Discourse. URL : ELAN: a professional tool for the creation of complex annotations on video and audio resources; release 4.9.4, May 19, 2016; Max Planck Institute for Psycholinguistics, The Language Archive, Nijmegen, The Netherlands Nikolaus P Himmelmann, Meytal Sandler, Jan Strunk & Volker Unterladstetter (submitted): "On the robustness of intonational phrases in spontaneous speech – a crosslinguistic interrater study" Nikolaus P Himmelmann (p.c.): “Prosody in language documentation: Taking spoken language seriously”. Talk given similarly at several occasions, recently at the Summer school for digital humanities and language documentation, Batumi, Georgia, August 2016. Shlomo Izre'el (2005): Intonation Units and the Structure of Spontaneous Spoken Language: A View from Hebrew. In: Cyril Auran, Roxanne Bertrand, Catherine Chanet, Annie Colas, Albert Di Cristo, Cristel Portes, Alain Reynier and Monique Vion (eds.). Proceedings of the IDP05 International Symposium on Discourse-Prosody Interfaces. Toolbox: The Field Linguist’s Toolbox, Current version: 1.5.8, released February 2010,, SIL International Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., Author B. (2006): ELAN: a Professional Framework for Multimodality Research. In: Proceedings of LREC 2006, Fifth International Conference on Language Resources and Evaluation
