Lang*Reg corpus: Documenting intraspeaker variation across languages and registers
Date
2025-03
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Narrator
Transcriber
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
University of Hawaii Press
Volume
19
Number/Issue
Starting Page
40
Ending Page
66
Alternative Title
Abstract
We present a new corpus design for multi-lingual corpora that involve intra-speaker variation in different situational-functional contexts, including primarily spoken but also the written mode, with the aim towards enhancing language documentation efforts and resources. We illustrate how this comparative design and the resulting cross-culturally applicable data collection procedure has been successfully realized in order to build the Lang*Reg corpus (Adli et. al. 2024), which currently includes five languages from three different language families: German, Persian, Southern Kurdish, Yucatec Maya and Javanese. For each of these languages, the same native speakers were asked to produce language in two types of activities that naturally occur in all the respective cultural contexts: telling a story to a friend, and talking freely with various interlocutors (friend, stranger, taxi driver, university professor). Moreover, our design included the storytelling in two modes, which allows for the comparison between spoken and written modes of the same language user. We show how Lang*Reg provides a versatile resource for many purposes – in particular research into register due to the variety of situational contexts involved, we show how German and Persian exploit the right periphery for different register distinctions, and we invite others to use this resource. At the same time, we show how the methodology developed can be used as a template to complement language resources by creating comparable intra-individual, multi-purpose data sets.
Description
Keywords
Citation
Lehmann, Nico, Vahid Mortezapour, Jozina Vander Klok, Zahra Farokhnejad, David Müller, Elisabeth Verhoeven, Aria Adli. 2025. Lang*Reg corpus: Documenting intra-speaker variation across languages and registers. Language Documentation & Conservation 19: 40-66.
Extent
27
Format
Article
Geographic Location
Time Period
Related To
Related To (URI)
Table of Contents
Rights
Creative Commons Attribution-NonCommercial 4.0 International
Rights Holder
Local Contexts
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.