Lang*Reg corpus: Documenting intraspeaker variation across languages and registers

Date

2025-03

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

University of Hawaii Press

Volume

19

Number/Issue

Starting Page

40

Ending Page

66

Alternative Title

Abstract

We present a new corpus design for multi-lingual corpora that involve intra-speaker variation in different situational-functional contexts, including primarily spoken but also the written mode, with the aim towards enhancing language documentation efforts and resources. We illustrate how this comparative design and the resulting cross-culturally applicable data collection procedure has been successfully realized in order to build the Lang*Reg corpus (Adli et. al. 2024), which currently includes five languages from three different language families: German, Persian, Southern Kurdish, Yucatec Maya and Javanese. For each of these languages, the same native speakers were asked to produce language in two types of activities that naturally occur in all the respective cultural contexts: telling a story to a friend, and talking freely with various interlocutors (friend, stranger, taxi driver, university professor). Moreover, our design included the storytelling in two modes, which allows for the comparison between spoken and written modes of the same language user. We show how Lang*Reg provides a versatile resource for many purposes – in particular research into register due to the variety of situational contexts involved, we show how German and Persian exploit the right periphery for different register distinctions, and we invite others to use this resource. At the same time, we show how the methodology developed can be used as a template to complement language resources by creating comparable intra-individual, multi-purpose data sets.

Description

Keywords

Citation

Lehmann, Nico, Vahid Mortezapour, Jozina Vander Klok, Zahra Farokhnejad, David Müller, Elisabeth Verhoeven, Aria Adli. 2025. Lang*Reg corpus: Documenting intra-speaker variation across languages and registers. Language Documentation & Conservation 19: 40-66.

Extent

27

Format

Article

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

Creative Commons Attribution-NonCommercial 4.0 International

Rights Holder

Local Contexts

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.