Volume 15 : Language Documentation & Conservation

    Between Stress and Tone: Acoustic Evidence of Word Prominence in Kurtöp
    (University of Hawaii Press, 2021-12) Hyslop, Gwendolyn
    Classic typologies within prosody tend to treat ‘tone’ languages as being diametrically opposed to ‘stress’ languages. However, Hyman (2006) highlights several languages that can have both, including Seneca, Fasu, and Copala Trique. As language documentation advances and our acoustic methodologies in the field are further refined, we have seen this list continue to expand. The aim in this article is to further this research trajectory by presenting the correlates of stress in Kurtöp, a tonal Tibeto-Burman language. Kurtöp has a word-level tone system, in which high versus low tone is required on the first syllable of every word. Stress, or prosodic word-level prominence, is realised on the first syllable of a root. Thus, stress and tone usually occur on the same syllable; they are only separated from each other when the negative prefix triggers movement of the tone to the initial syllable, leaving a stressed but toneless second syllable. Based on data collected in the field from three speakers, this article shows that the primary correlate of stress is duration, not pitch, intensity, or expansion of vowel space.
    Automatic Speech Recognition for Supporting Endangered Language Documentation
    (University of Hawaii Press, 2021-11) Prud’hommeaux, Emily ; Jimerson, Robbie ; Hatcher, Richard ; Michelson, Karin
    Generating accurate word-level transcripts of recorded speech for language documentation is difficult and time-consuming, even for skilled speakers of the target language. Automatic speech recognition (ASR) has the potential to streamline transcription efforts for endangered language documentation, but the practical utility of ASR for this purpose has not been fully explored. In this paper, we present results of a study in which both linguists and community members, with varying levels of language proficiency, transcribe audio recordings of an endangered language under timed conditions with and without the assistance of ASR. We find that both time-to-transcribe and transcription error rates are significantly reduced when correcting ASR for language learners of all levels. Despite these improvements, most community members in our study express a preference for unassisted transcription, highlighting the need for developers to directly engage with stakeholders when designing and deploying technologies for supporting language documentation.
    Using YouTube as the Primary Transcription and Translation Platform for Remote Corpus Work
    (University of Hawaii Press, 2021-11) Rice, Alexander
    This paper presents a remote corpus work model that was developed between an outside researcher and community collaborator to continue transcription/translation work at a distance with previously collected material in response to the travel restrictions imposed by the coronavirus pandemic. The paper describes, in detail, the corpus work model, which is based on Ryan Pennington’s (2014) SayMore-FLEx-ELAN workflow and uses YouTube as the primary transcription/translation platform. The paper also describes the pros, cons, and specific situational context in which this model has proven useful so that other documentation teams in similar contexts might benefit. In addition to simply providing a method of doing corpus work remotely, the model also provides a way to maintain community capacity building at a distance.
    Mapping Urban Linguistic Diversity in New York City: Motives, Methods, Tools, and Outcomes
    (University of Hawaii Press, 2021-10) Perlin, Ross ; Kaufman, Daniel ; Turin, Mark ; Daurio, Maya ; Craig, Sienna ; Lampel, Jason
    Communities around the world have distinctive ways of representing language use across space and territory. The approach to and method of mapping languages that began with nineteenth-century European dialectology and colonial boundary making is one such way. Though practiced by relatively few linguists today, language mapping has developed considerably from its roots yet remains stymied by problems of ideology, representation, and data quality. In this paper, we argue that digital language mapping in hyperdiverse cities can both contribute to overcoming these problems and bring visibility and resources to communities using Indigenous, minority, and primarily oral languages. For these communities, official surveys like the census are often inadequate, leaving a gap that communities, linguists, and mapping experts working in partnership can address. Urban language mapping as a field should make space for Indigenous, minority, and primarily oral languages through geospatial visualization – in terms that the communities themselves recognize and with a public policy agenda. As a case study, we present our ongoing efforts with LANGUAGEMAP.NYC to map the most linguistically diverse urban center in the world: New York City.
    The Role of Input in Language Revitalization: The Case of Lexical Development
    (University of Hawaii Press, 2021-10) O’Grady, William ; Heaton, Raina ; Bulalang, Sharon ; King, Jeanette
    Immersion programs have long been considered the gold standard for school-based language revitalization, but surprisingly little attention has been paid to the quantity and quality of the input that they provide to young language learners. Drawing on new data from three such programs (Kaqchikel, Western Subanon, and Māori), each with its own particular motivation, objectives, and pedagogical practices, we examine a key component of this revitalization strategy, namely the amount and type of lexical input that children receive. Our findings include previously unknown facts about the number of words that children in these programs hear per hour, the ratio of word tokens to word types, and the skewed frequency distribution of the particular words that make up the input. We discuss our findings with reference both to comparable measures for first language acquisition in a home setting and to their relevance for pedagogical strategies in the classroom.
    Collaborative Fieldwork with Custom Mobile Apps
    (University of Hawaii Press, 2021-09) Bettinson, Mat ; Bird, Steven
    Mobile apps have the potential to support collaborative fieldwork even where web connectivity is unreliable or unavailable. To explore this potential, we developed portable network infrastructure and custom-made field tool apps. We deployed this solution in remote communities in the far north of Australia, in connection with co-located cooperative language work. Throughout a series of visits, we worked with community members to iterate the designs, optimising their suitability for the tasks and the context. We found that custom toolmaking provides the benefits of digital collaboration tailored for the specific needs of the environment and community. However, we argue that it is activity design – not the technology itself – that must be foregrounded, placing fieldworkers in the driving seat of innovation in digital fieldwork practice.
    The Conundrum of Friulian Language Vitality
    (University of Hawaii Press, 2021-09) De Cia, Simone
    Italy is characterized by a considerable amount of language variation. Only a few spoken vernaculars enjoy institutional support and are officially recognized as minority languages. Among these, Friulian is one of the largest in terms of number of speakers. In the past decade, the assessment of Friulian language vitality has yielded discordant conclusions. The aim of the present paper is to shed light on Friulian’s vitality by providing an informed discussion of the findings of the three most recent studies on the topic, namely De Cia (2013), Coluzzi (2015), and Melchior (2015). As a framework for discussion and means of synthesis among the different claims put forward on Friulian’s vitality, I will make reference to the nine factors of language vitality proposed by UNESCO (2003): each factor describes six possible sociolinguistic scenarios, which reflect six different levels of language vitality. Despite its official status and institutional support, Friulian lacks young native speakers and is used more and more infrequently in a limited number of social settings. The overall picture suggests that a marked process of language shift from Friulian to Italian is taking place. National and regional authorities should take immediate action to ensure the future survival of the minority language.
    The Utility of Orthographic Design for Different Users: The Case of the Approved Dagbani Orthography
    (University of Hawaii Press, 2021-09) Hudu, Fusheini Angulu
    This paper presents a critical assessment of the utility of the orthography of Dagbani (a Gur language of Ghana) in the documentation, linguistic research, and literacy acquisition of Dagbani. While written literature on Dagbani dates to over a century, it was only in 1997 that the only known documented orthographic rules of the language, the Approved Dagbani Orthography (ADO), was put together. Its stated goal was to address inconsistencies that existed in the orthographic rules at the time. It has since largely served this goal and has remained a resource for linguists engaged in language documentation and linguistic research as well as adult and young learners acquiring literacy in Dagbani in formal and informal settings. The paper discusses the influence of the orthography in the understanding of aspects of Dagbani linguistics and the challenges that remain with its use in modern-day multimodal communication. It shows that while the ADO has impacted literacy, documentation, and research on Dagbani linguistics, aspects of the design of the orthography have limited its potential impact and have given room for the emergence or maintenance of co-orthographic practices used for electronic communication and in the documentation of names in non-native official circles.
    Collecting and annotating corpora for three under-resourced languages of France: Methodological issues
    (University of Hawaii Press, 2021-06) Bernhard, Delphine ; Ligozat, Anne-Laure ; Bras, Myriam ; Martin, Fanny ; Vergez-Couret, Marianne ; Erhart, Pascale ; Sibille, Jean ; Todirascu, Amalia ; Boula de Mareüil, Philippe ; Huck, Dominique
    In contrast to French, the vast majority of regional languages of France can be considered as under-resourced. In this article, we present the results of a research project aiming to produce annotated resources for three regional languages of France: Alsatian, Occitan, and Picard. These languages cover three different language families (Germanic and two subfamilies of Romance, Oïl and Oc languages) and different sociolinguistic situations. Yet, they all face issues common to many under-resourced languages: lack of human and financial resources and presence of geolinguistic variation. The originality of this project is that it brought together researchers from different fields (sociolinguistics, descriptive linguistics, dialectology, natural language processing, digital humanities) to work together towards the common goal of developing annotated corpora for Alsatian, Occitan, and Picard. This created a favorable and stimulating working environment which could not have been achieved had different research groups worked independently, each on a single language. This article details the annotation process, with a special focus on the delimitation of the tokens and the definition of the part-of-speech tags.
    Virtual Frisian: A comparison of language use in North and West Frisian virtual communities
    (University of Hawaii Press, 2021-06) Belmar, Guillem ; Heyen, Hauke
    Social networking sites have become ubiquitous in our daily communicative exchanges, which has brought about new platforms of identification and opened possibilities that were out of reach for many minoritized communities. As they represent an increasing percentage of the media we consume, these sites have been considered crucial for revitalization processes. However, the growing importance of social media may also pose a problem for minoritized languages, as the need for communication with a wider audience seems to require the use of a language of wider communication. One way in which this apparent need for a global language can be avoided is by creating virtual communities where the minoritized languages can be used without competition, a virtual breathing space. This study analyzes language practices of eight communities: four North Frisian and four West Frisian virtual communities. The analysis focuses on the languages used in each community, the topics discussed, as well as the status of the minoritized language in the community. A total of 1,127 posts are analyzed to determine whether these communities function as breathing spaces, the factors that may foster or prevent the emergence of these spaces, and the similarities and differences between these two sociolinguistic contexts.