Please use this identifier to cite or link to this item: http://hdl.handle.net/10125/42041

A semi-automated workflow for producing time-aligned intermediate tonal representations

File SizeFormat 
42041.pdf1.17 MBAdobe PDFView/Open
42041.mp330.07 MBMP3View/Open

Item Summary

Title: A semi-automated workflow for producing time-aligned intermediate tonal representations
Authors: McPherson, Laura
Grabowski, Emily
Issue Date: 02 Mar 2017
Description: Tone can be one of the most daunting aspects of a language to document, particularly at the beginning of a project. Even if tonal categories can be determined in elicitation contexts, tone in running speech and narratives, the core focus of documentary linguistics, is notoriously difficult even for seasoned tonal specialists. When tone marking is included, they are the researcher’s analytical conclusions (e.g. H, L) rather than a representation of the speech melody itself. To address these issues and facilitate the inclusion of objective replicable tonal annotations in language documentation, we have developed a semi-automated computational workflow to take raw phonetic data (fundamental frequency, or f0) and turn it into a more easily interpretable intermediate representation: a system of levels, already used as a descriptive lingua franca in reference grammars either numerically or using dashes (e.g. HL = 51 = ). The analyst can set the number of levels to reflect the desired level of detail. For instance, if the researcher suspects a two-tone system, she may set the number of levels at 4 or 5 to reflect processes such as declination, downdrift or upstep. For more complex tone systems, more levels might be employed. The workflow begins in Praat by creating a TextGrid annotation delimiting the tonal spans to be analyzed; for the most part, these will be vowels or syllable rimes so as to analyze only tone bearing units. A Praat script extracts f0 from multiple points in each span. Next, a Python script converts these f0 values to semitones based on the speaker’s mean f0, excluding outliers. The speaker’s pitch range, excluding these outliers, is then divided into the desired number of levels, and each point designated by the researcher within a syllable (e.g. 20% and 80%) is assigned a number corresponding to that level. The output of the Python script is a text file with time stamps for each annotation, which can be imported into Elan, thus tying together searchable tonal information with the broader text transcription. In this talk, we describe the workflow and demonstrate some analytical uses for the tool, including comparison of elicitation and free speech, interspeaker variation, and first-look analysis of unanalyzed tonal data.
URI/DOI: http://hdl.handle.net/10125/42041
Appears in Collections:5th International Conference on Language Documentation and Conservation (ICLDC)



Items in ScholarSpace are protected by copyright, with all rights reserved, unless otherwise indicated.