Evaluating cross-linguistic forced alignment of conversational data in north Australian Kriol, an under-resourced language

dc.contributor.author Jones, Caroline
dc.contributor.author Li, Weicong
dc.contributor.author Almeida, Andre
dc.contributor.author German, Amit
dc.date.accessioned 2019-06-19T19:36:49Z
dc.date.available 2019-06-19T19:36:49Z
dc.date.issued 2019-06
dc.description.abstract Speech technology is transforming language documentation; acoustic models trained on “small” languages are now technically feasible. At the same time, forced alignment built for major world languages has matured and now offers ease of use through web interfaces requiring low technical expertise. This paper provides an updated and detailed evaluation of cross-linguistic forced alignment, the approach of using forced aligners untrained on the target language. We compare two options within MAUS (Munich Automatic Segmentation System): language-independent mode vs major world language system (here, Italian) on the one dataset, a comparison that has not previously been reported. The dataset comes from a corpus of adult conversational speech in Kriol, an English-based creole of northern Australia. The results of using MAUS Italian were better than those of using the language-independent mode and those in previous studies: the agreement rate at 20 ms was 72.1% at vowel onset and 57.2% at vowel offset. With completely misaligned tokens excluded, the overall agreement rate rose to 69.2% at 20 ms and over 90% at 50 ms. Most errors in the output SAMPA (Speech Assessment Methods Phonetic Alphabet) labels were resolvable with simple text replacements. These results offer updated benchmark data for an untrained, late-model forced alignment system. en_US
dc.description.sponsorship National Foreign Language Resource Center en_US
dc.format.extent 19 pages en_US
dc.identifier.citation Jones, Caroline, Weicong Li, Andre Almeida, & Amit German. 2019. Evaluating cross-linguistic forced alignment of conversational data in north Australian Kriol, an under-resourced language. Language Documentation & Conservation 13: 281-299. en_US
dc.identifier.issn 1934-5275
dc.identifier.uri http://hdl.handle.net/10125/24869
dc.language.iso en-US en_US
dc.publisher University of Hawaii Press en_US
dc.rights Creative Commons Attribution-NonCommercial 4.0 International en_US
dc.rights Attribution-NonCommercial 3.0 United States
dc.rights.uri http://creativecommons.org/licenses/by-nc/3.0/us/
dc.subject Australian Kriol en_US
dc.subject forced alignment en_US
dc.subject Creoles en_US
dc.subject language documentation en_US
dc.subject speech technology en_US
dc.title Evaluating cross-linguistic forced alignment of conversational data in north Australian Kriol, an under-resourced language en_US
dc.type Article en_US
dc.type.dcmi Text en_US
prism.endingpage 299 en_US
prism.publicationname Language Documentation & Conservation en_US
prism.startingpage 281 en_US
prism.volume 13 en_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
jones_et_al.pdf
Size:
1.3 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.73 KB
Format:
Item-specific license agreed upon to submission
Description: