Improving Speech-to-Text Transcription of Chinese Podcasts

Date
2022-08
Authors
Schmitt, Elliot
Contributor
Advisor
Department
Instructor
Depositor
Speaker
Researcher
Consultant
Interviewer
Annotator
Journal Title
Journal ISSN
Volume Title
Publisher
Language Flagship Technology Innovation Center
Volume
Number/Issue
Starting Page
Ending Page
Alternative Title
Abstract
The internship was spent contributing to the Tech Center’s ongoing podcast project; an ap- plication that will collect language podcasts and extract information from those podcasts that can help language learners and instructors better find relevant language learning materi- als. The podcast audio files are transcribed by software, and most of the work of the intern- ship was creating a markup tool that can im- prove the quality of the podcast transcriptions. The transcriptions were corrected by hand and then a rule-based approach was developed to correct errors the transcription software consis- tently made. This adds a layer of polish to the project, yielding cleaner and more accurate En- glish translations later on. The internship was largely exploratory, and the rest of the time was spent experimenting with other aspects of the project, such as researching lexical sophis- tication and how a metric for the sophistication of a text could be useful information to teach- ers or learners trying to gather useful study ma- terials.
Description
Keywords
Citation
Schmitt, E. (2022). Improving speech-to-text transcriptions of Chinese podcasts.
Extent
Format
Technical Report
Geographic Location
Time Period
Related To
Table of Contents
Rights
Rights Holder
Local Contexts
Collections
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.