Improving Speech-to-Text Transcription of Chinese Podcasts

dc.contributor.authorSchmitt, Elliot
dc.date.accessioned2022-11-01T22:04:48Z
dc.date.available2022-11-01T22:04:48Z
dc.date.copyright2022
dc.date.issued2022-08
dc.description.abstractThe internship was spent contributing to the Tech Center’s ongoing podcast project; an ap- plication that will collect language podcasts and extract information from those podcasts that can help language learners and instructors better find relevant language learning materi- als. The podcast audio files are transcribed by software, and most of the work of the intern- ship was creating a markup tool that can im- prove the quality of the podcast transcriptions. The transcriptions were corrected by hand and then a rule-based approach was developed to correct errors the transcription software consis- tently made. This adds a layer of polish to the project, yielding cleaner and more accurate En- glish translations later on. The internship was largely exploratory, and the rest of the time was spent experimenting with other aspects of the project, such as researching lexical sophis- tication and how a metric for the sophistication of a text could be useful information to teach- ers or learners trying to gather useful study ma- terials.
dc.formatTechnical Report
dc.identifier.citationSchmitt, E. (2022). Improving speech-to-text transcriptions of Chinese podcasts.
dc.identifier.urihttps://hdl.handle.net/10125/104270
dc.publisherLanguage Flagship Technology Innovation Center
dc.rights.licenseAttribution-NonCommercial-ShareAlike 3.0 United States
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/us/
dc.titleImproving Speech-to-Text Transcription of Chinese Podcasts
dcterms.typeText

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Improving_Speech_to_Text_Transcriptions_of_Chinese_Podcasts (1).pdf
Size:
185.52 KB
Format:
Adobe Portable Document Format

Collections