Improving Speech-to-Text Transcription of Chinese Podcasts
dc.contributor.author | Schmitt, Elliot | |
dc.date.accessioned | 2022-11-01T22:04:48Z | |
dc.date.available | 2022-11-01T22:04:48Z | |
dc.date.copyright | 2022 | |
dc.date.issued | 2022-08 | |
dc.description.abstract | The internship was spent contributing to the Tech Center’s ongoing podcast project; an ap- plication that will collect language podcasts and extract information from those podcasts that can help language learners and instructors better find relevant language learning materi- als. The podcast audio files are transcribed by software, and most of the work of the intern- ship was creating a markup tool that can im- prove the quality of the podcast transcriptions. The transcriptions were corrected by hand and then a rule-based approach was developed to correct errors the transcription software consis- tently made. This adds a layer of polish to the project, yielding cleaner and more accurate En- glish translations later on. The internship was largely exploratory, and the rest of the time was spent experimenting with other aspects of the project, such as researching lexical sophis- tication and how a metric for the sophistication of a text could be useful information to teach- ers or learners trying to gather useful study ma- terials. | |
dc.format | Technical Report | |
dc.identifier.citation | Schmitt, E. (2022). Improving speech-to-text transcriptions of Chinese podcasts. | |
dc.identifier.uri | https://hdl.handle.net/10125/104270 | |
dc.publisher | Language Flagship Technology Innovation Center | |
dc.rights.license | Attribution-NonCommercial-ShareAlike 3.0 United States | |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/3.0/us/ | |
dc.title | Improving Speech-to-Text Transcription of Chinese Podcasts | |
dcterms.type | Text |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Improving_Speech_to_Text_Transcriptions_of_Chinese_Podcasts (1).pdf
- Size:
- 185.52 KB
- Format:
- Adobe Portable Document Format