Improving Speech-to-Text Transcription of Chinese Podcasts

Date

2022-08

Contributor

Advisor

Department

Instructor

Depositor

Speaker

Researcher

Consultant

Interviewer

Narrator

Transcriber

Annotator

Journal Title

Journal ISSN

Volume Title

Publisher

Language Flagship Technology Innovation Center

Volume

Number/Issue

Starting Page

Ending Page

Alternative Title

Abstract

The internship was spent contributing to the Tech Center’s ongoing podcast project; an ap- plication that will collect language podcasts and extract information from those podcasts that can help language learners and instructors better find relevant language learning materi- als. The podcast audio files are transcribed by software, and most of the work of the intern- ship was creating a markup tool that can im- prove the quality of the podcast transcriptions. The transcriptions were corrected by hand and then a rule-based approach was developed to correct errors the transcription software consis- tently made. This adds a layer of polish to the project, yielding cleaner and more accurate En- glish translations later on. The internship was largely exploratory, and the rest of the time was spent experimenting with other aspects of the project, such as researching lexical sophis- tication and how a metric for the sophistication of a text could be useful information to teach- ers or learners trying to gather useful study ma- terials.

Description

Keywords

Citation

Schmitt, E. (2022). Improving speech-to-text transcriptions of Chinese podcasts.

Extent

Format

Technical Report

Geographic Location

Time Period

Related To

Related To (URI)

Table of Contents

Rights

Rights Holder

Local Contexts

Collections

Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.