“Data is Nice:” Theoretical and pedagogical implications of an Eastern Cherokee corpus

dc.contributor.author Frey, Benjamin
dc.date.accessioned 2020-05-18T23:32:58Z
dc.date.available 2020-05-18T23:32:58Z
dc.date.issued 2018
dc.description.abstract This paper serves as a proof of concept for the usefulness of corpus creation in Cherokee language revitalization. It details the initial collection of a digital corpus of Cherokee/English texts and enumerates how corpus material can augment contemporary language revitalization efforts rather than simply preserving language for future analysis. By collecting and analyzing corpus material, we can quickly create new classroom materials and media products, and answer deeper theoretical linguistic questions. With a large enough corpus, we can even implement machine translation systems to facilitate the production of new texts. Although the vast majority of print material in Cherokee is in the Western dialect, this corpus has focused on Eastern texts. Expanding the dataset to include both dialects, however, will allow for comparison and facilitate generalizations about the Cherokee language as a whole. A corpus of Cherokee data can answer second language learners’ questions about the structure of the language and provide patterns for more effective, targeted learning of Cherokee. It can also provide teachers with ready access to accurate representations of the language produced by native speakers. By combining documentation and technology, we can leverage the power of databases to expedite and facilitate language revitalization.
dc.description.sponsorship National Foreign Language Resource Center
dc.identifier.citation Frey, Benjamin. 2020. “Data is Nice:” Theoretical and pedagogical implications of an Eastern Cherokee corpus. In Silva, Wilson de Lima and Katherine J. Riestenberg. (Eds.) Collaborative Approaches to the Challenges of Language Documentation and Conservation: Selected papers from the 2018 Symposium on American Indian Languages (SAIL). Language Documentation & Conservation Special Publication no. 20 [PP 38-53] Honolulu: University of Hawai'i Press.
dc.identifier.isbn 978-0-9973295-8-2
dc.identifier.uri http://hdl.handle.net/10125/24931
dc.publisher University of Hawai'i Press
dc.relation.ispartof LD&C Special Publication
dc.rights Creative Commons Attribution Non-Commercial Share Alike License
dc.subject language documentation
dc.subject Cherokee
dc.subject language corpus
dc.subject machine translation
dc.subject language technology
dc.title “Data is Nice:” Theoretical and pedagogical implications of an Eastern Cherokee corpus
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
648.35 KB
Adobe Portable Document Format