Understanding the role of text length, sample size and vocabulary size in determining text coverage

Date
2005-04
Authors
Chujo, Kiiyomi
Utiyama, Masao
Journal Title
Journal ISSN
Volume Title
Publisher
University of Hawaii National Foreign Language Resource Center
Center for Language & Technology
Abstract
Although the use of "text coverage" to measure the intelligibility of reading materials is increasing in the field of vocabulary teaching and learning, to date there have been few studies which address the methodological variables that can affect reliable text coverage calculations. The objective of this paper is to investigate how differing vocabulary size, text length, and sample size might affect the stability of text coverage, and to define relevant parameters. In this study, 23 varying vocabulary sizes taken from the high frequency words of the British National Corpus and 26 different text lengths taken from the Time Almanac corpus were analyzed using 10 different sample sizes in 1,000 iterations to calculate text coverage, and the results were analyzed using the distribution of the mean score and standard deviation. The results of the study empirically demonstrate that text coverage is more stable when the vocabulary size is larger, the text length is longer, and more samples are used. It was also found that the stability of text coverage is greater from a larger number of shorter samples than from a fewer number of longer samples. As a practical guideline for educators, a table showing minimum parameters is included for reference in computing text coverage calculations.
Description
Keywords
text coverage, sample size, text length, vocabulary size, standard deviation, sampling methodology
Citation
Rights
Access Rights
Collections
Email libraryada-l@lists.hawaii.edu if you need this content in ADA-compliant format.