Reading in a Foreign Language April 2023, Volume 35, No. 1 ISSN 1539-0578 pp. 48-71 The Counts of Dracula and Monte Cristo: Homonym Frequencies in Graded Readers Kevin Parent Korea Maritime and Ocean University Korea Stuart McLean Momoyama Gakuin University Japan Brandon Kramer Kwansei Gakuin University, School of Education Japan Young Ae Kim Kansai University Japan Abstract Graded readers are a great asset to learners acquiring the vocabulary of another language. Homonyms, on the other hand, are a recognized source of trouble for students with that same goal. Publishers of graded readers control the presentation of old and new words, but does this control extend to homonyms? Are only the word forms controlled for—in which case, the unrelated meanings of match (a pairing and a stick for starting fire) would together constitute two uses of the word? Or would these tally as separate words which, semantically and etymologically, they are? A comparison of a 4.2 million-word corpus of graded readers with previous research on the distributions of homonymic meanings in general English reveals that the meanings presented to learners are frequently quite different to those in general-purpose texts. Keywords: homonymy, polysemy, L2 reading, extensive reading, graded readers, L2 vocabulary acquisition, lexical development One issue facing language learners as they acquire large numbers of vocabulary items in the target language is that of homonyms. When encountering homonyms in reading, learners may not know the word at all, or they may know the word and the meaning employed, or they may know the word but only be familiar with a different meaning. Those unfamiliar with the word may consult the dictionary, find a list of candidate meanings, and then scan the texts for clues to help determine which meaning to apply; the choice is sometimes obvious but other times https://nflrc.hawaii.edu/rfl Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 49 hampered by their still-incomplete understanding of the text. Learners who already know only a different meaning, however, tend to apply it even when doing so fails to produce a logical sentence (Laufer, 1997); thus, partial knowledge is a potential source of interference. Though frequently acknowledged as a source of trouble, homonyms clearly present a learning challenge seldom addressed explicitly. Graded readers are a primary source of new vocabulary for many second language learners (Nation & Waring, 2020) as they provide repeated exposure to new words of the appropriate level. But how do such books treat homonymy? Do they use only the most frequent meaning, assisting in that meaning’s acquisition but providing an unbalanced presentation? Or are they careful to use each of the meanings, and if so, do the relative frequencies of these correspond roughly to what we might find in texts written for native speakers? In other words, graded readers are known for their control of vocabulary presentation, but to what level does this control extend? At a substantial level meanings would be accorded due attention from the publishing staff. At a superficial level, however, only the word form would be controlled for, so a word like rest, used once to indicate taking a break and used again to denote the remainder (and the rest), would count as two instances of the same word. This would be problematic because, etymologically, these are two different words, and if learners are not aware of their shared form, a haphazard presentation complicates its acquisition. By tallying and comparing the uses of homonyms in graded readers with the same words used by native speakers, we find, in fact, that homonyms in graded readers are not controlled for meaning. Discrete meanings are used indiscriminately, posing difficulty for learners who, instead of being exposed to target words, are encountering words whose previous encounters may be incompatible with the current one. Cobb (2013) writes of the importance of observing such distinctions. As texts such as graded readers are classified by their word level, it could take only a few misanalyzed words for shorter texts to be bumped out of a more accurate classification. Seemingly small differences, such as those between a 95% coverage level and a 98% level, can be very robust in their effect. Background Homonyms are words with unrelated meanings, or, more accurately, two or more completely different words that happen to have identical forms. Sometimes they come from different languages, such as match (they’re a good match), of Germanic origin, and match (for starting fire), from Old French and ultimately Latin. Other homonyms may come from the same language background but trace back to different words such as case (in my case and put it in its case) which stem from the Latin words casus (fall) and capsa (box) respectively. There is often confusion over terms homonyms, homographs, homophones and polysemes among language users, teachers and researchers alike. (Such is the confusion that Newsweek reports one extreme case in which a teacher in the US was fired for blogging about homophones for fear that the school would be “associated with homosexuality” (Schonfeld, 2014, July).) Homographs are words with identical written forms whose spoken forms remain distinct, such as lead which is read differently in lead singer and lead poisoning or close (adjective) and close (verb). Homophones reverse this: same pronunciation but different written forms, such as some and sum, or for, four and fore. Together, homonyms, homographs and homophones are called homoforms (Cobb, 2013; Nation & Parent, 2016; Parent, 2012). Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 50 When the senses are related, we call such words polysemes, and we speak of senses (with meanings reserved for unrelated cases). However, it sometimes happens that what began as senses have, over time, drifted so far apart that any relatedness is obscured. These we call “cognitive homonyms” (Nation & Parent, 2016, p. 47) and include such lexemes as chest (a person's torso and treasure chest) and organ (the musical instrument and an internal body part). These can be thought of as single-word idioms as knowledge of one meaning provides no meaningful clue to the other. The term monoseme denotes words with only one meaning. Ruhl (1989) argues that all words should be considered monosemic until polysemy or homonymy has been clearly established, and that we may too easily accept the notion of multiple meanings. This stance has not been universally adopted (Cruse, 1992). In certain contexts, these are academic distinctions. There may be effectively no difference between homonyms and homographs for the second language reader, just as in listening tasks, the distinction between homonyms and homophones is of equally questionable usefulness. The difference between (pure) homonyms and cognitive homonyms is essentially a diachronic distinction. How are homonyms distributed in English? Few studies have focused on how homoforms are distributed in English. We examine this question here from a pedagogical perspective, where words are taught by frequency and in bands of 1,000 items, so that the 1,000 most frequent words are learnt before the second 1,000, etc. Parent (2012) used the British National Corpus (BNC) to answer the question of how many homonyms occurred in West's (1953) General Service List (GSL). The GSL contains about 2,000 words, determined largely by frequency and has served as the de facto list for educational materials though it is not without problems. Cobb (2010, p. 192) found it contained about 500 “fairly infrequent” items. Parent used etymology to determine homonymy, and 500 instances of each word, including inflections, were randomly chosen from the BNC and tagged for meaning. For the word file, for example, each line of data included an instance of the forms file, files, filed or filing and was tagged for which homonymic meaning was employed. References to a physical file in a filing cabinet, computer data, or a direction to file under “miscellaneous”' would receive one tag, while references to the tool (i.e., for fingernails) and the related verb were assigned a different tag. The study found that 10% of the GSL were homoforms. Of the 75 homonyms found1, seven were shown to use only one meaning in the samples. About 66% had one meaning accounting for 90% or more of the uses. Only one homonym had a 50-50 split. The list of homonyms and relative frequencies in the GSL are included as part of the Appendix. Wang and Nation (2004) investigated homoforms in Coxhead's 570-item Academic Word List (2000), finding that about 10% of the words were actually homographs and therefore belonged to 1 The current study recognises 74 homonyms, omitting right, which, on re-examination, is not a homonym based on the original criteria. Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 51 different word families, with only 21 of these occurring with sufficient frequency to justify their inclusion on the list. Three word forms, once split into separate words, no longer met the criteria for inclusion in the AWL. Homoforms, thus, may seem to account for a relatively small percentage of words, but, it must be remembered, that this is not 10% of English, but rather, 10% of the most frequent words of English. Tallying words like match1 and match2 as the separate entities they are, then, could potentially not only decimate the list if the decoupled items no longer meet the criteria for inclusion, but promote other words in their stead that had previously just missed the cutoff. Cobb (2013) used Parent’s tallies to show, tentatively, the extent of the decoupling effect. How do we process homonyms in our first language? A brief survey of homonym processing in our first language provides context for an overview of second language homonyms; the two, as shall be seen, take very different methodological approaches. As young as four, children are aware that words and meanings do not always have 1:1 mappings (Peters & Zaidel, 1980); nonetheless, Mazzocco (1997) and Doherty (2003) have shown that the familiar meaning of a homonym can interfere with children's lexical acquisition process until at least the age of 10, a phenomenon dubbed “word rigidity” by Johnson and Pearson (1978, as cited in Kang, 1993; see, though, Storkel, Maekawa, & Aschenbrenner, 2013). In adult lexical processing, early studies examining polysemy (Forster & Bednall, 1976, p. 56; Jastrzembski, 1981) found that words with many meanings were accessed quicker than words with few, leading to the term “ambiguity advantage” and arguments for multiple entries in the mental lexicon. Later studies, however, significantly refined these results. Rodd, Gaskell & Marslen-Wilson (2002), measuring reaction times in lexical decision tasks, found that only polysemes were responded to faster, and that the response rate for homonyms was significantly slower. Around the same time, Beretta, Fiorentino and Poeppel (2005) found similar results using magnetoencephalography to measure the timings and locations of neural activity. Adults were measured for their neural reaction to stimuli of words with one sense or few senses (ant), words with one meaning but several senses (mask), homonyms with few additional meanings (calf), and homonyms with many meanings (bark), based on entries in the Wordsmyth dictionary2. Like Rodd, Gaskell and Marslen-Wilson (2002), they found significantly longer reaction times for homonyms than for words with one meaning, and that polysemes were accessed faster than all. A clear implication is that the various meanings of homonymous words compete with each other, strongly suggesting separate mental entries, while polysemes have a single, unified entry with 2 Although also employed in the Rodd, Gaskell and Marslen-Wilson study, the use of dictionaries for determining the number of senses is questionable practice. Jorgensen (1990a, 1990b) found that dictionaries tend to conflate words with more, and sometimes far more, meanings than language users distinguish. Dictionaries are not models of the mental lexicon. Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 52 multiple paths to access them. Thus there is a sense advantage for polysemes over monosemes and an ambiguity disadvantage for homonyms, so in terms of access times, polysemes < monosemes < homonyms. Reaction time experiments, however, do not show what happens between stimulus and response. EEG studies provide glimpses into brain activity during this period. One pattern, the N400, is associated with semantic processing. Discovered in 1980 by Kutas and Hillyard (see Luck, 2014), the N400 pattern appears when semantic expectations are violated. In hearing, “We enjoyed a nice dinner of spaghetti and garlic shoes,” an N400 pattern would be expected, as it would also would be with mismatched word pairs; lemon:sour would not produce a strong N400 effect but car:holy presumably would. N400 effects vary in size such that those words more irreconcilable with their context will exhibit bigger effects when the EEG data is analyzed. Klepousniotou, Pike, Steinhauer and Gracco (2012) exploited this for their work on polysemy and homonymy. The polysemes of their study were of two types, metaphoric (lip) and metonymic (rabbit, denoting the animal or the meat), and the homonyms were balanced (both meanings frequent) and unbalanced (one considerably rarer than the other, as in coach). Regardless of relative frequency, both meanings are activated and competing to some degree, but there is a definite preference for the dominant meaning, even when the relative frequencies are close. Their findings on polysemes again strongly suggest a single representation and show reduced N400 effects across both hemispheres. Differences between dominant and subordinate senses were found only in cases of metaphor, demonstrating that the neural generators underlying the processing of homonymy and polysemy are clearly different. Homonyms continue to provide a rich avenue for researchers. Employing functional magnetic resonance imaging, Musz and Thompson-Schill (2017) examined the competition between the dominant and subordinate meanings of homonyms, especially in cases where the subordinate meanings is primed, confirming previous findings with eye-tracking (Duffy, Morris & Rayner, 1988; Sereno, O’Donnell & Rayner, 2006) that cognitive competition remains an active process even after one meaning has been selected, suggesting that one meaning is raised to “candidate” (rather than “definite”) status. Using semantic priming (tree:bark, dog:bark) and various fMRI analyses, Hoffman and Tamm (2020) identified two different regions of the brain involved in homonym decoding, with a third interfacing the two. These studies highlight the topics of lexical storage and retrieval, offering insight into how the brain organizes its lexicon. The issue of the L2 lexicon throws in further complications if it is, in early stages at least, layered over the brain’s L1 lexicon. How do second-language learners acquire homonyms? As we have just seen, native speakers, do not process all words equally; homonyms take longer while polysemes are accessed quicker. Second-language learners are, by definition, not operating with fully developed mental lexicons in the target language. We can assume that a word known to be a homonym may take some extra time to process, while a homonym with only one meaning studied may be accessed more quickly but misinterpreted. Research in L2 lexical processing usually lacks the technology employed in the studies above, focusing instead on how homonyms are interpreted. Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 53 In a qualitative study, Kang (1993), like Laufer (1997), shows that learners familiar with one sense of polysemes applied that one meaning to each encounter with the word, with effects that could influence their understanding of sentences and sometimes even their interpretations of entire texts. Despite the processing differences, the differences between homonyms and polysemes may be an academic distinction with pedagogical implications relevant only after learners have developed some semantic flexibility with their mental lexicons and can connect related senses to each other. Mashhady, Lotfi and Noura (2010) found a speed advantage in presenting L2 homonymic meanings together, but only in comparison to learning synonyms, and only short-term recall was tested. This may indicate a benefit in presenting homonymic meanings together; however, the effects of both absolute and relative frequency (a very common meaning coupled with an extremely rare one) remain unexplored. Ushiro, Hoshino, Shimizu, Kai, Nakagawa, Watanabe, and Takaki (2010) examined the interpretation of homonyms by Japanese university students divided into two proficiency groups. No difference was found in the ability to correctly interpret the homonyms: in both groups, a quarter of the participants applied the wrong meaning even when doing so did not fit the context. There were, however, differences in the types of errors made. Successful homonym interpretation may depend on flexibility at least as much as understanding the context. When context was understood, the advanced students were better able to change their interpretations of the homonym to fit the context than were the less advanced learners, who, even when the context was understood, again tended to cling to the familiar meaning. Contextual flexibility, then, would appear to be a factor that increases with overall proficiency. While L2 studies of homonym processing lack the hard science found in L1 research, they nonetheless highlight that L2 homonym processing also entails Johnson and Pearson’s (1978) word rigidity. Further, we may assume that if L1 speakers take longer to access homonyms because of the ambiguity disadvantage, L2 learners must have, at minimum, the same delay. The problem of homonyms for learners, then, is twofold: first, during early stages, learners possess incomplete knowledge (knowing one meaning and not knowing that there may be others), and second, when additional meanings are acquired, it may now take longer to process the word. Extensive reading and graded readers As the current study examines homonyms in graded readers, we shall review some of the relevant studies. In an influential paper, Elley and Mangubhai (1983) reported significant results in a study of 600 learners over eight months in Fiji. The students in the experimental group exhibited highly significant improvements in their fluency and writing accuracy and, further, these were long-term gains not found in appreciable quantities among the control group. Other studies have focused on the learning gains of reading a single text, showing that while vocabulary learning does occur, a learner’s vocabulary enrichment may be so slender from reading one book that the case is made for more—indeed “extensive”—reading. Waring and Takaki (2003) examined vocabulary gains based on reading only one graded reader, one adapted so that 25 words occurring at certain frequencies within the text were substituted with English- Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 54 like non-words. Using various assessment instruments, each administered immediately, one week later and three months later, the study found that students did not learn the majority of the targeted words, and most words remembered at one assessment point were forgotten by the next. While seemingly pessimistic, these results bolster the need for extensive reading as the meager gains from reading a single text do show that words can be learned from context and that frequency of encounters affects acquisition. It is therefore important that repeated encounters occur within a few days (Nation & Waring, 2020), hence a need for extensive reading. Similar findings are seen in Brown, Waring and Donkaewbua (2008), which employed similar methodology and different methods of assessment. This informs the basis of the current study. One method of vocabulary acquisition entails the word being encountered, in context, a sufficient number of times, but when a word form is homonymic, the number of encounters is divided among what are essentially different words, complicating their acquisition. Secondary meanings—that is, meanings other than the most frequent one such as the “combined resources” meaning of pool—would likely cause trouble for the learner still learning the more common meaning (in this case, swimming pool). This raises research questions we aim to address in this report. Research Questions This paper partially replicates Parent (2012), but rather than examining homonyms in the BNC, it examines them in a specially constructed graded reader corpus, revealing the meanings learners are most likely to meet. This research answers the following questions: 1. Are the distribution of relative frequencies of homonymic meanings approximately the same in a corpus of graded readers (GR) as in the British National Corpus (BNC)? 2. If not, are learners being exposed to the secondary meanings through graded readers? 3. For words that do have multiple meanings in the graded reader (GR) corpus, how do the proportions between first and secondary meanings differ compared to in the BNC? Methodology The GR (graded reader) corpus constructed for this study is significantly larger and more representative than any used in previous studies (e.g., Allan, 2016), as seen in Table 1. To be sufficiently representative, texts were collected from a large number of publishers, a wide range of reading levels (Extensive Reading Foundation levels 2-16) and a mix of genres (535 fiction, 25 non-fiction). The corpus contains 560 texts, comprising 4,212,706 running words. Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 55 Table 1 Distribution of Texts in the Graded Reader (GR) Corpus Each book was manually scanned using a high quality DSLR camera and custom-made cradle. The images were batch edited using Adobe Lightroom (v. 4.4) to crop everything but the targeted pages, reduce the image to black and white, etc. The images were organized and subjected to optical character recognition (OCR) using ABBYY FineReader (v. 12.0), producing a text file for each book. The resulting corpus was manually cleaned of OCR errors, broken words, and construct-irrelevant text such as captions, comprehension questions, page numbers, etc. The target words used in this study were the 74 homonyms3 within West’s General Service Word List (GSL) (1954) identified by Parent (2012). A quick look at their distribution in general use is warranted. The results of a short R script referencing them against Nation's British National Corpus and Corpus of Contemporary American English lists (2017d) are shown in Table 2. A full two-thirds of them are found in the most frequent 1000 words of English (Nation, 2017a), while a further quarter are found in the second (Nation, 2017b), yielding a total of 92% of these homonyms occurring in the first two thousand words. Five more are found in the third (2017c) 1000-word band (host, net, pupil, rail, weave), with the remaining one (steep) occurring the fourth. Recall, though, that any homonym is at least two words, and lumping them into one count is the very practice this research warns against. Table 2 Distribution of Homonyms in Nation’s BNC and COCA Lists BNC, COCA Number of Cumulative Ratio k-band Homonyms Ratio First 49 66.22% 66.22% Second 19 25.68% 91.89% Third 5 6.76% 98.65% Fourth 1 1.35% 100% 3 Technically, there are 157 homonyms as each of the 74 words is two or more separate words. These are included in the Appendix. Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 56 Using AntConc (Anthony, 2018), files were created from the GR corpus which presented the target word in red, embedded in a 16-word context taken from the GR corpus (Figure 1). The meanings of homoforms were tagged by the second and fourth authors; where either author was unsure of the meaning of a homoform, they conferred with each other, referring back to the corpus when further context was necessary. Figure 1 Example of a KWIC File Used in this Study The original study drew 500 random samples for each word, sometimes minus a few manual deletions (in the sample for page, references to Jimmy Page, for example, were removed) while in the current study, all relevant instances were examined. In some cases, the number of words examined is considerably greater (go: 25,872), while in others, it is very small (tend occurs only 16 times in the GR corpus). In addition to the relative ratios, this tells us how often learners are exposed to the secondary and sometimes-rare meanings of homonyms. Fisher's Exact was applied to test the significance of differences between the current results and the original study. Because a few homonyms have a third and fourth meaning, the test examined the ratio of first meaning to non-first meanings. A note on the use of this test may be in order as there are several statistics that could be used. The chi-square goodness-of-fit test is based on assumptions not relevant to word lists. Treating the homonyms as keywords, other methods are available such as the difference coefficient (Leech & Fallon, 1992), the relative frequency ratio (Damerau, 1993), the log-likelihood (Dunning, 1993, p. 71) or even the chi-squared test (Oakes & Farrow, 2006). In his paper on the use of log-likelihood as a measure of surprise in text analysis, Dunning (1993) writes: “Measures based on Fischer’s [sic] exact method may prove even more satisfactory than the likelihood ratio measures described in this paper.” Fisher's exact test, a conservative measure, is chosen here because the p-values it generates provide a clear threshold in relation to a chosen alpha; our interest is in which words cross that boundary. Results How are homonymic meanings distributed in graded readers? The first research question asked if the relative frequencies of homonymic meanings in the graded reader (GR) corpus are comparable to those in Parent (2012). Looking at the results of Fisher's Exact Test, even at the level of p < 0.001, we find 29 homonyms (29%) significantly different than in the BNC texts. Table 3 summarizes this at various levels of significance, but no matter where the line is drawn, a large percentage of homonyms, around half, differ significantly between the two studies in terms of their distributions of first and non-first meanings. Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 57 Table 3 Number of Homonyms Differing from Original Study, at Different Levels of Significance p < .05 p < .01 p < .001 # of Homonyms 41 33 29 % of Total Homonyms 55.40% 44.59% 39.18% Table 4 gives an overview of the results. Of the 74 homonyms, 33 do not differ significantly between the GR and BNC corpora. These include instances in which the secondary meanings are present in both corpora, but also cases where the secondary meaning is not present in the GR but occurred with such low frequency in the BNC that its absence is not considered significant. In these cases, the secondary meaning may be the same as in the BNC, or more frequent or less frequent but not to appreciable degrees. Table 4 Summary of Homonymic Differences Between the GR and BNC Corpora (p < 0.01) GR Meaning 2 Relative to BNC N % Examples Not significantly different 41 55% box, even, last, shoot GR Meaning 2 significantly less frequent 18 24% arm(s), hide, race, spell GR Meaning 2 significantly more frequent 10 14% bill, deal, like, rest BNC Meaning 2 becomes GR Meaning 1 5 7% count, deal, match, policy, rest TOTAL 74 100% However, there are eighteen homonyms in which the secondary meanings, while present in the GR, appears so rarely that their distributions are flagged as significantly different, including arm(s) where the “weapon” meaning can be found in the graded readers but at a considerably smaller ratio than it is in the BNC. This meaning of arm accounts for 1% of the word's usage in the GR as opposed to the BNC's 17%. An additional five homonyms have no secondary meanings present in the GR despite being well represented in the BNC, such as the “brief period” meaning of spell which accounts for a quarter of all uses in the BNC. Conversely, in 17 homonyms the meaning deemed secondary by the BNC tallies is used significantly more often in the graded readers. If a clear majority of meaning distributions corresponded to the original findings, we might have cause to say an effort of matching the meaning distributions was being made, while a clear minority might lead us to believe that secondary meanings were being deliberately avoided so as to assist learners with the primary meaning. But the some-do-some-don’t results we find here suggests some care may be given to the use of the word form but not to the meanings employed. There are clear implications for learnability. First, the principle of spaced repeated retrieval presumes unrelated meanings are not being used each time the form is encountered (Macalister & Nation, 2020, pp. 50-53). A correctly guessed meaning is a hypothesis that may be incorrectly rejected when the same word form is encountered again with a drastically different meaning. Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 58 Second, since some word forms do have unrelated meanings, explicit attention that a different and less frequent meaning of a known word would be beneficial here given the research (Kang, 1993; Laufer, 1997) highlighting the tendency of learners to adhere to the one known meaning. Knowledge of one meaning, that is, provides a false confidence in a word’s semantic contribution which is not easily overturned by contextual clues to the contrary; even when the interpretation of the whole is odd or bizarre (a solider loading a magazine into a gun), the familiarity of that one meaning prevents that word from being targeted as the source of any confusion. If learners are unaware that a given word has additional, unrelated meanings (despite an awareness that words sometimes do), and the ability to see when a known word uses an unknown meaning is undeveloped, then such instances, if used, should be explicitly glossed or otherwise have attention drawn to them. Which meanings are learners not exposed to? The second research question inquired about homonymic meanings that did not occur in the GR corpus but did in the BNC study. Among the 74 homonymic forms, 23 (31%) occur among the graded readers with exactly one meaning, although three of these did not have secondary meanings in the random BNC samples either (these are asterisked). The words appearing with only one meaning in the GR corpus are bite (with teeth), boil (cooking), camp (camping), down (not up), ear (organ of hearing), egg (noun), fast (adjective), file (storage), fold (paper), go* (verb), page (book leaf), pan (cooking), pen (writing utensil), pot (cooking), pupil (student), repair (to fix), roll (to spin), school* (education), shoot (a gun), slip* (misstep), sock (foot garment), spell (letter-by-letter, including incantation) and wake (awaken). This may be misleading though as eleven of the meanings have secondary meanings of 1% or less in the BNC. Three words, however, have significant secondary uses, between 15% to 24%, which are not found in the GR corpus. These are pan, spell and wake. One meaning of pan is, of course, the instrument used for cooking, which includes, by extension, instances of panning for gold. Etymologically, the use referring to criticism (as in 'The movie was widely panned', stemming from 'to give the movie a pan') is related to this though the connection is not synchronically obvious. More clearly unconnected, however, is the use referring to moving ('panning') a camera, which is clipped from the word panorama. This usage accounted for 15% of the group pan-pans-panned-panning but is completely absent in the GR corpus, as shown in Table 5. Table 5 Relative Frequencies for pan in BNC and GR Corpora Meaning 1: Meaning 2: pan cooking utensil, to move a camera also to criticise (panorama) BNC 84.84% 15.16% Graded Readers 100% 0% (GR) Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 59 One meaning of spell unites the “letter-by-letter” use (i.e., spelling) with the magical incantation use, which is unrelated to the uses clustering around the period of time meaning. The latter meaning is often used with weather (cold spell, dry spell) but also includes other instances regarding employment (his recent spell as guest conductor, a spell at the Chicago Tribune) and other uses (dizzy spells). In 500 random samples from the BNC (Table 6), we can expect this meaning to occur around 120 times, or about one out of every four instances (24%), so this is not an insignificant use, making its absence in graded readers conspicuous. Table 6 Relative Frequencies for spell in BNC and GR Corpora Meaning 1: Meaning 2: spell letters, short incantation time interval BNC 75.95% 24.05% Graded Readers 100% 0% (GR) The form wake likewise exhibits a missing semantic cluster (Table 7). Here there are three clusters: those related to the verb (“to wake up”), those related to the track of a boat, and those related to the funereal vigil. The latter does not occur in the GR corpus but was also infrequent in the original study at 1.2% so we shall ignore it here. Of greater interest is the nautical meaning and its metaphorical extensions. A water-faring vessel of sufficient speed leaves a trail of disturbance in the water which will soon normalize, but until it does, another boat crossing over it will be in for a short spell of choppy riding. This meaning has been extended metaphorically to include the aftermath of major, frequently negative, events: in the wake of the killings, leaving a trail of controversy in her wake, etc. This use almost always occurs in prepositional phrases headed by the word in. Similar to spell, this has a distribution of about one-quarter of all uses in the original study but is completely missing in the GR corpus. Table 7 Relative Frequencies for wake in BNC and GR Corpora Meaning 1: Meaning 2: Meaning 3: wake awaken, a track funereal etc. (in water, etc.) vigil BNC 75.8% 23% 1.2% Graded 100% 0% 0% Readers (GR) We can take this analysis a step further. The original data was conducted at the lemma-level, so as both the nautical and funereal reading of wake can only occur as wake and wakes (and not as waked, woke or woken), these instances automatically appear rarely. Here we briefly examine all Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 60 instances of uninflected wake in the BNC, not a random sample but all occurrences trimmed to one-instance-per-text (since if a funeral wake figures in a text, the word may appear many times). The ratio changes rather drastically, with the wake up usage occurring only 49.96% of the time and the metaphoric reading only slightly below this at 47.73% as opposed to the 71% and 25% figures in the original study including inflections (the funereal vigil usage rises from 1.2% to 2.31%). Like all items examined, these three words are high frequency items. Wake is in the first of Nation’s 1000-word band, while pan and spell are in the second, the latter likely occurring even more frequently for learners. Nation’s lists are not based purely on frequency but on range as well, and the first two 1000-level word lists were prepared with a different corpus composition than the remainder, but they also were not consistently split into homonymic uses. Based on our observations of these three words in the GR corpus, the less-frequent-but-not-infrequent meanings appear “blocked” by the more dominant meaning. Perhaps it is felt that the less- frequent uses of these high-frequency words would confuse readers and are therefore avoided, but if so, this is an assumption in need of airing out. Are secondary meanings in graded readers used with frequencies comparable in texts for native speakers? Our first research question found that about half the homonyms exhibited distributions quite different to those in the original study. The second question found that nearly a third (31%, or 23 homonyms) did not have secondary meanings at all, even when some of these are clearly important. Our third research question, then, focuses on the remaining cases deemed significantly different but in which a secondary meaning is present. How does the distribution of these secondary meanings differ between the GR corpus and the BNC? There is no single, unified answer to the question. We might expect they are significant because the secondary meanings are present, just less prevalent, but of the 26 homonyms in this category, this is true of only eleven, these being arm, die, hide, lay, leave, miss, net, pool, race, sound and yard. In some cases, detailed below, the secondary meaning is far more frequent in the GR corpus, in some instances even outnumbering the primary meaning. The word miss, for example, is drastically different. In the 2012 study, the three meanings (fail to hit, want to see, and the title for unmarried women) had a 50-50 distribution, while the current study finds the title usage accounting for over 91% of the occurrences. Parent (2009), containing a pilot study for the 2012 work based on a smaller corpus, found a 25% distribution of the title meaning. The 91% distribution of the title in graded readers is surprising but is attributable to certain texts like Little Women and Pride and Prejudice in which characters are frequently referred to by their titles and family names. This is known as a “whelks problem” after Kilgarriff (1997), where a surprisingly high frequency score was assigned to the lexeme whelk because a corpus included a book specifically about sea snails. This issue has given rise to dispersion statistics, a score measuring spread, not unlike standard deviation in function, that factors in the number of texts or other corpus parts an item occurs in. Dispersion measures have been around since at least Juilland and Chang-Rodriguez (1964) and the introduction of Juilland’s D (see Brezina, 2018). More recent developments include Gries’s (2008) Deviation of Proportions. Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 61 Nine homonyms in this study, however, see the secondary meaning appear significantly more frequently than it had in the BNC analysis. These are ball (social function), bank (embankment), bear (animal), brush (undergrowth), case (container), lie (falsehood), like (admire), mean (cruel) and tend (attend). The social function meaning of ball is central to the telling of Cinderella, while bears are more frequently characters in stories written for children. That this meaning of ball is not blocked by the more common meaning (as suggested for wake, etc., above) even when it is easily replaced by party suggests that frequency alone fails to account for the absence of secondary meanings in graded readers, and that another factor—perhaps imageability or cognitive keyness (the word ball being strongly associated with the narrative of Cinderella) may be present. Tend, however, is an interesting anomaly. The first of its etymologically distinct uses is the catenative verb (I tend to exaggerate), from which the word tendency is derived. The second is a clipped form of attend as found in bartender and is often followed by the preposition to (I should tend to my garden). Thus, both homonymic forms of tend are frequently followed by a homonymic to. The GR corpus illuminates two points of interest regarding this word. First, the distribution is markedly different from Parent (2012). While the original study found the tendency meaning to account for 97% of the uses of the form, that meaning appears in the GR corpus only 69% of the time. Second, and more surprisingly, however, than the relative frequency is the total frequency. This is a high frequency lexical item, appearing in the first 1,000-frequency band in Nation's BNC-COCA word lists as well as in the New General Service List (Browne, Culligan and Phillips, 2013), but it appears only 16 times total in the GR corpus. It occurs this many times in the first five BNC files (about 1% of the corpus), and 148 individual BNC files each have 16 or more occurrences of the word form. With so few instances in graded readers, the word form itself is unlikely to be present enough in graded readers for homonymic awareness to spark4. Among the homonyms in which the second meaning differed from the original study, five saw the 'secondary' meaning present in such numbers that it became the primary meaning among the GR corpus, these being count, deal, match, policy and rest. The primary meaning of policy, as in foreign policy, may be primed to occur in newspapers (12% of the BNC texts) more than in texts produced for learners. This meaning accounts for only four of 28 usages in the graded readers. However, 20 occurrences of the insurance policy meaning were from a single book. The rise in the royalty-related meaning of count is also easily accounted for. It occurs in only 15 of the 560 books, including Dracula, The Count of Monte Cristo, The Three Musketeers, etc., some of which are present in the corpus in multiple versions; 69% of the occurrences of this meaning appear in just three books. This leaves one final homonym to account for, and that word is bill. The beak of a fowl reading did not occur at all among the BNC samples, but ten uses of this meaning are present in the GR corpus. However, all ten of these came from a single text about birds. It is worthwhile, however, to examine not just the ratio of secondary meanings to the primary, but also the actual frequencies. Given the situation where a learner knows only one meaning, we would expect a threshold effect, where x number of occurrences of an unknown meaning is 4 Two other homonyms appear fewer than 30 times total: policy and boil. Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 62 necessary for the learner to begin to suspect another meaning may be at play. We make no attempt here to define x, but it is surely not a matter of simple frequency as proximity of encounters is likely an issue (i.e., that reading five instances of the period of time meaning of spell over the course of a week is more informative than the same five encounters spread over a year) and, at least for learners of a certain level, that shifts in word class is another (i.e., that the unknown meaning of pan is a verb rather than the expected concrete noun). Although we cannot yet calculate that threshold level, it does not seem controversial that ten instances of a secondary meaning spread over the 560 corpus texts is an insufficient amount if acquisition of the secondary meaning is the goal. Table 8 summarizes how many homonymic word forms are present with ten or fewer instances of a secondary meaning. In the Appendix we present the combined results of a past study, that of Parent (2012) and the current study. In Parent (2012) the BNC corpus was examined. In the current study, a specially constructed graded reader (GR) corpus was examined. To accommodate the constraints of the printed page, the eight word forms with third and fourth meanings are given after the words with only two homonymic meanings. Table 8 Distributions of Homonyms with Ten or Fewer Instances of Secondary Meanings in the GR Corpus Instances of Number Cumulative secondary meaning of words ratio of homonyms 0 23 31.08% 1 9 43.24% 2 5 50.00% 3 1 51.35% 4 2 54.05% 5 3 58.11% 6 1 59.46% 7 0 59.46% 8 2 62.16% 9 1 63.51% 10 1 64.86% TOTAL 48 Discussion and Implications Homonymy is not something the producers of graded readers are particularly concerned with given that the uses of homonyms correspond to what we find in the native-speaker market around half the time while differing significantly, and in various ways, the other half. Common Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 63 and useful meanings are sometimes missing, and less common meanings are sometimes elevated to the status of most frequent. Nonetheless, this study is not without its limitations. By necessity, we were locked into comparing the graded reader (GR) corpus to results from a slightly older, closed corpus rather than having the freedom of choice. Future research would also need to examine dispersion measures which, as not part of the original study, were not considered here. The classroom instructor would benefit from familiarity with the homonym list, as a reminder of words potentially taken for granted that cause problems for learners and to help pinpoint the source of misunderstandings. This is especially true if teaching materials are not explicitly highlighting that some words have the same form. There are further implications for the publishers of graded readers. First, while homonymy is a frequent occurrence in the language, for learners it can be minimized, if not outright avoided, at elementary and even intermediate levels since, in many cases, they would not need the rarer meaning (the shoal meaning of school, for example). Second, an editorial policy should be explicit for any publishers of materials for language learners. This means that any use of a homonym or its inflected form should be flagged by the software. It is also recommended that the word lists used by publishers split homonyms into separate entries, for example fast1 denoting quickness and fast2 denoting abstinence. If a homonym is flagged and the rare meaning is employed, then it should be paraphrased just as a rare word would be. An exception is if the rare meaning is employed but it occurs repeatedly throughout the text (count in texts regarding royalty), then the secondary meaning should be made explicit, perhaps in the form of a footnoted gloss. This is especially important in instances where the common meaning is likely to be known (a reader of Monte Cristo would probably be familiar with the enumeration meaning of count); however, in the case of Cinderella prepared for elementary readers, little is gained by retaining ball over party. It is further suggested that a glossary at the back include both meaning, even presumably-known ones, so as to draw attention to the homonymy. This is especially true in cases when the homonymic meanings are different parts-of-speech and the less common meaning may therefore employ inflections unavailable to the more frequent usage; the forms steeped and steeping, for example, clearly signal the use of the verb rather than the more common adjective steep. Third, homonymy plays a crucial role in later stages of word consciousness activities, or expanding the depth (rather than breadth) of one’s lexical knowledge. This may occur with collocation activities, underlying meaning activities (Visser, 1989), etc. Graded readers can contribute greatly to this phase. They can include appendices on the homonyms contained in each volume, explaining the contrasting etymologies and introducing still other meanings or senses. These special chapters might also include activities giving learners the chance to practice meaning selection (‘Which meaning is used in this sentence?’) The main finding of this comparison is not that publishers of graded readers present homonyms to learners carelessly but, rather, that they do not have a policy regarding homonyms. This is strongly suggested by a large proportion of the GSL homonyms being statistically similar to the Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 64 GR homonyms while another chunk of homonyms is statistically different. Exposing the learners to the level-appropriate words is a major selling point of graded readers, but in the case of homonyms, a more refined strategy is clearly warranted for the learners to acquire these multi- faceted words. Publishers can make use of the previous research on homonymic meaning distribution or create their own to make informed decisions on which words require special attention and when, and how, such information can be presented. References Allan, R. (2016). Lexical bundles in graded readers: To what extent does language restriction affect lexical patterning? System 59, 61–72. https://doi.org/10.1016/j.system.2016.04.005 Anthony, L. (2018). AntConc (3.5.7). Available: www.laurenceanthony.net/software Beretta, A., Fiorentino R, & D. Poeppel (2005). The effects of homonymy and polysemy in lexical access: An MEG study. Cognitive Brain Research, 24, 57-65. http://dx.doi.org/10.1016/j.system.2016.04.005 Brezina, V. (2018). Statistics in corpus linguistics: A practical guide. Cambridge University Press: Cambridge. http://dx.doi.org/10.1016/j.system.2016.04.005 Brown, R., Waring, R., & S. Donkaewbua (2008). Incidental vocabulary acquisition from reading, reading-while-listening, and listening to stories. Reading in a Foreign Language, 20, 136–163. Browne, C., Culligan, B. & J. Phillips (2013). The New General Service List. Available: http://www.newgeneralservicelist.org. Cobb, T. (2010). Learning about language and learners from computer programs. Reading in a Foreign Language, 22(1), 181-200. Cobb, T. (2013). Frequency 2.0: Incorporating homoforms and multiword units into pedagogical frequency lists. In C. Bardel, C. Lindqvist & B. Laufer (Eds.), L2 vocabulary acquisition, knowledge and use: New perspectives on assessment and corpus analysis. Eurosla Monographs Series, 2. http://www.eurosla.org/monographs/EM02/EM02tot.pdf Coxhead, A. (2000). A new academic word list. TESOL Quarterly, 34(2), 213-238. http://dx.doi.org/10.2307/3587951 Cruse, D.A. (1992). Monosemy vs. polysemy: Review article. Linguistics - An Interdisciplinary Journal of the Language Sciences, 30(3), 577-600. Damerau, F. J. (1993). Generating and evaluating domain-oriented multi-word terms from texts. Information Processing & Management, 29(4), 433–447. http://dx.doi.org/10.1016/0306- 4573(93)90039-G Dang, T.N.Y. & S. Webb (2016). Evaluating lists of high-frequency words. ITL-International Journal, 167, 132-158. Doherty, M.J. (2004). Children’s difficulty in learning homonyms. Child Language, 31, 203-214. http://dx.doi.org/10.1017/S030500090300583X Duffy, S.A., Morris, R.K., & K. Rayner (1988). Lexical ambiguity and fixation times in reading. Journal of Memory and Language, 27, 429–446. http://dx.doi.org/10.1016/0749- 596X(88)90066-6 Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence. Computational Linguistics, 19, 61-74. Elley, W. & F. Mangubhai (1983). The impact of reading on second language learning. Reading Research Quarterly 19(1), 53-67. http://dx.doi.org/10.2307/747337 Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 65 Forster, K.I., & E.S. Bednall (1976). Terminating and exhaustive search in lexical access, Memory & Cognition 4, 53–61. https://doi.org/10.3758/BF03213255 Gries, S. T. (2008). Dispersions and adjusted frequencies in corpora. International Journal of Corpus Linguistics 13, 403–37. http://dx.doi.org/10.1075/ijcl.13.4.02gri Hoffman, P. & A. Tamm (2020). Barking up the right tree: Univariate and multivariate fMRI analyses of homonym comprehension, NeuroImage 219. http://dx.doi.org/10.1016/j.neuroimage.2020.117050 Jastrzembski, James E. (1981). Multiple meanings, number of related meanings, frequency of occurrence, and the lexicon, Cognitive Psychology, 13(2), 278-305. https://doi.org/10.1016/0010-0285(81)90011-6. Johnson, D.D. & P.D. Pearson (1978). Teaching reading vocabulary. New York, NY: Holt, Rinehart and Winston. Jorgensen, J.C. (1990a). Definitions as theories of word meaning. Journal of Psycholinguistic Research 19(5), 293–316. http://dx.doi.org/10.1007/BF01074362 Jorgensen, J.C. (1990b). The psychological reality of word senses. Journal of Psycholinguistic Research 19(3), 167-190. http://dx.doi.org/10.1007/BF01077415 Kang, H-W. (1993). How can a mess be fine? Polysemy and reading in a foreign language. Mid- Atlantic Language Pedagogy i, 35-49. Kilgarriff, A. (1997). Putting frequencies in the dictionary. International Journal of Lexicography, 10(2), 135-155. http://dx.doi.org/10.1093/ijl/10.2.135 Klepousniotou, E.G, Pike, B., Steinhauer, K. & V. Gracco (2012). Not all ambiguous words are created equal: An EEG investigation of homonymy and polysemy. Brain and Language, 123(1), 11-21. http://dx.doi.org/10.1016/j.bandl.2012.06.007 Kutas, M., S.A. & Hillyard (1980). Reading senseless sentences: Brain potentials reflect semantic incongruity. Science, 207(4427), 203-205. http://dx.doi.org/10.1126/science.7350657 Laufer, B. (1997). The lexical plight in second language reading: Words you don't know, words you think you know, and words you can't guess. In J. Coady & T. Huckin (Eds.), Second Language Vocabulary Acquisition (pp. 20-34). Cambridge, United Kingdom: Cambridge University Press. http://dx.doi.org/10.1017/CBO9781139524643.004 Leech, G.N., & R. Fallon (1992). Computer corpora: what do they tell us about culture? ICAME Journal, 16, 29–50. Luck, S.J (2014). An introduction to the event-related potential technique (2nd ed.). Cambridge: Massachusetts Institute of Technology. Macalister, J., & Nation, I.S.P. (2020). Language Curriculum Design (2nd ed.). New York: Taylor & Francis Mashhady, H., Lotfi, B. & M. Noura (2010). Word type effects on L2 word retrieval and learning: Homonym versus synonym vocabulary instruction. Iranian Journal of Applied Language Studies, 3(1), 97-118. https://ijals.usb.ac.ir/article_80.html Mazzocco, M.M. (1997). Children's interpretations of homonyms: A developmental study, Journal of Child Language, 24(2), 441-467. http://dx.doi.org/10.1017/S0305000997003103 Musz, E., & S.L. Thompson-Schill (2017). Tracking competition and cognitive control during language comprehension with multi-voxel pattern analysis. Brain and Language, 165, 21- 32. http://dx.doi.org/10.1016/j.bandl.2016.11.002 Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 66 Nation, I.S.P. (2017a). The BNC/COCA Level 1 word family lists (Version 1.0.0) [Data file]. Available from http://www.victoria.ac.nz/lals/staff/paul-nation.aspx Nation, I.S.P. (2017b). The BNC/COCA Level 2 word family lists (Version 1.0.0) [Data file]. Available from http://www.victoria.ac.nz/lals/staff/paul-nation.aspx Nation, I.S.P. (2017c). The BNC/COCA Level 3 word family lists (Version 1.0.0) [Data file]. Available from http://www.victoria.ac.nz/lals/staff/paul-nation.aspx Nation, I.S.P. (2017d). The BNC/COCA word family lists (Version 1.0.0) [Data file]. Available from http://www.victoria.ac.nz/lals/staff/paul-nation.aspx Nation, I.S.P. (2001). Learning vocabulary in another language. Cambridge University Press, Cambridge. http://dx.doi.org/10.1017/CBO9781139524759 Nation, I.S.P. & K. Parent (2016). Polysemy. In I.S.P. Nation (Ed.), Making and using word lists for language learning and testing. Amsterdam: John Benjamins Publishing Company. http://dx.doi.org/10.1075/z.208 Nation, P. & R. Waring (2020). Teaching extensive reading in another language. New York: Routledge. http://dx.doi.org/10.4324/9780367809256 Oakes, M. P., & M. Farrow M. (2006). Use of the chi-squared test to examine vocabulary differences in English language corpora representing seven different countries. Literary and Linguistic Computing, 22(1), 85-99. http://dx.doi.org/10.1093/llc/fql044 Parent, K. (2009). Polysemy: A pedagogical concern. (Unpublished doctoral dissertation). Victoria University of Wellington, New Zealand. Parent, K. (2012). The most frequent English homonyms. RELC Journal, 43(1), 69-81. http://dx.doi.org/10.1177/0033688212439356 Peters, A.M., & E. Zaidel (1980). The acquisition of homonymy. Cognition, 8(2). https://doi.org/10.1016/0010-0277(80)90012-8. Rodd, J., Gaskell, G., & W. Marslen-Wilson (2002). Making sense of semantic ambiguity: semantic competition in lexical access. Journal of Memory and Language, 46, 245-266. https://doi.org/10.1006/jmla.2001.2810 Ruhl, C. (1989). On monosemy: A study in language semantics. New York: State University of New York Press. Schonfeld, Z. (2014, July) Education blogger fired for writing about homophones and confusing Homophobes. Newsweek. Retrieved: https://www.newsweek.com/education-blogger- fired-writing-about-homophones-and-confusing-homophobes-262404 Sereno, S.C., O’Donnell, P.J. & K. Rayner (2006). Eye movements and lexical ambiguity resolution: Investigating the subordinate-bias effect. Journal of Experimental Psychology: Human Perception and Performance, 32(2), 335. http://dx.doi.org/10.1037/0096-1523.32.2.335 Ushiro, Y., Hoshino Y., Shimizu H., Kai A., Nakagawa C., Watanabe F., & S.Takaki (2010). Disambiguation of homonyms by EFL readers: The effects of primary meaning and context interpretation. ARELE, 21, 161-170. https://doi.org/10.20581/arele.21.0_161 Visser, A. (1989). Learning core meanings. Guidelines, 11(2), 10-17. Wang, K., & I.S.P. Nation (2004). Word meaning in academic English: Homography in the Academic Word List. Applied Linguistics, 25, 291–314. https://doi.org/10.1093/applin/25.3.291 Waring, R., & M. Takaki (2003). At what rate do learners learn and retain new vocabulary from reading a graded reader? Reading in a Foreign Language, 15(2), 130-163. http://hdl.handle.net/10125/66776 Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 67 Appendix A Comparisons of Relative Frequencies of Homonyms in ER Corpus and BNC First Meaning Second Meaning Parent This Study (2012) This Study Parent (2012) Word Gloss Freq % % Gloss Freq % % ARM body part 2238 99.16% 83.00% weapon 19 0.84% 17.00% BALL round object 330 71.58% 96.00% social event 131 28.20% 4.00% BAND group of people 250 85.32% 79.00% ring 43 14.68% 21.00% BANK finance 415 76.15% 90.00% embankment 130 23.85% 10.00% BEAR (verb) 495 59.28% 96.00% animal 340 40.72% 4.00% official written BILL statement 139 93.29% 100.00% (of a duck) 10 6.71% 0% BITE (verb) 177 100.00% 99.00% binary digit 0 0.00% 1.00% BOIL (verb) 25 100.00% 96.00% swelling 0 0.00% 4.00% BOWL dish 101 94.39% 96.00% game 6 5.61% 4.00% BOX container 1408 99.02% 99.00% sport 14 0.98% 1.00% BRIDGE to span 470 99.16% 97.00% card game 4 0.84% 3.00% BRUSH for hair 109 93.16% 99.80% undergrowth 8 6.84% 0.20% CAMP Encampment 399 100.00% 99.60% corny 0 0.00% 0.40% CAN (modal verb) 14424 99.99% 99.80% tin 1 0.01% 0.20% CASE Situation 532 74.93% 97.40% container 178 25.07% 2.60% Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 68 (various polysemous pattern of CHECK uses) 441 99.77% 100.00% crossed lines 1 0.23% 0.00% COUNT (verb) 184 35.02% 92.80% royalty rank 350 64.98% 7.20% related to DATE calendar 219 98.65% 99.20% fruit 3 1.35% 0.80% DEAL an amount 73 30.93% 84.00% to distribute 163 69.07% 16.00% singular of noun DIE to cease living 1681 99.94% 98.40% 'dice' 1 0.06% 1.60% EAR organ of hearing 458 100.00% 98.80% unit of corn 0 0.00% 0.20% produced by EGG females 217 100.00% 99.20% to egg on 0 0.00% 0.80% EVEN still, yet 2469 99.92% 99.60% (denoting) 2 0.08% 0.40% FAST quick 1012 100.00% 97.38% to abstain 0 0.00% 2.62% tool for FILE Folder 207 100.00% 99.38% smoothing 0 0.00% 0.62% FINE good/small 683 97.71% 95.37% penalty 16 2.29% 4.63% FIRM Business 133 80.12% 80.12% solid, strong 33 19.88% 19.88% FOLD to bend 48 100.00% 97.80% flock 0 0.00% 2.20% East Asian GO to depart 25872 100.00% 100.00% game 0 0.00% 0.00% HIDE to conceal 1344 99.93% 98.60% skin 1 0.07% 1.40% LAST previous/final 3376 95.96% 94.60% to continue 142 4.04% 5.40% LAY to place 977 99.90% 92.59% non-clergy 1 0.10% 7.41% Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 69 to LEAVE depart/bequeath 6127 93.47% 78.96% direction (left) 308 4.70% 17.03% LIE to be prostrate 811 60.21% 93.40% falsehood 536 39.79% 6.60% opposite of LIGHT opposite of dark 2267 95.73% 93.75% 'heavy' 101 4.27% 6.25% opposite of LIKE to resemble 6177 58.46% 76.20% dislike 4390 41.54% 23.80% geometric LINE figure 545 98.38% 95.93% to apply lining 9 1.62% 4.07% LOCK to use a key 869 99.43% 97.36% (of hair) 5 0.57% 2.64% sporting game/to be small wooden MATCH paired with 96 48.48% 92.51% stick 102 51.52% 7.49% MISS title 2191 91.22% 50.00% fail to hit 211 8.78 50.00% NET Web 58 98.31% 59.36% total 1 1.69 40.64% leaf of a book, PAGE internet 966 100.00% 99.58% to call out 0 0.42% cooking to move a (including to camera PAN 'criticize') 46 100.00% 84.84% 'panorama’ 0 0 15.16% animal PEN writing utensil 135 100.00% 99.58% enclosure 0 0 0.42% (as in 'foreign (as in 'insurance POLICY policy') 4 14.29% 95.60% policy') 24 85.71 4.40% combined resources, POOL Water 173 98.86% 78.62% billiards 2 1.14 21.38% POT Cookware 186 100.00% 99.60% marijuana 0 0 0.40% Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 70 PUPIL Students 37 100.00% 99.00% eye 0 0 1.00% competition of RACE speed 783 98.99% 92.00% species 8 1.01 8.00% RAIL horizontal beam 28 93.33% 99.00% to rail against 2 6.67 1.00% REPAIR to fix 98 100.00% 99.40% to return to 0 0 0.60% REST the remainder 517 45.79% 62.20% to recuperate 612 54.21 37.80% RING sound of bell 679 58.89% 67.47% circle 474 41.11% 41.11% ROLL to spin 180 100.00% 96.20% a list 0 0% 3.80% educational SCHOOL setting 1635 100.00% 100.00% group of fish 0 0% 0% to place/to be SET firm 398 78.68% 80.40% collection 58 21.32% 19.60% SHOOT to spout, etc 1156 100.00% 99.80% interjection 0 0% 0.20% SLIP to misstep 97 100.00% 100.00% undergarment 0 0% 0% SOCK garment 40 100.00% 98.59% to punch 0 0% 1.14% Letter-by-letter / SPELL incantation 43 100.00% 75.95% time interval 0 0% 24.05% STEEP (adjective) 85 98.00% 91.94% verb 2 2% 8.06% STEP (verb) 1211 99.00% 100.00% mother 13 1% 0.00% SWALLOW to gulp 58 98.00% 96.77% bird 1 2% 3.23% TEND habitual actions 11 69.00% 96.60% to take care of 5 31% 3.40% interlaced to move from WEAVE thread 53 98.00% 87.80% side to side 1 2% 12.20% Reading in a Foreign Language 35(1) Parent et al: The Counts of Dracula and Monte Cristo: Homonym Frequencies of Graded Readers 71 good / WELL interjection 5387 99.00% 100.00% spring 57 1% 0.00% YARD land 176 73.00% 56.60% 36 inches 66 27% 43.40% Appendix B Comparisons of Relative Frequencies of Homonyms in ER Corpus and BNC: Words with Third and Fourth Meanings First Meaning Second Meaning Third Meaning Fourth Meaning This Study Parent (2012) This Study Parent (2012) This Study Parent (2012) This Study Parent (2012) Word Gloss Freq. % % Gloss Freq. % % Gloss Freq. % % Gloss Freq. % % DOWN (opposite of up) 8288 100.00% 99.80% feathers 0 0.00% 0.20% dow nland 0 0.00% 0.00% HOST of a party, etc. 54 96.43% 85.28% multitude 2 3.57% 13.91% sacrif icial victim 0 0.00% 0.81% LEAVE to depart/bequeath 6127 93.47% 78.96% direction (left) 308 4.70% 17.03% permission 0 0.00% 0.80% plural of leaf 120 2% 1.83% MEAN to have meaning 2202 96.12% 99.60% cruel 89 3.88% 0.40% average 0 0% 0% POUND Monetary unit/w eight 639 99.38 97.63% to crush 4 0.62 2.07% dog pound 0 0% 0% SCALE measurement/ w eight 35 97.22% 90.22% to climb 0 0% 7.86% reptile skin 1 2.78% 3.02% SOUND audio phenomenon 1979 100% 93.19% sturdy 3 0% 6.41% to inquire sound out 0 0.00% 0.40% sea inlet 2 0.11% 0.00% About the Authors Kevin Parent is a native Chicagoan and came to Korea in 1997. He completed his Ph.D. at Victoria University of Wellington in 2009 on the topic of polysemy and second language pedagogy. He is an associate professor at Korea Maritime and Ocean University in Busan, with classes including sociolinguistics and Shakespeare. E-mail: ksparent1@gmail.com Stuart McLean is interested in vocabulary and comprehension (reading or listening) research. He is currently making online self-marking form-recall and meaning-recall (orthographic and phonological) vocabulary levels tests. Teachers can download automatically marked responses, actually typed responses, and the time taken to complete responses (vocableveltest.org). Stuart McLean is a Language Testing Editorial Board member. E-mail: stumc93@gmail.com Brandon Kramer earned his undergraduate degree in Mathematics and taught English in Japanese high schools for seven years before working at the university level. He is now teaching and researching while working on his Ph.D. in Education (concentration in Applied Linguistics) at Temple University, Japan. E-mail: brandon.l.kramer@gmail.com Young Ae Kim is an instructor at Kindai University and Kyoto Seika University, and holds an MA from Kansai University. She is working on research related to test item format type, word counting units, listening and word difficulty, as well as making the items for the tests available at vocableveltest.com. E-mail: youngaekim1227@gmail.com Reading in a Foreign Language 35(1)