CONTEXTUAL VOCABULARY ACQUISITION: Development of a Computational Theory and Educational Curriculum William J. Rapaport, Project Director and co-Principal Investigator Department of Computer Science and Engineering Center for Cognitive Science rapaport@cse.buffalo.edu www.cse.buffalo.edu/ rapaport Michael W. Kibby, co-Principal Investigator Department of Learning and Instruction Center for Literacy and Reading Instruction mwkibby@acsu.buffalo.edu www.gse.buffalo.edu/FAS/Kibby/ State University of New York at Buffalo Buffalo, NY 14260 18 May 2001 ------------------------------------------------------------------------- Abstract. ------------------------------------------------------------------------- No doubt you have on occasion read some text containing an unfamiliar word, but you were unable or unwilling to find out from a dictionary or another person what the word meant. Nevertheless, you might, consciously or not, have figured out a meaning for it. Suppose you didn't, or suppose your hypothesized meaning was wrong. If you never see the word again, it may not matter. However, if the text you were reading were from science, mathematics, engineering, or technology (SMET), not understanding the unfamiliar term might seriously hinder your subsequent understanding of the text. If you do see the word again, you will have an opportunity to revise your hypothesis about its meaning. The more times you see the word, the better your definition will become. And if your hypothesis development were deliberate, rather than "incidental", your command of the new word would be stronger. We propose (a) to extend and develop algorithms for computational contextual vocabulary acquisition (CVA): learning, from context, meanings for "hard" word: nouns (including proper nouns), verbs, adjectives, and adverbs, (b) to unify a disparate literature on the topic of CVA from psychology, first- and second-language (L1 and L2) acquisition, and reading science, in order to help develop these algorithms, and (c) to use the knowledge gained from the computational CVA system to build and to evaluate the effectiveness of an educational curriculum for enhancing students' abilities to use deliberate (i.e., non-incidental) CVA strategies in their reading of SMET texts at the middle-school and college undergraduate levels: teaching methods and guides, materials for teaching and practice, and evaluation instruments. The knowledge gained from case studies of students using our CVA techniques will feed back into further development of our computational theory. This project falls within Quadrant 2 (fundamental research on behavioral, cognitive, affective and social aspects of human learning) and Quadrant 3 (research on SMET learning in formal and informal educational settings). ------------------------------------------------------------------------- DESCRIPTION OF REVISED SCOPE OF PROJECT FOR PILOT-PROJECT PURPOSES ------------------------------------------------------------------------- What follows is a plan for, and further justification of, a pilot project for our research on Contextual Vocabulary Acquisition--Development of a Computational Theory and Educational Curriculum. First, some background (which partially addresses the concerns of some of the reviewers): It is generally agreed among researchers in contextual vocabulary acquisition (CVA) that so-called "incidental" vocabulary acquisition does occur: I.e., people know more words than they are explicitly taught; therefore, they must have learned most of them as a by-product of reading (and other language-related activities, such as listening). Furthermore, at least some of this incidental acquisition was the result of conscious processes of guessing, inferring, etc., the meaning of unknown words from context. It is also generally agreed that we don't know *how* readers do much of this. There are studies in the first- and second-language-learning literature that suggest various strategies for doing this, but most of them are quite vague (e.g.: step 1: examine the immediately preceding context of the unknown word (looking for causal, temporal, categorical information, etc.); step 2: examine the immediately following context of that word; step 3: guess the meaning of the word -- hardly a detailed algorithm that could easily be followed by a student). One reason for this vagueness in the educational literature is that it is not clear exactly how context operates, in large part because of the lack of research on this topic. In turn, this means there is no generally accepted curriculum or set of strategies for teaching CVA. We need to know more about how context operates and how we can teach it strategically. With this knowledge, and broadly speaking, we would then more effectively help students be more aware of context and know better how to use it. There are also computational theories that implement various CVA methods, and that do--because they are computational (see below)--go into much more detail on how to use context to infer meaning. But most of these theories and programs assume the prior existence of a known concept that the unknown word is to be mapped to. As Ellen Prince, a linguist at the University of Pennsylvania suggested to me in conversation, that makes the task more like a multiple-choice quiz; whereas CVA as our system does it is more like an essay test. What is needed (and what we have been working on) is a *general* method that (a) shows how CVA *can* be done and (b) is explicit enough to be taught to human readers. Such a theory is best expressed algorithmically, for then the methods are made fully explicit and can be tested computationally. Admittedly, this does not solve the problem of how *humans* actually do CVA, though it does provide testable ideas of how they *might* do it. And it certainly provides ideas for how they *could* do it and, hence, how it might be taught. An NSF program officer has asked us to respond to the following remarks made by one reviewer of the original proposal about the value of such computational studies: "... the application of the programming is to teach people to operate a la computers. ... in what world does learning or teaching occur as it does in a computer? Further, the project is going to circumvent people going to dictionaries to look up the meanings of words they don't know .... Why not just use a dictionary in the first place?" The latter question is easy to answer and was in fact addressed in the original proposal: Not all words are in dictionaries, nor are dictionaries always readily available. In addition, many researchers have pointed out that dictionary definitions are neither always correctly understood by readers nor are they always useful. Further, it is a fact that upwards of ninety percent of all the words we know are learned from context while reading or listening. There is no intention here of demeaning the value of the dictionary, but this reviewer seems to take the stand that all or most new words are learned by consulting a dictionary. This view simply is not compatible with the research on vocabulary acquisition between ages 0 and 18; the dictionary simply is not the major source of learning word meanings in elementary, middle, and high schools. Our intent here, speaking broadly, is to find ways to facilitate students' natural CVA by developing a more rigorous knowledge base of how context operates and creating a more systematic and viable curriculum for teaching students to use CVA strategies. We will respond to the former question by elaborating on our remarks above about computation, beginning with some comments on the nature of Artificial Intelligence (AI). AI can be viewed in at least three ways: (1) as a branch of engineering whose goal is to advance the field of computer science; this, however, is neither our immediate goal nor our methodology (2) as "computational psychology", where the goal is to study human cognition using computational techniques; a good computational-psychology computer program will simulate some human cognitive task in a way that is faithful to human performance, with the same failures as well as successes--AI as cognitive psychology can tell us something about the human mind and (3) as "computational philosophy", where the goal is to learn which aspects of cognition in general are computable; a good computational-philosophy computer program will simulate some cognitive task but not necessarily in the way that humans would do it--AI as computational philosophy can tell us something about the limits and scope of cognition in general, but not necessarily about human cognition in particular. The present project falls under the category of computational psychology (and to a lesser extent under the category of computational philosophy). Our goal is not to teach people to "think like computers" (except, of course, insofar as computers might already "think"--i.e., process information--like humans do!). Rather, our goal is to explicate methods for inferring the meanings of unknown words from context. The "vague" strategy mentioned above is *not* a caricature; it is the actual recommendation of one writer in the field of vocabulary acquisition! But neither is it an algorithm--i.e., an explicit, step-by-step procedure for solving a problem correctly. Our goal is to "teach" (i.e., program) a computer to do the "educated" guessing--or inferencing--that is left vague in the strategy above. To do that, we must determine what information is needed and what inference methods must be supplied, and we must spell this all out in enough detail so that "even" a computer could do it. But that is not all: For once we have such a method, we can then actually teach it to people, rather than leave them wondering what to do with all the contextual information that they might have found in steps 1 and 2 of the above vague strategy--we can teach them what information to look for and what to do with it. This is our final goal. Furthermore, whereas our original proposal can be seen as having three parts, this revised pilot project focuses on only one of them: Originally, we proposed: (1) to make our existing computational CVA system more robust: to improve or create algorithms for inferring the meanings of unknown nouns, verbs, adjectives, and adverbs; to utilize grammatical, morphological, and etymological information in the inference process; etc., (2) to develop and fully test educational curricula at the secondary and post-secondary levels for teaching CVA methods, and (3) to integrate these two tasks by using the results of the computational theory to help develop the educational curriculum. As the reviewers of that proposal noted, task (3) was rather vague. The purpose of this pilot project is to focus on task (3) in order to facilitate the eventual transfer between, and mutual interaction of, tasks (1) and (2). The pilot project that we are proposing will continue to have two research "streams" as in our full proposal: a computational stream and an educational stream. The computational stream has as its main goal the development and implementation of a *computational theory* of CVA; the educational stream has as its main goal the development and implementation of an *educational curriculum* in CVA. Although these two streams still have independent goals and, to some extent, independent methodologies, their full development must be intimately integrated in a synergistic fashion. We intend this pilot project to primarily address our lack of clarity on such integration. Thus, the development of the bridge between the two research streams is its focus. Accordingly, in the computational stream, less time will be spent on developing new algorithms, and more time will be spent on developing it for use by the educational stream. Similarly, the educational stream will spend less time on *testing* new curricula, and more time on *using the computational system* to begin the *development* of new curricula, as well as *providing feedback* to the computational stream based on students' actual CVA techniques. We believe that this bridge-building can be accomplished as follows: 1. We will begin by identifying common texts (not necessarily text*books*) that we will both work on. This will be the focus of the "synergy" between the two streams. We will start with the texts that the computational stream has already developed algorithms for. But clearly we must find real texts that the students in our study would be reading. Although we will continue to search for *textbooks* in science, math, engineering, and/or technology (SMET), we will also look at popular science writing such as is found in _Scientific American_ or even a daily newspaper. The ability to read and understand such texts is an important aspect of SMET literacy, and such writing is more likely to require CVA skills than SMET text*books* (which are often quite detailed and specific in giving definitions of terms). 2. The computational research stream will (a) develop grammars for those texts, (b) develop knowledge representations of the background knowledge necessary for understanding and reasoning about them, (c) test the current system on the unknown words in them, and (d) develop new algorithms, as necessary, for CVA on them, as well as ones based on student protocols (see below: in the educational stream, we will be eliciting student protocols (or "thinking-out-loud" records) of students' attempts to figure out the meanings of unknown words; the researchers in the computational stream will then try to formalize and implement any (successful!) methods that students actually use but that we have not (yet) implemented). Another task that we might tackle (one not discussed explicitly in the original proposal, but that would be a clear case of integration of the two streams) is to develop an *explanation facility* for our computational system ("Cassie") that can be used by the students when they are stuck. That is, if students have trouble figuring out the meaning of an unknown word (or merely want to check to see if their answer is acceptable), they could ask Cassie for an explanation of how "she" figured it out (or, perhaps, for guidance on how the *student* could figure it out). 3. The educational stream will (a) identify students who will participate in the experiments, (b) have them read the chosen texts, (c) have them figure out the meaning of the unknown words in those texts, (d) elicit protocols of their thought processes while doing this (which will be used to modify Cassie; see above), and (e) begin to develop educational curricula to teach them Cassie's (successful) techniques. We expect that by the end of one year, we will have a working demonstration of how the two streams of our project can be bridged. Finally, a program officer has asked us to respond to the following issues: "What are the broader implications of the project for science and mathematics education? That is, how could your line of research (assuming it gets carried to its logical conclusions) affect instruction and/or learning practice in the future? Tell me a bit more about how you view this in the longer term agenda to improve teaching and learning?" Why, then, is our research important? We have addressed some of these issues in passing, above. For instance, as noted above, it is a fact that most meaning vocabulary is learned from context; teachers have too little time for directly teaching an extensive list of meaning vocabulary. Also as noted above, though the use of the dictionary is extremely important in learning words, it is the case that dictionaries are not always available, that dictionary definitions are not always decipherable, and that sometimes, humans just do not bother to go "look it up." Learning words from context is simply required if a student is to try to learn the many new terms and words that must be known to learn science, mathematics, and technology topics. Further, newly revised educational standards for language arts, science, social studies, and mathematics all call for students to have a greater command of concepts and the words that signify those concepts. Since these concepts and their words and terms cannot all be taught directly in the classroom, it is important that not only do we devote more instructional time in school to teaching CVA, but also gain more knowledge about what context is and how it operates. Learning when and how to use CVA strategies has broader implications than just the classroom learning and learning standards, however. Students learn a great deal of science, math, and technology from reading trade books (i.e., books that are not textbooks), articles in general-interest children's magazines (e.g., _Highlights_, _Cricket_, _Spider_, _Weekly Reader_), and children's magazines devoted especially to science and science-related topics (e.g., _Spinner_, _TimeKids_, _Science Scholastic_, _Quantum_). If students are better able to use surrounding context to help determine the meaning of unknown words or terms, then more science, math, and technology will be learned when students are independently engaged in reading these materials. It is usually in reading magazines, trade books, and websites such as these that students first encounter articles on science, math, and technology. If schools are more effective in teaching CVA, and if the writers and editors of these articles structure their texts to accommodate CVA, then students will gain more knowledge and heighten their interest and motivation in science, math, and technology. There are also considerations from a broader science-education perspective: One of the goals of education should be to instill in students the knowledge--and the confidence and life-long desire to use that knowledge--of how to learn on one's own. Most often, there are no ultimate authorities or experts to consult when one has a problem to solve or a question to answer. This is just as true in the world of, say, particle physics (where there are no "answers in the back of the book") as it is when one comes across an unknown word while reading (and there is no dictionary or glossary at hand, or no other person who knows the word). The skills required for CVA are not only useful for helping one read (and hence learn) on one's own, but are also among those most useful in science and mathematics: finding clues or evidence (among the context surrounding the unknown word), integrating them with one's background knowledge, and using both to infer (whether by deduction, induction, or abduction) the meaning of the unknown word. CVA is a wonderful model "in the small" of the scientific method of hypothesis formation, testing, and revision, as well as a useful tool for learning on one's own.