CONTEXTUAL VOCABULARY ACQUISITION:
Development of a Computational Theory and Educational Curriculum

William J. Rapaport, Project Director and co-Principal Investigator
Department of Computer Science and Engineering
Center for Cognitive Science
rapaport@cse.buffalo.edu
www.cse.buffalo.edu/ rapaport

Michael W. Kibby, co-Principal Investigator
Department of Learning and Instruction
Center for Literacy and Reading Instruction
mwkibby@acsu.buffalo.edu
www.gse.buffalo.edu/FAS/Kibby/

State University of New York at Buffalo
Buffalo, NY 14260

18 May 2001

-------------------------------------------------------------------------
Abstract.
-------------------------------------------------------------------------

No doubt you have on occasion read some text containing an unfamiliar
word, but you were unable or unwilling to find out from a dictionary or
another person what the word meant.  Nevertheless, you might,
consciously or not, have figured out a meaning for it.  Suppose you
didn't, or suppose your hypothesized meaning was wrong.  If you never
see the word again, it may not matter.  However, if the text you were
reading were from science, mathematics, engineering, or technology
(SMET), not understanding the unfamiliar term might seriously hinder
your subsequent understanding of the text.  If you do see the word
again, you will have an opportunity to revise your hypothesis about its
meaning.  The more times you see the word, the better your definition
will become.  And if your hypothesis development were deliberate,
rather than "incidental", your command of the new word would be stronger.

We propose (a) to extend and develop algorithms for computational
contextual vocabulary acquisition (CVA):  learning, from context,
meanings for "hard" word: nouns (including proper nouns), verbs,
adjectives, and adverbs, (b) to unify a disparate literature on the
topic of CVA from psychology, first- and second-language (L1 and L2)
acquisition, and reading science, in order to help develop these
algorithms, and (c) to use the knowledge gained from the computational
CVA system to build and to evaluate the effectiveness of an educational
curriculum for enhancing students' abilities to use deliberate (i.e.,
non-incidental) CVA strategies in their reading of SMET texts at the
middle-school and college undergraduate levels: teaching methods and
guides, materials for teaching and practice, and evaluation
instruments.  The knowledge gained from case studies of students using
our CVA techniques will feed back into further development of our
computational theory.  This project falls within Quadrant 2 (fundamental
research on behavioral, cognitive, affective and social aspects of human
learning) and Quadrant 3 (research on SMET learning in formal and
informal educational settings).

-------------------------------------------------------------------------
DESCRIPTION OF REVISED SCOPE OF PROJECT FOR PILOT-PROJECT PURPOSES
-------------------------------------------------------------------------

What follows is a plan for, and further justification of, a pilot project
for our research on Contextual Vocabulary Acquisition--Development of a
Computational Theory and Educational Curriculum.

First, some background (which partially addresses the concerns of
some of the reviewers):  It is generally agreed among researchers in
contextual vocabulary acquisition (CVA) that so-called "incidental"
vocabulary acquisition does occur:  I.e., people know more words than
they are explicitly taught; therefore, they must have learned most of
them as a by-product of reading (and other language-related activities,
such as listening).  Furthermore, at least some of this incidental
acquisition was the result of conscious processes of guessing, inferring,
etc., the meaning of unknown words from context.  

It is also generally agreed that we don't know *how* readers do much of
this.  There are studies in the first- and second-language-learning
literature that suggest various strategies for doing this, but most of
them are quite vague (e.g.:  step 1:  examine the immediately preceding
context of the unknown word (looking for causal, temporal, categorical
information, etc.); step 2:  examine the immediately following context of
that word; step 3:  guess the meaning of the word -- hardly a detailed
algorithm that could easily be followed by a student).  One reason for
this vagueness in the educational literature is that it is not clear
exactly how context operates, in large part because of the lack of
research on this topic. In turn, this means there is no generally
accepted curriculum or set of strategies for teaching CVA.  We need to
know more about how context operates and how we can teach it
strategically.  With this knowledge, and broadly speaking, we would then
more effectively help students be more aware of context and know better
how to use it.

There are also computational theories that implement various CVA methods,
and that do--because they are computational (see below)--go into much
more detail on how to use context to infer meaning.  But most of these
theories and programs assume the prior existence of a known concept that
the unknown word is to be mapped to.  As Ellen Prince, a linguist at the
University of Pennsylvania suggested to me in conversation, that makes
the task more like a multiple-choice quiz; whereas CVA as our system does
it is more like an essay test.

What is needed (and what we have been working on) is a *general* method
that (a) shows how CVA *can* be done and (b) is explicit enough to be
taught to human readers.  Such a theory is best expressed
algorithmically, for then the methods are made fully explicit and can be
tested computationally.  Admittedly, this does not solve the problem of
how *humans* actually do CVA, though it does provide testable ideas of
how they *might* do it.  And it certainly provides ideas for how they
*could* do it and, hence, how it might be taught.

An NSF program officer has asked us to respond to the following remarks
made by one reviewer of the original proposal about the value of such
computational studies:

	"... the application of the programming is to teach people to
	operate a la computers. ... in what world does learning or
	teaching occur as it does in a computer?  Further, the project is
	going to circumvent people going to dictionaries to look up the
	meanings of words they don't know ....  Why not just use a 
	dictionary in the first place?"

The latter question is easy to answer and was in fact addressed in the
original proposal:  Not all words are in dictionaries, nor are
dictionaries always readily available.  In addition, many researchers
have pointed out that dictionary definitions are neither always correctly
understood by readers nor are they always useful.  Further, it is a fact
that upwards of ninety percent of all the words we know are learned from
context while reading or listening.  There is no intention here of
demeaning the value of the dictionary, but this reviewer seems to take
the stand that all or most new words are learned by consulting a
dictionary. This view simply is not compatible with the research on
vocabulary acquisition between ages 0 and 18; the dictionary simply is
not the major source of learning word meanings in elementary, middle, and
high schools. Our intent here, speaking broadly, is to find ways to
facilitate students' natural CVA by developing a more rigorous knowledge
base of how context operates and creating a more systematic and viable
curriculum for teaching students to use CVA strategies.

We will respond to the former question by elaborating on our remarks
above about computation, beginning with some comments on
the nature of Artificial Intelligence (AI).  AI can be viewed in at least
three ways:  

	(1) as a branch of engineering whose goal is to advance the
	    field of computer science; this, however, is neither our
	    immediate goal nor our methodology

	(2) as "computational psychology", where the goal is to study
	    human cognition using computational techniques; a good
	    computational-psychology computer program will simulate some
	    human cognitive task in a way that is faithful to human
	    performance, with the same failures as well as successes--AI
	    as cognitive psychology can tell us something about the human
	    mind

    and (3) as "computational philosophy", where the goal is to learn
	    which aspects of cognition in general are computable; a good
	    computational-philosophy computer program will simulate some
	    cognitive task but not necessarily in the way that humans
	    would do it--AI as computational philosophy can tell us
	    something about the limits and scope of cognition in general,
	    but not necessarily about human cognition in particular.

The present project falls under the category of computational psychology
(and to a lesser extent under the category of computational philosophy).
Our goal is not to teach people to "think like computers" (except, of
course, insofar as computers might already "think"--i.e., process
information--like humans do!).  Rather, our goal is to explicate methods
for inferring the meanings of unknown words from context.  The "vague"
strategy mentioned above is *not* a caricature; it is the actual
recommendation of one writer in the field of vocabulary acquisition!
But neither is it an algorithm--i.e., an explicit, step-by-step
procedure for solving a problem correctly.

Our goal is to "teach" (i.e., program) a computer to do the "educated"
guessing--or inferencing--that is left vague in the strategy above.  To
do that, we must determine what information is needed and what inference
methods must be supplied, and we must spell this all out in enough
detail so that "even" a computer could do it.  But that is not all:  For
once we have such a method, we can then actually teach it to people,
rather than leave them wondering what to do with all the contextual
information that they might have found in steps 1 and 2 of the above
vague strategy--we can teach them what information to look for and what
to do with it.  This is our final goal.

Furthermore, whereas our original proposal can be seen as having three
parts, this revised pilot project focuses on only one of them:
Originally, we proposed:  (1) to make our existing computational CVA
system more robust:  to improve or create algorithms for inferring the
meanings of unknown nouns, verbs, adjectives, and adverbs; to utilize
grammatical, morphological, and etymological information in the
inference process; etc., (2) to develop and fully test educational
curricula at the secondary and post-secondary levels for teaching CVA
methods, and (3) to integrate these two tasks by using the results of
the computational theory to help develop the educational curriculum.  As
the reviewers of that proposal noted, task (3) was rather vague.  The
purpose of this pilot project is to focus on task (3) in order to
facilitate the eventual transfer between, and mutual interaction of,
tasks (1) and (2).

The pilot project that we are proposing will continue to have two
research "streams" as in our full proposal:  a computational stream and
an educational stream.  The computational stream has as its main goal the
development and implementation of a *computational theory* of CVA; the
educational stream has as its main goal the development and
implementation of an *educational curriculum* in CVA.  Although these two
streams still have independent goals and, to some extent, independent
methodologies, their full development must be intimately integrated in a
synergistic fashion.  

We intend this pilot project to primarily address our lack of clarity on
such integration.  Thus, the development of the bridge between the two
research streams is its focus.  Accordingly, in the computational stream,
less time will be spent on developing new algorithms, and more time will
be spent on developing it for use by the educational stream.  Similarly,
the educational stream will spend less time on *testing* new curricula,
and more time on *using the computational system* to begin the
*development* of new curricula, as well as *providing feedback* to the
computational stream based on students' actual CVA techniques.

We believe that this bridge-building can be accomplished as follows:

1.  We will begin by identifying common texts (not necessarily
text*books*) that we will both work on.  This will be the focus of the
"synergy" between the two streams.  We will start with the texts that the
computational stream has already developed algorithms for.  But clearly
we must find real texts that the students in our study would be reading.
Although we will continue to search for *textbooks* in science, math,
engineering, and/or technology (SMET), we will also look at popular
science writing such as is found in _Scientific American_ or even a daily
newspaper.  The ability to read and understand such texts is an important
aspect of SMET literacy, and such writing is more likely to require CVA
skills than SMET text*books* (which are often quite detailed and
specific in giving definitions of terms).

2.  The computational research stream will (a) develop grammars for those
texts, (b) develop knowledge representations of the background knowledge
necessary for understanding and reasoning about them, (c) test the
current system on the unknown words in them, and (d) develop new
algorithms, as necessary, for CVA on them, as well as ones based on
student protocols (see below:  in the educational stream, we will be
eliciting student protocols (or "thinking-out-loud" records) of students'
attempts to figure out the meanings of unknown words; the researchers in
the computational stream will then try to formalize and implement any
(successful!) methods that students actually use but that we have not
(yet) implemented).

Another task that we might tackle (one not discussed explicitly in the
original proposal, but that would be a clear case of integration of the
two streams) is to develop an *explanation facility* for our computational
system ("Cassie") that can be used by the students when they are stuck.
That is, if students have trouble figuring out the meaning of an unknown
word (or merely want to check to see if their answer is acceptable),
they could ask Cassie for an explanation of how "she" figured it out
(or, perhaps, for guidance on how the *student* could figure it out).

3.  The educational stream will (a) identify students who will
participate in the experiments, (b) have them read the chosen texts,
(c) have them figure out the meaning of the unknown words in those texts,
(d) elicit protocols of their thought processes while doing this (which
will be used to modify Cassie; see above), and (e) begin to develop
educational curricula to teach them Cassie's (successful) techniques.	

We expect that by the end of one year, we will have a working
demonstration of how the two streams of our project can be bridged.

Finally, a program officer has asked us to respond to the following
issues:

	"What are the broader implications of the project for science and
	mathematics education?  That is, how could your line of research
	(assuming it gets carried to its logical conclusions) affect
	instruction and/or learning practice in the future?  Tell me a
	bit more about how you view this in the longer term agenda to
	improve teaching and learning?"

Why, then, is our research important?  We have addressed some of these
issues in passing, above.  For instance, as noted above, it is a fact that
most meaning vocabulary is learned from context; teachers have too little
time for directly teaching an extensive list of meaning vocabulary. Also
as noted above, though the use of the dictionary is extremely important
in learning words, it is the case that dictionaries are not always
available, that dictionary definitions are not always decipherable, and
that sometimes, humans just do not bother to go "look it up." Learning
words from context is simply required if a student is to try to learn the
many new terms and words that must be known to learn science, mathematics,
and technology topics. 

Further, newly revised educational standards for language arts, science,
social studies, and mathematics all call for students to have a greater
command of concepts and the words that signify those concepts. Since
these concepts and their words and terms cannot all be taught directly in
the classroom, it is important that not only do we devote more
instructional time in school to teaching CVA, but also gain more
knowledge about what context is and how it operates.

Learning when and how to use CVA strategies has broader implications than
just the classroom learning and learning standards, however.  Students
learn a great deal of science, math, and technology from reading trade
books (i.e., books that are not textbooks), articles in general-interest
children's magazines (e.g., _Highlights_, _Cricket_, _Spider_, _Weekly
Reader_), and children's magazines devoted especially to science and
science-related topics (e.g., _Spinner_, _TimeKids_, _Science
Scholastic_, _Quantum_). If students are better able to use surrounding
context to help determine the meaning of unknown words or terms, then
more science, math, and technology will be learned when students are
independently engaged in reading these materials. It is usually in
reading magazines, trade books, and websites such as these that students
first encounter articles on science, math, and technology.  If schools
are more effective in teaching CVA, and if the writers and editors of
these articles structure their texts to accommodate CVA, then students
will gain more knowledge and heighten their interest and motivation in
science, math, and technology.

There are also considerations from a broader science-education
perspective:  One of the goals of education should be to instill in
students the knowledge--and the confidence and life-long desire to use
that knowledge--of how to learn on one's own.  Most often, there are no
ultimate authorities or experts to consult when one has a problem to
solve or a question to answer.  This is just as true in the world of,
say, particle physics (where there are no "answers in the back of the
book") as it is when one comes across an unknown word while reading (and
there is no dictionary or glossary at hand, or no other person who knows
the word).  

The skills required for CVA are not only useful for helping one read (and
hence learn) on one's own, but are also among those most useful in
science and mathematics:  finding clues or evidence (among the context
surrounding the unknown word), integrating them with one's background
knowledge, and using both to infer (whether by deduction, induction, or
abduction) the meaning of the unknown word.  CVA is a wonderful model "in
the small" of the scientific method of hypothesis formation, testing, and
revision, as well as a useful tool for learning on one's own.