PROGRAMMING PROJECT

Contextual Vocabulary Acquisition

(Click on the title above to go to the CVA homepage)

Last Update: 6 April 2004

Note: or material is highlighted

In this project, you will use the SNePS knowledge-representation and reasoning system to represent the information in a text passage that contains an "unknown" word, together with the "prior knowledge" needed to help figure out a meaning for that word from context. You will then run a definition algorithm on your representation to see what definition it computes, making any necessary changes to your representation or the algorithm in order to improve its performance. You will write up your work in a conference-style paper, to be accompanied by copies of the relevant computer files and annotated demos. Your grade will be a function of the quality of both your work and your writing. If warranted, the final report and accompanying computer files will be placed on the CVA website's "Papers, progress reports, and related documents" page.

More precisely, please do the following:

If you are not familiar with SNePS, please do the SNePS Tutorial.
- If you do this tutorial, please hand in "Project 1" (which you will find at the end of the tutorial) as soon as possible (preferably no later than Fri., Feb. 6.
Familiarize yourself with the CVA website. In particular, read the following documents that are available on that website:
1. Rapaport, William J., & Ehrlich, Karen (2000), "A Computational Theory of Vocabulary Acquisition", in Lucja M. Iwanska & Stuart C. Shapiro (eds.), Natural Language Processing and Knowledge Representation: Language for Knowledge and Knowledge for Language (Menlo Park, CA/Cambridge, MA: AAAI Press/MIT Press): 347-375.
2. Rapaport, William J., & Kibby, Michael W. (2002), "ROLE: Contextual Vocabulary Acquisition: From Algorithm to Curriculum".
Choose a sentence containing an "unknown" (or otherwise "hard") word. For ideas, consider:
- the CVA website's "Words and Contexts" page,
- a new sentence containing a word that has already been worked on, or
- a "hard" word that you have come across in your own reading.
Please get my approval of your choice before proceeding to the next step. Please do this no later than Fri., Feb. 13, if possible.
Conduct an informal experiment with friends, asking them to read the passage and to figure out the meaning of the word. (If it is a word that you think they already know the meaning of, substitute a made-up word or replace the word with a blank. If you do this, please also do two things:
1. Make sure that the made-up word is morphologically similar to the real word (e.g., change a verb in the past tense to a made-up word that looks like a past-tense verb) and pronounceable in English. For help on doing this, please see me.
2. Emphasize to your experimental subjects that they are not trying to guess what real word the made-up word (or the blank word) is. Rather, they are trying to figure out what it might mean, i.e., to come up with a dictionary-like entry for it.
3. Keep a record (written or taped) of these "think-aloud" (or "verbal") "protocols" (as they are called).
Don't give the subjects any help, but, for each proposed definition they come up with, do ask them why or how they came up with it; i.e., try to elicit what information in the text or from their prior knowledge they used to help them figure out a meaning.
Represent the sentence containing the hard word in SNePS.
- Use the "standard" SNePS case frames for this project; lists of these are on the CVA website's "Resources" page.
- If necessary, you may also use or adapt "standard" SNePS case frames from the SNePS Case-Frame Dictionary
- For any new or modified case frame, please give its syntax and semantics (follow the style given in the various case-frame dictionaries above).
Decide what prior knowledge is needed for figuring out a meaning. In general, you will need a "meaning postulate" (i.e., a necessary or sufficient condition) for each important term in the sentence, and various other kinds of facts or rules that provide appropriate background knowledge, world knowledge, commonsense knowledge, domain knowledge, etc.
1. Then represent this prior knowledge in SNePS, following the directions above.
Run the appropriate definition algorithm (for nouns, verbs, or adjectives) on your representations. To do this, create a SNePS "demo" file. This is just a plain text file containing lines of commented SNePS or Lisp code that SNePS reads and executes. Your demo file will:
- begin with the necessary code to load various kinds of information that are necessary for running all CVA demos,
- followed by the background knowledge for your passage,
- followed by your passage,
- and ending with an invocation of the appropriate definition algorithm.
Click here for a template for the demo file that you can save and adapt.
Modify your representations, prior knowledge, or the algorithm itself (but the latter will require some knowledge of Lisp!) in order to improve the performance of the definition algorithm.
Turn in a research report containing:
1. an abstract, consisting of brief, 1-or-2-sentence summaries of each of the following points (b-f, below)
  - This is the sort of information you might find yourself having to give, extemporaneously, in a job interview, in an informal discussion at a conference/convention, or even in a "real" job when your boss sees you in the hall or even in the mall :-)
  - This should be completely self-contained; i.e., it should be understandable by someone who is not familiar with our CVA project, as well as by someone who does not bother to read the rest of your paper! And your paper should not assume that the reader has read the abstract!
2. brief descriptions of the CVA project and SNePS
3. the role of your task in the overall project, e.g., which passage you're working on, what you're trying to do with it, etc.
4. what you have accomplished, including:
  1. a report on any human protocols you ran
  2. an annotated transcript of your demos
  3. commented SNePS representations of the sentence and prior knowledge.
5. what the immediate next steps in your part of the project are
  - i.e., what you would have done had you had another week or so to work on it)
6. what longer-term future steps need to be taken
Please prepare all documents using a word processor (preferably LaTeX), and hand in hard copy to me on or before the due date announced in the syllabus.
For inclusion on the CVA website, I would also like online versions of:
1. your complete report, with all appendices (preferably in PDF format, but .doc is OK, too)
2. your demo file (plain ASCII text, not .doc)
3. a transcript of your demo (plain ASCII text, not .doc)
- For further information on how to prepare your report, as well as pointers on grammar, etc., see my webpage "How to Write".
- To see what some other reports look like, browse through the "Progress Reports" section of the CVA website.
- Further hints:
  1. You should create and edit the files you will need by using your favorite text editor and then running the files using SNePS's "demo" command. (For details, see the SNePS User Manual, esp. the relevant section on "Using Auxiliary Files".)
  2. Use Unix's "script" (or Emacs's equivalent) to create transcripts of your interactions. (See "How to Use the UNIX "script" Program" for details.)