PROGRAMMING PROJECT

Contextual Vocabulary Acquisition

(Click on the title above to go to the CVA homepage)

Last Update: 6 April 2004

Note: NEW or UPDATED material is highlighted


In this project, you will use the SNePS knowledge-representation and reasoning system to represent the information in a text passage that contains an "unknown" word, together with the "prior knowledge" needed to help figure out a meaning for that word from context. You will then run a definition algorithm on your representation to see what definition it computes, making any necessary changes to your representation or the algorithm in order to improve its performance. You will write up your work in a conference-style paper, to be accompanied by copies of the relevant computer files and annotated demos. Your grade will be a function of the quality of both your work and your writing. If warranted, the final report and accompanying computer files will be placed on the CVA website's "Papers, progress reports, and related documents" page.

More precisely, please do the following:

  1. If you are not familiar with SNePS, please do the SNePS Tutorial.

  2. Familiarize yourself with the CVA website. In particular, read the following documents that are available on that website:

    1. Rapaport, William J., & Ehrlich, Karen (2000), "A Computational Theory of Vocabulary Acquisition", in Lucja M. Iwanska & Stuart C. Shapiro (eds.), Natural Language Processing and Knowledge Representation: Language for Knowledge and Knowledge for Language (Menlo Park, CA/Cambridge, MA: AAAI Press/MIT Press): 347-375.

    2. Rapaport, William J., & Kibby, Michael W. (2002), "ROLE: Contextual Vocabulary Acquisition: From Algorithm to Curriculum".

  3. Choose a sentence containing an "unknown" (or otherwise "hard") word. For ideas, consider:

    Please get my approval of your choice before proceeding to the next step. Please do this no later than Fri., Feb. 13, if possible.

  4. Conduct an informal experiment with friends, asking them to read the passage and to figure out the meaning of the word. (If it is a word that you think they already know the meaning of, substitute a made-up word or replace the word with a blank. If you do this, please also do two things:

    1. Make sure that the made-up word is morphologically similar to the real word (e.g., change a verb in the past tense to a made-up word that looks like a past-tense verb) and pronounceable in English. For help on doing this, please see me.

    2. Emphasize to your experimental subjects that they are not trying to guess what real word the made-up word (or the blank word) is. Rather, they are trying to figure out what it might mean, i.e., to come up with a dictionary-like entry for it.

    3. Keep a record (written or taped) of these "think-aloud" (or "verbal") "protocols" (as they are called).

    Don't give the subjects any help, but, for each proposed definition they come up with, do ask them why or how they came up with it; i.e., try to elicit what information in the text or from their prior knowledge they used to help them figure out a meaning.

  5. Represent the sentence containing the hard word in SNePS.

  6. Decide what prior knowledge is needed for figuring out a meaning. In general, you will need a "meaning postulate" (i.e., a necessary or sufficient condition) for each important term in the sentence, and various other kinds of facts or rules that provide appropriate background knowledge, world knowledge, commonsense knowledge, domain knowledge, etc.

    1. Then represent this prior knowledge in SNePS, following the directions above.

  7. Run the appropriate definition algorithm (for nouns, verbs, or adjectives) on your representations. To do this, create a SNePS "demo" file. This is just a plain text file containing lines of commented SNePS or Lisp code that SNePS reads and executes. Your demo file will:

    Click here for a template for the demo file that you can save and adapt.

  8. Modify your representations, prior knowledge, or the algorithm itself (but the latter will require some knowledge of Lisp!) in order to improve the performance of the definition algorithm.

  9. Turn in a research report containing:

    1. an abstract, consisting of brief, 1-or-2-sentence summaries of each of the following points (b-f, below)

      • This is the sort of information you might find yourself having to give, extemporaneously, in a job interview, in an informal discussion at a conference/convention, or even in a "real" job when your boss sees you in the hall or even in the mall :-)

      • This should be completely self-contained; i.e., it should be understandable by someone who is not familiar with our CVA project, as well as by someone who does not bother to read the rest of your paper! And your paper should not assume that the reader has read the abstract!

    2. brief descriptions of the CVA project and SNePS

    3. the role of your task in the overall project, e.g., which passage you're working on, what you're trying to do with it, etc.

    4. what you have accomplished, including:

      1. a report on any human protocols you ran
      2. an annotated transcript of your demos
      3. commented SNePS representations of the sentence and prior knowledge.

    5. what the immediate next steps in your part of the project are

      • i.e., what you would have done had you had another week or so to work on it)

    6. what longer-term future steps need to be taken

  10. Please prepare all documents using a word processor (preferably LaTeX), and hand in hard copy to me on or before the due date announced in the syllabus.

  11. NEW For inclusion on the CVA website, I would also like online versions of:

    1. your complete report, with all appendices (preferably in PDF format, but .doc is OK, too)
    2. your demo file (plain ASCII text, not .doc)
    3. a transcript of your demo (plain ASCII text, not .doc)




Copyright © 2003-2004 by William J. Rapaport (rapaport@cse.buffalo.edu)
file: 740/S04/cvaproject-2004-06-06.html