file: cassie-vs-humans-2004-09-09 ------------------------------------------------------------------------ How Cassie Computes (or: Acquires, or: Learns) a Meaning for (or: a Hypothesis about) an Unfamiliar Word -- a work very much in progress -- ======================================================================== 1. Cassie is initially supplied with a stock of beliefs; call this her "prior knowledge" (PK). Despite the term `knowledge' (which ordinarily implies that what's known is true), not all of her PK need be true, since she might well have some false beliefs. In fact, perhaps "prior *information*" would be the best (more or less neutral) term. In SNePS, there are three basic kinds of PK (though there may be others): Call them "basic propositions", "node-based rules", and "path-based rules". (She might also have beliefs about *how to do* certain things, though so far we have not explored this in our CVA project. She might also have "mental images" (e.g., be able to mentally visualize what she reads). She also has "subconscious" (or "tacit) linguistic knowledge--see below.) Examples of basic propositions include "Someone is named John", "Someone is tall", "Someone likes someone (else)", "Some particular kind of thing belongs to someone", etc. (see below for more examples). In general, basic propositions are expressed by simple subject-predicate sentences (usually without proper names---that someone has a certain name is itself a basic proposition; see below) and simple relational sentences---the sorts of sentences represented in first-order logic by atomic sentences of the form Fx or Rxy, for instance. Basic propositions are probably most easily characterized negatively: They are not rules. Node-based rules are primarily conditional propositions of the form "if P, then Q" and usually involve universally quantified variables (e.g., "for all x, if P(x), then Q(x)"). They are interpreted by the SNePS Inference Package, which is the source of inference rules (such as Modus Ponens), allowing Cassie to infer from a rule of the form, e.g., if P, then Q, and a (typically, basic) proposition of the form P, that she should believe Q. Path-based rules enable a generalization of the inheritance feature of semantic networks, enabling Cassie to infer, e.g., that Fido is an animal, if she believes that Fido is a brachet, that brachets are dogs, and that dogs are animals. The difference between the two kinds of rules roughly corresponds to the difference between "consciously believed" and "subconsciously believed" rules. (This is all a vast oversimplification, but will suffice for now.) 2. PK can be of a wide variety. In practice, for our research, we try to limit the PK to propositions that are necessary for Cassie to understand the meaning of all words in the co-text of the unknown word X. In fact, we often use even less than this, limiting ourselves to that PK about the co-text that our analysis indicates is necessary for Cassie to compute the meaning of X (notation: [[X]]). Besides basic propositions (usually meaning postulates about the crucial terms in the co-text (i.e., necessary *or* sufficient conditions concerning these terms), there is need for rules of a very special and general sort. I have not yet found a way to characterize them. Here are a few examples: (a) IF x is a subclass of z, AND x is a subclass of y, AND z has the property unknown, AND y is a subclass of w, THEN presumably, z is a subclass of y, and z is a subclass of w. (b) IF action A is performed by agent Y on object Z, AND action B is performed by agent Y on object Z, THEN A and B are similar. (c) IF x does a AND a has the property p; AND x does b AND b is unknown , THEN possibly b also has the property p. For now, let's just say that they are fairly abstract, not specific to any particular domain of discourse (i.e., are very general), perhaps abductive in nature, perhaps analogical in nature, and defeasible. They are, I believe, essential to CVA. 3. How does Cassie get this PK? In practice, we give it to her, though once she has it, it can be stored ("memorized") and re-used again and again. In general, Cassie would acquire her PK from reading, being told, previous inferencing, etc.; in short, she would learn it in any of the variety of ways that one learns anything (not excluding some of it being "innate knowledge"). Cassie's PK is always unique, as is each human reader's. Sometimes, we give Cassie PK that, while not strictly needed according to any of the (informal) criteria mentioned above, is such that human readers have indicated that they have (and use), as shown by our protocol case studies. Thus, we feel justified in giving Cassie some PK, including rules, that human readers seem to use, even if, on the face them, they seem unmotivated. 4. Armed with her PK, Cassie begins to read the text. When our full system is implemented, she will read all texts in the manner to be described below. Currently, we hand-code the output of this process. We input the text to a computational grammar, which outputs a semantic representation of the text. Currently, the grammar is implemented in an ATN (augmented transition network) formalism, though we are planning to experiment with the LKB computational grammar (which is more modern and has wider coverage) in the future. The output consists of a semantic network in the formalism of the SNePS knowledge-representation and reasoning system. For ease of grammar development, we try to constrain the possible input sentences to a small set, including those in Sect. 5, below. (Where these prove insufficient, we extend the set. However, each extension usually requires a corresponding extension to the definition algorithms, as will be explained below.) The main idea behind this analysis is to take complex sentences and analyze them into the following "basic" propositions. The assumption is that the meaning of a complex sentence should be the (combined) meanings of the shorter sentences into which it gets analyzed. 5. Sentences of the form... Are encoded as this SNePS net(*): (*) rough notation only ------------------------------------------------------------------------ X is a Y (add member (build lex X) i.e., NP_indiv is Art+NP_common class (build lex Y)) X is a Y (add subclass (build lex X) i.e., NP_common is a NP_common superclass (build lex Y)) X is Y (add object (build lex X) i.e., NP is Adj property (build lex Y)) ------------------------------------------------------------------------ Note: If X is an N_proper, then we represent this using: (add object #base-node proper-name X) and we replace (build lex X) above by: *base-node ------------------------------------------------------------------------ X is Y's Z (add possessor (build lex Y) i.e., NP is NP's NP rel (build lex Z) object #base-node) X does Y {with respect to Z} (add agent X act (build action Y {object Z})) X stands in relation R to Y (add object1 X rel R object2 Y) X causes Y (add cause X effect Y) X is a part of Y (add part X whole Y) An XY (build classmod X classhead Y) (e.g., a toy gun, a small elephant, a fire hydrant, etc.) X is (extensionally) the same as Y (add equiv X equiv Y) X and Y are synonyms (add synonym X synonym Y) (For more information on SNePS, see "Essential SNePS Readings" at [http://www.cse.buffalo.edu/~rapaport/snepsrdgs.html].) The effect of using "add" instead of "assert" is to have Cassie do "forward" inference upon reading each sentence. This models a reader who thinks about each thing that s/he reads. If any text sentence matches the antecedent of any PK rule, that rule will fire (usually; sometimes it has to be "tricked" into firing--an implementation-dependent phenomenon). This is the primary means by which Cassie infers the new information needed to hypothesize a definition. 7. At any point, we can ask Cassie to define any N or V; call it X. (Adjectives (and adverbs) are harder, and we do not yet have a general adjective-definition algorithm; we can, however, produce definitions of certain adjectives. This is an area of current research.) If X is not in Cassie's lexicon (because she has never read the word, not even in the current text), she will respond with "I don't know". If X is in her lexicon, then--whether or not X occurs in the current text (though typically, of course, it will)--Cassie will search her entire network for a subset of the information required to fill in the slots of a definition frame. In general, Cassie will look for general, basic-level information, though in its absence she will report (in "possible" slots) information specific to known instances of X. In the worst case, Cassie will "algebraically/syntactically" manipulate the only sentence containing X such that X becomes its subject. [Speculation: A possible algorithm for computing a definition is: Manipulate the sentence containing X in that way; then replace each other word by its definition. Compare solving an equation for 1 unknown.] Repeat the above for subsequent occurrences of X until a stable definition is reached. 8. The N algorithm looks for... [ need to give a good English description of the current defineNoun algorithm ] 9. The V algorithm looks for... [ need to give a good English description of the current defineVerb algorithm ] 10. What a human would have to do in order to simulate Cassie. [ how does what humans do differ from what Cassie does? ] 11. What we would have to teach humans to do. [ what things that Cassie does automatically and easily would have to be taught to humans? ]