Go up to Top
Go forward to Logic databases

Introduction to Logic Programming

  • Introduction
  • Syntax
  • Facts and queries
  • Facts
  • Queries
  • Terms
  • Implicit quantification of variables
  • Existential queries
  • Universal facts
  • Conjunctive queries
  • Rules
  • Semantics
  • Introduction

    Niklaus Wirth wrote a book, published in 1976, with the title,
    Algorithms + Data Structures = Programs
    
    This equation highlights the separation between algorithms and data structures. An algorithm by itself does not determine a program. As a programmer you must make a decision about the data structure(s) over which the algorithm will operate. A program is an algorithm specialized for a particular data structure.

    Robert Kowalski wrote an article, published in 1979, with the title,

    Algorithm = Logic + Control
    
    This equation highlights the independence of logic and control in determining an algorithm. Logic specifies what is to be solved, while control specifies how it should be solved. Given this separation of specification from computation, there is the potential at least of removing from the programmer the burden of specifying the control component of the equation. This is the ideal of logic programming: purely declarative programming, in which the programmer only specifies the structure of a solution, and leaves the search for that solution to the logic programming system. This ideal has yet to be attained.

    Our commitment to logic brings with it a commitment to meaning. What does a program mean? How do we know a program is correct? These are questions to consider as we study logic programming. For an absurdly ridiculous example of a program without an obvious meaning, have a look at this program.

    Syntax

    Let us now look at how we can express the logic of a program. We will use a subset of first order logic as our language. As with any programming language we have to define both the syntax and the semantics of the language. The syntax of a language defines what are well-formed strings of symbols in the language. The semantics of a language tells us how to interpret the syntactically well-formed strings of symbols. In this section we discuss only the syntax of logic programming languages; their semantics will be covered later.

    A note on terminology

    If you have taken some courses in logic, you may find that the terminology used here differs from what you are accustomed to. I will follow the terminology used in The Art of Prolog, but I will highlight the differences in order to minimize confusion.


    Facts and queries

    In logic programming there are only three types of statements: facts, queries, and rules. Facts and queries will be introduced here, while rules will be discussed later.

    Facts

    A fact expresses a relation which holds of objects. Examples of facts are:
    owns(john,theArtOfProlog).
    likes(sally,pizza).
    
    Let us consider the syntax of the second example. It is composed of a goal, "likes(sally,pizza)" and a period ".", which is a punctuation symbol.

    likes is the name of a relation (also called a predicate), while sally and pizza are names of individuals of which the relation holds. Since the predicate named likes has two arguments (sally and pizza) we call it a binary predicate. In the general case, a predicate taking n arguments is called an n-ary predicate; we also say that it has arity n. A predicate is characterized by both its name and its arity. Suppose we had a three-place relation with the name likes:

    likes(carl,beer,friday).
    
    This (ternary) predicate is distinct from the binary predicate of the same name. When we need to refer unambiguously to a predicate of a specific arity, we will use the notation name/arity. Thus, we can refer to the three-place predicate named likes unambiguously as likes/3, while the two-place predicate introduced earlier is likes/2.

    likes, sally and pizza are atoms. In general, alphanumeric strings which start with a lower case letter are atoms, and a variety of symbol sequences are atoms. For example, + and - are atoms, as are +-, -+, ++. The pair of symbols [] is an atom, but neither [ nor ] alone is an atom. We will introduce other ways of forming atoms as the need arises.

    We will define a logic program, for the time being, to be a finite set of facts.


    A note on terminology

    In logic an atom refers to an atomic formula, one whose truth value is non-decomposable. Thus likes(sally,pizza) is an atomic formula, while likes(sally,pizza) & likes(sally,beer) is not. This is a semantic notion of atomicity.

    In a logic programming context a syntactic notion of atomicity is often used. From a syntactic point of view, likes(sally,pizza) can be decomposed into "likes", "(", "sally", ",", "pizza", and ")". We will call "(", "," and ")" punctuation symbols. "likes", "sally", and "pizza" are called atoms. Thus, atoms are not the only things which are syntactically nondecomposable; you can think of atoms as names of things.


    Queries

    A query asks about logical consequence. For example, the query(posed with respect to a program P)
    owns(john,theArtOfProlog)?
    
    asks whether or not the goal
    owns(john,theArtOfProlog)
    
    is a logical consequence of P.

    In order to determine the logical consequences of a program we need rules of deduction which tell how we may draw conclusions from a program. The first rule of deduction which we will see is a very simple one.

    Identity

    From F deduce F

    Identity tells us that if we know that if F is a fact, then F is a logical consequence of the program. We will see more interesting rules of deduction later.

    Aside

    In general, if G is a goal then "G." is a fact asserting the goal G, while "G?" is a query asking whether or not the goal G is a logical consequence of the given program.

    A goal is interpreted as a fact or a query depending on context; in these notes and in The Art of Prologthe requisite context is supplied by the "." and "?" punctuation. In SICStus Prolog the context is supplied by the current mode of the system. The prompt indicates the current mode. The "|" prompt is used in consult mode, in which goals are interpreted as facts; the "| ?-" prompt is used in query mode, in which goals are interpreted as queries.


    The query

    owns(john,theArtOfProlog)?
    
    is a logical consequence of the example program given above, and repeated here:
    owns(john,theArtOfProlog).
    likes(sally,pizza).
    
    according to the identity rule of deduction. Hence, we may answer "Yes" to the query.

    By way of contrast, the query (if posed with respect to the same program)

    owns(sally,theArtOfProlog)?
    
    should be answered negatively, as it is not a logical consequence of the program.

    It is very important to notice that it might be true in the real world that Sally owns The Art of Prolog. When we pose a query to our logic programming system, however, we are not asking the query about the real world. We are asking the query with respect to a program; this program may be an attempt to axiomatize (part of) the real world. One of the hardest things in logic programming is to come up with the correct axiomatization.

    It is also very important to note that all a logic programming system does it manipulate symbols. The symbols have no meaning to the machine. It is up to you as a programmer to assign a mapping from symbols in your program to objects in a domain of discourse. For example, we may write the above example program as follows:

    o(j,t).
    l(s,p).
    
    This is less readable to a human, but no more or less readable to a machine. In order to be able to interpret answers to queries, you must clearly specify your intended interpretation of the symbols in your program.

    Terms

    The logical term is the only data structure in logic programming. In order to define the syntax of terms, we must first define constants and variables.

    A constant is either an atom or a number. We have seen atoms already. Numbers are are things like "17" and "3.1415926535".

    A variable in logic programming is used to refer to an unspecified individual. This interpretation of what a variable is differs substantially from the interpretation of what a variable is in other programming languages. In an imperative language a variable is thought of as a storage location, into which different values can be put at different times. In (pure) functional programming (think of Scheme without mutation) variables are names which are associated with values once and for all. It is important to keep these three conceptions of what variables are separate.

    The name of a variable must begin with either a capital letter or an underscore "_".

    We can now (recursively) define what a term is:

  • constants are terms
  • variables are terms
  • if t1,...,tn are terms, and f is an atom, then f(t1,...,tn) is a term (called a compound term).
  • Terms are referred to as ground if they contain no variables, and nonground otherwise.


    A note on terminology

    In logic the syntax of well-formed strings of symbols (also called well-formed formulas, or wffs) is built up with a view to how the semantics of wffs is to be defined. Some of the distinctions which are made explicitly in the syntax in logic are left to context in logic programming.

    In logic symbols are partitioned into several sets as follows,

  • constant symbols
  • variable symbols
  • function symbols
  • predicate symbols
  • connectives (e.g. not, and, or, implies, iff)
  • quantifiers (e.g. forall, exists)
  • punctuation (e.g. "(", ")", ",")
  • A term, as defined in logic, refers to an individual in the domain of discourse, and is defined recursively. To distinguish between the logic usage of "term" and the usage of "term" in logic programming, let us call a term as defined in logic an "l-term". We can now define what an l-term is. A constant is an l-term. A variable is an l-term. If f is an n-ary function symbol and t1,...,tn are l-terms, then f(t1,...,tn) is an l-term.

    Well-formed formulas, which have truth values, are also defined recurively. If p is an n-ary predicate symbol and t1,...,tn are l-terms, then p(t1,...,tn) is a well-formed formula. Since the truth value of such a formula is non-decomposable, it is called an atomic formula, or an atom. This is different from the syntactic notion of atomicity used in logic programming.

    If F and G are wff's, then so are

  • not(F)
  • and(F,G)
  • or(F,G)
  • implies(F,G)
  • iff(F,G)
  • forall(F)
  • exists(F)
  • This takes care of the syntax. Note that a distinction is made in the syntax between function and predicate symbols, and between l-terms and formulas. In logic programming l-terms and atomic formulas are distinguished solely by the context in which they appear. Constants and variables are clearly only l-terms (they refer to individuals in the domain of discourse). Compound terms are l-terms if they appear as arguments of other compound terms. Compound terms are atomic formulas if they do not appear as arguments of other terms.

    Let us work through some examples. Consider the goal likes(carl, pizza). The term likes(carl, pizza) is an atomic formula consisting of a predicate whose name is likes and whose arity is 2. The terms carl and pizza are constants, and so must be l-terms.

    A slighly more interesting example is the goal likes(child_of(henry), six_pack_of(beer)). The term likes(child_of(henry), six_pack_of(beer)) is an atomic formula consisting of a predicate whose name is likes and whose arity is 2. The terms child_of(henry) and six_pack_of(beer) are arguments of a term (an atomic formula), and hence are interpreted as l-terms.

    The next two examples highlight the fact that in logic programming no syntactic distinction is made between function and predicate symbols. Context determines the interpretation of terms.

    Consider the goal p(f(0)). The term p(f(0)) does not appear as the argument of another term, so it is an atomic formula consisting of a predicate whose name is p and whose arity is 1. The term f(0) is an argument of a term (an atomic formula), and hence is interpreted as an l-term.

    Consider the goal f(p(0)). The term f(p(0)) does not appear as the argument of another term, so it is an atomic formula consisting of a predicate whose name is f and whose arity is 1. The term p(0) is an argument of a term (an atomic formula), and hence is interpreted as an l-term.

    The last examples drives home the fact that context and not syntactic form determines the difference between a (semantic) predicate and a (semantic) term.

    Consider the goal f(f(0)). The term f(f(0)) does not appear as the argument of another term, so it is an atomic formula consisting of a predicate whose name is f and whose arity is 1. The term f(0) is an argument of a term (an atomic formula), and hence is interpreted as an l-term.


    Implicit quantification of variables

    Variables are implicitly quantified in logic programming statements.

    Existential queries

    Variables in queries are (implicitly) existentially quantified. Thus, in order to answer the query (posed with respect to a program P),
    owns(john,X)?
    
    it must be determined whether there exist an X such that owns(john,X) is a logical consequence of P. In effect, when we are posing a query we are asking for a constructive proof that the query follows from the program.

    It should be clear that identity, our by now lonely rule of deduction, is not sufficient to answer an existential query. We need to expand our repertoire of deduction rules, but first we need a few definitions.

    "A substitution is a finite set (possibly empty) of pairs of the form Xi = ti, where Xi is a variable and ti is a term, and Xi=/= Xj for every i=/= j, and Xi does not occur in tj, for any i and j." [The Art of Prolog, pg. 5]

    "A is an instance of B if there is a substitution Z such that A=BZ." [The Art of Prolog, pg. 5]

    We can now present our second rule of deduction.

    Generalization

    An existential query Q is a logical consequence of any instance of it, QZ, for any substitution Z.

    In other words, if an instance of the query is a logical consequence of the program, then so is the existentially quantified query.

    Returning to our example, the query

    owns(john,X)?
    
    is a logical consequence of the program below
    owns(john,theArtOfProlog).
    likes(sally,pizza).
    
    according to the generalization rule of deduction since "owns(john,theArtOfProlog)" is an instance of the "owns(john,X)" given the substitution {X=theArtOfProlog}. Hence, we may answer "Yes" to the query.

    Universal facts

    Variables in facts are (implicitly) universally quantified. The fact
    owns(john,X).
    
    thus expresses that for any X, owns(john,X) is a deducible. This is our third rule of deduction, which is called instantiation.
    Instantiation

    From a universally quantified statement R, and any substitution Z, deduce the instance RZ.

    As an example, the query owns(john,theArtOfProlog) can be answered affirmatively with repsect to the fact owns(john,X) since the query is an instance of the fact given the substitution {X=theArtOfProlog}.

    Conjunctive queries

    So far we have only seen a single goal G interpreted as a query. We can conjoin several goals and pose the conjunction as a query. A conjoined query G1, G2, ..., Gn?, posed with respect to a program P, is a logical consequence of P if and only if the conjunction of the Gis is a logical consequence of P.

    Variables in conjunctive queries are, as expected, (implicitly) existentially quantified, and the scope of the quantification is the whole conjunction. This means that a variable which is shared by several of the conjuncts must refer to the same (unspecified) individual. For example the query (posed with respect to a program P),

    p(X), q(X)?
    
    asks whether there exists an X such that the relation p and the relation q are both deducible (from P) of the same individual X. In other words, the conjoined query above is a logical consequence of P if and only if there is some individual X such that both p(X) and q(X) are logical consequences of P. The sharing of variables between goals is a way of constraining possible solutions.

    A note on terminology

    In joining goals to form a collection of goals the comma "," denotes the logical conjunction "and". This use of the comma is distinct from the comma as a separator of arguments in a compound term.


    Rules

    Finally we see the third type of statement of logic programming, the rule. An example of a rule is,
    dog(X) <- mammal(X), furry(X), barks(X).
    
    This rule is intended to assert that something is a dog if it is a furry mammal which barks. The general form of a rule is,
    A <- B1, B2, ..., Bn.
    A and all of the Bis are goals. A is called the head of the clause, and the conjunction of the Bis is the body of the clause.

    A note on terminology

    The arrow (<-) denotes logical implication, though the typical p -> q appears "backwards" as q <- p.


    A fact is just the special case of a rule with n=0. Thus, as we might expect, variables in rules are (implicitly) universally quantified. A rule such as,

    daughter(X,Y) <- father(Y,X), female(X).
    
    is understood to mean that for all X and for all Y the daughter relation holds between X and Y if both the father relation holds of Y and X and the female relation holds of X. This illustrates the declarative (or logical) reading of a rule.

    One can also read a rule in a procedural manner; in this case the above rule is interpreted as saying that in order to prove that two individuals X and Y are in the daughter relation it is sufficient to prove that Y and X are in the father relation and that X is in the female relation. Under the procedural reading a rule tells us that to answer the query A? we can answer the conjunctive query B1, B2, ..., Bn.


    A note on quantification

    We have said that variables appearing in rules are (implicitly) universally quantified. Consider an example of a rule in which some variables appear only in the rule's body:

    grandparent(X,Z) <- parent(X,Y), parent(Y,Z).
    
    This rule can be read as saying that for all X's, Y's, and Z's, if X is Y's parent and Y is Z's parent, then X is Z's grandparent. Alternatively, we may read this rule as saying that for all X's and Z's, X is Z's grandparent if there is some Y such that Y is both X's child and Y's parent.

    You might now ask where the existential quantifier came from. Recall that For all(X) r(X) is equivalent to Not There exists(X) Not r(X), and that p -> q is equivalent to Not p | q. In other words,

    
    For all X, For all Y, For all Z:
        (grandparent(X,Z) <- (parent(X,Y) & parent(Y,Z)))
    
    For all X, For all Y, For all Z:
        (grandparent(X,Z) | Not(parent(X,Y) & parent(Y,Z)))
    
    For all X, For all Z:
        (grandparent(X,Z) | For all Y: (Not(parent(X,Y) & parent(Y,Z))))
    
    For all X, For all Z:
        (grandparent(X,Z) | Not There exists Y: Not (Not(parent(X,Y) & parent(Y,Z))))
    
    For all X, For all Z:
        (grandparent(X,Z) | Not There exists Y: (parent(X,Y) & parent(Y,Z)))
    
    For all X, For all Z:
        (grandparent(X,Z) <- There exists Y: (parent(X,Y) & parent(Y,Z)))
    
    


    Where does this leave us with respect to our rules of deduction? We need to introduce a fourth deduction rule, called modus ponens. This rule is probably quite familiar to you. It is the rule which sanctions the deduction from "All humans are mortal" and "Socrates is human" to the conclusion that "Socrates is mortal".

    Universal Modus Ponens [The Art of Prolog, pg. 10]

    The law of universal modus ponens says that from the rule

    R = ( A <-B1, B2, ..., Bn.)
    and the facts
    B'1.
    B'2.
    .
    .
    .
    B'n.
    A' can be deduced if A' <- B'1, B'2, ..., B'n is an instance of R.

    It turns out that instantiation and identity are special cases of universal modus ponens.

    We can now generalize our definition of what a logic program is, from a finite set of facts to a finite set of rules. We also define a procedure to be a finite set of rules all of which have the same predicate in the head.

    Semantics

    Let us now give a simple characterization of the meaning of a logic program. We have considered several examples of queries. In order to answer a query Q with repect to a program P we need to decide whether or not Q is a logical consequence of the program P. Let us define logical consequence more formally than we have up to this point.

    Logical consequence [The Art of Prolog, pg. 10]

    An existentially quantified goal G is a logical consequence of a program P if there is a clause in P with a ground instance A <-B1, ..., Bn where n >= 0, such that B1, ..., Bn are logical consequences of P, and A is an instance of G.

    We now define the meaning of a logic program P simply as the set of all ground goals which are logical consequences of P. Let us call this set M(P). We can calculate M(P) by repeatedly applying universal modus ponens:

    M(P) is also called the declarative semantics of P.

    The Art of Prolog presents the following abstract interpreter for computing the answers to queries. This algorithm specifies a procedural semantics for logic programs.

    Abstract interpreter [The Art of Prolog, pg. 12]
    INPUT
    Goal G and program P
    OUTPUT
    yes if G is a logical consequence of P, no otherwise
    ALGORITHM
    initialize
    the resolvent to G
    while
    the resolvent is not empty
    choose
    a goal A from the resolvent
    choose
    a ground instance of a rule A' <-B1, ..., Bn from P such that A and A' are identical
    (if not possible, exit the while loop)
    replace
    A by B1, ..., Bn in the resolvent
    output
    yes if the resolvent is empty, no otherwise

    Notice that the following three choices are left unspecified in the algorithm:

    1. which goal to choose from the resolvent
    2. which rule to choose from the program
    3. which ground instance of the chosen rule to use
    As we will see later, it is possible to compute the proper instance of a rule to use, so the third decision will disappear. The first two decisions remain, however. We will explore the consequences of the choices that Prolog makes with respect these remaining decisions later in the course. It turns out that given Prolog's choices, the procedural and declarative semantics of Prolog programs do not always coincide.

    The declarative semantics of a logic program P characterizes the meaning of P in a formal manner. M(P) is, if you will, what P means to the machine. M(P) is neither right nor wrong.

    If your logic program P "has bugs", then there is a mismatch between what you intended the program to mean (let us call this the intended meaning of P, IM(P)) and the actual meaning of P, M(P). A program P is said to be correct with respect to IM(P) if M(P) is a subset of IM(P). P is said to be complete with repsect to IM(P) if IM(P) is a subset of M(P). Ideally your programs should be both correct and compelete: IM(P) = M(P).

    In order to determine whether your program is correct and complete you need to specify its intended meaning. You must specify the intended interpretation of all the symbols in your program. You must also specify the relation scheme for your predicates. For example, I can define part of the meaning of the plus binary relation as follows.

    p(1,2,1).        p(2,3,1).        p(3,4,1).
    p(1,3,2).        p(2,4,2).        p(3,5,2).
    p(1,4,3).        p(2,5,3).        p(3,6,3).
    
    I should specify that p/3 is a predicate which (partially) defines the plus relation, and that p(A,B,C) is intended to hold when B is the sum of A and C, where A, B, and C are numbers.

    It might seem more natural to define,

    plus(1,1,2).        plus(2,1,3).        plus(3,1,4).
    plus(1,2,3).        plus(2,2,4).        plus(3,2,5).
    plus(1,3,4).        plus(2,3,5).        plus(3,3,6).
    
    where the predicate plus/3 (partially) defines the plus relation, and where plus(X,Y,SUM) is intended to hold when SUM is the sum of X and Y (where X, Y, and SUM are numbers). The point is that you as programmer are free to decide how to define your predicates. With this freedom comes the responsibility to document your intended interpretation.
    horsch@cs.ubc.ca

    Go up to Top
    Go forward to Logic databases