0% found this document useful (0 votes)
4 views49 pages

Using Predicate Logic

This chapter discusses the representation of facts using predicate logic, emphasizing its advantages over propositional logic for deriving new knowledge through mathematical deduction. It highlights the challenges of translating English sentences into logical statements, including ambiguity and the need for additional information to reason effectively. The chapter also explores the use of instance and isa relationships in representing knowledge and the complexities involved in logical reasoning within artificial intelligence.

Uploaded by

riarathore.6767
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
4 views49 pages

Using Predicate Logic

This chapter discusses the representation of facts using predicate logic, emphasizing its advantages over propositional logic for deriving new knowledge through mathematical deduction. It highlights the challenges of translating English sentences into logical statements, including ambiguity and the need for additional information to reason effectively. The chapter also explores the use of instance and isa relationships in representing knowledge and the complexities involved in logical reasoning within artificial intelligence.

Uploaded by

riarathore.6767
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
CHAPTER 5 USING PREDICATE LOGIC Nature cares nothing for logic, our human logic: she has her own, whick we do not recognive and do not acknowledge until we are crushed under tes wticel ~—Ivan Turgenev (1818-1883), Russian novelist and playwright In this chapter, we begin exploring one particular way of representing facts — the language of logic. Other representational formalisms are discussed in later chapters, The logical formalism is appealing because it immediately suggests a powerful way of deriving new knowledge from old — mathematical deduction. In this formalism, we can conclude that a new statement is true by proving that it follows from the statements that are already known. Thus the idea of a proof, as developed in mathematics as a rigorous way of demonstrating the truth of an already believed proposition, can be exteaded to include deduction as a way of deriving answers to questions and solutions to problems. One of the early domains in which AI techniques were explored was mechanical theorem proving, by which was meant proving statements in various areas of mathematics. For example, the Logic Theorist (Newell eral, {963} proved theorems from the first chapter of Whitehead and Russell's Principia Mathematica [1950]. Another theorem prover {Gelernter et al., 1963] proved theorems in geometry. Mathematical theorem proving is still an active area of Al research. (See, for example, Wos et al. {1984].) But, as we show in this chapter, the usefulness of some mathematical techniques extends well beyond the traditional scope of mathematics. It turns out that mathematics is no different from any other complex intellectual endeavor in requiring both reliable deductive mechanisms and a mass of heuristic knowledge to control what would otherwise be a completely intractable search problem, At this point, readers who are unfamiliar with propositional and predicate logic may want to consult a good introductory logic text before reading the rest of this chapter. Readers who want a more complete and formal presentation of the material in this chapter should consult Chang and Lee [1973]. Throughout the chapter, we use the following standard logic symbols: “>” (material implication), “+” (not), “\" (or), “\” (and), “0” (for all), and “3” (there exists) Using Predicate Logic 99 5.1 REPRESENTING SIMPLE FACTS IN LOGIC Let's first explore the use of propositional logic as a way of representing the sort of world knowledge that an Al system might need. Propositional logic is appealing because it is simple to deal with and a decision procedure for it exists. We can easily represent real-world facts as logical propositions written as well-formed formulas (voff’) in propositional logie, as shown in Fig. 5.1. Using these propositions, we could, for example, conclude from the fact that itis raining the fact that itis not sunny, But very quickly we run up against the limitations of. propositional logic. Suppose we Want to represent the obvious fact stated by the classical sentence {tis raining. RAINING tis sunny, ‘SUNNY itis windy, winoy Utitis raining, then itis not sunny. RAINING +7 SUNNY Fig. 5.1 Some Simple Facts in Prepositional Logic ‘Socrates is 4 man, We could write: SOCRATESMAN But if we also wanted to represent Plato is a man. ‘we would have 10 write something such as; PLATOMAN which would bea totally separate assertion, and we would not be able to draw any conclusions about similarities between Socrates and Plato. It would be much better to represent these facts as: MAN(SOCRATES) MAN(PLATO) since now the structure of the representation reflects the structure of the knowledge itself. But 10 do that, we need to be able to use predicates applied to arguments. We are in even more difficulty if we try to represent the equally classic sentence . All men are mortal. ‘We could represent this as: MORTALMAN 100 Artificial Intelligence But that fails to capture the relationship between any individual being a man and that individual being a mortai, To do that, we really need variubles and quantification unless we are willing to write separate statements about the mortality of every known man. So we appear to be forced to move to first-order predicate logic (or just predicate logic, since we do not uss higher order theories in this chapter) as a way of representing knowledge because it permits representations of things that cannot reasonably be represented in prepositional logic. Ia predicate logic, we can represent real-world facts as statements written as wits. But a major motivation for choosing to use logic at all is that if we use logical statements as a way of representing knowledge. then we have available a good way of reasoning with that knowledge. Determining the validity of a proposition in propositional logic is straightforward, although it may be computationally hard, So before we adopt predicate logic as a good medium for representing knowledge, we need to ask whether it also provides a good way of reasoning with the knowledge. AL first glance, the answer is yes. It provides a way of deducing new statements from old ones, Unfortunately, however, unlike propositional logic, it does not possess a decision procedure, even an exponential one. There do exist procedures that will find a proof of a proposed theorem if indeed itis a theorem. But these procedures are not guaranteed to halt if the proposed statement is not a theorem, In other words, although first-order predicate logic is not decidable. ‘it is semidecidable. A simple such procedure is to use the rules of inference to generate theorem’s from the axioms in some orderly fashion, testing each to see if itis the one for which a proof is sought. This method is not particutarly efficient, however, and we will want to try to find a better one. Although negative results, such as the fact that there can exist no decision procedure for predicate logic, generally have little direct effect on a science such as Al, which secks positive methods for doing things, this. particular negative result is helpfut since it tells us that in our search for an efficient proof procedure, we should he content if we find one that will prove theorems, even if it is not guaranteed te halt if given a nontheorem. And the fact that there cannot exist a decision procedure that halts on all possihle inputs does not ‘mean that there cannot exist one that will halt on almost all che inputs it would see in the process of trying to solve real problems. So despite the theoretical undecidability of predicate logic. it can still serve as a useful Way of representing and manipulating some of the kinds of knowledge that an AI system might need Let’s now explore the use of predicate logic as a way of representing knowledge by looking at a specific example. Consider the following set of sentences: Marcus was a man, Marcus was a Pompeian. All Pompeians were Romans. Caesar was a ruler. All Romans were either loyal to Caesar or hated him, Everyone is loyal to someone. People only try t0 assassinate rulers they are not loyal to. Marcus tried to assassinate Caesar. The facts described by these sentences can be represented as a set of wft’s in predicate logic as follows 1. Marcus was a man. PA ewae ‘man(Marcus) This representation captures the critica) fact of Marcus being & man. It fails to capture some of the information in the English sentence, namely the notion of past tense, Whether this omission is acceptable or not depends on the use to which we intend to put the knowledge, For this simple exampfe, it will be all right. 6. Using Predicate Logic 101 ne Marcus was a Pompeian. Pompeian(Marcus) All Pompeians were Romans. Wax : Pompeian(x) > Roman(x) Caesar was a ruler. rnuler{Caesar) Here we ignore the fact that proper names are often not references to unique individuals, since many people share the same name. Sometimes deciding which of several people of the same name is being referred to in a particular statement may require a fair amount of knowledge and reasoning. All Romans were either loyal to Caesar or hated him Wx; Roman(x) ~ loyalto(x, Caesar) \/ hate(x, Caesar) In English, the word “or” sometimes means the logical inclusive-or and sometimes means the logical exclusive-oe (KOR). Here we have used the inclusive interpretation, Some people will argue, however, that this English sentence is really stating an,exclusive-or. To express that, we would have to write: Vx : Romman(x) > [loyal to(x, Caesar) \/ hate(x, Caesar)) \ ~Wloyalto(x, Caesar) \ hate(x, Caesar))] Everyone is loyal to someone, Wx: y: loyalto(xy) ‘A major problem that arises when trying to convert English sentences into logical statements is the scope of quantifiers. Does this sentence say, as we have assumed in writing the logical formula above, that for each person there exists someone to whom he or she is loyal, possibly a different someone for everyone? Or does it say that there exists someone to whom everyone is loyal (which would be written as Jy : Vx : loyalto(x,y))? Often only one of the two interpretations seems likely, so people tend to favor it. People only try to assassinate rulers they are not loyal to. Var: Vy: personte) A rulenty) A tryassassinate(s,y) ~» = loyalto(s.y) This sentence, too, is ambiguous. Does it mean that the only rulers that people try to assassinate are those to whom they are not loyal (the interpretation used here), or does it mean that the only thing people try to do is to assassinate rulers to whom they are not loyal” In representing this sentence the way we did, we have chosen to write “try to assassinate” as a single predicate. This gives a fairly simple representation with which we can reason about trying to assassinate. But using this representation, the connections between trying to assassinate and trying (0 do other things and between trying to assassinate and actually assassinating could not be made easily. If such connections were necessary, we would need to choose a different representation, Marcus tried to assassinate Caesar. tryassassinate (Marcus, Caesar) 102 Artificiol Intelligence From this brief attempt to convert English sentences into logical statements, it should be clear how difficult the task is. For a good description of many issues involved in this pracess, see Reichenbach 11947}. : Now suppose that we want to use these statements to answer the question Was Marcus loyal 0 Caesar? It seems that using 7 and 8, we should be able to prove that Marcus was not loyal to Caesar (again ignoring the distinction between past and present tense). Now let’s try to produce 2 format proof, reasoning backward from the desired goal: wloyaltotMarcus, Caesar) In order to prove the goal, we need to use the rules of inference to transform it into another goal (or possibly a set of goals) that can in turn be transformed, and so on, until there are no unsatisfied goals remaining. This process may require the search of an AND-OR graph (as described in Section 3.4) when there are alternative ways of satisfying individual goals. Here, for simplicity, we show only a single path. Figure 5.2 shows an attempt to produce a proof of the goal by reducing the set of necessary but as yet unalzained goals to the empty set. The attempt fails, however, since there is no way to satisty the goal person (Murcus) with the statements we have available. The problem is that, although we know that Marcus was a man, we do not have any way to conclude from that that Marcus was a person. We need to add the representation of another fact to our system, namely: + loyalto{ Marcus, Caesar) T (7, substitution) person(Marcus)/ ruler( Caesar trassassinata (Marcus, Caesar) (4) person{ Marcus) tryassassinate(Marcus, Caesar} T (8) person{Marcus) Fig. 5.2 An Attempt to Prove ~ioyalto(Marcus,Caesar) All men are people. Ws man(x) > person(x) Now we can satisfy the last goal and produce a proof that Marcus was not loyal to Caesar. From this simple example, we see that three important issues must be addressed in the process of converting English sentences into Jogical statements and then using those statements to deduce new ones: ‘© Many English sentences are ambiguous (for example, 5, 6, and 7 above). Choosing the correct interpretation may be difficult. © There is often a choice of how to represent the knowledge (as discussed in connection with 1, and 7 above). Simple representations are desirable, but they may preclude certain kinds of reasoning. The expedient representation for @ particular set of sentences depends on the use to which the knowledge contained in the sentences will be put. Using Predicate Logic 103 enn ‘+ Even in very simple situations, a set of sentences is unlikely to contain all the information necessary to reason about the topic at hand. In order to be able to use a set of statements effectively. it is usually necessary to have access to another set of statements that represent facts that people consider too obvious to mention. We discuss this issue further in Section 10.3. An additional problem arises in situations where we do not know in advance which statements to deduce. In the example just presented, the object was to answer the question “Was Marcus loyal 0 Caesar?” How would a program decide whether it should try to prove loyalto(Marcus. Caesar) “loyalto(Marcus, Caesar) ‘There are several things it could do. It could abandon the strategy we have outlined of reasoning backward from a proposed truth to the axioms and instead try to reason forward and see which answer it gets to. The problem with this approach is that, in general, the branching factor going forward from the axioms is so great that it would probably not get to either answer in any reasonable amount of time. A second thing it could do is use some sort of heuristic rules for deciding which answer is more likely and then try to prove that one first If it fails to find a proof after some reasonable amount of effort, it can iry the other answer. This notion of limited effort is important, since any proof procedure we use may not halt if given a nontheorem, Another thing it could do is simply try to prove both answers simultaneously and stop when one effort is successful. Even here, however, if there is not enough information available to answer the question with certainty, the program may never halt. Yet a fourth strategy is to try both to prove one answer and to disprove it, and to use information gained in one of the processes to guide the other. 5.2 REPRESENTING INSTANCE AND ISA RELATIONSHIPS. In Chapter 4, we discussed the specific attributes instance and isa and described the important role they play in a particularly useful form of reasoning, property inheritance. But if we look back at the way we just represented our knowledge about Marcus and Caesar, we do not appear to have used these attributes at all. We certainly have not used predicates with those names. Why not? The answer is that although we have not used the predicates instance and ise explicitly, we have captured the relationships they are used to express, namely cfass membership and class inclusion. Figure 5.3 shows the first five sentences of the last section represented in logic in three different ways. The first part of the figure contains the representations we have already discussed. In these representations, class membership is represented with unary predicates (such as Roman), each of which corresponds to a class, Asserting that P(.) is true is equivalent to asserting that x is an instance (or element) of P. The second part of the figure contains representations that use the instance predicate explicitly. The predicate instance is.a binary one, whose first argument is an object and whose second argument is a class to which the object belongs. But these representations do not use an explicit isa predicate. Instead, subclass relationships, such as that between Pompeians and Romans, ate described as showa in sentence 3. The implication rule there states that if an object is an instance of the subclass Pompeian then it is an instance of the superclass Roman, Note that this rule is equivalent to the standard set-theoretic definition of the subclass-superclass relationship. The third part contains representations that use both the instance and isa predicates explicitly. The use of the isa predicate simplifies the representation of sentence 3, but it requires that one additional axiom (shown here as number 6) be provided. This additional axiom describes how an instance relation and an isa relation can be combined to derive a now instance relation. This one additional axiom is general, though, and does not need to be provided separately for additional isa relations, 104 Artificial intelligence eae ‘mar(Marcus) Pompeian(Marcus) ‘vx : Pompeian(x) > Foman(x) ruler(Caesan) ‘vx: Romarkx) + loyaltolx, Caesar) \/ hatetx, Caesar) instance{Marcus, man) instance(Marcus, Pompefan) ‘yx: instance(x, Pompeian) -> instencetx, Rorran instance{ Caesar, ruler) ‘Wx: instance(x, Roman} ~> loyalio(x, Caesar) \y hate(x, Caesar) instance(Marcus, man) instance(Marcus, Pompetan) isa(Pompeian, Roman) instance Caesar, ru) ‘yx: instance(x, Roman) -> loyalto(x. Caesar) \/ hate(x, Caesar x: Vy: Vz: instances, y) isaly, 2) instance(x, 2) Fig. 5.3 Three Ways of Representing Class Membership OO RON+= OheNs FeONG These examples illustrate two points. The first is fairly specific, It is that, although class and superclass memberships are important facts that need to be represented, those memberships need not be represented with predicates labeled instance and isa. In fact, in a logical framework it is usually unwieldy to do that, and instead unary predicates corresponding to the classes are often used. ‘The second point is more general. There are usually several different ways of representing a given fact within a particular representational framework, be it logic or anything else. ‘The choice depends partly on which deductions need to be supported most efficiently and partly on taste. The only important thing is that within a particular knowledge base consistency of representation is critical, Since any particular inference rule is designed to work on one particular form of, representation, it is necessary that all the knowledge to which that rule is intended to apply be in the form that the rule demands. Many errors in the reasoning performed by knowledge-based programs are the result of inconsistent representation decisions. The moral is simply to be careful. There is one additional point that needs to be made here on the subject of the use of isa hierarchies in logic based systems, The reason that these hierarchies are so important is not that they permit the inference of superclass membership. Itis that by permitting the inference of superclass membership, they permit the inference of other properties associated with membership in that superclass. So, for example, in our sample knowledge base it is important to be able to conclude that Marcus is a Roman because we have some relevant knowledge about Romans, namely that they either hate Caesar or are loyal to him. But recall that in the baseball example of Chapter 4, we were able 10 associate knowledge with superclasses that could then be overridden by more specific knowledge associated either with individual instances or with subclasses. In other words, we recorded default values that could be accessed whenever necessary. For example, there Was a height associated with adult males and a different height associated with baseball players. Our procedure for manipulating the isa hierarchy guaranteed that we always found the correct (i.e, most specific) value for any attribute. Unfortunately, reproducing this result in logic is difficult. ‘Suppose, for example, that, in addition to the facts we already have, we add the following.' Pompeian(Paulus) = [levalto(Paulus, Caesar) \y hate(Paulus,Caesar)| "For convenience, we now return to our original notation using unary predicates to denote class relations. Using Predicate Logic 105 In other words, suppose we want to make Paulus an exception to the general rule about Romans and their feelings toward Ceesar, Unfortunately, we cannot simply add these facts to our existing knowledge base the way We could just add new nodes into a semantic net, The difficulty is that if the old assertions are left unchanged, then the addition of the new assertions makes the knowledge base inconsistent. In order to restore consistency, it is necessary to modily the original assertion to which an exception is being made. So our original sentence 5 must become: Vx : Roman(x) Aneq(x,Paulus) — doyalto(x, Caesar) \y hate(x, Caesar) in this framework, every exception to a general rule” must be stated twice, once in a particular statement ‘and once in an exception list that forms part of the general rule. This makes the use of general nules in this framework less convenient and less efficient when there are exceptions than is the use of general rules in a semantic net A further problem arises when information is incomplete and itis not possible to prove that no exceptions apply in a particular instance. But we defer consideration of this problem until Chapter 7. 5.3 COMPUTABLE FUNCTIONS AND PREDICATES In the example we explored in the last section, all the simple facts were expressed as combinations of individual predicates, such as: sryassussinate(Marcus,Caesar) This is fine if the number of facts is not very large or if the facts themselves are sufficiently unstructured that there is litle alternative. But suppose we want to express simple facts, such as the following greater-than and less-than relationships: att,0) WO) at(2,1) 12) 2t(3.2) 12,3) Clearly we do not want to have to write out the representation of each of these facts individually. For one thing, there are infinitely many of them. But even if we only consider the finite number of them that can be represented, say, using a single machine word per number, it would be extremely inefficient to store explicitly a large sct of statements when we could, instead, so easily compute cach one as we need it, Thus it becomes useful to augment our representation by these computable predicates. Whatever proof procedure we use, when it comes upon one of these predicates, instead of searching for it explicitly in the database or attempting {o deduce it by further reasoning, we can simply invoke a procedure, which we will specify in addition to our regular rules, that will evaluate it and return true or false. Its often also useful to have computable functions as well as computable predicates. Thus we might want to be able to evaluate the truth of (2 + 3,1) To do so requires that we first compute the value of the plus function given the arguments 2 and 3, and then send the arguments 5 and 1 to gi. The next example shows how these ideas of computable functions and predicates can be useful. It also makes use of the notion of equality and allows equal objects to be substituted for each other whenever it appears helpful to do so during a proof. 106 Artificial inteligence Consider the following set of facts, again involving Marcus: Marcus was a man. ‘man(Marcus) Again we ignore the issue of tense. Marcus was a Pompeian. Pompeian(Marcus) Marcus was born in 40 A.D. born(Marcus, 40) For simplicity, we will not represent A.D. explicitly, just as we normally omit it in everyday discussions, If we ever need to represent dates B.C., then we will have to decide on a way to do that, such as by using negative numbers. Notice thatthe representation of a sentence does not have to look like the sentence itself as long as there is a way to convert back and forth between them. This allows us to choose a representation, such as positive and negative numbers, that is easy for a program to work with, All men are mortal. ‘We: man(x} > mortal(x) Ail Pompeians died when the volcano erupted in 79 A.D. erupted(voleano, 79) \ ¥x : (Pompeian(x) — died(x, 79)] This sentence clearly asserts the two facts represented above. It may also assert another that we have not shown, namely that the eruption of the volcano caused the death of the Pompeians. People often assume causality between concurrent events if such causality seems plausible. Another problem that arises in interpreting this sentence is that of determining the referent of the phrase “the volzano.” There is more than one volcano in the world. Clearly the one referred to here is ‘Vesuvius, which is near Pompeii and erupted in 79 A.D. In general, resolving references such as these can require both a lot of reasoning and a lot of additional knowledge. No mortal lives longer than 150 years. Wer Wey: ¥ty i montal(x) A born(x, t)) A git ~ f,150) dead, t9) ‘There are several ways that the content of this sentence could be expressed. For example, we could introduce a function age and assert that its value is never greater than 150. The representation shown above is simpler, though, and it will suffice for this example. Itis now 1991. now = 1991 Here we will exploit the idea of equal quantities that can be substituted for each other. [Now suppose we want 10 answer the question “Is Marcus alive?” A quick glance through the statements ‘we have suggests that there may be two ways of deducing an answer. Rither we can show that Marcus is dead because he was killed by the volcano or we can show that he must be dead because he would otherwise be more than 150 years old, which we know is not possible. As soon as we attempt to follow Using Predicate Logic 107 either of those paths rigorously, however, we discover, just as we did in the last example, that we need some additional knowledge. For example, our statements talk about dying, but they say nothing that relates to being alive, which is what the question is asking. So we add the following facts: 8, Alive means not dead. + lalivets, 1) 9 ~dead(x,1)] A [dead(x, 1) alive(a, 91 not strictly correct, since “dead implies alive only for animate objects. (Chairs can be neither dead nor alive.) Again, we will ignore, this for now. This is an example of the fact that rarely do two expressions have truly identical meanings in all circumstances. 9. If someone dies, then he is dead at all later times. Vert Vr: Vip: died(x, 4) A get t)) > dead, 2) This representation says that one is dead in all years after the one in which one died. It ignores the question of whether one is dead in the year in which one died, ‘man Marcus) Pombeian(Marcus) ‘borr{ Marcus, 40) \vx:man(x) —> mortalx) vv: Pompeiartx) - aieats, 79) eruptedtyoleano,79) WG: Wi: mortakx) “borm(x, ) A gtlty~ t,. 150) > dead, t3) row= 1991 Vax: vt falive(x, > ~deadkx, 1 \ Esdeadkx, > alivalx, 4] Wah: Vly die, ) Ott, ) > dead, &) Fig.5.4 A Set of Facts about Marcus To answer that requires breaking time up into smaller units than years. If we do that, we can then add rules that say such things as “One is dead at time (year 1, month 1) if one died during (year 1, month 7) and month 2 precedes month 11.” We can extend this to days, hours, etc., as necessary. But we do not want to reduce all time statements to that level of detail, which is unnecessary and often not available. ‘A summary of all the facts we have now represented is given in Fig. 5.4. (The numbering is changed slightly because sentence 5 has been split into two parts.) Now let's attempt to answer the question “Is Marcus alive?” by proving: walive(Marcus, now) ‘Two such proofs are shown in Fig, 5.5 and 5.6. The term nif at the end of each proof indicates that the list of conditions remaining to be proved is empty and so the proof has succeeded. Notice in those proofs that whenever a statement of the form: ahboe was used, a and b were set up as independent subgoals. In one sense they are, but in another sense they are not if they share the same bourd variables, since, in that case, consistent substitutions must be made in each of them. For example, in Fig. 5.6 look at the step justified by statement 3. We can satisfy the goal 108 Artificial Intelligence bor(Marcus, 1,) using statement 3 by binding A to 40, but then we must also bind A to 40 in gt(now = 1, 150) since the two 1,’s were the same variable in statement 4, from Which the two goals came. A good computational proof procedure has to include both a way of determining that a match exists and a way of guaranteeing uniform substitutions throughout a proof. Mechanisms for doing both those things are discussed below. nalive(Marcus, now) 1 (9, substitution) dead Marcus, now) 7 (10, substitution) deck Marcus, ,) /\ gkinow, t) t (5, substitution) Pompeiar{Marcus) /\ g{now, 79) T (2) gtnow, 79) T substitute equals) .91991,78) + (compute at) nil Fig. 5.5 One Way of Proving That Marcus Is Dead From looking at the proofs we have just shown, two things should be clear: * Even very simple conclusions can require many steps to prove. ‘+ A variety of processes, such as matching, substitution, and application of modus ponens are involved in the production of a proof. This is true even for the simple statements we are using. It would be worse if we had implications with more than a single term on the right ox with complicated expressions involving amis and ors on the left, The first of these observations suggests that if we want to be able to do nontnvial reasoning, we are going to need some statements that allow us to take bigger steps along the way. These should represent the facts that people gradually acquire as they become experts. How to get computers to acquire them is a hard problem for which no very good answer is known, The second observation suggests that actually building a program to do what people do in producing proofs such as these may not be easy. In the next section, we introduce a proof procedure called resofution that reduces some of the complexity because it operates on statements that have first been converted to a single canonical form. 5.4 RESOLUTION As we suggest above, it would be useful from a computational point of view ff we had a proof procedure that carried out in a single operation the variety of processes involved in reasoning with statements in predicate Jogic. Resolution is such a procedure, which gains its efficiency from the fact that it operates on statements that have been converted to a very convenient standard form, which is described below. Using Predicate Logic 109 nalive(Marcus, now) 1 (9, substitution) dead Marcus, now} 1 (7, substitution) ‘mortak Marcus) \ bom(Marcus, 4). ‘atnow— 1, 180) 1 (4, substitution) ‘man(Marcus) \ bom(Marcus, th) \ ‘atinow — t, 150) 1 a bom(Marcus, ,) /\ gtinow ~t,, 150) T @ gtnow — 40,150) T @) .gt'1991 — 40,150) 1 (compute minus) 9f'1951,150) 1 (compute gt) ni Fig. 5.6 Another Way of Proving That Marcus is Dead Resolution produces proofs by refutation, In other words, to prove a statement (i.e., show that it is valid), resolution attempts to show that the negation of the statement produces a contradiction with the known statements (je., that it is unsatisfiable). This approach contrasts with the technique that we have been using to generate proofs by chaining backward from the theorems to be proved (o the axioms. Further discussion of how resolution operates will be much more sicaightforward afier we have discussed the standard form in which statements will be represented, so we defer it until then. 5.4.1 Conversion to Clause Form Suppose we know that all Romans who know Marcus either hate Caesar or think that anyone who hates anyone is crazy. We could represent that in the following wf: Wx: [Romantx) A know(, Marcus)] > [hate(x,Caesar) \/ (My : Bz! hate(y, 2) thinkerazy(x, y))] ‘To use this formula in a proof requires a complex matching process. Then, having matched one piece of it, such as thinkcrazy(2;y), itis necessary to do the right thing with the rest of the formula including the pieces in which the matched part is embedded and those in which it is not. If the formula were in a simpler form, this proc;ess would be much easier. The formula would be easier to work with if © Tt were flatter, ic., there was less embedding of components. + The quantifiers were separated from the rest of the formula so that they did not need to be considered. Conjunctive normal form [Davis and Putnam, 1960) has both of these properties, For example, the formula siven above for the feelings of Romans who know Marcus would be represented in conjunctive normal form as “Roman(s) A know x, Marcus) \/ hhate(s, Caesar) \/ “hately, 2)\V thinkcrazy(x, 2) 110 Artificial Intelligence en Since there exists an algorithm for converting any w{f into conjunctive normal form, we lose no generality if we employ a proof procedure (such as resolution) that operates only on wif's in this form. In fact, for resolution to work, we need to go one step further. We need to reduce a set of wif's to a set of clauses, where 4 clause is defined to be a wff in conjunctive normal form but with no instances of the connector A. We can do this by first converting each wf into conjunctive normal form and then breaking apart each such expression into clauses, one for each conjunct. All the conjuncts will be considered to be conjoined together as the proof procedure operates. To convert a wff into clause form, perform the following sequence of steps. Algorithm: Convert to Clause Form 1. Bliminate —, using the fact that a — b is equivalent to -a vy b. Performing this transformation on the wif given above yields Ves [Romania A knows x, Marcus) \/ Uhatetx, Caesar) \y (Wy: Gz : hately, 2)y thinkerazy(uy 2. Reduce the scope of each = to a single term, using the fact that ~(-p) = p, deMorgan's laws [which say that aA 6) =a \y band ay 6) = 7a /\ 7b], and the standard correspondences between quantifiers [AV Ply = 3c: P(x) and “x: P(x) = Wx: +P (3)]. Performing this transformation on the wif from step 1 yields Wax: [>Roman(x) \/ sknow(x, Marcus} \V Wares, Caesar) \/ (Wy : Vz: hately, 2) \V thinkerazytx, YL 3. Standardize variables so that each quantifier binds a unique variable. Since variables are just dummy names, this process cannot affect the truth value of the wff. For example, the formula ve: Po) yx: Ole) would be converted to va: PONV Ye: 20) ‘This step is in preparation for the next. 4, Move all quantifiers to the left of the formula without changing their relative order. This is possible since there is no conflict among variable names. Performing this operation on the formula of'step 2, we get Wr: Wy: Ve: [>Romants) \V aknow(x Marcus) thate(x, Caesar) \) (rhate(y, 2) \/ thinkerazy(x,y))] ‘At this point, the formula is in what is known as prenex normal form. It consists of a prefix of quantifiers followed by a matrix, which is quantifier-free. 5. Eliminate existential quantifiers. A formula that contains ar existentially quantified variahle asserts that there isa value that can be substituted for the variable that makes the formula true, We can eliminate the quontifier by substituting for the variable a reference to 2 function that produces the desired value. Since we do not necessarily know how to produce the value, we must create a new function name for every such replacement. We make no assertions about these functions except that they must exist. So, for example, the formula Ay : Presidentty) can be transformed into the formula President(St) Using Predicate Logic 114 where S] is a function with no arguments thal somehow produces a value that satisfies President. If existential quantifiers occur within the scope of universal quantifiers, then the value that satisfies the predicate may depend on the values of the universally quantified variables. For example, in the formula Wot Dy sfather-of x) the value of y that satisfies father-of depends on the particular value of x, Thus we must generate functions with the same number of arguments as the number of universal quantifiers in whose scope the expression occurs. So this example would be transformed into Vx : father-oftS2x),x)) ‘These generated functions ase called Skolem functions. Sometimes ones with no arguments are called Skolem constants. . Drop the prefix. At this point, all remaining variables are universally quantified, so the prefix can just ‘be dropped and any proof procedure we use can simply assume that any variable it sees is universally quantified. Now the formula produced in step 4 appears as (Romance) y ~know(. Marcus)] (hate(x, Caesar) \y (shatey, 2) \v thinkerazy/x, y))] Convert the matrix into a conjunction of disjuncis. tn the case of our example, since there are no and’s, is only necessary to exploit the associative property of or [i.e.. (a \ b) ve = (aye) A(b A.e)] and simply remove the parentheses, giving Roman(x) \¢ aknow(x, Marcus) \/ hhate(x, Caesar) \) ~hately, 2) thinkerasytt, ¥) However, itis also frequently necessary to exploit the distributive property [ie., (a A 6) v.e=(a vc) A(bV Ol For example, the formula (winter \ wearingboots) \ (summer \ wearingsandals) becomes, after one application of the rule [winter \V (summer “\ wearingsandals)] A {wearingboots \ (summer \ wearingsandals)} and then, after a second application, required since there are still conjuncts joined byOR’s, winter \y summer) \ (winter \V wearingsandats) A (wearingboots \7 summer) (wearingboots \/ wearingsandals) . Create a separate clause corresponding to each conjunct. In order for a wff to be true, all the clauses that are generated from it must be true, If we are going to be working with several wf's, all the clauses generated by each of them can now be combined to represent the same set of facts as were represented by the original wff's. Standardize apart the variables in the set of clauses generated in step 8. By this We mean rename the variables so that no two clauses make reference to the same variable. In making this transformation, we rely on the fact that 112 Artificial Intelligence (x: Pix) A QC) = Wx: POD A Wx: OG) Thus since each clause is a separate conjunct and since all the variables are universally quantified, there need be no relationship between the variables of two clauses, even if they were generated from the same wif, Performing this final step of standardization is important because during the resolution procedure it is sometimes necessary 10 instantiate a universally quantified variable (i.e., substitute for ita particular value). But, in general, we want to keep clauses in their most gereeral form as long as possible. So when a variable is instantiated, we want to know the minimum number of substitotions that must be made to preserve the truth value of the system. After applying this entire procedure to a set of wi’s, we will have a set of clauses, each of which is a disjunction of literals, These clauses can now be exploited by the resolution procedure to generate proofs, 5.4.2 The Basis of Resolution The resolution procedure is a simple iterative process: at each step, two clauses, called the parent clauses, are compared (resolved), yielding a new clause that has been inferred from them. The new clause represents ways that the two parent clauses interact with each other. Suppose that there are two clauses in the system: winter \y summer winter \/ cold Recall that this means that both clauses must be true (i.e., the clauses, although they look independent, are really conjoined) Now we observe that precisely one of winter and -winter will be teue at any point, If winter is true, then cold must be true to guarantee the truth of the second clause. If ater is true, then summer must be true to guarantee the truth of the first clause. Thus we see that from these two clauses we can deduce summer \¢ cold This is the deduction that the resolution procedure will make. Resolution operates by taking two clauses that each contain the same literal, in this example, winter. The literal must occur in positive form in one clause and in negative form in the other. The resolvent is obtained by combining all of the literals of the two parent clauses except the ones that cancel, If the clause that is produced is the empty clause, then a contradiction has been found, For example, the two clauses winter swinter will produce the empty clause. If a contradiction exists, then eventually it will be found, Of course, if no contradiction exists, itis possible that the procedure will never terminate, although as we will sce, there are often ways of detecting that no contradiction exists. So far, we have discussed only resolution in prepositional logic. in predicate logic, the situation is more complicated since we must consider all possible Ways of substituting values for the variables, The theoretical basis of the resolution procedure in predicate logic is Herbrand’s theorem [Chang and Lee, 1973], which tells us the following: Using Predicate Logic 113 ‘« To show that a set of clauses § is unsatisfiable, it is necessary to consider only interpretations over & particular set, called the Herbrand universe of S. + A set of clauses S is unsatisfiable if and only if'a finite subset of ground instances (in which all bound variables have had a value substituted for them) of $ is unsatisfiable. ‘The second part of the theorem is important if there is to exist any computational procedure for proving unsatisfiability, since in a finite amount of time no procedure will be able to examine an infinite set. The first part suggests that one way to go about finding a contradiction is to try systematically the possible substitutions and see if each produces a contradiction. But that is highly inefficient. The resolution psincipte, first introduced by Robinson [1965], provides a way of finding contradictions by trying a minimum nurnber of substitutions. The idea is-to keep clauses in their general form as long as possible and only introduce specific substitutions when they are required. For more details on different kinds of resolution, see Stickel [1988}. 5.4.3 Resolution in Propositional Logic In order to make it clear how resolution works, we first present the resolution procedure for propositional logic. We then expand it to include predicate logic. In propositional logic, the procedure for producing a proof by resolution of proposition P with cespect to a set of axioms F is the following. Algorithm: Propositional Resolution 1. Convert all the propositions of F to clause form. 2. Negate P and convert the result to clause form, Add it to the set of clauses obtained in step 1. 3. Repeat until either a contradiction is found or no progress can be made: (a) Select two clauses. Call these the parent clauses. (b) Resolve them together. The resulting ciause, called the resolvent, will be the disjunction of all of the literals of both of the parent clauses with the following exception: If there are any pairs of literals Z and AL, such that one of the parent clauses contains £ and the other contains 7L, then select one such pair and eliminate both L and =. from the resolvent. {e) the resolvent is the empty clause, thea a contradiction has been found. if it is not, then add it to the set of clauses available to the procedure. Let's look at a simple example. Suppose we are given the axioms shown in the first column of Fig. 5.7 and ‘we want to prove A. First we convert the axioms to clause form, as shown in the second column of the figure. Given Axioms Converted to Clause Form Pp P 4) (PAQ>R “PV OVA 2 (SvNwao “SV O @) -TVva @ T T 6) Fig. 5.7. A Few Facts in Propositional Logic Then we negate R, producing “R, which is already in clause form, Then we begin selecting pairs of clauses to resolve together. Aithough any pair of clauses can be resolved, only those pairs that contain complementary literals will produce a resolvent that is likely to lead to the goal of producing the empty clause (shown as a box). We might, for example, generate the sequence of resolvents shown in Fig. 5.8. We begin by resolving with the clause 7 since that is one of the clauses that must be involved in the contradiction we are trying to find. 114 Artficial Intelligence One way of viewing the resolution process is that ietakes a set of clauses that are _p._ ay, all assumed to be true and, based on information provided by the others, it generates SO ® new clauses that represent restrictions on the way cach of those original clauses oa OP can be made true. A contradiction occurs when a clause becomes so restricted that ae there is no way it can be true. This is indicated by the generation of the empty =TvQ =@ clause. To see how this works, let's look again at the example. In order for proposition SS 7 2 to be true., one of three things must be true: =P, 7, or R. But we are assuming ~ NZ that ~2 is true, Given that, the only way for proposition 2 to be true is for one of two things to be true: ~? or ~Q. That is what the first resolvent clause says. But Fig, $8 Resolution én proposition | says that P is true, which means that +P cannot be true, which leaves. Propositional only one way for proposition 2 to be true, namely for +@ to be true (as shown in Logic the second resolvent clause). Proposition 4 can be true if either ~T or Q is true. But since we now know that @ must be true, the only way for proposition 4 to be true is for ~7 to be true (the third resolvent). But proposition 5 says that Tis true, Thus there is no way for all of these clauses to be ¢rue in a single interpretation. This is indicated by the empty clause (the last resolvent). 5.4.4 The Unification Algorithm In propositional logic, it is easy to determine that two literals cannot both be true at the same time. Simply look for L and -L In predicate logic, this matching process is more complicated since the arguments of the predicates must be considered. For example, maan(John) and =man{John) is a contradiction, while man(John) artd ~man(Spot) is not Thus, in order to determine contradictions, we need a matching procedure that compares two literals and discovers whether there exists a set of substitutions that makes them identical. There is a straightforward recursive procedure, called the unification algorithm, that does just this. “The basic idea of unification is very simple. To attempt to unify two literals, we first check if their initial predicate symbols are the same, if so, we can proceed. Otherwise, there is no way they can be unified, regardless of their arguments. For example, the two literals tryassassinate(Marcus, Caesar) hate(Marcus, Caesar) cannot be unified. If the predicate symbols match, then we must check the arguments, one pair at a time. If the first matches, we can continue with the second, and so on. To test each argument pair. we can simply call the unification procedure recursively. The matching rules are simple. Different constants or predicates cannot match; identical ones can. A variable can match another variable, any constant, oF a predicate expression, with the restriction that the predicate expression must not contain any instances of the variable being matched. The only complication in this procedure is that we must find a single, consistent substitution for the entire literal, not separate ones for each piece of it. To do this, we must take each substitution that we find and apply it to the remainder of the literals before we continue trying to unify them. For example, suppose we want to unify the expressions Pour) P(y.z) ‘The two instances of P match fine, Next we compare x and y, and decide that if we substitute y for x, they could match. We will write that substitution as whe Using Predicate Logic 118 (We could, of course, have decided instead to substitute x for y, since they are both just dummy variable names. The algorithm will simply pick one of these two substitutions.) But now, if we simply continue and match x and z. we produce tne substitution man(x,) might be true, making ‘mortal(x;) irrelevant to the truth of the complete clause, So the resolvent generated by clauses 1 and 2 must be ‘mortal(Marcus), which we get by applying the result of the unification process [Link] resolvent. The resolution process can then proceed from there to discover whether mortai(Marcus) feads to a contradiction with other available clauses. This example illustrates the importance of standardizing variables apart during the process of converting expressions 0 clause form. Given that that standardization has bee’n done, it is easy (o determine how the Using Predicate Logic 117 unifier must be used to perform substitutions to create the resolvent. If two instances of the same variable occur, then they must be given identical substitutions We can now state the resolution algorithm far predicate logic as follows, assuming a set of given statements F and a statement to be proved P: Algorithm: Resolution 1. Convert all the statements of F to clause form. 2. Negate P and convert the result to clause form. Add it to the set of clauses obttfHied in 1. 3. Repeat until either a contradiction is found, no progress can be made, or a prede- termined amount of effort has been expended. (a) Select two clauses. Call these the parent clauses. (b) Resolve them together. The resolvent will be the disjunction of all the literals of both parent clauses with appropriate substitutions performed and with the following exception: If there is one pair of literals 71 and “72 such that one of the parent clauses contains 72 and the other contains 71 and if TI and 72 are unifiable, then neither 71 nor T2 should appear in the resolvent. We call T1 and 72 Complemeniary literals. Use the substitution produced by the unification to create the resolvent. If there is more than one pair of complementary literals, only one pair should be omitted from the resolvent. (©) If the resolvent is the empty clause, then a contradiction has been found, If it is not, then add it to the set of clauses available to the procedure. ithe choice of clauses to resolve together at each step is made in certain systematic ways, then the resolution procedure will find a contradiction if one exists, However, it may take a very long time. There exist strategie for making the choice that can speed up the process considerably: © Only resolve pairs of clauses that contain complementary literals, since only such resolutions produce new clauses that are harder to satisfy than their parents. To facilitate this, index clauses by the predicates they contain, combined with an indication of whether the predicate is negated. Then, given a particular clause, possible resolvents that contain a complementary occurrence of one of its predi- cates can be located directly. + Eliminate certain clauses as soon as they are generated so that they cannot partic- ipate in later resohions. ‘Two kinds of clauses should be eliminated: tantologies (which can never be unsatisfied) and clauses that are subsumed by other clauses (i.e., they are easier ;0 satisfy. For example, P\y Q is subsumed by P) ‘* Whenever possible, resolve either with one of the clauses that is part of the statement we are trying to refute or with a clause generated by a resolution with such a clause. This is called the set-of:support strategy and corresponds to the intuition that the contradiction we are looking for must involve the statement we are trying to prove. Any other contradiction would say that the previously believed statements were inconsistent. ‘* Whenever possible, resolve with clauses that have a single literal, Such resolutions generate new clayses with fewer literals than the larger of their parent clauses and thus are probably closer to the goal of a resolvent with zero terms. This method is called the unit-preference strategy. Ler’s now return to our discussion of Marcus and show how resolution ean be used to prove new things about him. Let's first consider the set of statements introduced in Section 5.1. To use them in resolution proofs, we must convert them to clause form as described in Section 5.4.1. Figure 5.9(a) shows the results of that conversion. Figure 5.9(b) shows a resolution proof of the statement hate(Marcus,Caesar) 118 Artificial ineeligence ‘Axioms in clause form: 1. man(Marcus) 2. Pompeian(Marcus) 3. ~Pompeian(x,) \y Flomantx,) 4 nuto(Cassar) 5. -Roman(x,) \/ dato(x, Caesan \ hatalx,Caosar) 6. loyaltolxa,fhxs)) 7. amanix4) \V ~wuledy,) \/ stryassassinatelxyy,) V loyato(%,y;) 8. tryassassinate(Marcus,Caeserh () Prove: hate (Marcus, Caesar) hate (Marcus, Caesar) “Marcus! x, 3. —Roman (Marcus) \ loyalto (Marcus, Caesar} NT Marcuse —Pompeian (Marcus) \y loyatto (Marcus, Caesar) 2 ae 7. tayatto (Marcus, Caeser) Neca, Casarly 1 man (Marcus) \j ~ruler (Caesar) \ ~tyassassinate (Marcus, Cesar) ee ~nuler(Coesar) \V ~tryassassinate (Marcus, Caesar) 4 —— “tryassassinate (Marcus, Caoser) (b) Fig. 5.9. A Resolution Proof 8 Of course, many more resolvents could have been generated than we have shown, but we used the heuristics described above to guide the search. Notice that what we have done here essentially is to reason backward from the statement we want to show is a contradiction through a set of intermediate conclusions to the final conclusion of inconsistency. ‘Suppose our actual goal in proving the assertion hate(Marcus,Caesar) was to answer the question “Did Marcus hate Caesar?” In that case, we might just as easily have attempted to prove the statement Thate(Marcus,Caesar) Using Predicate Logic 119 To do so, we would have added hate(Marcus, Caesar) to the set of available clauses and begun the resolution process. But immediately we notice that there are no clauses that contain a literal involving hare. Since the resolution process can only generate new clauses that are composed of combinations of literals from already existing clauses, we know that no such clause can be generated and thus we conclude that haze(Marcus,Caesar) will not produce a contradiction with the known statements. This is an example of the kind of situation in which the resolution procedure can detect that no contradiction exists. Sometimes this situation is detected not at the beginning of a proof, but part way through, as shown in the example in Figure 5.10(q), based on the axioms given in Fig. 5.9. But suppose our knowledge base contained the two additional statements ~oyelto (Mercus, Cassar) 5 ‘Marcus/xg 3. Roman (Marcus) v foyaito (Marcus, Caesar) NL arcs ~Pompelan (Marcus) \ hate (Marcus, Caeser) 2 hate (Marcus, Caesar) @) hate (Marcus, Caesar) 10 NL arouse, Caeser persecute (Caesar, Marcus) 9 NL Marcus, Caesarty, hate (Mercus, Caesar) Cy Fig. 5.10 An Unsuccessful Attempt at Resolution 9. persecute(x, y) —> hate(y, x) 10. hate(x, y) — persecute(y, x) Converting to clause form, we get 9. npersecutelts, ¥2)\y hate(¥> Xs) 10. ~hatelxg, »3) V persecutely. %) ‘These statements enable the proof of Fig. 5.10(a) to continue as shown in Fig. 5.10(5). Now to detect that there is no contradiction we must discover that the only resolvents that can be generated have been generated before. In other words, although we can generate resolvents, we can generate no new ones, 120 Artificial intelligence Given: 1. sfathartx, ¥) \v swoman(x) (ie, fathenx, ¥) -+ -woman(x)) 2. -motherx, v1 wommants) (.8., motherx. yj ~» woman(x)) ‘mother Chris Mary) father{Chris, Bit 2 father (xy) Mother, (cy) 3 ~ 2 Cate, Marly father (Chris, Mary) Fig. 5.11 The Need to Standardize Variables ‘Axioms in clause form: 1, man(bfarcus) 2. Pompeian(Marcus) shom(Marcus, 40) marx) \V mortals) Pompoiarixy) \/ diedx,78) erupted volcano,79) smorta(x,) \f “DOM, 4) \V “Ol ~ ty, 150) V dead, ) now = 2008 mallve Xe, 6) \/ “dead xs, ty) dead xs, i) \/ alive(xs, &) ICR ts) \V “Oth le, &5) \V dead xe, fe) Prove: ~alive(Marcus, now) BSE evomay alive (Marcus, now) 9a Marcus |x, now/t, ~dead (Marcus, now) 40 NL inarcus x, no lt —dead (Marcus, fs) \V ~gt (now, ts) 5 NL itareus ng, 78s —Pompeian (Marcus) \/ “gt (now, 79) NN eubttte equats —Pompeian (Marcus} \/ ~gt(2008, 79) NL rotce —Pompefan (Marcus) ag ee Fig. 5.12 Using Resolution with Equatity and Reduce Using Predicate Logic 121 Recall that the final step of the process of converting a set of formulas to clause form was to standardize apart the variables that appear in the final clauses. Now that we have discussed the resolution procedure, we can see clearly why this step is so important. Figure 5.11 shows an example of the difficulty that may arise if standardization is not done. Because the variable y occurs in both clause I and clause 2, the substitution ar the second resolution step produces a clause that is too restricted and so does not lead to the contradiction that is present in the database. If, instead, the clause father Chris, ») had been produced, the contradiction with clause 4 would have emerged. This would have happened if clause 2 had been rewritten as cmorher(a, b)\ woman(a) In its pure form, resolution requires all the knowledge it uses to be represented in the form of clauses. But as we pointed out in Section 5.3, itis often more efficient to represent certain kinds of information in the form of computable functions, computable predicates, and equality relationships. It is not hard to augment resolution to handle this sort of knowledge. Figure 5.12 shows, a resolution proof of the statement ~alive(Marcus,now) based on the statements given in Section 5.3. We have added two ways of generating new clauses, in addition to the resolution rule: © Substitution of one value for another to which it is equal. « Reduction of computable predicates. If the predicate evaluates to FALSE, it can simply be dropped. since adding V FALSE to a disjunction cannot change its truth value. If the predicate evaluates to TRUE, then the generated clause is a tautology and cannot lead to a contradiction 5.4.6 The Need to Try Several Substitutions Resolution provides a very good way of finding a refutation proof without actually trying all the substitutions that Herbrand’s theorem suggests might be necessary. But it does not always eliminate the necessity of trying more than one substitution. For example, suppose we know, in addition to the statements in Section 5.1, that hate(Marcus, Paulus) hate(Mareus, Julian) Now if we want to prove that Marcus hates some ruler, we would be likely to try each substitution shown in Figure 5.13(a) and (b) before finding the contradiction shown in (c). Sometimes there is no way short of ‘very good luck to avoid trying several substitutions, 5.4.7 Question Answering Very early in the history of AT it was realized that theorem-proving techniques could be applied to the problem of answering questions. As we have already suggested, this seems natural since both deriving theorerns from axioms and deriving new facts (answers) from old facts employ the process of deduction. We have already shown how resolution can be used to answer yes-no questions, such as “Is Marcus alive?” In this section, we show how resolution can be used to answer fill-in-the-blank questions, such as “When did Marcus die?” or 122 Artificial Intelligence “Who (ried to assassinate a ruler?” Answering these questions involves finding a known statement that matches the terms given in the question and then responding with another piece of that samc statement that fills the slot educated(x) In this form, the statement suggests a way of deducing that someone is educated. But when the same statement is converted to clause form, ajudge(s) \/ crooked(x) \/ educated) it appears also to be a way of deducing that someone is not a judge by showing that he is not crooked and not educated. Of course, in a logical sense, it is, But itis almost certainly not the best way, or even a very good ‘way, to go about showing that someone is not a judge. The heuristic information contained in the original statement has been lost in the transformation. ‘Another problem with the use of resolution as the basis of a theorem-proving system is that people do not think in resolution. Thus itis very difficult for a person to interact with a resolution theorem prover, either to Using Predicate Logic 125 give it advice or to be given advice by it. Since proving very hard things is something that computers still do poorly, it is important from a practical standpoint that such interaction be possible. To facilitate it, we are foiced to look for a way of doing machine theorem proving that corresponds more closely to the processes used in human theorem proving. We are thus led to what we call, mostly by definition, natural deduction. Natural deduction is not a precise term, Rather it describes a melange of techniques, used in combination to solve problems that are not tractable by any one method alone. One common technique is to arrange knowledge, not by predicates, as we have been doing, but rather by the objects involved in the predicates, Some techniques for doing this are described in Chapter 9. Another technique is to use a set of rewrite rules that not only describe logical implications but also suggest the way that those implications can be exploited in proofs For a good survey of the variety of techniques that can be exploited in a natural deduction system, see Bledsoe [1977]. Although the emphasis in that paper is on proving inathematical theorems, many of the ideas in it can be applied to a variety of domains in which it is necessary to deduce new statements from known ones. For another discussion of theorem proving using natural mechanisms, see Boyer and Moore [1988], which describes a system for reasoning about programs. It places particular emphasis on the use*of mathematical induction as a proof technique. SUMMARY In this chapter we showed how predicate logic can be used as the basis of a technique for knowledge representation. We also discussed a problem-solving technique, resolution, that can be applied when knowledge is represented in this way. The resolution procedure is not guaranteed to hall if given a nontheorem to prove. But is it guaranteed io halt and find a contradiction if one exists? This is called the completeness question. In the form in which we have presented the algorithm, the answer to this question is no. Some small changes, usually not implemented in theorem-proving systems, must be made to guarantee completeness. But, from a computational point of view, completeness is not the only important question. Instead, we must ask whether proof can be found in the limited amount of time that is available. There are two ways to approach achieving this computational goal. The first is to search for good heuristics that can inform a theorem-proving program. Current theorem-proving research attempts to do this. The other approach is to change not the program but the data given to the program. In this approach, we recognize that a knowledge base that is just a list of logical assertions possesses no structure. Suppose an information-bearing structure could be imposed on such a knowledge base. Then that additional information could be used to guide the program that uses the knowledge. Such a program majuiot look a lot like a theorem prover, although it will still be a knowledge-based problem solver. We discuss this idea further in Chapter 9. A second difficulty with the use of theorem proving in AI systems is that there are some kinds of information that are not easily represented in predicate logic. Consider the following examples: ‘+ “tis very hot today.” How can relative degrees of heat be represented? ‘* “Blond-haired people often have blue eyes.” How can the amount of certainty be represented? ‘© “If there is no evidence to the contrary, assume that any adult you meet knows how to read.” How can we represent that one fact should be inferred from the absence of another? * “Its better to have more pieces on the board than the opponent has.” How can we represent this kind of heuristic information? ‘© “I know Bill thinks the Giants will win, but I nk they are going to lose.” How can several different 126 Artificial Intelligence eee enna mt belief systems be represented at once? ‘These examples suggest issues in knowledge representation that we have not yet satisfactorily addressed. They deal primarily with the need to make do with a knowledge base that is incomplete, although other problems also exist, such as the difficulty of representing continuous phenomena in a discrete system. Some solutions to these problems are presented in the remaining chapters in this part of the book. EXERCISES 1. Using facts 1-9 of Section 5.1, answer the question, “Did Marcus hate Caesar?” 2. In Section 5.3, we showed that given our facts, there were two ways o prove the statement walive(#arcus, now). In Fig. 5.12(a) resolution proof corresponding to one of those methods is shown, Use resolution to derive another proof of the statement using the other chain of reasoning. 3. Trace the operation of the unification algorithm on each of the following pairs of literals: (a) f(Marcus) and f(Caesar) (b) fox) maf(gis)) (c) f(Marcus, g(x, y)) andf(x, g(Caesar, Marcus)) 4. Consider the following sentences: fa) (b) ( @ © John likes all kinds of food. Apples are food. Chicken is food. Anything anyone eats and isn’t killed by is food. Bill eats peanuts and is still alive. Sue eats everything Bill eats. Translate these sentences into formulas in predicate logic. Prove that John likes peanuts using backward chaining. Convert the formulas of part a into clause form. Prove that John likes peanuts using resolution. Use resolution to answer the question, “What food does Sue eat?” 5. Consider the following facts: ‘The members of the En St, Bridge Club are Joe, Sally, Bill, and Ellen. Joe is married to Sally. Bill is Ellen's brother, The spouse of every married person in the club is also in the club. The last meeting of the club was at Joe's house. Represent these facts in predicate logic. From the facts given above, most people would be able to decide on the truth of the following, additional statements: ‘The last meeting of the club was at Sally's house. Ellen is not married Can you construct resolution proofs to demonstrate the truth of each of these statements given the five facts listed above? Do so if possible. Otherwise, add the facts you need and then construct the proofs, 6. Assume the following facts: . Steve only likes easy courses. Science courses are hard. Using Predicate Logtc 127 © All the courses in the basketweaving department are easy. * BK301 is a basketweaving course. Use resolution to answer the question, “What course would Steve like?” In Section 5.4.7, we answered the question, “When did Marcus die?” by using resolution to show that there was a time when Marcus died. Using the facts given in Fig, 5.4, and the additional fact Wes Vt: dead, 1:) 9 3tp + gi(ty 3) died(x, 12) there is another way to show that there was a time when Marcus died. (a) Do a resolution proof of this other chain of reasoning, (b) What answer will this proof give to the question, “When did Marcus die?” . Suppose that we are attempting to resolve the following clauses: loves(father{a), 2) nlovesty, 3) v loves(sx, y) (a) What will be the result of the unification algorithm when applied to clause | and the first term of clause 27 (b) What must be generated as a result of resolving these two clauses? (c) What does this example show about the order in which the substitutions determined by the unification procedure must be performed? 1. Suppose you are given the following facts: Wx. ys 2s gela, YA gts 2) > gts 2) Va,b : suceta. b) > gt(a,b) Wr s ngt(x,x) ‘You want to prove that g(5.2) Consider the following attempt at a resolution proof: ngtt5, 2) “atl, ¥)V 7atly, 2) v gtx, 2) Si 2 gHS, y) \/ gt, 2) asucc{a, b)\y gta, 6) NN Tat(S. y) Vv ~succly. 2) Tat, AV Oty, 2) v aX, 2) Neen A9H5. Vv VGH. Y) V ~succly, 2) r (a) What went wrong? (b) What needs to be added to the resolution procedure to make sure that this does not happen? 128 10. Me 12. 13. Artificial Intelligence The answer to the last problem suggests that the unification procedure could be simplified by omitting the check that prevents.x and f(x) from being unified together (the occur check). This should be possible since no two clauses will ever share variables. If x occurs in one, f(x) cannot occur in another. But suppose the unification procedure is given the following two clauses {in the notation of Section 5.4.4): PLO) PfCa),a) Trace the execution of the procedure. What does this example show about the need for the occur check? What is wrong with the following argument (Henle, 1965}? © Men are widely distributed over the earth. * Socrates is a man. + Therefore, Socrates is widely distributed over the earth, How should the facts represented by these sentences be represented in logic so that this problem does not arise? Consider all the facts about baseball that are represented in the slot-and-filier structure of Fig. 4.5. Represent those same facts as a set of assertions in predicate logic. Show how the inferences that were derived from that knowledge” in Section 4.2 can be derived using logical deduction. What problems woutd be encountered in attempting to represent the following statements in predicate togic? It should be possible to deduce the final statement from the others, John only likes to see French movies. Te’s safe to assume a movie is American unless explicitly told otherwise. ‘The Playhouse rarely shows foreign films. People don't do things that will cause them to be in situations that they don’t lke John doesn’t go to the Playhouse very often. UUGda REPRESENTING KNOWLEDGE USING RULES To be useful, a system has to do more than just correctly perform some task. —John McDermott, Al Researcher In this chapter, we discuss the use of rules to encode knowledge. This is a particularly important issue since rule-based reasoning systems have played a very important role in the evolution of Al from a purely laboratory science into a commercially significant one, as we see later in Chapter 20. We have already talked about rules as the basis for a search program. But we gave little consideration to the way knowledge about the world was represented in the rules (although we can see a simple example of this in Section 4.2). In particular, we have been assuming that search control knowledge was maintained completely separately from the rules themselves, We will now relax that assumption and consider a set of rules to represent both knowledge about relationships in the world, as Well as knowledge about how to solve problems using the content of the rules. 6.1 PROCEDURAL VERSUS DECLARATIVE KNOWLEDGE Since our discussion of knowledge representation has concentrated so far on the use of logical asserti use logic as 2 starting point in our discussion of rule-based systems. In the previous chapter, we viewed logical assertions as declarative representations of knowledge. A declurative representation is one in which knowledge is specified, but the use to which that knowledge is to be put is not given. To use a declarative representation, we must augment it with a program that specifies what is to be done to the knowledge and how. For example, a set of logical assertions can be combined with a resolution theorem prover to give a complete program for solving problems. There is a different way, though, in which logical assertions can be viewed, namely as a program, rather than as data to a program. In this view, the implication statements define the legitimate reasoning paths and the atomic assertions provide the starting points (or, if we reason backward, the ending points) of those paths. These reasoning paths define the possible execution paths of the program in much the same way that traditional control constructs, such as if-then-else. define the execution paths through traditional programs. In other words, we could view logical assertions as Ss, WE 130 Artificial Intelligence procedural representations of knowledge. A procedural representation is one in which the control information that is necessary to use the knowledge is considered to be embedded in the knowledge itself. To use a procedural representation, we need to augment it with an interpreter that follows the instructions given in the knowledge. Actually, viewing logical assertions as code is not a very radical idea, given that all programs are really data to other programs that interpret (or compile) and execute them. The teal difference between the declarative and the procedural views of knowledge fies in where control information resides. For example, consider the knowledge base: san(Marcus) ‘man(Caesar) person(Cleopatra) "Ye: man(x) > person(x) Now consider trying to extract from this knowledge base the answer to the question BY : person(s) We want to bind y to a particular value for which person is true. Our knowledge base justifies any of the following answers: Marcus Caesar Cleopatra Because there is more than one value that satisfies the predicate, but only one value is needed, the answer to the question wil] depend on the order in which the assertions are examined during the search for a response. If we view the assertions as declarative, then they do not themselves say anything about how they will be examined. If we view them as procedural, then they do, Of course, nondeterministic programs are possible — for example, the concurrent and parallel programming constructs described in Dijkstra [1976], Hoare [1985], and Chandy and Misra [1989]. So, we could view these assertions as a nondeterministic program whose output is simply not defined. If we do this, then we have a “procedural” representation that actually contains no more information than does the “declarative” form. But most systems that view knowledge as procedural do not do this. The reason for this is that, at least if the procedure is to execute on any sequential or on most existing parallel machines, some decision must be made about the order in which the assertions will be examined. There is no hardware support for randomness. So if the interpreter must have a way of deciding, there is no real reason not to specify it as part of the definition of the language and thus to define the meaning of any particular program in the language. For example, we might specify that assertions will be examined in the order in which they appear in the program and that search will proceed depth-first, by which we mean that if a new subgoa! is established then it will be pursued immediately and other paths will only be examined if the new one fails. If we do that, then the assertions we gave above describe a program that will answer our auestion with Cleopatra ‘To see clearly the difference between declarative and procedural representations, consider the following assertions: Representing Knowledge Using Rules 131 man(Marcus) ‘man(Caesur) Wx : man(x) > person(2) person(Cleopatra) ‘Viewed declaratively, this is the same knowledge base that we had before. All the same answers are supported by the system and no one of them is explicitly selected. But viewed procedurally, and using the control model we used to get Cleopatra as our answer before, this is a different knowledge base since now the answer to our question is Marcus. This happens because the first statement that can achieve the person goal is the inference rule Wx : man(x) -> person(x). This rule sets up a subgoal to find a man, Again the statements are examined from the beginning, and now Marcus is found to satisfy the subgoal and thus also the goal. So Marcus is reported as the answer. It is important to keep in mind that although we have said that a procedural representation encodes control information in the knowledge base, it does so only to the extent that the interpreter for the knowledge base recognizes that control information. So we could have gotten a different answer to the person question by leaving ‘our original knowledge base intact and changing the interpreter so that it examines statements from last to first (but still pursuing deptt-first search). Following this control regime, we report Caesar as our answer. There has been a great deal of controversy in AT over whether declarative or procedural knowledge representation frameworks are better. There is no clearcut answer to the question. As you can see from this discussion, the distinction between the two forms is often very fuzzy. Rather than try to answer the question of which approach is better, what we do in the rest of this chapter is to describe ways in which rule formalisms and interpreters can be combined to solve problems. We begin with a mechanism called logic programming, and then we consider more flexible structures for rule-based systems. 6.2 LOGIC PROGRAMMING Logic programming is a programming language paradigm in which ‘ogical assertions are viewed as programs, as described in the previous section. There are several logic programming systems in use today, the most popular of which is PROLOG [Clocksin and Mellish, 1984; Bratko, 1986]. Programming in PROLOG has been described in more detail in Chapter 25. A PROLOG program is described as a series of logical assertions, each of which is a Horn clause.' A Hom clause is a clause (as defined in Section 5.4.1) that has at most one positive literal. Thus p, =p Vg, and p 9 q are all Hom clauses. The last of these does not look like a clause Wx: patty) A smaix) > apartmentpen») ‘Wx aka) \ dogs) -> ports) ‘Wx: poodle(x) — dog\x) /\ smatx) poodle fly) ‘A Representation In Logic apartmentpet (x) :- pet (%), small (x) pet (x) i= cat (x). pet (X) :- dogix) dog (xX) :- poodle (x) . smal] (x) := poodle (x) poodle (fluffy) ‘A Representation in PROLOG Fig.6.1. A Declarative and a Procedural Representation "Programs written in pure PROLOG are composed only of Hor clauses. PROLOG, as an actual programming language, however, allows departures fom Horn clauses. In the rest of this section, we limit our discussion to pure PROLOG. 132 Artificial Intelligence and it appears to have two positive literals. But recall from Section 5.4.1 that any logical expression can be converted to clause form. If we do that for this example, the resulting clause is 9p \y q. which is a well-formed Hom clause. As we will see below, when Hora clauses are written in PROLOG programs, they actually look ‘more like the form we started with (an implication with at most one literal on the right of the implication sign) than the clause form we just produced. Some examples of PROLOG Hom clauses appear below. “The fact that PROLOG programs are composed only of Homn clauses and not of arbitrary logical expressions has two important consequences. The first is that because of the uniform representation a simple and efficient interpreter can be written. The second consequence is even more important. The logic of Horn clause systems is decidable (unlike that of full first-order predicate logic). The contro! structure that is imposed on a PROLOG program by the PROLOG interpreter is the same one we used at the beginning of this chapter to find the answers Cleopatra and Marcus. The input to a program is a goal to be proved. Backward reasoning is applied to try to prove the goal given the assertions in the program. ‘The program is read top to bottom, left to right and search is performed depth-first with backtracking, Figure 6.1 shows an example of a simple knowledge base represented in standard logical notation and then in PROLOG. Both of these representations contain two types of statements, facts, which contain only constants G.c., no variables) and rules, which do contain variables. Facts represent statements about specific objects, Rules represent statements about classes of objects. Notice that there are several superficial, syntactic differences between the logic and the PROLOG representations, including: 1. In logic, variables are explicitly quantified. In PROLOG, quantification is provided implicitly by the way the variables are interpreted (sce below). The distinction between variables and constants is made in PROLOG by having all variables begin with upper case letters and all constants begin with lower case letters or numbers. 2. In logic, there are explicit symbols for and (/\) and or (\/). In PROLOG, there is an explicit symbol for and (,), but there is none for or. Instead, disjunction must be represented as a list of alternative statements, any one af which may provide the basis for a conclusion. 3. In logic, implications of the form “p implies q” are written as p —> q. In PROLOG, the same implication ig written “backward,” as q : - p. This form is natural in PROLOG because the interpreter always works backwards from a goal, and this form causes every rule to begin with the component that must therefore be matched first. This first component is called the head of the rule. The first wo of these differences arise naturally from the fact that PROLOG programs are actually sets of Horn clauses that have been transformed as follows: 1. If the Horn clause contains no negative literals (Le., it contains a single literal which is positive), then leave it as it is. 2. Otherwise, rewrite the Hom clause as an implication, combining all of the negative literals into the antecedent of the implication and leaving the single positive literal (ir there is one) as the consequent. This procedure causes a clause, which originally consisted of a disjunction of literals (all but one of which were negative), to be transformed into a single implication whose antecedent is a conjunction of (what are now positive) literals. Further, recall that in a clause, all variables are implicitly universally quantified. But, when we apply this transformation (which essentially inverts several steps of the procedure we gave in Section 5.4.1 for converting to clause form), any variables that occurred in negative literals and so now occur in the antecedent become existentially quantified, while the variables in the consequent (the head) are still universally ‘quantified. For example, the PROLOG clause P(x) 2 @be yd Representing Knowledge Using Rules 133 is equivalent to the logical expression Wei By: Ql, y) > Pix) A key difference between logic and the PROLOG representation is that the PROLOG interpreter has a fixed control strategy, and so the assertions in the PROLOG program define a particular search path to an answer to any question. In contrast, the logical assertions define only the set of answers that they justify; they themselves say nothing about how to choose among those answers if there are more than one, The basic PROLOG control strategy outlined above is simple. Begin with a problem statement, which is viewed as a goal to be proved. Look for assertions that can prove the goal. Consider facts, which prove the goal directly, and also consider any rule whose head matches the goal. To decide whether a fact or a rule can be applied to the current problem, invoke a standard unification procedure (recall Section 5.4.4), Reason backward from that goal until a path is found that terminates with assertions in the program. Consider paths using a depth-first search strategy and using backtracking. At each choice point, consider options in the order in which they appear in the program. If a goal has more than one conjunctive part, prove the parts in the order in which they appear, propagating variable bindings as they are determined during unification. We can illustrate this strategy with a simple example. Suppose the problem we are given is to find a value of X that satisfies the predicate apartmentpet (x). We state this goal to PROLOG as 2- apartmentpet (x). Think of this as the input to the program. The PROLOG interpreter begins looking for a fact with the predicate apartmentpet or a rule with that predicate as its head. Usually PROLOG programs are written with the facts containing a given predicate coming before the rules for that predicate so that the facts can be used immediately if they are appropriate and the rules will only be used when the desired fact is not immediately available, In this example, there are no facts with this predicate, though, so the one rule there is must be used. Since the rule will succeed if both of the clauses on its right-hand side can be satisfied, the next thing the interpreter does is to try to prove each of them. They will be tried in the order in which they appear. There are no facts with the predicate pet: but again there are rules with it on the right-hand side. But this time there are two such rules, rather than one. All that is necessary for a proof though is that one of them succeed. They will be tried in the order in which they occur. The first will fail because there are no assertions about the predicate cat in the program. The second will eventually lead to success, using the rule about dogs and poodles and using the fact poodle (£1u££y). This results in the variable X being bound to £1uf fy. Now the second clause smal (X) of the initial rule must be checked. Since X is now bound to fluffy, the more specific goal, small (fluffy), must be proved. This too can be done by reasoning backward to the assertion poodle (£luf fy). The program then halts with the result apartmentpet (fluffy). Logical negation (=) cannot be represented explicitly in pure PROLOG. So, for example, itis not possible to encode directly the logical assertion We: dogtx) > reai(x) Instead, negation is represented implicitly by the lack of an assertion. This leads to the problem-solving strategy called negation as failure (Clark, 1978]. If the PROLOG progtam of Fig. 6.1 were given the goal 2 cat (fluffy).

You might also like