A R "! I,,, r.-ii ' i '!~ii ii! A ow ' I % i o,... V. 4..... JA' i,.. Al V5, 9 MiN, ;
Logic and Language Models for Computer Science
Logic and Language Models for Computer Science HENRY HAMBURGER George Mason University DANA RICHARDS George Mason University PRENTICE HALL Upper Saddle River, New Jersey 07458
Library of Congress Cataloging-in-Publication Data CIP data on file. Vice President and Editorial Director, ECS: Marcia Horton Senior Acquisitions Editor: Petra J. Recter Vice President and Director of Production and Manufacturing, ESM: David W. Riccardi Executive Managing Editor: Vince O'Brien Managing Editor: David A. George Production Editor: Lakshmi Balasubramanian Composition: PreTEh, Inc. Director of Creative Services: Paul Belfanti Creative Director: Carole Anson Art Director: Jayne Conte Art Editor: Greg Dulles Cover Designer: Bruce Kenselaar Manufacturing Manager: Trudy Pisciotti Manufacturing Buyer: Lisa McDowell Marketing Manager: Jennie Burger Pei 2002 by Prentice Hall Prentice-Hall, Inc. Upper Saddle River, NJ 07458 All rights reserved. No part of this book may be reproduced in any form or by any means, without permission in writing from the publisher. The author and publisher of this book have used their best efforts in preparing this book. These efforts include the development, research, and testing of the theories and programs to determine their effectiveness. The author and publisher make no warranty of any kind, expressed or implied, with regard to these programs or the documentation contained in this book. Printed in the United States of America 10 9 8 7 6 5 4 3 2 1 ISBN 0-13-065487-6 Pearson Education Ltd., London Pearson Education Australia Pty. Ltd., Sydney Pearson Education Singapore, Pte. Ltd. Pearson Education North Asia Ltd., Hong Kong Pearson Education Canada, Inc., Toronto Pearson Educacion de Mexico, S.A. de C.V. Pearson Education-Japan, Tokyo Pearson Education Malaysia, Pte. Ltd. Pearson Education, Upper Saddle River, New Jersey
To Judith, with love. -H.H. To Nelda, for her patience and support. D.R.
Contents Preface xi 1 Mathematical Preliminaries 1 1.1 Operators and Their Algebraic Properties... 1 1.2 Sets...... 2 1.3 Strings...... 5 1.4 Relations and Functions... 6 1.5 Growth Rates of Functions... 7 1.6 Graphs and Trees... 7 1.7 Computing with Mathematical Objects... 8 I Logic for Computer Science 11 2 Propositional Logic 15 2.1 Propositions...... 15 2.2 States, Operators, and Truth Tables... 17 2.3 Proofs of Equivalence with Truth Tables... 20 2.4 Laws of Propositional Logic... 22 2.5 Two Important Operators... 25 3 Proving Things: Why and How 33 3.1 Reasons for Wanting to Prove Things... 33 3.2 Rules of Inference...... 34 3.3 Proof by Rules...... 39 3.4 Assumptions...... 40 3.5 Proof Examples...... 43 3.6 Types of Theorems and Proof Strategies... 47 vii
viii CONTENTS 4 Predicate Logic 53 4.1 Predicates and Functions...... 53 4.2 Predicates, English, and Sets... 55 4.3 Quantifiers...... 57 4.4 Multiple Quantifiers...... 61 4.5 Logic for Data Structures... 64 5 Proving with Predicates 71 5.1 Inference Rules with Predicates... 71 5.2 Proof Strategies with Predicates...... 73 5.3 Applying Logic to Mathematics...... 74 5.4 Mathematical Induction... 77 5.5 Limits of Logic...... 83 6 Program Verification 85 6.1 The Idea of Verification...... 85 6.2 Definitions...... 86 6.3 Inference Rules...... 87 6.4 Loop Invariants...... 92 6.5 The Debate About Formal Verification...... 96 7 Logic Programming 99 7.1 The Essence of Prolog and Its Relation to Logic... 99 7.2 Getting Started Using Prolog... 101 7.3 Database Operations in Prolog... 106 7.4 The General Form and a Limitation of Prolog... 110 7.5 How Prolog Works... 113 7.6 Structures... 118 7.7 Lists and Recursion...... 118 7.8 Built-In Predicates and Operators... 123 II Language Models for Computer Science 135 8 Language and Models 139 8.1 Programming Languages and Computer Science... 139 8.2 Ambiguity and Language Design... 140 8.3 Formal Languages...... 143 8.4 Operations on Languages... 145
CONTENTS ix 8.5 Two Levels and Two Language Classes... 148 8.6 The Questions of Formal Language Theory... 150 9 Finite Automata and Their Languages 157 9.1 Automata: The General Idea... 158 9.2 Diagrams and Recognition...... 159 9.3 Formal Notation for Finite Automata... 164 9.4 Finite Automata in Prolog...... 169 9.5 Nondeterminism: The General Idea... 171 9.6 Nondeterministic Finite Automata... 173 9.7 Removing Nondeterminism...... 177 9.8 A-Transitions...... 182 9.9 Pattern Matching...... 186 9.10 Regular Languages... 188 10 Regular Expressions 201 10.1 Regular Sets...... 202 10.2 Regular Expressions and What They Represent... 204 10.3 All Regular Sets Are FA Languages... 209 10.4 All FA Languages Are Represented by REs... 213 11 Lex: A Tool for Building Lexical Scanners 223 11.1 Overview...... 223 11.2 Lex Operators and What They Do... 226 11.3 The Structure and Processing of Lex Programs... 230 11.4 Lex Examples with C...... 231 11.5 States...... 233 11.6 Using Lex in Unix...... 237 11.7 Flex and C±+.... 239 12 Context-Free Grammars 245 12.1 Limitations of Regular Languages...... 246 12.2 Introduction to Context-Free Grammars... 249 12.2.1 The Four Parts of a CFG... 250 12.2.2 Deriving the Strings... 250 12.2.3 Balanced Parentheses...... 252 12.3 RE Operators in CFGs...... 254 12.4 Structure, Meaning, and Ambiguity... 260 12.4.1 Modeling Structure for Meaning... 260
x CONTENTS 12.4.2 A Grammar of Algebraic Expressions... 262 12.5 Backus Normal Form and Syntax Diagrams... 266 12.5.1 Backus Normal Form...... 266 12.5.2 Syntax Diagrams... 268 12.6 Theory Matters...... 270 12.6.1 Practicing Proving with Palindromes... 271 12.6.2 Proving that G 10 Does the Right Thing... 273 13 Pushdown Automata and Parsing 279 13.1 Visualizing PDAs...... 280 13.2 Standard Notation for PDAs...... 283 13.3 NPDAs for CFG Parsing Strategies...... 287 13.4 Deterministic Pushdown Automata and Parsing... 291 13.5 Bottom-Up Parsing...... 295 13.5.1 Comparison and Properties... 295 13.5.2 Design and Construction of the NPDA... 296 13.6 Pushdown Automata in Prolog... 298 13.7 Notes on Memory...... 301 14 Turing Machines 305 14.1 Beyond Context-Free Languages... 306 14.2 A Limitation on Deterministic Pushdown Automata... 308 14.3 Unrestricted Grammars...... 310 14.4 The Turing Machine Model... 313 14.5 Infinite Sets...... 316 14.6 Universal Turing Machines...... 319 14.7 Limits on Turing Machines...... 322 14.8 Undecidability...... 324 14.9 Church-Turing Thesis...... 326 14.10 Computational Complexity... 326 Index 336
Preface So you are a computer science (CS) major and you are sitting down to see what this book is about. It has been assigned, the course is required, you have no choice. Still you chose your institution, your major. Maybe your instructor made a good choice. Let's hope so. Okay, you are not a computer science major, perhaps not even a student, but you have picked up this book. Maybe the title intrigued you. Will you be able to read it, to learn from it? We think so. We will try to interest you too. Or you are teaching a course that might use this book, maybe in discrete math, maybe including logics or formal language or both. If you want your CS students to see the applicability of mathematical reasoning to their own field or your math students to see the usefulness of their field outside itself, it is your students whom we have in mind. If you are a CS major, you have already noticed that this course is different from the others you have taken so far. It is not an introduction to computing, programming, problem solving, or data structures. No, this book is about something called models-models of language and knowledge. It is also about formal methods. You know something about models if you have built or seen a model airplane. In Kitty Hawk, North Carolina, you can see the wind tunnel that the Wright brothers built to test the lift capabilities of various wing shapes. A model can help us simplify and think more clearly about a complex problem (powered flight) by selecting a part (the wing) and focusing on some aspect of it (its aerodynamics). The other, temporarily ignored parts and aspects must ultimately be addressed, of course, if the original problem is to be solved. The models in this book are simplifications too, but not of material objects like airplanes. For computer scientists, the objects of study lie mainly in the world of symbols. In this book, it is computer software, and especially the programming languages in which that software is written, from which we draw our models and to which we apply them. xi
xii PREFACE A model, then, is a collection of precisely stated interacting ideas that focus on a particular aspect or part of our subject matter. A good model can simplify a topic to its essence, stripping away the details so that we can understand the topic better and reason precisely about it. The model keeps only those parts and processes that are of interest. We reason both formally and informally. Informal methods draw on analogies to your knowledge of other things in the world in general and your common sense, typically expressed in a human language like English and perhaps a diagram. Formal methods use abstract symbols-like the famous "x" of high school algebra-and clearly stated rules about how to manipulate them. A formal method based on a simple but precise model of a situation can enable us to prove that we have got things right at least as reflected in the model. If this concern with precision and proof makes you think this is a theory book, you are partly right. If you think that means it is not of practical value, we ask you to think again. It is often said that experience is the best teacher. However, learning from experience means transferring ideas across situations by seeing the essential similarities in nonidentical situations. This abstracted essence, by which we learn from history or from our mistakes, is an informal model. Formalizing the model and reasoning carefully about it-that is, theory-is the scientist's and engineer's path to knowledge and action in the real world. So what do we theorize about? We have chosen to focus on language, the crucial link between hardware and software. Programming languages permit software to be written and language processors-compilers, interpreters and assemblers-permit hardware to run that software. Sometimes a model proves to be so interesting and widely applicable that it becomes an object of study in its own right. That is the case with the logic and language models in this book. Two key aspects of language are structure and meaning. We study models of each. The structure of language has to do with the arrangement of symbols into permitted sequences-called sentences in human language and statements in programming languages. This topic is usually called formal models of language. It underlies key aspects of compilers, the study of what computers can do efficiently and the processing of human language for translation and easy interaction between people and computers. Symbol arrangements are of interest not only in their own right, but also because they express ideas about meaning and computation. Expressing meaning can be done in various ways, including logic. Of the many logics, the simplest is propositional logic. It finds application in the tiny hardware components called logic gates, in the conditions for branching and loops in high-level programming languages and
PREFACE xiii in mathematical rules of proof that can be applied via software throughout engineering and science. Predicate logic builds on propositional logic to permit knowledge representation in database systems, artificial intelligence, and work on program correctness in software engineering. Computer science students may notice that several phrases in the prior paragraphs are the names of upper division courses in computer science. To further emphasize the practical value of the two major topics of this book, we introduce an important programming language based on each. Lex, based on formal language, is a tool for building a lexical scanner-a key component of a compiler. Prolog, a programming language whose core is based on predicate logic, supports rapid prototyping of intelligent systems. Formalisms for language and logic have ancient roots: India for language and Greece for logic. Each field has enjoyed renewed attention in more recent times, starting in the nineteenth century for logic and early in the twentieth century for language. These latter thrusts are more formal yet still independent of computing. The venerable histories of logic and linguistics suggest the inherent fascination that each has held for human minds. Building on that motivation, this book stresses the relationship of each to computer science. The two fields are also related to each other in various ways that emerge in this text. Watch for these important links among logic, formal language, and computing. "* Complementarity: Logic and formal language share the job of modeling, with logic providing models of meaning and formal language paying attention to form. "* Recursion: In logic, formal language and elsewhere, recursive definitions provide a finite means to specify expressions of unlimited size. "* Proofs: Logic supports proofs of results throughout formal language, mathematics, and computer science, notably in the area of program verification. "* Correspondences: Language categories defined by grammar types are in direct correspondence to the recognition capabilities of types of automata (models of computing). "* Compiling: Design strategies for some (pushdown) automata reflect language processing techniques for compilers. Concepts of formal languages and automata directly support compiler tools. "* Computation: Another class of automata (Turing machines) provides an apparently correct characterization of the limits of computing.
xiv PREFACE 9 Programming: Logic-based languages such as Prolog support the declarative style of programming. Prolog in turn is used to implement some automata and database concepts. H. HAMBURGER D. RICHARDS
Chapter 1 Mathematical Preliminaries This text is concerned with formal models that are important to the field of computer science. Because the models are formal, we make substantial use of mathematical ideas. In many ways, the topics in this book-logic, languages, and automata-are a natural extension of a Discrete Mathematics course, which is generally required for computer science (CS) majors. This text steers clear of excessive mathematical notation, focusing instead on fundamental ideas and their application. However, it is impossible to appreciate the power that comes from the rigorous methods and models in this book without some background in discrete mathematics. This chapter is a brief overview of the needed mathematical background and may be useful for self-evaluation, review, and reference. 1.1 Operators and Their Algebraic Properties Operators are crucial to all of mathematics, starting with the first one we learn in childhood-the addition operator of ordinary arithmetic. The things that an operator operates on are called its operands. Each operand of an operator must come from some domain. For present purposes, we assume that both operands of addition are from the domain of real numbers, which includes things like -273, 7r,.406, and v5. The real numbers are closed under addition, because the result of adding two of them is also a real number; roughly speaking, "closed" means staying within a domain. In the case of addition, the order of the operands does not affect the result. For example, 2 + 3 and 3 + 2 are both 5. More generally, x + y = y + x for any x and y. Since that is the case, the operator is commutative. Multiplication is also commutative, but subtraction and division are not. Being commutative, or the I
2 CHAPTER 1. MATHEMATICAL PRELIMINARIES property of commutativity, is one of several properties of operators that is of interest to us. Another key property of addition is associativity. Like commutativity, it can be expressed by an equality. To say that addition is associative-which it is-is the same as saying that (x + y) + z = x + (y + z) for any x, y, and z. The identity element for addition is 0 (zero) since, whenever it is one of the operands, the result is the other operand: x + 0 = x for any x. Every real number x has an inverse, -x, such that the two of them add up to the identity: x + (-X) = 0. Multiplication of reals is commutative and associative. It has an identity element, 1, and for each element except 0 there is an inverse. The multiplication operator is often left invisible, as in xy, the product of x and y. Here the operator has been expressed by simply writing the operands next to each other. A property of the interaction of addition and multiplication is distributivity, the fact that multiplication distributes over addition. This fact is written x(y + z) = xy+xz, for all x, y, and z. Addition does not distribute over multiplication, however, since it is not true in general that x + yz = (x + y)(x + z). Equality, less than ("<"), and greater than (">") are known as relational operators. The pairwise combinations of them are also relational operators: inequality ("5'), less than or equal ("<"), and greater than or equal (">"). An important property of all these operators except "=" is transitivity. For example, to say that less than is transitive means that if x < y and y < z are both true, then x < z must also be true. Operators apply not only to numbers, but to other categories as well. An excellent example occurs in the next section, where sets and their operators are introduced. We find that not only are the two key operators, union and intersection, both commutative and associative, but also each distributes over the other. In later chapters, we see that discussions of operators and their algebraic properties are highly significant for the principal topics of this book logic and formal languages. 1.2 Sets A set is a collection of distinct elements. The elements are typically thought of as objects such as integers, people, books, or classrooms, and they are written within braces like this: {Friday, Saturday, Sunday}. When working with sets, it can be important to specify the universe, U, of elements (e.g., the set of days of the week) from which the elements of particular sets are drawn. Note that the universe is a set: the set of all elements of a given type. Sometimes the universe is only tacitly specified, when the reader can easily figure out what it is. The elements are said to
1.2. SETS 3 be in the set and may also be called its members. Sets can be presented in two forms. The extensional form enumerates the elements of the set, whereas the intensional form specifies the properties of the elements. For example: S = {11, 12,13, 14} S = {x I x is an integer, and 10 < x < 15} are extensional and intensional forms of the same set. The second of these is read "those x such that x is an integer greater than 10 and less than 15." Note that the universe, the set of integers, is tacit in the first example and only informally specified in the second. The empty set is a set with no element and is denoted 0. Because the elements of a set are distinct, you should write sets with no repetition. For example, suppose a student database includes countries of origin and shows the participants in a seminar as being from China, France, China, Egypt, and France. Then the set of countries represented in this class is {China, France, Egypt}. Further, there is no concept of ordering within a set; there is no "first" element, and so on. For example, the sets {4, 2, 3} and {2, 3, 4} are the same set; it does not matter which form is used. If ordering is important, then one speaks of a sequence of elements. In the extensional form of a sequence, the elements appear in order, within parentheses, not braces. For example, the sequence (4, 2, 3) is different from (2, 3, 4). Further, sequences need not have distinct elements, so the sequence (2, 3,3,4) is different from (2, 3,4). Sequences are often implemented as one-dimensional arrays or as linked lists. A sequence of length 2 is called an ordered pair. A sequence of length 3, 4, or 5 is called a triple, quadruple, or quintuple respectively; in the general case of length n, the word is n-tuple. Set operators let us talk succinctly about sets. We begin with notions of membership and comparison. The notation x G S means that x is an element of the set S, whereas x S means that x is not in S. With S = {11, 12,13, 14} as in the prior example, 12 c S and 16 V S. We say that S1 is a subset of S 2, written S 1 g S2, if each element of S is also an element of S 2. For example, {12, 14} C {11, 12, 13, 14}. Since (of course) a set contains all of its own elements, it is correct to write S C S. Now consider subset T, which is not equal to S because it is missing one or more elements of S. Although it is correct to write T C S, we may choose to write T C S, which states that T is a proper subset of S. For contrast, one may refer to any set as an improper subset of itself. Two sets are equal, S = S 2, if (and only if) they contain exactly the same elements. It is important to observe that S = S 2 exactly when S g S2 and S 2 C S1. To show that S = S 2 one needs to argue that both S g S 2 and S 2 C S1.