Machines and languages theory Lecture 1
Machines and languages theory Instructor: Fatemeh Daneshfar E-mail: f.daneshfar@uok.ac.ir TA:? Text: An Introduction to Formal Languages and Automata, (5th ed.) - Linz (Jones & Bartlett) Webpage: http://eng.uok.ac.ir/daneshfar
Machines and languages theory Requirements: 30% - Exam1 (the 21 th Aban) 30% - Exam2 (the 26 th Azar) 40% - Final Exam (two hours) cumulative. 5% - Presentation Students may not do extra credit work to make up for missing exams or poor exam grades.
Course overview The course discusses two very closely related concepts: models of computing and languages The computing models are simplified as much as possible, so as to boil them down to their most important elements. Some of the models will be more powerful than others. Languages are effectively sets of strings which are built by particular rules, called a grammar. The more complex the grammar, the more complicated the language can be. 4
Models of computing Finite state machines (automata) Pattern recognition Simple circuits (e.g. elevators, sliding doors) Automata with stack memory (pushdown automata.) Parsing computer languages Automata with limited tape memory Automata with infinite tape memory called `Turing machines Most powerful model possible Capable of solving anything that is solvable 5
Chomsky hierarchy of grammars Regular grammars Context free grammars Context sensitive grammars Unrestricted grammars We ll define what these mean later, but the important point is that the grammars become more complex as we go down the list, and contain those above it. 6
Computers recognise languages! Computers can be made to recognize, or accept, the strings of a language. There is a correspondence between the power of the computing model and the complexity of languages that it can recognize! Finite automata only accept regular grammars. Push down automata can also accept context free grammars. Turing machines can accept all grammars. This is why we study both in this course! 7
Week Section Syllabus Topic 1 1.1 Mathematical Preliminaries 2 1.2 Basic Concepts 2.1 Deterministic Finite Automata (dfa) 3 2.2 Nondeterministic Finite Automata (ndfa) 2.3 Equivalence of dfa's and ndfa's
Week Section Syllabus Topic 4 3.1 Regular Expressions (re) 5 3.2 Connection between re's and Regular Languages 3.3 Regular Grammars 6 4.1 Closure Properties of Regular Languages 4.2 Elementary Questions about Regular Languages 7 4.3 Identifying Nonregular Languages
Week Section Syllabus Topic 8 5.1 Context-free Grammars (cfg) 5.2 Parsing and Ambiguity 9 6.2 Chomsky and Greibach Normal Forms for cfg's 7.1 Nondeterministic Pushdown Automata (pda) 7.2 Pda's and Context-free Grammars 10 8.1 Pumping Lemma for Context-free Languages 8.2 Closure Properties of Context-free Languages
Week Section Syllabus Topic 11 9.1 Turing Machines 9.2 More on Turing Machines 12 9.3 Turing's Thesis 11.1 Recursive and Recursively Enumerable Languages 13 11.4 The Chomsky Hierarchy 12.1 Unsolvable Problems 14 12.2 Undecidable Problems for Rec. Enum. Languages 15 Final Exam
Chapter 1 Introduction to the Theory of Computing
Introduction to the Theory of Computing Mathematical Preliminaries and Notation Sets Functions and Relations Graphs and Trees Proof Techniques
Mathematical Preliminaries and Notation Sets notation {1, 2, 3, 4, 5}, {1, 2,..., 5}, {x 1 x 5} membership 3 S 7 S an apple S
Mathematical Preliminaries and Notation Sets union and intersection know what U (union) and (intersection) mean universal set understand that the universal set is merely ALL the things that are under discussion. implicit vs. explicit
Mathematical Preliminaries and Sets complement Notation The complement of S, written as S', (S with a bar over it in the book) is all the elements of the universe not in S difference The difference between S and T = S T = S T'
Mathematical Preliminaries and Notation empty (null) set The empty set ϕ is the set containing no elements
Mathematical Preliminaries and Notation demorgan's Laws (S' T')' = S U T (S' U T')' = S T
Mathematical Preliminaries and subset Notation S T iff every element in S is also in T S T (S is a proper subset of T) if S T and there is something in T that is not in S disjoint Two sets are disjoint if their intersection is empty; they have no elements in common
Mathematical Preliminaries and infinite vs. finite sets Notation A set is infinite if it is not possible to list all of the elements. There are two classes of infinity: numerable, not enumerable This will become very important in this class
Mathematical Preliminaries and powerset Notation The powerset of a set S (written as 2 S ) is the set of all the subsets of the set S There's a reason for the notation
Mathematical Preliminaries and Cartesian product Notation The Cartesian product of two sets S and T, written as S T, is the set of all the ordered pairs created by choosing one element of S and one element of T.
Mathematical Preliminaries and Functions Notation A function f : S T is the mapping of elements of S to unique elements of T domain, range
Mathematical Preliminaries and Relations Notation A relation between S and T is a set of ordered pairs (s, t) taken from these sets. A relation is a subset of Cartesian product of S and T. A function is a special kind of relation reflexive, symmetric, transitive A relation that is reflexive, symmetric and transitive is called an equivalence relation and partitions the underlying set.
Mathematical Preliminaries and Notation Graphs and Trees Read the section and learn the notation
Mathematical Preliminaries and Notation Proof Techniques proof by contradiction proofs Mathematical induction
Mathematical Preliminaries and Notation Mathematical induction Prove that the sum of all integers between 1 and n = n(n + 1) / 2.
Mathematical Preliminaries and Notation Prove that the sum of all integers between 1 and n = n(n + 1) / 2 for all n. Basis: Consider n = 1. The sum of all integers from 1 to 1 is 1. But (1)(2)/2 = 1 also. Proven for this case Hypothesis: Assume that the sum of all integers from 1 to n is n(n + 1) / 2
Mathematical Preliminaries and Notation Prove that the sum of all integers between 1 and n = n(n + 1) / 2 for all n. Induction: We will show that it is true for n + 1, where n is the number from the hypothesis. The sum of the integers from 1 to n + 1 is the same as the sum of the integers from 1 to n, plus n + 1. From the hypothesis, this is n(n+1)/2 + n + 1. A little algebra show us that this is (n + 1)(n + 2)/2. Done!
Introduction to the Theory of Three Basic Concepts Languages: Computing An alphabet is a finite set of symbols. = {a, b} A string or word is any series of symbols from the alphabet. w = abaaa : empty string *: the set of all strings on ( + = * { }) A language is any set of words (a subset L of *). Sentence: a string in L
Languages A language is a set of strings. If is an alphabet, then a language over is a collection of strings whose components come from. So * isthebiggest possible language over, and every other language over is a subset of *. 31
Examples of languages Four simple examples of languages over an alphabet are the sets,{ },, and *. For example, if ={a} then these four simple languages over are, { }, {a}, and {, a, aa, aaa, }. Recall { } is the empty string while is the empty set. * is an infinite set. 32
Example: English The alphabet is A = {a,b,c,d,e x,y,z} The English language is made of strings formed from A: e.g. fun, excitement. We could define the English Language as the set of strings over A which appear in the Oxford English dictionary (but it is clearly not a unique definition). 33
Other Examples = {a, b} * = {, a, b, aa, ab, ba, aaa,...} L 1 = {a, aa, aab} (finite language) L 2 = {a n b n n 0} = {, ab, aabb,...} 34
Concatenation The natural operation of concatenation of strings places two strings in juxtaposition. For example, if then the concatenation of the two strings aab and ba is the string aabba. Use the name "cat " to denote this operation. cat(aab, ba) = aabba. 35
Combining Languages Also we can combine two languages L and M by forming the set of all concatenations of strings in L with strings in M. 36
Products of languages This new language is called the product of L and M and is denoted by L M. A formal definition can be given as follows: L M = {cat(s, t) s L and t M}. L1L2 = {xy x L1, y L2} For example, if L = {ab, ac} and M = {a, bc, abc}, then the product L M is the language L M = {aba, abbc, ababc, aca, acbc, acabc}. 37
Properties of products The following simple properties hold for any language L: L { } = { } L = L. L =.L =. The product is not commutative. In other words, we can find two languages L and M such that L M M L. The product is associative. In other words, if L, M, and N are languages, then L (M N) = (L M) N 38
Powers of languages If L is a language, then the product L L is denoted by L 2. The language product L n for every n {0, 1, 2, } is as follows: L 0 = { } L n = L L n-1 if n > 0 39
Example For example, if L = { a, bb} then the first few powers of L are L 0 = { } L 1 = L = {a, bb} L 2 = L L = {aa, abb, bba, bbbb} L 3 = L L 2 = {aaa, aabb, abba, abbbb, bbaa, bbabb, bbbba, bbbbbb} 40
Languages Example 2: L = {a n b n n 0} 41
Languages Example 2: L = {a n b n n 0} L 2 = {a n b n a m b m n 0, m 0} 42
Closure of a language If L is a language over (i.e. L *) then the closure of L is the language denoted by L* and is defined as follows: L* = L 0 L 1 L 2. The positive closure of L is the language denoted by L + and defined as follows: L + = L 1 L 2 L 3. 43
L* vs. L + It follows that L* =L + { }. But it s not necessarily true that L + = L* - { }. For example, if we let our alphabet be ={a} and our language be L ={, a}, then L + = L*. 44
Properties of Closure Let L and M be languages over the alphabet. Then: a) { }* = * = { } b) L* = L* L* = (L*)* c) L if and only if L + = L* d) (L* M*)* = (L* M*)* = (L M)* e) L (M L)* = (L M)* L 45
Grammars A grammar for a natural language tells us whether a particular sentence is well-formed or not. <sentence> <noun-phrase><predicate> <noun-phrase> <article><noun> <predicate> <verb> <article> a the <noun> boy dog <verb> runs walks 46
Three Basic Concepts Grammars A grammar is a finite set of rules (called productions) over an alphabet and a set of variables (non-terminals) to define the structure of the strings in a language. rule: where and are any string containing symbols from the alphabet and variables from the set of variables Start Symbol. One variable is set special. It's called the start symbol
Grammars Formal grammar: G = (V, T, S, P) V: finite set of variables T: finite set of terminal symbols S V: start variable P: finite set of productions 48
Productions A grammar rule is often called a production, and it can be read in any of severalwaysasfollows: "replace by ", produces," " rewrites to, " reduces to." 49
Grammars Productions: x y x (V T) + y (V T) * w = uxv derives z = uyv w z w 1 * w n (w 1 w 2... w n w 1 = w n ) w 1 + w n 50
Other shorthand: The following three symbols with their associated meanings are used quite often in discussing derivations: derives in one step, + derives in one or more steps, * derives in zero or more steps. 51
Where to begin Every grammar has a special grammar symbol called a start symbol, and there must be at least one production with left side consisting of only the start symbol. For example, if S is the start symbol for a grammar, then there must be at least one production of the form S. 52
Grammars Generated language: Derivation: G = (V, T, S, P) L(G) = {w T * S * w} S w 1 w 2... w n w L(G) Sentential forms: S, w 1,w 2,..., w n (containing variables) 53
Grammars Example 3: G = ({S}, {a, b}, S, P) P: S asb S S asb aasbb aabb aabb: sentence aasbb: sentential form 54
Grammars Example 3: G = ({S}, {a, b}, S, P) P: S asb S 55
Grammars Example 3: G = ({S}, {a, b}, S, P) P: S asb S L(G) = {a n b n n 0} L(G) = {a n b n+1 n 0}? 56
Grammars Example 4: 57
Grammars Example 5: G 2 = ({S}, {a, b}, S, P 2 ) P 2 : S SS S S asb S bsa 58
Grammars Example 5: G 2 = ({S}, {a, b}, S, P 2 ) P 2 : S SS S S asb S bsa L(G 2 ) = {w n a (w) = n b (w)} 59
Example Let A = {a, b, c}. Then a grammar for the language A* can be described by the following four productions: S S as S bs S cs. Or in shorthand: S as bs cs, "S can be replaced by either, or as, or bs, or cs." 60
Sample derivation. S as bs cs, S as S as aas. S as aas aacs aacbs.. S as aas aacs aacbs aacb = aacb A short hand way of showing a derivation exists: S * aacb derives in zero or more steps 61
A more complex grammar S AB A aa B bb. We can deduce that the grammar non-terminal symbols are S, A, and B, the start symbol is S, and the language alphabet includes, a, and b. 62
Another derivation Let's consider the string aab. The statement S + aab means that there exists a derivation of aab that takes one or more steps. For example, we have S AB aab aaab aab aabb aab. 63
Introduction to the Theory of Computing
Introduction to the Theory of Computing
Finite languages If the language is finite, then a grammar can consist of all productions of the form S w for each string w in the language. For example, the language {a, ba} can be described by the grammar S a ab. 66
Infinite languages If the language is infinite, then some production or sequence of productions must be used repeatedly to construct the derivations. Notice that there is no bound on the length of strings in an infinite language. Therefore there is no bound on the number of derivation steps used to derive the strings. If the grammar has n productions, then any derivation consisting of n + 1 steps must use some production twice 67
For example, the infinite language {a n b n 0}canbe described by the grammar, S b as. To derive the string a n b, use the production S as repeatedly --n times to be exact-- and then stop the derivation by using the production S b. The production S as allows us to say If S derives w, then it also derives aw," 68
Some simple grammars Language Grammar {a, ab, abb, abbb} S a ab abb abbb {, a, aa, aaa, } S as {b, bbb, bbbbb, b 2n+1 } S bbs b {b, abc, aabcc,, a n bc n } S asc b {ac, abc, abbc,, ab n c} S abc B bb 69
Automata An abstract model of digital computer: Input file Control unit Storage Output 70
Automata Input file: is divided into squares. Input is a string over a given alphabet. Each input square holds a symbol. The symbols are read from left to right, one at a time. The end of the input string can be detected. 71
Automata Storage: consists of an unlimited number of cells. Each cell can hold a symbol from an alphabet (which can be different from the input alphabet). The contents of the storage cells can be read and changed. 72
Automata Control unit: has a finite number of internal states. Can be in any one of the internal states. Can change state in some defined manner. 73
Automata Transition function: current state input symbol storage info next state Output may be produced Info in the storage may be changed Configuration: current state input symbol storage info Move: current configuration next configuration 74
Automata General types of automata: Accepter: yes/no output Transducer: string of symbols as output Deterministic: single move Non-deterministic: multiple moves 75
Homework Exercises: 4, 5, 6, 8, 9, 12, 15, 17 of Section 1.2 - Linz s book. Reading: Section 1.3 - Linz s book. 76