Decision, Computation and Language Context-free Grammar (CFG) Dr. Muhammad S Khan (mskhan@liv.ac.uk) Ashton Building, Room G22 http://www.csc.liv.ac.uk/~khan/comp218
The Chomsky Hierarchy Languages exist which are not regular; Noam Chomsky categorised regular and other languages as follows: Language Class Grammar Automaton 3 Regular NFA or DFA 2 Context-Free Push Down Automaton 1 Context-Sensitive Linear-Bounded Automaton 0 Unrestricted Turing machine M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 2
Type 3 - Regular Languages A regular language is one which can be: represented by a regular grammar, described using a regular expression, or accepted using an NFA or a DFA. M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 3
Type 2 - Context-Free Languages A Context-Free Grammar (CFG) is one whose production rules are of the form: A α where A is any single non-terminal, and α is any combination of terminals and non-terminals. A NFA/DFA cannot recognise strings from this type of language since we must be able to "remember" information somehow. CFG is accepted using Push-Down Automaton which is like a DFA except that we are also allowed to use a stack (memory). M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 4
Type 1 - Context-Sensitive Languages Context-Sensitive grammars may have more than one symbol on the left-hand-side of their production rules (provided that at least one of them is a non-terminal). However, the production rules must now obey the following: The number of symbols on the left-hand-side must not exceed the number of symbols on the right-hand-side We do not allow rules of the form A ε unless A is the start symbol and does not occur on the right-hand-side of any rule. M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 5
Type 0 - Unrestricted (Free) Languages Free grammars have absolutely no restrictions on their grammar rules, (except, of course, that there must be at least one non-terminal on the left-hand-side). The type of automata which can recognise such a language is basically a NFA/DFA with an infinitely-long list at its disposal to use as a store; this is called a Turing machine. M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 6
Context-free Grammars (CFG) M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 7
Grammar M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 8
Context-free Grammars Show how CFGs can be converted into normal forms, i.e. equivalent CFGs that have additional syntactic restrictions Use normal form to show that pushdown automata are the class of machines that accept CFLs. Parsing is the process of checking that a sequence of symbols is generated by a context-free grammar Consider classes of parsing algorithms expressible as restricted classes of pushdown automata (note: general pushdown automata are impractical to implement, unlike finite automata) M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 9
Context-free Grammars M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 10
Example M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 11
Example of using the grammar M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 12
Another Example M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 13
Example of using the grammar M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 14
Finite languages M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 15
Notation M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 16
Notation M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 17
Types of rules in CFG There are three types of rules in CFG 1. Union Rule: S A B 2. Production Rule: S AB 3. Closure Rule: S AS ε M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 18
Examples M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 19
More Examples L = ε, a, b, bb, aa, aaa, bbb,, a n, b n n 0 L = ε, ab, aabb, aaabbb,, a n b n n 0 L = ε, ab, abab, ababab,, ab n n 0 L = a n b m n, m 0 L = a n b m c k : n, m, k 0 Well balances parentheses: (()(()))() M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 20
Regular Languages are Context-free... M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 21
Regular Grammar M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 22
An equivalent definition The above definition is used in Hopcroft and Ullman. The previous definition is used in Temblay and Sorenson. M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 23
Conversion to restricted form M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 24
Regular Grammar to NFA M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 25
How to prove the construction is valid M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 26
Example M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 27
Example of conversion from grammar to FA M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 28
Example of conversion from grammar to FA M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 29
Example of conversion from grammar to FA M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 30
Another Example M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 31
Example (contd.) M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 32
Example: some arithmetic expressions M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 33
Ambiguity Parsing is the problem of: given a grammar and a string, find a derivation of the string using the grammar. (example on previous slide!) Given our claim that programming languages are often described using grammars, this is a key problem for compiler. We prefer unambiguous CFGs (ones where all derivations of a string are essentially the same, noting that variables of a CFG often correspond to specific structures of a program, e.g. arithmetic expression, procedure, statement, method etc. M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 34
Ambiguity (continued) M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 35
Parse Trees (a.k.a. syntax trees, derivation trees M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 36
Parse Trees (continued) M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 37
Rewriting a grammar M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 38
Rewriting a grammar (contd.) M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 39
Leftmost/rightmost derivations M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 40
Leftmost/rightmost derivations (contd.) M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 41
Another example of a simple ambiguous grammar M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 42
Example (contd.) M S Khan (Univ. of Liverpool) COMP218 Decision, Computation and Language 43