Context-Free Grammars and Languages. Context-Free Grammars and Languages p.1/40

Similar documents
CS 598 Natural Language Processing

Grammars & Parsing, Part 1:

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Language properties and Grammar of Parallel and Series Parallel Languages

Proof Theory for Syntacticians

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Parsing of part-of-speech tagged Assamese Texts

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

A General Class of Noncontext Free Grammars Generating Context Free Languages

A R "! I,,, !~ii ii! A ow ' r.-ii ' i ' JA' V5, 9. MiN, ;

A Version Space Approach to Learning Context-free Grammars

"f TOPIC =T COMP COMP... OBJ

Refining the Design of a Contracting Finite-State Dependency Parser

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Context Free Grammars. Many slides from Michael Collins

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Enumeration of Context-Free Languages and Related Structures

Some Principles of Automated Natural Language Information Extraction

systems have been developed that are well-suited to phenomena in but is properly contained in the indexed languages. We give a

Developing a TT-MCTAG for German with an RCG-based Parser

Grade 5 + DIGITAL. EL Strategies. DOK 1-4 RTI Tiers 1-3. Flexible Supplemental K-8 ELA & Math Online & Print

GACE Computer Science Assessment Test at a Glance

Focus of the Unit: Much of this unit focuses on extending previous skills of multiplication and division to multi-digit whole numbers.

Natural Language Processing. George Konidaris

AQUA: An Ontology-Driven Question Answering System

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Grade 6: Correlated to AGS Basic Math Skills

Hyperedge Replacement and Nonprojective Dependency Structures

Parsing natural language

Compositional Semantics

Analysis of Probabilistic Parsing in NLP

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Highlighting and Annotation Tips Foundation Lesson

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

GRAMMAR IN CONTEXT 2 PDF

Universiteit Leiden ICT in Business

Statewide Framework Document for:

Ohio s Learning Standards-Clear Learning Targets

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Achim Stein: Diachronic Corpora Aston Corpus Summer School 2011

The Interface between Phrasal and Functional Constraints

Character Stream Parsing of Mixed-lingual Text

Chapter 4 - Fractions

Specifying Logic Programs in Controlled Natural Language

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

NATURAL LANGUAGE PARSING AND REPRESENTATION IN XML EUGENIO JAROSIEWICZ

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Improved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation

Dear Teacher: Welcome to Reading Rods! Reading Rods offer many outstanding features! Read on to discover how to put Reading Rods to work today!

Common Core State Standards for English Language Arts

Formulaic Language and Fluency: ESL Teaching Applications

Accurate Unlexicalized Parsing for Modern Hebrew

Hindi Aspectual Verb Complexes

Learning Computational Grammars

4-3 Basic Skills and Concepts

Word Stress and Intonation: Introduction

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Developing a concrete-pictorial-abstract model for negative number arithmetic

Chapter 4: Valence & Agreement CSLI Publications

First Grade Curriculum Highlights: In alignment with the Common Core Standards

Emmaus Lutheran School English Language Arts Curriculum

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

The Smart/Empire TIPSTER IR System

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

CS 101 Computer Science I Fall Instructor Muller. Syllabus

Ch VI- SENTENCE PATTERNS.

Introducing the New Iowa Assessments Language Arts Levels 15 17/18

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Three New Probabilistic Models. Jason M. Eisner. CIS Department, University of Pennsylvania. 200 S. 33rd St., Philadelphia, PA , USA

Content Language Objectives (CLOs) August 2012, H. Butts & G. De Anda

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Type Theory and Universal Grammar

Learning to Think Mathematically With the Rekenrek

Reading Grammar Section and Lesson Writing Chapter and Lesson Identify a purpose for reading W1-LO; W2- LO; W3- LO; W4- LO; W5-

ABSTRACT. A major goal of human genetics is the discovery and validation of genetic polymorphisms

Linking Task: Identifying authors and book titles in verbose queries

Guidelines for Writing an Internship Report

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

Part I. Figuring out how English works

WSU Five-Year Program Review Self-Study Cover Page

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

TRAITS OF GOOD WRITING

Opportunities for Writing Title Key Stage 1 Key Stage 2 Narrative

Lecture 1: Basic Concepts of Machine Learning

CX 101/201/301 Latin Language and Literature 2015/16

PowerTeacher Gradebook User Guide PowerSchool Student Information System

PRODUCT PLATFORM DESIGN: A GRAPH GRAMMAR APPROACH

Dependency, licensing and the nature of grammatical relations *

Organizing Comprehensive Literacy Assessment: How to Get Started

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Subject: Opening the American West. What are you teaching? Explorations of Lewis and Clark

Ensemble Technique Utilization for Indonesian Dependency Parser

Transcription:

Context-Free Grammars and Languages Context-Free Grammars and Languages p.1/40

Limitations of finite automata There are languages, such as be described (specified) by NFAs or REs that cannot Context-Free Grammars and Languages p.2/40

Limitations of finite automata There are languages, such as be described (specified) by NFAs or REs Context-free grammars provide a more powerful mechanism for language specification that cannot Context-Free Grammars and Languages p.2/40

Limitations of finite automata There are languages, such as be described (specified) by NFAs or REs Context-free grammars provide a more powerful mechanism for language specification that cannot Context-free grammars can describe features that have a recursive structure making them useful beyond finite automata Context-Free Grammars and Languages p.2/40

Historical notes Context-free grammars were first used to study human languages Context-Free Grammars and Languages p.3/40

Historical notes Context-free grammars were first used to study human languages One way of understanding the relationship between syntactic categories (such as noun, verb, preposition, etc) and their respective phrases leads to natural recursion Context-Free Grammars and Languages p.3/40

Historical notes Context-free grammars were first used to study human languages One way of understanding the relationship between syntactic categories (such as noun, verb, preposition, etc) and their respective phrases leads to natural recursion This is because noun phrases may occur inside the verb phrases and vice versa. Context-Free Grammars and Languages p.3/40

Note Context-free grammars can capture important aspects of these relationships Context-Free Grammars and Languages p.4/40

Important application Context-free grammars are used as basis for compiler design and implementation Context-Free Grammars and Languages p.5/40

Important application Context-free grammars are used as basis for compiler design and implementation Context-free grammars are used as specification mechanisms for programming languages Context-Free Grammars and Languages p.5/40

Important application Context-free grammars are used as basis for compiler design and implementation Context-free grammars are used as specification mechanisms for programming languages Designers of compilers use such grammars to implement compiler s components, such a scanners, parsers, and code generators Context-Free Grammars and Languages p.5/40

Important application Context-free grammars are used as basis for compiler design and implementation Context-free grammars are used as specification mechanisms for programming languages Designers of compilers use such grammars to implement compiler s components, such a scanners, parsers, and code generators The implementation of any programming language is preceded by a context-free grammar that specifies it Context-Free Grammars and Languages p.5/40

Context-free languages The collection of languages specified by context-free grammars are called context-free languages Context-Free Grammars and Languages p.6/40

Context-free languages The collection of languages specified by context-free grammars are called context-free languages Context-free languages include regular languages and many others Context-Free Grammars and Languages p.6/40

Context-free languages The collection of languages specified by context-free grammars are called context-free languages Context-free languages include regular languages and many others Here we will study the formal concepts of context-free grammars and context-free languages Context-Free Grammars and Languages p.6/40

Notations Abbreviate the phrase context-free grammar to CFG. Context-Free Grammars and Languages p.7/40

Notations Abbreviate the phrase context-free grammar to CFG. Abbreviate the phrase context-free language to CFL. Context-Free Grammars and Languages p.7/40

Notations Abbreviate the phrase context-free grammar to CFG. Abbreviate the phrase context-free language to CFL. Abbreviate the concept of a CFG specification rule to the tuple side and where stands for right hand side. stands for left hand Context-Free Grammars and Languages p.7/40

More on specification rules The of a specification rule is also called variable and is denoted by capital letters Context-Free Grammars and Languages p.8/40

More on specification rules The of a specification rule is also called variable and is denoted by capital letters The of a specification rule is also called a specification pattern and consists of a string of variables and constants Context-Free Grammars and Languages p.8/40

More on specification rules The of a specification rule is also called variable and is denoted by capital letters The of a specification rule is also called a specification pattern and consists of a string of variables and constants The variables that occur in a specification pattern are also called nonterminal symbols; the constants that occur in a specification pattern are also called terminal symbols Context-Free Grammars and Languages p.8/40

CFG: Informal A CFG grammar consists of a collection of specification rules where one variable is designated as start symbol or axiom Context-Free Grammars and Languages p.9/40

CFG: Informal A CFG grammar consists of a collection of specification rules where one variable is designated as start symbol or axiom Example: the CFG rules: has the following specification Context-Free Grammars and Languages p.9/40

CFG: Informal A CFG grammar consists of a collection of specification rules where one variable is designated as start symbol or axiom Example: the CFG rules: has the following specification Context-Free Grammars and Languages p.9/40

Note Nonterminals of CFG are and is the axiom Context-Free Grammars and Languages p.10/40

Note Nonterminals of CFG are and is the axiom Terminals of CFG are Context-Free Grammars and Languages p.10/40

More terminology The specification rules of a CFG are also called productions or substitution rules Context-Free Grammars and Languages p.11/40

More terminology The specification rules of a CFG are also called productions or substitution rules Nonterminals used in the specification rules defining a CFG may be strings Context-Free Grammars and Languages p.11/40

More terminology The specification rules of a CFG are also called productions or substitution rules Nonterminals used in the specification rules defining a CFG may be strings Terminals in the specification rules defining a CFG are constant strings Context-Free Grammars and Languages p.11/40

Terminals Terminals used in CFG specification rules are analogous to the input alphabet of an automaton Context-Free Grammars and Languages p.12/40

Terminals Terminals used in CFG specification rules are analogous to the input alphabet of an automaton Example terminals used in CFG-s are letters of an alphabet, numbers, special symbols, and strings of such elements. Context-Free Grammars and Languages p.12/40

Terminals Terminals used in CFG specification rules are analogous to the input alphabet of an automaton Example terminals used in CFG-s are letters of an alphabet, numbers, special symbols, and strings of such elements. Strings used to denote terminals in CFG specification rules are quoted Context-Free Grammars and Languages p.12/40

Language specification A CFG is used as a language specification mechanism by generating each string of the language in following manner: Context-Free Grammars and Languages p.13/40

Language specification A CFG is used as a language specification mechanism by generating each string of the language in following manner: 1. Write down the start variable; it is the of one of the specification rules,the top rule, unless specified otherwise Context-Free Grammars and Languages p.13/40

Language specification A CFG is used as a language specification mechanism by generating each string of the language in following manner: 1. Write down the start variable; it is the of one of the specification rules,the top rule, unless specified otherwise 2. Find a variable that is written down and a rule whose variable. Replace the written down variable with the rule is that of that Context-Free Grammars and Languages p.13/40

Language specification A CFG is used as a language specification mechanism by generating each string of the language in following manner: 1. Write down the start variable; it is the of one of the specification rules,the top rule, unless specified otherwise 2. Find a variable that is written down and a rule whose variable. Replace the written down variable with the rule is that of that 3. Repeat step 2 until no variables remain in the string thus generated Context-Free Grammars and Languages p.13/40

Example string generation Using CFG follows: we can generate the string 000#111 as A 0A1 00A11 000A111 000B111 000#111 Context-Free Grammars and Languages p.14/40

Example string generation Using CFG follows: we can generate the string 000#111 as A 0A1 00A11 000A111 000B111 000#111 Note: The sequence of substitutions used to obtain a string using a CFG is called a derivation and may be represented by a tree called a derivation tree or a parse tree Context-Free Grammars and Languages p.14/40

Example derivation tree The derivation tree of the string 000#111 using CFG in Figure 1 A is A A A B 0 0 0 # 1 1 1 Figure 1: Derivation tree for 000#111 Context-Free Grammars and Languages p.15/40

Note All strings of terminals generated in this way constitute the language specified by the grammar Context-Free Grammars and Languages p.16/40

Note All strings of terminals generated in this way constitute the language specified by the grammar We write grammar. Thus, for the language generated by the. Context-Free Grammars and Languages p.16/40

Note All strings of terminals generated in this way constitute the language specified by the grammar We write grammar. Thus, for the language generated by the. The language generated by a context-free grammar is called a Context-Free Language, CFL. Context-Free Grammars and Languages p.16/40

More notations To distinguish nonterminal from terminal strings we often enclose nonterminals in angular parentheses,, and terminals in quotes,". Context-Free Grammars and Languages p.17/40

More notations To distinguish nonterminal from terminal strings we often enclose nonterminals in angular parentheses,, and terminals in quotes,". If two or more rules have the same example using the form and with the meaning of an or"., as in the, we may compact them where is used Context-Free Grammars and Languages p.17/40

Example compaction The rules. and may be written as Context-Free Grammars and Languages p.18/40

$ CFG specifies a fragment of English The CFG $ # "! % Context-Free Grammars and Languages p.19/40

Note The CFG has ten variables (capitalized and in angular brackets) and 9 terminals (written in the standard English alphabet) plus a space character Context-Free Grammars and Languages p.20/40

Note The CFG has ten variables (capitalized and in angular brackets) and 9 terminals (written in the standard English alphabet) plus a space character Also, the CFG has 18 rules Context-Free Grammars and Languages p.20/40

Note The CFG has ten variables (capitalized and in angular brackets) and 9 terminals (written in the standard English alphabet) plus a space character Also, the CFG has 18 rules Examples strings that belongs to a boy sees the boy sees a flower are: a girl with a flower likes the boy Context-Free Grammars and Languages p.20/40

Example derivation with Context-Free Grammars and Languages p.21/40

Formal definition of a CFG A context-free grammar is a 4-tuple where: Context-Free Grammars and Languages p.22/40

Formal definition of a CFG A context-free grammar is a 4-tuple where: 1. is a finite set of strings called the variables or nonterminals Context-Free Grammars and Languages p.22/40

Formal definition of a CFG A context-free grammar is a 4-tuple where: 1. is a finite set of strings called the variables or nonterminals 2. is a finite set of strings, disjoint from terminals, called Context-Free Grammars and Languages p.22/40

Formal definition of a CFG A context-free grammar is a 4-tuple where: 1. is a finite set of strings called the variables or nonterminals 2. is a finite set of strings, disjoint from terminals, called 3. form is a finite set of rules (or specification rules) of the, where, Context-Free Grammars and Languages p.22/40

Formal definition of a CFG A context-free grammar is a 4-tuple where: 1. is a finite set of strings called the variables or nonterminals 2. is a finite set of strings, disjoint from terminals, called 3. form is a finite set of rules (or specification rules) of the, where, 4. is the start variable (or grammar axiom) Context-Free Grammars and Languages p.22/40

Example CFG grammar where is: Context-Free Grammars and Languages p.23/40

Direct derivation If terminals) and grammar) then we say that (i.e., are strings of variables and (i.e., is a rule of the yields, written Context-Free Grammars and Languages p.24/40

Direct derivation If terminals) and grammar) then we say that (i.e., are strings of variables and (i.e., is a rule of the yields, written We may also say that using the rule is directly derived from Context-Free Grammars and Languages p.24/40

% % Derivation We write or if a sequence if, and exists, for Context-Free Grammars and Languages p.25/40

% % % Derivation or if a sequence if We write, and exists, for is a derivation of We may also say that from Context-Free Grammars and Languages p.25/40

Language specified by If (or the language of is a CFG then the language specified by ) is Context-Free Grammars and Languages p.26/40

Note Often we specify a grammar by writing down only its rules Context-Free Grammars and Languages p.27/40

Note Often we specify a grammar by writing down only its rules We can identify the variables as the symbols that appear only as the of the rules Context-Free Grammars and Languages p.27/40

Note Often we specify a grammar by writing down only its rules We can identify the variables as the symbols that appear only as the of the rules Terminals are the remaining strings used in the rules Context-Free Grammars and Languages p.27/40

More examples of CFGs Consider the grammar: Context-Free Grammars and Languages p.28/40

More examples of CFGs Consider the grammar: Context-Free Grammars and Languages p.28/40

More examples of CFGs Consider the grammar: contains strings such as: Context-Free Grammars and Languages p.28/40

More examples of CFGs Consider the grammar: contains strings such as: abab, aaabbb, aababb; Context-Free Grammars and Languages p.28/40

More examples of CFGs Consider the grammar: contains strings such as: abab, aaabbb, aababb; Note: if one think at parentheses as then we can see that is the language of all strings of properly nested Context-Free Grammars and Languages p.28/40

Arithmetic expressions Consider the grammar: where is: Context-Free Grammars and Languages p.29/40

Arithmetic expressions Consider the grammar: where is: Context-Free Grammars and Languages p.29/40

Arithmetic expressions Consider the grammar: where is: is the language of arithmetic expressions Context-Free Grammars and Languages p.29/40

Note The variables and constants in by the terminal are represented Context-Free Grammars and Languages p.30/40

Note The variables and constants in by the terminal are represented Arithmetic operations in are addition, represented by +, and multiplication, represented by * Context-Free Grammars and Languages p.30/40

Note The variables and constants in by the terminal are represented Arithmetic operations in are addition, represented by +, and multiplication, represented by * An examples of a derivation using is in Figure 2 Context-Free Grammars and Languages p.30/40

Example derivation with E E + T T T * F F F a a a Figure 2: Derivation tree for a+a*a Context-Free Grammars and Languages p.31/40

Designing CFGs As with the design of automata, the design of CFGs requires creativity Context-Free Grammars and Languages p.32/40

Designing CFGs As with the design of automata, the design of CFGs requires creativity CFGs are even trickier to construct than finite automata because we are more accustomed to programming a machine than we are to specify programming languages" Context-Free Grammars and Languages p.32/40

Design techniques Many CFG are unions of simpler CFGs. Hence the suggestion is to construct smaller, simpler grammars first and then to join them into a larger grammar Context-Free Grammars and Languages p.33/40

Design techniques Many CFG are unions of simpler CFGs. Hence the suggestion is to construct smaller, simpler grammars first and then to join them into a larger grammar The mechanism of grammar combination consists of putting all their rules together and adding the new rules % where the variables are the start variables of the individual grammars and is a new variable,, Context-Free Grammars and Languages p.33/40

Example grammar design Design a grammar for the language Context-Free Grammars and Languages p.34/40

Example grammar design Design a grammar for the language 1. Construct the grammar that generates Context-Free Grammars and Languages p.34/40

Example grammar design Design a grammar for the language 1. Construct the grammar that generates 2. Construct the grammar that generates Context-Free Grammars and Languages p.34/40

Example grammar design Design a grammar for the language 1. Construct the grammar that generates 2. Construct the grammar that generates 3. Put them together adding the rule thus getting Context-Free Grammars and Languages p.34/40

Second design technique Constructing a CFG for a regular language is easy if one can first construct a DFA for that language Context-Free Grammars and Languages p.35/40

Second design technique Constructing a CFG for a regular language is easy if one can first construct a DFA for that language Conversion procedure: Context-Free Grammars and Languages p.35/40

Second design technique Constructing a CFG for a regular language is easy if one can first construct a DFA for that language Conversion procedure: 1. Make a variable for each state of DFA Context-Free Grammars and Languages p.35/40

Second design technique Constructing a CFG for a regular language is easy if one can first construct a DFA for that language Conversion procedure: 1. Make a variable for each state of DFA 2. Add the rule to the CFG if is a transition in the DFA Context-Free Grammars and Languages p.35/40

Second design technique Constructing a CFG for a regular language is easy if one can first construct a DFA for that language Conversion procedure: 1. Make a variable for each state of DFA 2. Add the rule to the CFG if transition in the DFA 3. Add the rule if is a is an accept state of the DFA Context-Free Grammars and Languages p.35/40

Second design technique Constructing a CFG for a regular language is easy if one can first construct a DFA for that language Conversion procedure: 1. Make a variable for each state of DFA 2. Add the rule to the CFG if transition in the DFA 3. Add the rule if 4. If is the start state of the DFA make the CFG. is a is an accept state of the DFA the start variable of Context-Free Grammars and Languages p.35/40

Note Verify that CFG constructed by the conversion of a DFA into a CFG generates the language that the DFA recognizes Context-Free Grammars and Languages p.36/40

Third design technique Certain CFLs contain strings with two related substrings as are and in Context-Free Grammars and Languages p.37/40

Third design technique Certain CFLs contain strings with two related substrings as are and in Example of relationship: to recognize such a language a machine would need to remember an unbounded amount of info about one of the substrings Context-Free Grammars and Languages p.37/40

Note A CFG that handles this situation uses a rule of the form which generates strings wherein the portion containing s corresponds to the portion containing s Context-Free Grammars and Languages p.38/40

Fourth design technique In a complex language, strings may contain certain structures that appear recursively Context-Free Grammars and Languages p.39/40

Fourth design technique In a complex language, strings may contain certain structures that appear recursively Example: in arithmetic expressions any time the symbol a appear, the entire parenthesized expression may appear. Context-Free Grammars and Languages p.39/40

Note To achieve this effect one needs to place the variable generating the structure ( in case of ) in the location of the rule corresponding to where the structure may recursively appear as in in case of Context-Free Grammars and Languages p.40/40