The Chomsky Hierarchy comprises four types of languages and their associated grammars and machines. Language Grammar Machine Example

Similar documents
Language properties and Grammar of Parallel and Series Parallel Languages

A R "! I,,, !~ii ii! A ow ' r.-ii ' i ' JA' V5, 9. MiN, ;

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Grammars & Parsing, Part 1:

Enumeration of Context-Free Languages and Related Structures

CS 598 Natural Language Processing

A Version Space Approach to Learning Context-free Grammars

(Sub)Gradient Descent

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

A General Class of Noncontext Free Grammars Generating Context Free Languages

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Proof Theory for Syntacticians

Natural Language Processing. George Konidaris

Lecture 1: Machine Learning Basics

Lecture 10: Reinforcement Learning

A Grammar for Battle Management Language

Refining the Design of a Contracting Finite-State Dependency Parser

"f TOPIC =T COMP COMP... OBJ

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Parsing of part-of-speech tagged Assamese Texts

Probability and Game Theory Course Syllabus

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

systems have been developed that are well-suited to phenomena in but is properly contained in the indexed languages. We give a

Context Free Grammars. Many slides from Michael Collins

On the Polynomial Degree of Minterm-Cyclic Functions

CS 101 Computer Science I Fall Instructor Muller. Syllabus

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Lecture 1: Basic Concepts of Machine Learning

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

GRAMMAR IN CONTEXT 2 PDF

The Strong Minimalist Thesis and Bounded Optimality

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Learning to Think Mathematically With the Rekenrek

PRODUCT PLATFORM DESIGN: A GRAPH GRAMMAR APPROACH

An Introduction to the Minimalist Program

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Using dialogue context to improve parsing performance in dialogue systems

Parsing natural language

Self Study Report Computer Science

Rule Learning With Negation: Issues Regarding Effectiveness

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

AP Calculus AB. Nevada Academic Standards that are assessable at the local level only.

Statewide Framework Document for:

Software Maintenance

ABSTRACT. A major goal of human genetics is the discovery and validation of genetic polymorphisms

WSU Five-Year Program Review Self-Study Cover Page

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Developing a TT-MCTAG for German with an RCG-based Parser

Lecture 1.1: What is a group?

Hyperedge Replacement and Nonprojective Dependency Structures

Compositional Semantics

Detecting English-French Cognates Using Orthographic Edit Distance

Computer Science (CS)

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

THE ANTINOMY OF THE VARIABLE: A TARSKIAN RESOLUTION Bryan Pickel and Brian Rabern University of Edinburgh

Summer Assignment AP Literature and Composition Mrs. Schwartz

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Chapter 2 Rule Learning in a Nutshell

Liquid Narrative Group Technical Report Number

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

Cognitive Modeling. Tower of Hanoi: Description. Tower of Hanoi: The Task. Lecture 5: Models of Problem Solving. Frank Keller.

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

Reinforcement Learning by Comparing Immediate Reward

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Writing Research Articles

SARDNET: A Self-Organizing Feature Map for Sequences

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Probability estimates in a scenario tree

An extended dual search space model of scientific discovery learning

Pre-Processing MRSes

Unraveling symbolic number processing and the implications for its association with mathematics. Delphine Sasanguie

A. True B. False INVENTORY OF PROCESSES IN COLLEGE COMPOSITION

Backwards Numbers: A Study of Place Value. Catherine Perez

Multiple case assignment and the English pseudo-passive *

GACE Computer Science Assessment Test at a Glance

Discriminative Learning of Beam-Search Heuristics for Planning

Improving Fairness in Memory Scheduling

ARNE - A tool for Namend Entity Recognition from Arabic Text

Hans-Ulrich Block, Hans Haugeneder Siemens AG, MOnchen ZT ZTI INF W. Germany. (2) [S' [NP who][s does he try to find [NP e]]s IS' $=~

Multimedia Application Effective Support of Education

Aspectual Classes of Verb Phrases

Evolution of Collective Commitment during Teamwork

The Interface between Phrasal and Functional Constraints

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Rule Learning with Negation: Issues Regarding Effectiveness

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

Ensemble Technique Utilization for Indonesian Dependency Parser

Transcription:

Grammars. Numeric functions (Chapter 4, Sections 4.6, 4.7) CmSc 365 Theory of Computation 1. Grammars Grammars are language generators. They consist of an alphabet of terminal symbols, alphabet of non-terminal symbols, a starting symbol and rules. Each language, generated by some grammar, can be recognized by some automaton. Languages (and the corresponding ) can be classified according to the minimal automaton sufficient to recognize them. Such classification, known as Chomsky Hierarchy, has been defined by Noam Chomsky, a distinguished linguist with major contributions to linguistics. The Chomsky Hierarchy comprises four types of and their associated and machines. Type 3 Language Grammar Machine Example Regular Regular Right-linear Left-linear Deterministic or nondeterministic finitestate automata a* Type 2 Context-free Context-free Nondeterministic pushdown automata a b Type 1 Context-sensitive Context-sensitive Linear-bound automata a b c Type 0 Recursive and recursively enumerable Unrestricted Turing machines Any computable function Regular expressions do not have non-terminal symbols, instead they have rules to describe expressions. Context-free use terminal and non-terminal symbols. Their rules have a restriction - only one non-terminal symbol in their left-hand side Unrestricted - the rules of these do not have the restriction above - their left-hand sides may contain any string of terminal and /or non-terminal symbols, provided there is at least one non-terminal symbol. 1

The types of form a strict hierarchy; that is, regular context-free context-sensitive recursive recursively enumerable. The distinction between can be seen by examining the structure of the grammar rules of their grammar, or the nature of the automata which can be used to identify them. Type 3 - Regular Languages As we have discussed, a regular language is one which can be represented by a regular grammar, described using a regular expression, or accepted using an FSA. There are two kinds of regular grammar: Right-linear (right-regular), with rules of the form A B or A, where A and B are single non-terminal symbols, is a terminal symbol Parse trees with these are right-branching. Left-linear (left-regular), with rules of the form A B or A Parse trees with these are left-branching Examples of regular are pattern matching (regular expressions) Type 2 - Context-Free Languages A Context-Free Grammar (CFG) is one whose production rules are of the form: A where A is any single non-terminal, and is any combination of terminals and non-terminals. The minimal automaton that recognizes context-free is a push-down automaton. It uses stack when expanding the non-terminal symbols with the righthand side of the corresponding grammar rule. Examples of CFLs are some simple programming Type 1 - Context-Sensitive Languages Context-Sensitive may have more than one symbol on the left-handside of their grammar rules, provided that at least one of them is a non-terminal and the number of symbols on the left-hand-side does not exceed the number of symbols on the right-hand-side. Their rules have the form: 2

A where A is a single non-terminal symbol, and are any combination of terminals and non-terminals. Since we allow more than one symbol on the left-hand-side, we refer to those symbols other than the one we are replacing as the context of the replacement. The automaton which recognizes a context-sensitive language is called a linearbounded automaton: an FSA with a memory to store symbols in a list. Since the number of the symbols on the left-hand side is always smaller or equal to the number of the symbols on the right-hand side, the length of each derivation string is increased when applying a grammar rule. This length is bounded by the length of the input string. Thus a linear-bounded automaton always needs a finite list as its store Examples of context-sensitive are most programming Type 0 - Unrestricted (Free) Languages Unrestricted have no restrictions on their grammar rules, except that there must be at least one non-terminal on the left-hand-side. The rules have the form where and are arbitrary strings of terminal and non-terminal symbols and (the empty string) The type of automata which can recognize such a language is a Turing machine, with an infinitely long memory. Examples of unrestricted are almost all natural. Turing Machines and Grammars A language is recursively enumerable if there exists a Turing machine that accepts every string of the language, and does not accept strings that are not in the language. "Does not accept" is not the same as "reject" -- the Turing machine could go into an infinite loop instead, and never get around to either accepting or rejecting the string. The generated by unrestricted are precisely the recursively enumerable. Theorem. Any language generated by an unrestricted grammar is recursively enumerable. Theorem: A language is generated by an unrestricted grammar if and only if it is recursively enumerable. 3

Are all recursively enumerable? The answer is no. Regular and context-free are recursive. This means that a Turing machine can say whether a string belongs to the language or not. The complement of a recursive language is also a recursive language - follows from the fact that we can reverse the "yes" answer to a "no" answer. Recursive are also recursively enumerable - we can change the halting "no" configurations to configurations with non-halting states. However, there are recursively enumerable that are not recursive. They are generated by unrestricted. A Turing machine semidecides such - it can say whether a string belongs to the language. However, if the string does not belong to the language the machine never stops. The complement of recursively enumerable is not recursively enumerable - we cannot change a non-existing answer to a "yes" answer. Non-recursive cannot be generated by a grammar - there is no grammar that can describe them. Each formal grammar has a finite description and therefore can be considered as a string. Thus, the set of all formal is infinitely countable. The set of all over an alphabet is the power set of all strings over that alphabet. We have shown that power sets of infinite sets are not countable. Therefore there is no oneto-one match between and. 2. Numerical functions Recursive language - a language that can be decided by a Turing machine Recursive function - a function that can be computed by a Turing machine Why do we use the word "recursive"? It turns out that functions computable by a Turing machine can be represented by means of very simple, basic functions using composition and recursive definition. Three basic numerical functions, so simple that their computability is obvious: 1. Zero function: matches a tuple to zero z k (n 1,n 2, n k ) = 0 for any k 4

2. Identity function: matches a tuple to a number within the tuple: Id j,k (n 1,n 2, n k ) = n j, 0 < j k Example: id 3,5 (1,3,5,7,9) = 5 3. Successor function: defines the natural numbers: s(0) = 1 s(n) = n+1 Using these three functions we can define more complex functions. Examples: Addition: plus(m,0) = m plus(m,n+1) = s(plus(m,n)) Multiplication: mult(m,0) = 0 mult(m,n+1) = plus(m,mult(m,n)) It can be proved that all computable functions can be obtained from these primitive functions and vice versa - all functions that can be obtained are computable. Question: is f(x) = x 2, for x - real number, computable function? The answer is: No. The reason - we cannot represent all real numbers. 5