Context-free Grammars

Similar documents
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Parsing of part-of-speech tagged Assamese Texts

Grammars & Parsing, Part 1:

Some Principles of Automated Natural Language Information Extraction

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

CS 598 Natural Language Processing

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

A R "! I,,, !~ii ii! A ow ' r.-ii ' i ' JA' V5, 9. MiN, ;

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Developing a TT-MCTAG for German with an RCG-based Parser

Language properties and Grammar of Parallel and Series Parallel Languages

Proof Theory for Syntacticians

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Parsing natural language

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

A General Class of Noncontext Free Grammars Generating Context Free Languages

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

A relational approach to translation

Analysis of Probabilistic Parsing in NLP

PH.D. IN COMPUTER SCIENCE PROGRAM (POST M.S.)

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Context Free Grammars. Many slides from Michael Collins

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Prediction of Maximal Projection for Semantic Role Labeling

Compositional Semantics

cambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN

Refining the Design of a Contracting Finite-State Dependency Parser

An Introduction to the Minimalist Program

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Efficient Normal-Form Parsing for Combinatory Categorial Grammar

Using dialogue context to improve parsing performance in dialogue systems

Aspects Of The Theory Of Syntax (Massachusetts Institute Of Technology. Research Laboratory Of Electronics. Special Technical Report No.

Minimalism is the name of the predominant approach in generative linguistics today. It was first

Type-driven semantic interpretation and feature dependencies in R-LFG

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

SOME MINIMAL NOTES ON MINIMALISM *

An Interactive Intelligent Language Tutor Over The Internet

Natural Language Processing. George Konidaris

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

The Strong Minimalist Thesis and Bounded Optimality

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Chapter 4: Valence & Agreement CSLI Publications

LFG Semantics via Constraints

Argument structure and theta roles

Construction Grammar. University of Jena.

Update on Soar-based language processing

Page 1 of 11. Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General. Grade(s): None specified

Specifying Logic Programs in Controlled Natural Language

Character Stream Parsing of Mixed-lingual Text

A Grammar for Battle Management Language

The Interface between Phrasal and Functional Constraints

A Comparison of Two Text Representations for Sentiment Analysis

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

LTAG-spinal and the Treebank

Constraining X-Bar: Theta Theory

Type Theory and Universal Grammar

Adapting Stochastic Output for Rule-Based Semantics

Generation of Referring Expressions: Managing Structural Ambiguities

School of Innovative Technologies and Engineering

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Language Evolution, Metasyntactically. First International Workshop on Bidirectional Transformations (BX 2012)

AQUA: An Ontology-Driven Question Answering System

Hans-Ulrich Block, Hans Haugeneder Siemens AG, MOnchen ZT ZTI INF W. Germany. (2) [S' [NP who][s does he try to find [NP e]]s IS' $=~

LNGT0101 Introduction to Linguistics

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Pre-Processing MRSes

Som and Optimality Theory

Multiple case assignment and the English pseudo-passive *

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Accurate Unlexicalized Parsing for Modern Hebrew

Evolution of Collective Commitment during Teamwork

Hyperedge Replacement and Nonprojective Dependency Structures

How Does Physical Space Influence the Novices' and Experts' Algebraic Reasoning?

The semantics of case *

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1

Ensemble Technique Utilization for Indonesian Dependency Parser

"f TOPIC =T COMP COMP... OBJ

The Discourse Anaphoric Properties of Connectives

Improving Fairness in Memory Scheduling

A Version Space Approach to Learning Context-free Grammars

TEACHING AND EXAMINATION REGULATIONS PART B: programme-specific section MASTER S PROGRAMME IN LOGIC

Words come in categories

Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition

Agent-Based Software Engineering

Pseudo-Passives as Adjectival Passives

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

Transcription:

Context-free Grammars Natural & Programming Languages Laureats Visit July 19, 2013 1/22

Example of a Programming Language: Go designed by Google (2012) documentation : specifies the syntax uses a context-free grammar 2/22

Example of a Programming Language: Go tool shipped with Go: YACC generates a parser from a grammar allows for creating, editing, adapting the syntax of programming languages 2/22

Pāṇini ( 350 BC) : Aṣṭādhyāyī Sanskrit grammar about 4000 rules formal rules: A B / C D rewrite A to B in the context C D auxiliary symbols 3/22

Chomsky (1956) : Three Models for the Description of Language 1. finite-state automata 2. phrase-structure grammars 3. transformational grammars N. Chomsky 4/22

Modeling a language set of sentences syntax vs. semantics: The child eats a tomato. A tomato eats the child. *A tomato the child eats. competence vs. performance: The child eats a nice tomato. The child eats a nice round tomato. The child eats a nice red round tomato.... 5/22

Constituents Analysis [[The child] [eats [a tomato]]]. [[The child] [eats [a [nice tomato]]]]. 6/22

Constituents Analysis (ctd.) P NP VP det AP v NP The n eats det AP child a adj AP nice n tomato 7/22

Context-free Grammars Special case of phrase-structured grammars: empty contexts P NP VP NP det AP VP v NP AP adj AP n det The a n child tomato v eats adj nice red round 8/22

Backus (1959); Naur (1960): Algol 60 ALGOrithmic Language standard syntax statement unconditional statement conditional statement unconditional statement for statement conditional statement if statement if statement else statement if statement if clause unconditional statement if clause if boolean expression then J. Backus 9/22

Ginsburg and Rice (1962) : Two families of languages related to ALGOL connection between Algol and Chomsky s work pluri-disciplinary research: linguistics programming languages Y. Bar-Hillel theoretical computer science (Chomsky, 1959; Bar-Hillel et al., 1961; Chomsky and Schützenberger, 1963,...) M.P. Schützenberger 10/22

Pushdown Automata Yngve (1960); Oettinger (1961); Chomsky (1962) operational model, easy implementation expressivity equivalent to that of context-free grammars idea of parsing: generate a pushdown automaton from a grammar 11/22

Pushdown Automata (ctd.) (q,ε,,ε,q f ) (q,ε,p,np VP,q) (q,ε,np,det AP,q)... (q,ε,det,the,q) (q,ε,det,a,q) (q,the,the,ε,q) (q,a,a,ε,q)... 12/22

Issues Floyd (1962b): Algol 60 is not context-free: begin real x; y := 3 end is only correct if the two identifiers x and y are the same. separation into lexical analysis, parsing, and semantics analysis R.W. Floyd 13/22

Issues Cantor (1962); Floyd (1962a): Algol 60 is ambiguous: several possible analyses for some programs inherently ambiguous languages (Parikh, 1966; Ginsburg and Ullian, 1966) undecidable properties R. Parikh 13/22

Issues the first parsers impose very stringent restrictions on grammars (Irons, 1961) ideally: deterministic pushdown automata (Ginsburg and Greibach, 1966) not derivable from any grammar undecidable properties S. Greibach 13/22

... and Answers parser generators for larger and larger classes of grammars Knuth (1965): LR parsing for all the deterministic languages DeRemer (1969) : simplifications (SLR & LALR) YACC (Johnson, 1975) : LALR(1) parser generator D.E. Knuth 14/22

Today All the mainstream programming languages are shipped with a context-free grammar that specifies their syntax a parser generator (most likely a YACC variant) allowing to write parsers for new languages 15/22

Syntax Models context-free grammars (rewriting systems) pushdown automata (transition systems) algebraic equations (equations systems) categorial grammars (proof systems) dynamic logic on trees (model theory) 16/22

Syntax Models context-free grammars (rewriting systems) pushdown automata (transition systems) algebraic equations (equations systems) categorial grammars (proof systems) dynamic logic on trees (model theory) 16/22

Algebraic Equations (Ginsburg and Rice, 1962; Chomsky and Schützenberger, 1963) Minimal solutions of a system P = NP VP NP = det AP VP = v NP AP = adj AP n det = {The} {a} n = {child} {tomato} v = {eats} adj = {nice} {round} {red} 17/22

Categorial Grammars (Bar-Hillel, 1953; Lambek, 1958) Categories built using left and right quotients over a finite set of symbols A: γ ::= A γ 1 \γ 2 γ 1 /γ 2 (categories) Deduction rules: w γ Lexicon w 1 γ 1 w 2 γ 1 \γ 2 \ w 1 w 2 γ 2 w 1 γ 2 /γ 1 w 2 γ 1 / w 1 w 2 γ 2 J. Lambek 18/22

Proofs Example The NP/n child n / The child NP eats (P\NP)/NP eats a tomato P\NP \ The child eats a tomato P a NP/n tomato n / a tomato NP / 19/22

Logics on Trees (Blackburn et al., 1993; Afanasiev et al., 2005) Modal logic on a set of atomic propositions p ϕ ::= p ϕ ϕ 1 ϕ 2 π ϕ π ::= π (formulæ) (relations) P. Blackburn 20/22

Models An ordered finite labeled tree t in a node n: t,n = t,n = p t,n = ϕ if the label of n is p if t,n = ϕ t,n = ϕ 1 ϕ 2 if t,n = ϕ 1 and t,n = ϕ 2 t,n = π ϕ if n,n π n and t,n = ϕ 21/22

Formulæ Example P [ ][ ]( X Σ N(X Y X Y) ( ) ( a) ( ) ( a Σ A NA) P (NP VP ) AP (adj AP ) (n ) det (The ) (a )...) 22/22

References References Afanasiev, L., Blackburn, P., Dimitriou, I., Gaiffe, B., Goris, E., Marx, M., and de Rijke, M., 2005. PDL for ordered trees. Journal of Applied Non-Classical Logic, 15(2):115 135. doi:10.3166/jancl.15.115-135. Aho, A.V., Johnson, S.C., and Ullman, J.D., 1975. Deterministic parsing of ambiguous grammars. Communications of the ACM, 18(8):441 452. doi:10.1145/360933.360969. Backus, J.W., 1959. The syntax and semantics of the proposed international algebraic language of the Zürich ACM-GAMM Conference. In IFIP Congress, pages 125 131. Bar-Hillel, Y., Perles, M., and Shamir, E., 1961. On formal properties of simple phrase-structure grammars. Zeitschrift für Phonetik, Sprachwissenschaft, und Kommunikations-forschung, 14:143 172. Bar-Hillel, Y., 1953. A quasi-arithmetical notation for syntactic description. Language, 29(1):47 58. doi:10.2307/410452. Blackburn, P., Gardent, C., and Meyer-Viol, W., 1993. Talking about trees. In EACL 93, pages 21 29. ACL Press. doi:10.3115/976744.976748. Cantor, D.G., 1962. On the ambiguity problem of Backus systems. Journal of the ACM, 9(4):477 479. doi:10.1145/321138.321145. Chomsky, N., 1956. Three models for the description of language. IEEE Transactions on Information Theory, 2(3): 113 124. doi:10.1109/tit.1956.1056813. Chomsky, N., 1959. On certain formal properties of grammars. Information and Control, 2(2):137 167. doi:10.1016/s0019-9958(59)90362-6. Chomsky, N., 1962. Context-free grammars and pushdown storage. Quarterly Progress Report 65, Research Laboratory of Electronics, M.I.T. Chomsky, N. and Schützenberger, M.P., 1963. The algebraic theory of context-free languages. In Braffort, P. and Hirshberg, D., editors, Computer Programming and Formal Systems, volume 35 of Studies in Logic, pages 118 161. North-Holland Publishing. doi:10.1016/s0049-237x(08)72023-8. DeRemer, F.L., 1969. Practical Translators for LR(k) Languages. PhD thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts. http://www.lcs.mit.edu/publications/pubs/pdf/mit-lcs-tr-065.pdf. Earley, J., 1975. Ambiguity and precedence in syntax description. Acta Informatica, 4(2):183 192. doi:10.1007/bf00288747. Floyd, R.W., 1962a. On ambiguity in phrase structure languages. Communications of the ACM, 5(10):526. doi:10.1145/368959.368993. Floyd, R.W., 1962b. On the nonexistence of a phrase structure grammar for ALGOL 60. Communications of the ACM, 5(9):483 484. doi:10.1145/368834.368898. Ginsburg, S. and Rice, H.G., 1962. Two families of languages related to ALGOL. Journal of the ACM, 9(3):350 371. doi:10.1145/321127.321132. Ginsburg, S. and Greibach, S., 1966. Deterministic context-free languages. Information and Control, 9(6):620 648. doi:10.1016/s0019-9958(66)80019-0. 23/22

References References Ginsburg, S. and Ullian, J., 1966. Ambiguity in context free languages. Journal of the ACM, 13(1):62 89. doi:10.1145/321312.321318. Irons, E.T., 1961. A syntax directed compiler for ALGOL 60. Communications of the ACM, 4(1):51 55. doi:10.1145/366062.366083. Johnson, S.C., 1975. YACC yet another compiler compiler. Computing science technical report 32, AT&T Bell Laboratories, Murray Hill, New Jersey. Knuth, D.E., 1965. On the translation of languages from left to right. Information and Control, 8(6):607 639. doi:10.1016/s0019-9958(65)90426-2. Lambek, J., 1958. The mathematics of sentence structure. American Mathematical Monthly, 65(3):154 170. doi:10.2307/2310058. Naur, P., editor, 1960. Report on the algorithmic language ALGOL 60. Communications of the ACM, 3(5):299 314. doi:10.1145/367236.367262. Oettinger, A.G., 1961. Automatic syntactic analysis and the pushdown store. In Structure of Language and its Mathematical Aspects, volume 12 of Proc. of Symposia in Applied Math., pages 104 129. AMS. Parikh, R.J., 1966. On context-free languages. Journal of the ACM, 13(4):570 581. doi:10.1145/321356.321364. Yngve, V.H., 1960. A model and an hypothesis for language structure. Proceedings of the American Philosophical Society, 104(5):444 466. 24/22