# 11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Size: px
Start display at page:

Download "11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation"

Transcription

1 tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each parse tree. Provides principled approach to resolving syntactic ambiguity. Allows supervised learning of parsers from treebanks of parse trees provided by human linguists. Also allows unsupervised learning of parsers from unannotated text, but accuracy of such parsers has been limited. 2 Probabilistic Context Free Grammar () A is a probabilistic version of a CFG where each production has a probability. Probabilities of all productions rewriting a given nonterminal must add to 1, defining a distribution for each non-terminal. tring generation is now probabilistic where production probabilities are used to nondeterministically select a production for rewriting a given non-terminal. imple for ATI Grammar NP Aux NP NP Pronoun NP Proper-Noun NP Det Nominal Nominal Noun Nominal Nominal Noun Nominal Nominal PP Verb Verb NP PP PP Prep NP Prob Lexicon Det a that this 0.6 Noun book flight meal money Verb book include prefer Pronoun I he she me Proper-Noun Houston NWA 0.8 Aux does 1.0 Prep from to on near through entence Probability Assume productions for each node are chosen indedently. Probability of derivation is product of probabilities of its productions. P(D 1 ) = x x x 0.6 x 0.6 x D 1 x x 1.0 x x x x 0.8 Verb NP 0.6 = book Det Nominal 0.6 Nominal PP 1.0 Noun Prep NP flight through Proper-Noun 0.8 Houston 5 yntactic Disambiguation Resolve ambiguity by picking most probable parse tree. P(D 2 ) = x x x 0.6 x x 0.6 x x 1.0 x x x x 0.8 = D 2 Verb NP 0.6 PP book Det Nominal Noun Prep NP flight through Proper-Noun 0.8 Houston 6 1

2 entence Probability Probability of a sentence is sum of probabilities of all of its derivations. P( book flight through Houston ) = P(D 1 ) + P(D 2 ) = = Three Useful Tasks Observation likelihood: To classify and order sentences. Most likely derivation: To determine most likely parse tree for a sentence. Maximum likelihood training: To train a to fit empirical training data. 7 8 : Most Likely Derivation There is an analog to Viterbi algorithm to efficiently determine most probable derivation (parse tree) for a sentence. : Most Likely Derivation There is an analog to Viterbi algorithm to efficiently determine most probable derivation (parse tree) for a sentence. NP 0.9 NP Det A N NP NP PP NP PropN V NP 0.7 PP John dog in. Parser X NP John V NP PP dog in NP 0.9 NP Det A N NP NP PP NP PropN V NP 0.7 PP John dog in. Parser NP John V NP dog in 10 Probabilistic CKY CKY can be modified for parsing by including in each cell a probability for each non-terminal. Cell[i,j] must retain most probable derivation of each constituent (non-terminal) covering words i +1 through j toger with its associated probability. When transforming grammar to CNF, must set production probabilities to preserve probability of derivations. Probabilistic Grammar Conversion Original Grammar NP Aux NP NP Pronoun NP Proper-Noun NP Det Nominal Nominal Noun Nominal Nominal Noun Nominal Nominal PP Verb Verb NP PP PP Prep NP Chomsky Normal Form NP X1 X1 Aux NP book include prefer Verb NP PP NP I he she me NP Houston NWA 6.04 NP Det Nominal Nominal book flight meal money Nominal Nominal Noun Nominal Nominal PP book include prefer Verb NP PP PP Prep NP

3 :.5*.5* :.05*.5*.054 :.5*.5*.054 :.05*.5*.054 :.5*.5* :.05*.5*.054 :.5*.5*.054 :.05*.5*.054 :.5*.5*

4 :.05*.5*.054 :.5*.5*.054 :.05*.5*.054 :.5*.5*.054 :.05*.5* = :.05*.5*.054 :.5*.5*.054 :.03*.0135*.032 = : :.05*.5*.054 :.5*.5*.054 : Pick most probable parse, i.e. take max to combine probabilities of multiple derivations of each constituent in each cell : Observation Likelihood There is an analog to Forward algorithm for HMMs called Inside algorithm for efficiently determining how likely a string is to be produced by a. Can use a as a language model to choose between alternative sentences for speech recognition or machine translation. 23 NP 0.9 NP Det A N NP NP PP NP PropN V NP 0.7 PP?? O 1 The dog big barked. The big dog barked O 2 P(O 2 ) > P(O 1 )? Inside Algorithm Use CKY probabilistic parsing algorithm but combine probabilities of multiple derivations of any constituent using addition instead of max. 24 4

5 for Inside Computation for Inside Computation :.05*.5*.054 :.5*.5*.054 : : :.05*.5*.054 :.5*.5*.054 : um probabilities = of multiple derivations of each constituent in each cell : upervised Training If parse trees are provided for training sentences, a grammar and its parameters can be can all be estimated directly from counts accumulated from tree-bank (with appropriate smoothing). Tree Bank Estimating Production Probabilities et of production rules can be taken directly from set of rewrites in treebank. Parameters can be directly estimated from frequency counts in treebank. NP 27 John V NP PP NP put dog in John V NP PP put dog in. upervised Training NP 0.9 NP Det A N NP NP PP NP PropN V NP 0.7 PP 28 P( ) count( ) count( ) count( ) count( ) : Maximum Likelihood Training Given a set of sentences, induce a grammar that maximizes probability that this data was generated from this grammar. Assume number of non-terminals in grammar is specified. Only need to have an unannotated set of sequences generated from model. Does not need correct parse trees for se sentences. In this sense, it is unsupervised. : Maximum Likelihood Training Training entences John ate apple A dog bit Mary Mary hit dog John gave Mary cat.. Training NP 0.9 NP Det A N NP NP PP NP PropN V NP 0.7 PP

6 Inside-Outside The Inside-Outside algorithm is a version of EM for unsupervised learning of a. Analogous to Baum-Welch (forward-backward) for HMMs Given number of non-terminals, construct all possible CNF productions with se non-terminals and observed terminal symbols. Use EM to iteratively train probabilities of se productions to locally maximize likelihood of data. ee Manning and chütze text for details Experimental results are not impressive, but recent work imposes additional constraints to improve unsupervised grammar learning. Vanilla Limitations ince probabilities of productions do not rely on specific words or concepts, only general structural disambiguation is possible (e.g. prefer to attach PPs to Nominals). Consequently, vanilla s cannot resolve syntactic ambiguities that require semantics to resolve, e.g. ate with fork vs. meatballs. In order to work well, s must be lexicalized, i.e. productions must be specialized to specific words by including ir head-word in ir LH non-terminals (e.g. -ate). 32 Example of Importance of Lexicalization A general preference for attaching PPs to NPs rar than s can be learned by a vanilla. But desired preference can ded on specific words. Example of Importance of Lexicalization A general preference for attaching PPs to NPs rar than s can be learned by a vanilla. But desired preference can ded on specific words. NP 0.9 NP Det A N NP NP PP NP PropN V NP 0.7 PP John put dog in. Parser NP John V NP PP put dog in 33 NP 0.9 NP Det A N NP NP PP NP PropN V NP 0.7 PP John put dog in. NP John V NP Parser X put dog in 34 Head Words yntactic phrases usually have a word in m that is most central to phrase. Linguists have defined concept of a lexical head of a phrase. imple rules can identify head of any phrase by percolating head words up parse tree. Head of a is main verb Head of an NP is main noun Head of a PP is preposition Head of a sentence is head of its Lexicalized Productions pecialized productions can be generated by including head word and its PO of each nonterminal as part of that non-terminal s symbol. NP John-NNP NNP John -VBD VBD -VBD NP dog-nn Nominaldog-NN Nominaldog-NN PPin-IN DT Nominal dog-nn Nominal PP in-in dog-nn NN IN NP -NN dog in DT Nominal -NN NN 6

7 Lexicalized Productions put-vbd NP John-NNP put-vbd put-vbd put-vbd PPin-IN NNP put-vbd PP in-in John VBD NP dog-nn IN NP -NN put DT Nominal in DT Nominal -NN dog-nn NN NN dog Parameterizing Lexicalized Productions Accurately estimating parameters on such a large number of very specialized productions could require enormous amounts of treebank data. Need some way of estimating parameters for lexicalized productions that makes reasonable indedence assumptions so that accurate probabilities for very specific rules can be learned. Collins Parser Collins (1999) parser assumes a simple generative model of lexicalized productions. Models productions based on context to left and right of head daughter. LH L n L n 1 L 1 H R 1 R m 1 R m First generate head (H) and n repeatedly generate left (L i ) and right (R i ) context symbols until symbol TOP is generated. ample Production Generation put-vbd VBDput-VBD NPdog-NN PPin-IN put-vbd Note: Penn treebank tends to have fairly flat parse trees that produce long productions. TOP VBDput-VBD NPdog-NN PPin-IN TOP L 1 H R 1 R 2 R 3 P L (TOP put-vbd) * P H (VBD Vpput-VBD)* P R (NPdog-NN put-vbd)* P R (PPin-IN put-vbd) * P R (TOP put-vbd) Estimating Production Generation Parameters Estimate P H, P L, and P R parameters from treebank data. Count(PPin-IN right of head in a put-vbd production) P R (PPin-IN put-vbd) = Count(symbol right of head in a put-vbd) Count(NPdog-NN right of head in a put-vbd production) P R (NPdog-NN put-vbd) = Count(symbol right of head in a put-vbd) mooth estimates by linearly interpolating with simpler models conditioned on just PO tag or no lexical info. Missed Context Dedence Anor problem with CFGs is that which production is used to expand a non-terminal is indedent of its context. However, this indedence is frequently violated for normal grammars. NPs that are subjects are more likely to be pronouns than NPs that are objects. smp R (PPin-IN put-vbd) = 1 P R (PPin-IN put-vbd) + (1 1 ) ( 2 P R (PPin-IN VBD) + (1 2 ) P R (PPin-IN )) 42 7

8 plitting Non-Terminals To provide more contextual information, nonterminals can be split into multiple new non-terminals based on ir parent in parse tree using parent annotation. A subject NP becomes NP^ since its parent node is an. An object NP becomes NP^ since its parent node is a 43 Parent Annotation Example NP ^ ^ ^ VBD^ NP^ NNP ^NP VBD ^ NP ^ John DT ^NP Nominal ^NP Nominal PP ^Nominal ^Nominal NN IN ^PP NP ^PP ^Nominal dog in DT ^NP Nominal ^NP NN ^Nominal 44 plit and Merge Non-terminal splitting greatly increases size of grammar and number of parameters that need to be learned from limited training data. Best approach is to only split non-terminals when it improves accuracy of grammar. May also help to merge some non-terminals to remove some un-helpful distinctions and learn more accurate parameters for merged productions. Method: Heuristically search for a combination of splits and merges that produces a grammar that maximizes likelihood of training treebank. Treebanks Penn Treebank: tandard corpus for testing syntactic parsing consists of 1.2 M words of text from Wall treet Journal (WJ). Typical to train on about 40,000 parsed sentences and test on an additional standard disjoint test set of 2,416 sentences. Chinese Penn Treebank: 100K words from Xinhua news service. Or corpora existing in many languages, see Wikipedia article Treebank First WJ entence 47 ( ( (NP-BJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NN years) ) (JJ old) ) (,,) ) ( (MD will) ( (VB join) (NP (DT ) (NN board) ) (PP-CLR (IN as) (NP (DT a) (JJ nonexecutive) (NN director) )) (NP-TMP (NNP Nov.) (CD 29) ))) (..) )) WJ entence with Trace (NONE) 48 ( ( (NP-BJ (DT The) (NNP Illinois) (NNP upreme) (NNP Court) ) ( (VBD ordered) (NP-1 (DT ) (NN commission) ) ( (NP-BJ (-NONE- *-1) ) ( (TO to) ( ( (VB audit) (NP (NP (NNP Commonwealth) (NNP Edison) (PO 's) ) (NN construction) (NN exses) )) (CC and) ( (VB refund) (NP (DT any) (JJ unreasonable) (NN exses) )))))) (..) )) 8

9 Parsing Evaluation Metrics PAREVAL metrics measure fraction of constituents that match between computed and human parse trees. If P is system s parse tree and T is human parse tree ( gold standard ): Recall = (# correct constituents in P) / (# constituents in T) Precision = (# correct constituents in P) / (# constituents in P) Labeled Precision and labeled recall require getting non-terminal label on constituent node correct to count as correct. F 1 is harmonic mean of precision and recall. 49 Computing Evaluation Metrics Correct Tree T Computed Tree P Verb NP book Det Nominal Verb NP Nominal PP book Det PP Nominal Noun Prep NP Noun Prep NP flight through Proper-Noun Houston flight through Proper-Noun Houston # Constituents: 12 # Constituents: 12 # Correct Constituents: 10 Recall = 10/12= 83.3% Precision = 10/12=83.3% F 1 = 83.3% Treebank Results Results of current state-of--art systems on Penn WJ treebank are slightly greater than 90% labeled precision and recall. Discriminative Parse Reranking Motivation: Even when top-ranked parse not correct, frequently correct parse is one of those ranked highly by a statistical parser. Use a discriminative classifier that is trained to select best parse from N-best parses produced by original parser. Reranker can exploit global features of entire parse whereas a is restricted to making decisions based on local info tage Reranking Approach Parse Reranking Adapt parser to produce an N-best list of most probable parses in addition to most-likely one. Extract from each of se parses, a set of global features that help determine if it is a good parse tree. Train a discriminative classifier (e.g. logistic regression) using best parse in each N-best list as positive and ors as negative. sentence Parser N-Best Parse Trees Parse Tree Feature Extractor Best Parse Tree Discriminative Parse Tree Classifier Parse Tree Descriptions

10 ample Parse Tree Features Probability of parse from. The number of parallel conjuncts. bird in tree and squirrel on ground bird and squirrel in tree The degree to which parse tree is right branching. parses tend to be right branching (cf. parse of Book flight through Houston ) Frequency of various tree fragments, i.e. specific combinations of 2 or 3 rules. Evaluation of Reranking Reranking is limited by oracle accuracy, i.e. accuracy that results when an omniscient oracle picks best parse from N-best list. Typical current oracle accuracy is around F 1 =97% Reranking can generally improve test accuracy of current models a percentage point or two Or Discriminative Parsing There are also parsing models that move from generative s to a fully discriminative model, e.g. max margin parsing (Taskar et al., 2004). There is also a recent model that efficiently reranks all of parses in complete (compactly-encoded) parse forest, avoiding need to generate an N-best list (forest reranking, Huang, 2008). Human Parsing Computational parsers can be used to predict human reading time as measured by tracking time taken to read each word in a sentence. Psycholinguistic studies show that words that are more probable given preceding lexical and syntactic context are read faster. John put dog in with a lock. John put dog in with a bone in car. John dog in with a bone. Modeling se effects requires an incremental statistical parser that incorporates one word at a time into a continuously growing parse tree Garden Path entences People are confused by sentences that seem to have a particular syntactic structure but n suddenly violate this structure, so listener is lead down garden path. The horse raced past barn fell. vs. The horse raced past barn broke his leg. The complex houses married students. The old man sea. While Anna dressed baby spit up on bed. Incremental computational parsers can try to predict and explain problems encountered parsing such sentences. Center Embedding Nested expressions are hard for humans to process beyond 1 or 2 levels of nesting. The rat cat chased died. The rat cat dog bit chased died. The rat cat dog boy owned bit chased died. Requires remembering and popping incomplete constituents from a stack and strains human short-term memory. Equivalent tail embedded (tail recursive) versions are easier to understand since no stack is required. The boy owned a dog that bit a cat that chased a rat that died

11 Dedency Grammars An alternative to phrase-structure grammar is to define a parse as a directed graph between words of a sentence representing dedencies between words. 61 John dog in Typed nsubj dobj dedency John dog parse det in det Dedency Graph from Parse Tree Can convert a phrase structure parse to a dedency tree by making head of each nonhead child of a node ded on head of head child. NP John-NNP NNP John 62 -VBD VBD -VBD NP dog-nn DT Nominal dog-nn Nominal PP in-in dog-nn NN IN NP -NN dog in DT Nominal -NN NN John dog in Unification Grammars In order to handle agreement issues more effectively, each constituent has a list of features such as number, person, gender, etc. which may or not be specified for a given constituent. In order for two constituents to combine to form a larger constituent, ir features must unify, i.e. consistently combine into a merged set of features. Expressive grammars and parsers (e.g. HPG) have been developed using this approach and have been partially integrated with modern statistical models of disambiguation. Mildly Context-ensitive Grammars ome grammatical formalisms provide a degree of context-sensitivity that helps capture aspects of NL syntax that are not easily handled by CFGs. Tree Adjoining Grammar (TAG) is based on combining tree fragments rar than individual phrases. Combinatory Categorial Grammar (CCG) consists of: Categorial Lexicon that associates a syntactic and semantic category with each word. Combinatory Rules that define how categories combine to form or categories tatistical Parsing Conclusions tatistical models such as s allow for probabilistic resolution of ambiguities. s can be easily learned from treebanks. Lexicalization and non-terminal splitting are required to effectively resolve many ambiguities. Current statistical parsers are quite accurate but not yet at level of human-expert agreement

### Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up

### Context Free Grammars. Many slides from Michael Collins

Context Free Grammars Many slides from Michael Collins Overview I An introduction to the parsing problem I Context free grammars I A brief(!) sketch of the syntax of English I Examples of ambiguous structures

### Grammars & Parsing, Part 1:

Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review

### Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion

### Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together

### CS 598 Natural Language Processing

CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#\$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@

### 2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz

### Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and

### Natural Language Processing. George Konidaris

Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans

### SEMAFOR: Frame Argument Resolution with Log-Linear Models

SEMAFOR: Frame Argument Resolution with Log-Linear Models Desai Chen or, The Case of the Missing Arguments Nathan Schneider SemEval July 16, 2010 Dipanjan Das School of Computer Science Carnegie Mellon

### Prediction of Maximal Projection for Semantic Role Labeling

Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba

### LTAG-spinal and the Treebank

LTAG-spinal and the Treebank a new resource for incremental, dependency and semantic parsing Libin Shen (lshen@bbn.com) BBN Technologies, 10 Moulton Street, Cambridge, MA 02138, USA Lucas Champollion (champoll@ling.upenn.edu)

### UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

UNIVERSITY OF OSLO Department of Informatics Dialog Act Recognition using Dependency Features Master s thesis Sindre Wetjen November 15, 2013 Acknowledgments First I want to thank my supervisors Lilja

### Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art

### Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,

### Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about

### Accurate Unlexicalized Parsing for Modern Hebrew

Accurate Unlexicalized Parsing for Modern Hebrew Reut Tsarfaty and Khalil Sima an Institute for Logic, Language and Computation, University of Amsterdam Plantage Muidergracht 24, 1018TV Amsterdam, The

### BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Daffodil International University Institutional Repository DIU Journal of Science and Technology Volume 8, Issue 1, January 2013 2013-01 BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS Uddin, Sk.

### The stages of event extraction

The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks

### Chapter 4: Valence & Agreement CSLI Publications

Chapter 4: Valence & Agreement Reminder: Where We Are Simple CFG doesn t allow us to cross-classify categories, e.g., verbs can be grouped by transitivity (deny vs. disappear) or by number (deny vs. denies).

### Parsing of part-of-speech tagged Assamese Texts

IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal

### Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se

### Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Ulrike Baldewein (ulrike@coli.uni-sb.de) Computational Psycholinguistics, Saarland University D-66041 Saarbrücken,

### The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

Lecture 4: OT Syntax Sources: Kager 1999, Section 8; Legendre et al. 1998; Grimshaw 1997; Barbosa et al. 1998, Introduction; Bresnan 1998; Fanselow et al. 1999; Gibson & Broihier 1998. OT is not a theory

Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

### Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Final Exam (120 points) Click on the yellow balloons below to see the answers I. Short Answer (32pts) 1. (6) The sentence The kinder teachers made sure that the students comprehended the testable material

### Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger

Page 1 of 35 Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger Kaihong Liu, MD, MS, Wendy Chapman, PhD, Rebecca Hwa, PhD, and Rebecca S. Crowley, MD, MS

### Compositional Semantics

Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language

### A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.

### Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

### EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,

### Developing a TT-MCTAG for German with an RCG-based Parser

Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,

### Control and Boundedness

Control and Boundedness Having eliminated rules, we would expect constructions to follow from the lexical categories (of heads and specifiers of syntactic constructions) alone. Combinatory syntax simply

### Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Inleiding Taalkunde Docent: Paola Monachesi Blok 4, 2001/2002 Contents 1 Syntax 2 2 Phrases and constituent structure 2 3 A minigrammar of Italian 3 4 Trees 3 5 Developing an Italian lexicon 4 6 S(emantic)-selection

### Ensemble Technique Utilization for Indonesian Dependency Parser

Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id

### Construction Grammar. University of Jena.

Construction Grammar Holger Diessel University of Jena holger.diessel@uni-jena.de http://www.holger-diessel.de/ Words seem to have a prototype structure; but language does not only consist of words. What

### Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

### 1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

If we cancel class 1/20 idea We ll spend an extra hour on 1/21 I ll give you a brief writing problem for 1/21 based on assigned readings Jot down your thoughts based on your reading so you ll be ready

### Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,

### Some Principles of Automated Natural Language Information Extraction

Some Principles of Automated Natural Language Information Extraction Gregers Koch Department of Computer Science, Copenhagen University DIKU, Universitetsparken 1, DK-2100 Copenhagen, Denmark Abstract

### Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3] Jakub Waszczuk, Agata Savary To cite this version: Jakub Waszczuk, Agata Savary. Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]. PARSEME 6th general

### Analysis of Probabilistic Parsing in NLP

Analysis of Probabilistic Parsing in NLP Krishna Karoo, Dr.Girish Katkar Research Scholar, Department of Electronics & Computer Science, R.T.M. Nagpur University, Nagpur, India Head of Department, Department

### The Interface between Phrasal and Functional Constraints

The Interface between Phrasal and Functional Constraints John T. Maxwell III* Xerox Palo Alto Research Center Ronald M. Kaplan t Xerox Palo Alto Research Center Many modern grammatical formalisms divide

### Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank

Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank Dan Klein and Christopher D. Manning Computer Science Department Stanford University Stanford,

### "f TOPIC =T COMP COMP... OBJ

TREATMENT OF LONG DISTANCE DEPENDENCIES IN LFG AND TAG: FUNCTIONAL UNCERTAINTY IN LFG IS A COROLLARY IN TAG" Aravind K. Joshi Dept. of Computer & Information Science University of Pennsylvania Philadelphia,

### The Role of the Head in the Interpretation of English Deverbal Compounds

The Role of the Head in the Interpretation of English Deverbal Compounds Gianina Iordăchioaia i, Lonneke van der Plas ii, Glorianna Jagfeld i (Universität Stuttgart i, University of Malta ii ) Wen wurmt

### ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Meeting 2 Chapter 7 (Morphology) and chapter 9 (Syntax) Today s agenda Repetition of meeting 1 Mini-lecture on morphology Seminar on chapter 7, worksheet Mini-lecture on syntax Seminar on chapter 9, worksheet

### Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing Grzegorz Chrupa la A dissertation submitted in fulfilment of the requirements for the award of Doctor of Philosophy (Ph.D.)

### Argument structure and theta roles

Argument structure and theta roles Introduction to Syntax, EGG Summer School 2017 András Bárány ab155@soas.ac.uk 26 July 2017 Overview Where we left off Arguments and theta roles Some consequences of theta

### A Graph Based Authorship Identification Approach

A Graph Based Authorship Identification Approach Notebook for PAN at CLEF 2015 Helena Gómez-Adorno 1, Grigori Sidorov 1, David Pinto 2, and Ilia Markov 1 1 Center for Computing Research, Instituto Politécnico

### A Computational Evaluation of Case-Assignment Algorithms

A Computational Evaluation of Case-Assignment Algorithms Miles Calabresi Advisors: Bob Frank and Jim Wood Submitted to the faculty of the Department of Linguistics in partial fulfillment of the requirements

### University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

University of Alberta Large-Scale Semi-Supervised Learning for Natural Language Processing by Shane Bergsma A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of

### Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

### RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

N S ER E P S I M TA S UN A I S I T VER RANKING AND UNRANKING LEFT SZILARD LANGUAGES Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A-1997-2 UNIVERSITY OF TAMPERE DEPARTMENT OF

### Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Dr. Kakia Chatsiou, University of Essex achats at essex.ac.uk Explorations in Syntactic Government and Subcategorisation,

### Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Basic Syntax Doug Arnold doug@essex.ac.uk We review some basic grammatical ideas and terminology, and look at some common constructions in English. 1 Categories 1.1 Word level (lexical and functional)

### An Efficient Implementation of a New POP Model

An Efficient Implementation of a New POP Model Rens Bod ILLC, University of Amsterdam School of Computing, University of Leeds Nieuwe Achtergracht 166, NL-1018 WV Amsterdam rens@science.uva.n1 Abstract

### AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

### The Smart/Empire TIPSTER IR System

The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of

### Proof Theory for Syntacticians

Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax

### THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,

### Twitter Sentiment Classification on Sanders Data using Hybrid Approach

IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

### Three New Probabilistic Models. Jason M. Eisner. CIS Department, University of Pennsylvania. 200 S. 33rd St., Philadelphia, PA , USA

Three New Probabilistic Models for Dependency Parsing: An Exploration Jason M. Eisner CIS Department, University of Pennsylvania 200 S. 33rd St., Philadelphia, PA 19104-6389, USA jeisner@linc.cis.upenn.edu

### Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Approaches to control phenomena handout 6 5.4 Obligatory control and morphological case: Icelandic and Basque Icelandinc quirky case (displaying properties of both structural and inherent case: lexically

### Applications of memory-based natural language processing

Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal

### Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Universal Grammar 1 evidence : 1. crosslinguistic investigation of properties of languages 2. evidence from language acquisition 3. general cognitive abilities 1. Properties can be reflected in a.) structural

### Cross Language Information Retrieval

Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

### ESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly

ESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly Inflected Languages Classical Approaches to Tagging The slides are posted on the web. The url is http://chss.montclair.edu/~feldmana/esslli10/.

### Annotation Projection for Discourse Connectives

SFB 833 / Univ. Tübingen Penn Discourse Treebank Workshop Annotation projection Basic idea: Given a bitext E/F and annotation for F, how would the annotation look for E? Examples: Word Sense Disambiguation

### COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR ROLAND HAUSSER Institut für Deutsche Philologie Ludwig-Maximilians Universität München München, West Germany 1. CHOICE OF A PRIMITIVE OPERATION The

### What is NLP? CS 188: Artificial Intelligence Spring Why is Language Hard? The Big Open Problems. Information Extraction. Machine Translation

C 188: Artificial Intelligence pring 2006 What is NLP? Lecture 27: NLP 4/27/2006 Dan Klein UC Berkeley Fundamental goal: deep understand of broad language Not just string processing or keyword matching!

### A Version Space Approach to Learning Context-free Grammars

Machine Learning 2: 39~74, 1987 1987 Kluwer Academic Publishers, Boston - Manufactured in The Netherlands A Version Space Approach to Learning Context-free Grammars KURT VANLEHN (VANLEHN@A.PSY.CMU.EDU)

### POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important

### The College Board Redesigned SAT Grade 12

A Correlation of, 2017 To the Redesigned SAT Introduction This document demonstrates how myperspectives English Language Arts meets the Reading, Writing and Language and Essay Domains of Redesigned SAT.

### Lecture 1: Machine Learning Basics

1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

### Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

### Learning Computational Grammars

Learning Computational Grammars John Nerbonne, Anja Belz, Nicola Cancedda, Hervé Déjean, James Hammerton, Rob Koeling, Stasinos Konstantopoulos, Miles Osborne, Franck Thollard and Erik Tjong Kim Sang Abstract

### Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Informatics 2A: Language Complexity and the Chomsky Hierarchy September 28, 2010 Starter 1 Is there a finite state machine that recognises all those strings s from the alphabet {a, b} where the difference

### What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017 Supervised Training of Neural Networks for Language Training Data Training Model this is an example the cat went to

### SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,

### The Discourse Anaphoric Properties of Connectives

The Discourse Anaphoric Properties of Connectives Cassandre Creswell, Kate Forbes, Eleni Miltsakaki, Rashmi Prasad, Aravind Joshi Λ, Bonnie Webber y Λ University of Pennsylvania 3401 Walnut Street Philadelphia,

### Memory-based grammatical error correction

Memory-based grammatical error correction Antal van den Bosch Peter Berck Radboud University Nijmegen Tilburg University P.O. Box 9103 P.O. Box 90153 NL-6500 HD Nijmegen, The Netherlands NL-5000 LE Tilburg,

### CS Machine Learning

CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

### DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS

DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS Julia Tmshkina Centre for Text Techitology, North-West University, 253 Potchefstroom, South Africa 2025770@puk.ac.za

### First Grade Curriculum Highlights: In alignment with the Common Core Standards

First Grade Curriculum Highlights: In alignment with the Common Core Standards ENGLISH LANGUAGE ARTS Foundational Skills Print Concepts Demonstrate understanding of the organization and basic features

### Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer

### The Strong Minimalist Thesis and Bounded Optimality

The Strong Minimalist Thesis and Bounded Optimality DRAFT-IN-PROGRESS; SEND COMMENTS TO RICKL@UMICH.EDU Richard L. Lewis Department of Psychology University of Michigan 27 March 2010 1 Purpose of this

### An Interactive Intelligent Language Tutor Over The Internet

An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This

### Multilingual Sentiment and Subjectivity Analysis

Multilingual Sentiment and Subjectivity Analysis Carmen Banea and Rada Mihalcea Department of Computer Science University of North Texas rada@cs.unt.edu, carmen.banea@gmail.com Janyce Wiebe Department

### Good-Enough Representations in Language Comprehension

CURRENT DIRECTIONS IN PSYCHOLOGICAL SCIENCE 11 Good-Enough Representations in Language Comprehension Fernanda Ferreira, 1 Karl G.D. Bailey, and Vittoria Ferraro Department of Psychology and Cognitive Science

### Hyperedge Replacement and Nonprojective Dependency Structures

Hyperedge Replacement and Nonprojective Dependency Structures Daniel Bauer and Owen Rambow Columbia University New York, NY 10027, USA {bauer,rambow}@cs.columbia.edu Abstract Synchronous Hyperedge Replacement

### Specifying a shallow grammatical for parsing purposes

Specifying a shallow grammatical for parsing purposes representation Atro Voutilainen and Timo J~irvinen Research Unit for Multilingual Language Technology P.O. Box 4 FIN-0004 University of Helsinki Finland

### Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models Richard Johansson and Alessandro Moschitti DISI, University of Trento Via Sommarive 14, 38123 Trento (TN),

### Adapting Stochastic Output for Rule-Based Semantics

Adapting Stochastic Output for Rule-Based Semantics Wissenschaftliche Arbeit zur Erlangung des Grades eines Diplom-Handelslehrers im Fachbereich Wirtschaftswissenschaften der Universität Konstanz Februar

### Distant Supervised Relation Extraction with Wikipedia and Freebase

Distant Supervised Relation Extraction with Wikipedia and Freebase Marcel Ackermann TU Darmstadt ackermann@tk.informatik.tu-darmstadt.de Abstract In this paper we discuss a new approach to extract relational

### Visual CP Representation of Knowledge

Visual CP Representation of Knowledge Heather D. Pfeiffer and Roger T. Hartley Department of Computer Science New Mexico State University Las Cruces, NM 88003-8001, USA email: hdp@cs.nmsu.edu and rth@cs.nmsu.edu

### Python Machine Learning

Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

### Discriminative Learning of Beam-Search Heuristics for Planning

Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University