Sentence Processing Lecture 5 Introduction to Psycholinguistics

Similar documents
The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Copyright and moral rights for this thesis are retained by the author

Context Free Grammars. Many slides from Michael Collins

Parsing of part-of-speech tagged Assamese Texts

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Good-Enough Representations in Language Comprehension

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Constraining X-Bar: Theta Theory

Control and Boundedness

CS 598 Natural Language Processing

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Natural Language Processing. George Konidaris

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Proof Theory for Syntacticians

Grammars & Parsing, Part 1:

Syntactic Ambiguity Resolution in Sentence Processing: New Evidence from a Morphologically Rich Language

The Interface between Phrasal and Functional Constraints

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Analysis of Probabilistic Parsing in NLP

Argument structure and theta roles

An Introduction to the Minimalist Program

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Accurate Unlexicalized Parsing for Modern Hebrew

Ambiguity in the Brain: What Brain Imaging Reveals About the Processing of Syntactically Ambiguous Sentences

Update on Soar-based language processing

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Good Enough Language Processing: A Satisficing Approach

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Theoretical Syntax Winter Answers to practice problems

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Chapter 4: Valence & Agreement CSLI Publications

Prediction of Maximal Projection for Semantic Role Labeling

English Language and Applied Linguistics. Module Descriptions 2017/18

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Some Principles of Automated Natural Language Information Extraction

Developing Grammar in Context

"f TOPIC =T COMP COMP... OBJ

The Real-Time Status of Island Phenomena *

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Som and Optimality Theory

Compositional Semantics

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

THE INTERNATIONAL JOURNAL OF HUMANITIES & SOCIAL STUDIES

Part I. Figuring out how English works

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Specifying a shallow grammatical for parsing purposes

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

The Structure of Multiple Complements to V

Generation of Referring Expressions: Managing Structural Ambiguities

AQUA: An Ontology-Driven Question Answering System

A Usage-Based Approach to Recursion in Sentence Processing

The Smart/Empire TIPSTER IR System

A Computational Evaluation of Case-Assignment Algorithms

Underlying and Surface Grammatical Relations in Greek consider

Chapter 3: Semi-lexical categories. nor truly functional. As Corver and van Riemsdijk rightly point out, There is more

Hindi Aspectual Verb Complexes

Segmented Discourse Representation Theory. Dynamic Semantics with Discourse Structure

FOREWORD.. 5 THE PROPER RUSSIAN PRONUNCIATION. 8. УРОК (Unit) УРОК (Unit) УРОК (Unit) УРОК (Unit) 4 80.

An Interactive Intelligent Language Tutor Over The Internet

EAGLE: an Error-Annotated Corpus of Beginning Learner German

Derivations (MP) and Evaluations (OT) *

Ch VI- SENTENCE PATTERNS.

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

LNGT0101 Introduction to Linguistics

LTAG-spinal and the Treebank

The Pennsylvania State University. The Graduate School. College of the Liberal Arts THE TEACHABILITY HYPOTHESIS AND CONCEPT-BASED INSTRUCTION

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Lecture 1: Machine Learning Basics

Which verb classes and why? Research questions: Semantic Basis Hypothesis (SBH) What verb classes? Why the truth of the SBH matters

Frequency and pragmatically unmarked word order *

Minimalism is the name of the predominant approach in generative linguistics today. It was first

Ambiguities and anomalies: What can eye-movements and event-related potentials reveal about second language sentence processing?

SEMAFOR: Frame Argument Resolution with Log-Linear Models

Hindi-Urdu Phrase Structure Annotation

Authors note Chapter One Why Simpler Syntax? 1.1. Different notions of simplicity

The Structure of Relative Clauses in Maay Maay By Elly Zimmer

Heads and history NIGEL VINCENT & KERSTI BÖRJARS The University of Manchester

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Phonological and Phonetic Representations: The Case of Neutralization

The optimal placement of up and ab A comparison 1

Psychology and Language

Advanced Grammar in Use

Formulaic Language and Fluency: ESL Teaching Applications

Pseudo-Passives as Adjectival Passives

Language acquisition: acquiring some aspects of syntax.

The Discourse Anaphoric Properties of Connectives

Speech Recognition at ICSI: Broadcast News and beyond

Multiple case assignment and the English pseudo-passive *

A relational approach to translation

Grounding Language for Interactive Task Learning

The Conversational User Interface

Transcription:

entence Processing Lecture 5 Introduction to Psycholinguistics Matthew W. Crocker Pia Knoeferle Department of Computational Linguistics aarland University Reading Altmann, G. Ambiguity in entence Processing. Trends in Cognitive ciences, 2:4, 1998. How do the accounts Altmann discusses relate to the notion of linguistic modularity? What kinds of information is used during processing? We will return later in the course to: - theories of ambiguity resolution later - connectionist and constraint-based processing models Next lecture: Experimental Methods II (PK) 2

Theories of entence Processing tructure-based theories Disambiguation based on structural heuristics Grammar-based theories Preferred structure based on grammatical principles Experience-based theories tructural preferences are based on prior experience Interactive accounts Disambiguation draws on diverse knowledge sources Resources-based accounts Preferred structure involves the least resources 3 Linking Hypotheses Relate the theory/model to some observed measure Typically impossible to predict measures completely Theories of parsing typically determine what mechanism is used to construct interpretations? which information sources are used by the mechanism? which representation is preferred/constructed when ambiguity arises? Linking Hypothesis: Preferred sentence structures should have faster reading times in the disambiguating region than dispreferred 4

g The Garden Path Theory (Frazier) Prepositional Phase Attachment: ei NP VP g ry PN V NP PP John saw ty tu Det N P NP the man with the telescope Which attachment do people initially prefer? 5 First trategy: Minimal Attachment Minimal Attachment: Adopt the analysis which requires postulating the fewest nodes g ep NP VP qgp ei PN V NP PP NP VP John saw 2 tu 3 Det N P NP PN V NP the man with the telescope John saw 3 NP PP 2 tu Det N P NP the man with the telescope 6

econd trategy: Late Closure Late Closure: Attach material into the most recently constructed phrase marker ei NP VP 6 ru The reporter V g to said NP VP 5 5 AdvP the plane crashed 5 last night 7 ummary of Frazier Parsing preferences are guided by general principles: erial structure building Reanalyze based on syntactic conflict Reanalyze based on low plausibility ( thematic fit ) Psychological assumptions: Modularity: only syntactic (not lexical, not semantic) information used for initial structure building Resources: emphasizes importance of memory limitations Processing strategies are universal, innate 8

Garden-Path Theory: Frazier (1978) What architecture is assumed? Modular syntactic processor, with restricted lexical (category) and semantic knowledge What mechanisms is used to construct interpretations? Incremental, serial parsing, with reanalysis What information is used to determine preferred structure? General syntactic principles based on the current phrase stucture Linking Hypothesis: Parse complexity and reanalysis cause increased RTs 9 Against linguistic modularity Empirical evidence from on-line methods evidence for immediate (very early) interaction effects of animacy, frequency, plausibility, discourse context - The woman/patient sent the flowers was pleased Appropriate computational frameworks: symbolic constraint-satisfaction systems connectionist systems & competitive activation models Homogenous/Integrative Linguistic Theory: HPG multiple levels of representation within a unified formalism 10

g NP/ Complement Ambiguity The student knew the solution to the problem. The student knew the solution was incorrect. ei NP VP 6 ru The student V NP g 6 knew the solution... ei NP VP 6 ru The student V ro knew NP VP 6 6 the solution 11 Grammar-Based trategies Not concerned with representation or form, but defined in terms of syntactic content trategies are modular, but knowledge-based Motivation: strategies are derived from the purpose of the task, not e.g. computational efficiency Closer competence-performance relationship 12

g Pritchett (1992) Rather than minimize complexity, maximize role assignment: Incrementally establish primary syntactic dependencies Theta-Criterion: (GB theory, also in LFG + HPG) Each argument must receive exactly one theta-role, and each theta role must be assigned to exactly one argument Theta-Attachment: Maximally satisfy the theta-criterion at every point during processing, given the maximal theta-grid of the verb Theta Reanalysis Constraint: Reanalysis of a constituent out of its theta-domain results in a conscious garden-path effect 13 Theta-Reanalysis: Easy Reanalysis to a position within the original theta-domain is easy. ei NP VP 6 ru The student V NP g 6 knew the solution... ei NP VP 6 3 The student V ro knew NP VP 6 6 the solution was incorrect 14

Theta-Reanalysis: Difficult Reanalysis to a position outside the original theta-domain is difficult. PP qp qp p P rp After ei NP VP NP VP closed 6 ru g the man V NP 6 left the shop 15 Pritchett: Another example Without her contributions the orphanage closed Without : a Prep with a single thematic role her : - an NP determiner of a yet unseen NP head, or - an Full NP complement (Pronoun), receives the role [Thetaattach] contributions : - head of a new NP, without a theta-role, or - build the larger NP with her, and receive the role [Thetaattach] 16

Well-known local ambiguities NP/VP Attachment Ambiguity: The cop [saw [the burglar] [with the binoculars]] The cop saw [the burglar [with the gun]] NP/ Complement Attachment Ambiguity: The athlete [realised [his goals]] last week The athlete realised [[his goals] were unattainable] Clause-boundary Ambiguity: ince Jay always [jogs [a mile]] [the race doesn t seem very long] ince Jay always jogs [[a mile] doesn t seem very long] Reduced Relative-Main Clause Ambiguity: [The woman [delivered the junkmail on Thursdays]] [[The woman [delivered the junkmail]] threw it away] Relative/Complement Clause Ambiguity: The doctor [told [the woman] [that he was in love with her]] The doctor [told [the woman [that he was in love with]] [to leave]] 17 Grammar-Based (cont d) Theta-Attachment: reliance on theta-grids means it s head driven O.k. for English, but not incremental for head-final languages ame problem for Abney (1989), and other head-driven models 18

Pritchett s Theory (1992) What architecture is assumed? Modular lexico-syntactic processor with syntactic and thematic role features What mechanisms is used to construct interpretations? Incremental, serial parsing, with reanalysis What information is used to determine preferred structure? Grammar principles and thematic role information Linking Hypothesis: TRC violation causes garden-path, reanalysis without TRC is relatively easy 19 Experience and non-syntactic constraints The previous accounts focus on yntactic (and lexico-syntactic) ambiguity Purely syntactic mechanisms for disambiguation Assume a modular parser, the primacy of syntax Does our prior experience with language, determines our preferences for interpreting the sentences we hear? Tuning hypothesis: disambiguate structure based on how it has been most frequently disambiguated in the past. Non-syntactic constraints: to what extent do semantics, intonation, and context influence our resolution of ambiguity? 20

Multiple constraints in ambiguity resolution The doctor told the woman that...!!!!! story diet was unhealthy he was in love with her husband he was in love with to leave story was was about to leave Prosody: intonation can assist disambiguation Lexical preference: that = {Comp, Det, RelPro} ubcat: told = { [ _ NP NP] [ _ NP ] [ _ NP ] [ _ NP Inf] } emantics: Referential context, plausibility Reference may determine argument attach over modifier attach Plausibility of story versus diet as indirect object 21 Probabilistic Theories of Processing Task of comprehension: recover the correct interpretation Goal: Determine the most likely analysis for a given input: argmaxp(s i ) for all s i " i P can hide a multitude of sins: P corresponds to the degree of belief in an interpretation Influenced by recent utterances, experience, context Implementation: P is determined by frequencies in corpora or completions To compare probabilities (of the i), assume parallelism 22

Implementation Interpretation of probabilities Likelihood of structure occurring, P can be determined by frequencies in corpora or human completions Estimation of probabilities Infinite structural possibilities = sparse data Associate probabilities with grammar (finite): e.g. PCFGs What mechanisms are required: Incremental structure building and estimation of probabilities Comparison of probabilities entails parallelism 23 Probabilistic Grammars Context-free rules annotated with probabilities Probabilities of all rules with the same LH sum to one; Probability of a parse is the product of the probabilities of all rules applied in the parse. Example (Manning and chütze 1999)! NP VP " 1.0 PP! P NP 1.0 VP! VP NP 0.7 VP! VP NP 0.3 P! with 1.0 V! saw 1.0 NP! NP PP 0.4 NP! astronomers 0.1 NP! ears 0.18 NP! saw 0.04 NP! stars 0.18 NP! telescopes 0.1 24

Parse Ranking 25 Parse Ranking 26

Jurafsky (1996) Probabilistic model of lexical and syntactic disambiguation exploits concepts from computational linguistics: - PCFGs, Bayesian modeling frame probabilities. Overview of issues: data to be modeled: frame preferences, garden paths; architecture: serial, parallel, limited parallel; probabilistic CFGs, frame probabilities; examples for frame preferences, garden paths; comparison with other models; problems and issues. 27 Frame Preferences The women discussed the dogs on the beach. t1. The women discussed them (the dogs) while on the beach. (10%) t2. The women discussed the dogs which were on the beach. (90%) 28

Frame Preferences (2) The women kept the dogs on the beach. a. The women kept the dogs which were on the beach. b. The women discussed them (the dogs) while on the beach. 29 Modeling Garden Paths The reduced relative clause often cause irrecoverable difficulty, but nor always: The horse raced past the barn fell (irrecoverable) The bird found died (recoverable) We can use probabilities to distinguish the two cases, in a way a purely structural account (Frazier, or Pritchett) cannot. Assume a bounded, parallel parser The parse with the highest probability is preferred Only those parsers which are within some beam of the preferred parse are kept, others are discarded 30

The horse raced past the barn fell 31 The bird found died 32

The Jurafsky Model etting the beam width: The horse raced past the barn fell " 82:1 The bird found died " " 4:1 Jurafsky assumes a garden path occurs (I.e. a parse is pruned) if its probability ratio with the best parse is greater than 5:1 Open issues: Where do we get the probabilities? Does the model work for other languages? 33 Garden-Path Theory: Jurafsky (1996) What architecture is assumed? Modular lexico-syntactic processor with lexical (category and subcategory), no semantic knowledge What mechanisms is used to construct interpretations? Incremental, bounded parallel parsing, with reranking What information is used to determine preferred structure? Lexical and structural probabilities Linking Hypothesis: Parse reranking causes increased RTs, if correct parse has been eliminated, predict a garden-path 34

A Problem for Likelihood? NP/ Complement Ambiguity: The athlete realised his goals... ru ru NP1 VP NP1 VP The athlete ru The athlete ru V NP2 V 2 realised his goals realised tu NP2 VP his goals were out of reach Evidence for object attachment: (Pickering, Traxler & Crocker 2000) Despite -comp bias of verb, NP is attached as D-object Ideal likelihood model and Jurafsky predict the opposite realised is initially tagged at -comp, but the simpler DO analysis is then given higher probability, when NP is found 35