Language as communication. Ted Gibson 9.59J/24.905J

Similar documents
Good-Enough Representations in Language Comprehension

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

CS 598 Natural Language Processing

Copyright and moral rights for this thesis are retained by the author

Construction Grammar. University of Jena.

Syntactic Ambiguity Resolution in Sentence Processing: New Evidence from a Morphologically Rich Language

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Good Enough Language Processing: A Satisficing Approach

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

A Usage-Based Approach to Recursion in Sentence Processing

Context Free Grammars. Many slides from Michael Collins

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Ambiguity in the Brain: What Brain Imaging Reveals About the Processing of Syntactically Ambiguous Sentences

Parsing of part-of-speech tagged Assamese Texts

Compositional Semantics

Argument structure and theta roles

Natural Language Processing. George Konidaris

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

The Real-Time Status of Island Phenomena *

Control and Boundedness

THE VERB ARGUMENT BROWSER

Unit 8 Pronoun References

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

Ambiguities and anomalies: What can eye-movements and event-related potentials reveal about second language sentence processing?

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

An Interactive Intelligent Language Tutor Over The Internet

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Multiple case assignment and the English pseudo-passive *

Chapter 4: Valence & Agreement CSLI Publications

Pseudo-Passives as Adjectival Passives

The Role of the Head in the Interpretation of English Deverbal Compounds

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Some Principles of Automated Natural Language Information Extraction

Which verb classes and why? Research questions: Semantic Basis Hypothesis (SBH) What verb classes? Why the truth of the SBH matters

Minimalism is the name of the predominant approach in generative linguistics today. It was first

The Perception of Nasalized Vowels in American English: An Investigation of On-line Use of Vowel Nasalization in Lexical Access

EAGLE: an Error-Annotated Corpus of Beginning Learner German

University of Groningen. Verbs in spoken sentence processing de Goede, Dieuwke

Thornhill Primary School - Grammar coverage Year 1-6

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Stephen Craint and Donald Shankweilert. 1. Introduction

Processing as a Source of Accessibility Effects on Variation

NAME: East Carolina University PSYC Developmental Psychology Dr. Eppler & Dr. Ironsmith

Project in the framework of the AIM-WEST project Annotation of MWEs for translation

Ch VI- SENTENCE PATTERNS.

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

Formulaic Language and Fluency: ESL Teaching Applications

Analysis of Probabilistic Parsing in NLP

Developing Grammar in Context

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1

Procedia - Social and Behavioral Sciences 154 ( 2014 )

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Organizing Comprehensive Literacy Assessment: How to Get Started

BASIC ENGLISH. Book GRAMMAR

Generation of Referring Expressions: Managing Structural Ambiguities

LNGT0101 Introduction to Linguistics

The Structure of Relative Clauses in Maay Maay By Elly Zimmer

520 HISTORY.ORG CIVICS HOW DO PEOPLE WORK TOGETHER TO SOLVE PROBLEMS?

On the Notion Determiner

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Underlying and Surface Grammatical Relations in Greek consider

Lesson Plan Art: Painting Techniques

Effects of speaker gaze on spoken language comprehension: Task matters

Hindi Aspectual Verb Complexes

5. UPPER INTERMEDIATE

Update on Soar-based language processing

Possessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Verb subcategorization frequencies: American English corpus data, methodological studies, and cross-corpus comparisons

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Grammars & Parsing, Part 1:

Creating Travel Advice

The suffix -able means "able to be." Adding the suffix -able to verbs turns the verbs into adjectives. chewable enjoyable

Hindi-Urdu Phrase Structure Annotation

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Cross Language Information Retrieval

Language Learning and Development. ISSN: (Print) (Online) Journal homepage:

Morphosyntactic and Referential Cues to the Identification of Generic Statements

TEKS Correlations Proclamation 2017

The Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh

Mandarin Lexical Tone Recognition: The Gating Paradigm

Chapter 9 Banked gap-filling

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Phenomena of gender attraction in Polish *

Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure

English Language and Applied Linguistics. Module Descriptions 2017/18

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

SOFTWARE EVALUATION TOOL

Dissertation Summaries. The Acquisition of Aspect and Motion Verbs in the Native Language (Aristotle University of Thessaloniki, 2014)

Aging and the Use of Context in Ambiguity Resolution: Complex Changes From Simple Slowing

Developing a TT-MCTAG for German with an RCG-based Parser

Urban Legends Three Week Unit 9th/10th Speech

Prediction of Maximal Projection for Semantic Role Labeling

Controlled vocabulary

SYNTACTIC ADAPTATION 1. Rapid Expectation Adaptation During Syntactic Comprehension. Alex B. Fine * T. Florian Jaeger. Thomas A. Farmer.

Transcription:

Language as communication Ted Gibson 9.59J/24.905J

Overview Language information sources and constraints Lexicon; syntax; world knowledge; working memory; context; pragmatics; prosody Language as communication Ambiguity? Words Sentences Communication-based models of language evolution and processing

Language information sources and constraints Lexicon Syntax World knowledge Context Working memory Pragmatics Prosody

Language: Information sources and constraints Lexical (Word) information, e.g., frequency Unambiguous words: more frequent, faster access: class vs. caste Ambiguity: more frequent usages are preferred # The old man the boats. Syntactic argument structure frequencies # I put the candy on the table into my mouth. The verb put prefers to have a locative goal prepositional phrase (like on ) The noun candy has no bias to have a locative prepositional phrase

The existence of garden-path effects provides evidence: That the relevant information factor(s) play a role in human language processing (e.g., lexical frequency, syntactic phrase structure frequency, etc.) And more generally: That language is processed on-line, as it is heard or read That the human parser is not unlimited parallel. Rather, it must be ranked parallel or serial.

Language: Information sources and constraints Syntax / word order / sentence structure: giving rise to the literal predicate-argument meaning of a phrase / sentence The cat is watching the mouse.?? mouse cat the is the watching. Compositional rules: meaning of the larger phrase is formed from the meaning of the parts: NP Det Noun; S NP VP; VP Verb NP The syntax of a language makes some interpretations available: The dog bit the boy. vs. The boy bit the dog. Ambiguity: multiple syntactic interpretations The boy saw the man with the telescope.

Language: Information sources and constraints Syntax / word order / sentence structure, giving rise to the literal predicateargument meaning of a phrase / sentence The rules corresponding to assigning the meaning of a phrase like The dog with the white fur are context-independent: (so-called context-free rules) Subject position of sentence (the noun phrase to the left of verb): The dog with the white fur chased the black squirrel into the home of the grey cat. Direct object position of sentence (first noun phrase to the right of verb): The grey cat chased the dog with the white fur into the home of the black squirrel. Direct object position of a preposition (first noun phrase to the right of a preposition): The grey cat chased the black squirrel into the home of the dog with the white fur.

Language: Information sources and constraints More frequent phrase rules, easier processing (Jurafsky, 1996; Hale, 2001; Levy, 2008): Ambiguity The defendant examined S NP VP vs. NP NP RC The defendant examined the evidence.?? The defendant examined by the lawyer turned out to be unreliable. Unambiguous syntax John was smoking.? That John was smoking bothered me.?? John s face needs washed.

Language: Information sources and constraints World knowledge Unambiguous examples: The dog bit the boy. vs. The boy bit the dog. Ambiguity: (Trueswell,Tanenhaus & Garnsey, 1994) The defendant examined by the lawyer turned out to be unreliable. The evidence examined by the lawyer turned out to be unreliable. Methods: (1) Eye-tracking during reading; (2) Self-paced reading

Reading time studies Compare target to its control: Temporary ambiguity: The defendant examined by the lawyer turned out to be unreliable. Unambiguous control: The defendant that was examined by the lawyer turned out to be unreliable. Target regions: examined, by the lawyer

Information sources and constraints: Modularity / Information- Two kinds of questions: WHAT are the information sources that people are sensitive to? (And how are they organized in the brain: we don t know this well yet) WHEN are information constraints applied? Fodor (1983) proposed modularity / information-encapsulation of words and syntax One concrete idea: people compute the literal meanings of compositional language first, and then make inferences about what might have been meant Non-literal language: inferences about the intended meaning: PRAGMATICS Some of the students passed the test. Not all the students passed the test. JOHN went to the store. Only John went to the store. Can you please pass the salt? Pass the salt. I am cold. (next to an open window): Close the window.

Information sources and constraints: Modularity / Information- WHEN are information constraints applied? Fodor (1983) proposed modularity / information-encapsulation of words and syntax Another idea: people use syntactic disambiguation rules to decide among choices, independent of their meaning: choose simplest syntactic choice, independent of meaning. E.g., most frequent syntax Thus the choice between Main-Verb or Relative Clause structure of the defendant / evidence examined would not depend on the meanings Thus people should favor the simpler structure, independent of meaning. This is what Ferreira & Clifton (1986) found for the evidence examined case. But there were serious confounds in their materials, which undermined their interpretation

Language: Information sources and constraints Current Context (Crain & Steedman, 1985;Altmann & Steedman, 1988; Tanenhaus et al., 1995): visual or linguistic Ambiguity: There were two defendants, one of whom the lawyer ignored entirely, and the other of whom the lawyer interrogated for two hours. The defendant examined by the lawyer turned out to be unreliable.

Monitoring visual eye-movements while listening to spoken instructions (Tanenhaus et al., 1995;Trueswell et al., 1999) 1-referent context: Put the hippo on the towel in the basket. Many looks to the incorrect target

Monitoring visual eye-movements while listening to spoken instructions (Tanenhaus et al., 1995;Trueswell et al., 1999) 2-referent context: Put the bear on the plate into the box. No looks to the incorrect target

Language: Information sources and constraints Working memory: Longer distance dependencies are harder to process than more local ones Dependencies between a verb and its post-verbal objects: Short NP object: Local Particle: Joe threw out the documents. Non-local Particle: Joe threw the documents out. Long NP object: Local Particle: Joe threw out the very important documents that he brought home. Non-local Particle: Joe threw the very important documents that he brought home out.

Information processing: Working memory Working memory: Local connections are easier to make than long-distance ones (Gibson, 1998, 2000; Grodner & Gibson, 2005;Warren & Gibson, 2002; Lewis & Vashishth, 2005; Hawkins, 1994) Ambiguous attachments: The bartender told the detective that the suspect left the country yesterday. yesterday is preferred as modifying left rather than told (Frazier & Rayner, 1982; Gibson et al., 1996;Altmann et al., 1998; Pearlmutter & Gibson, 2001) Unambiguous connections: The reporter wrote an article. The reporter from the newspaper wrote an article. The reporter who was from the newspaper wrote an article.

Retrieval / Integration-based theories Integration: connecting the current word into the structure built thus far: Local integrations are easier than longer-distance integrations The Dependency Locality Theory (DLT) (Gibson, 1998; 2000): intervening discourse referents cause retrieval difficulty (also in production) Activation-based memory theory: similarity-based interference (Lewis & Vasishth, 2005;Vasishth & Lewis, 2006; Lewis,Vasishth & Van Dyke, 2006): intervening similar elements cause retrieval difficulty Production: Hawkins (1994; 2004): word-based distance metric.

Dependency Length Minimization Futrell, Mahowald & Gibson, 2015, PNAS Corpora from 37 languages parsed into dependencies, from NLP sources: the HamleDT and UDT; cf.wals (Dryer 2013) Family / Region Indo-European (IE)/West-Germanic; IE/North-Germanic; IE/ Romance; IE/Greek; IE/West Slavic; IE/South Slavic; IE/East Slavic; IE/Iranian; IE/Indic; Finno-Ugric/Finnic; Finno-Ugric/Ugric; Turkic; West Semitic; Dravidian; Austronesian; East Asian Isolate (2); Other Isolate (1) Result:All languages minimize dependency distances (c.f. Hawkins, 1994; Gibson, 1998)

the girl kicks the ball the girl the ball kicks the ball the girl kicks girl the kicks the ball ball the girl the kicks Futrell, Mahowald, & Gibson, 2015, PNAS

Dependency Length Minimization Futrell, Mahowald & Gibson, 2015, PNAS Courtesy of National Academy of Sciences, U. S. A. Used with permission. Source: Futrell, Richard, Kyle Mahowald, and Edward Gibson. "Largescale evidence of dependency length minimization in 37 languages." Proceedings of the National Academy of Sciences 112, no. 33 (2015): 10336-10341. Copyright 2015 National Academy of Sciences, U.S.A.

Potential project Result to replicate: Subject-extractions in Relative clauses (RCs) are easier to process than objectextractions: Subj-RC: The reporter who attacked the senator admitted the error. Obj-RC: The reporter who the senator attacked admitted the error. RTs faster at attacked in SRC than in ORC Two explanations: ORCs are rare, and longer-distance Extension: evaluation other kinds of extraction in English: Dative extractions: infrequent, long-distance The boy who the girl gave the book to admitted the error. The boy to whom the girl gave the book admitted the error. Genitive extractions: infrequent, short-distance The girl whose friend invited the kids to the party was kind.

9.59J/24.905J Lab in Psycholinguistics Spring 2017 For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.