Detecting novel metaphor using selectional preference information

Similar documents
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

A Case Study: News Classification Based on Term Frequency

The stages of event extraction

LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization

The Role of the Head in the Interpretation of English Deverbal Compounds

CS Machine Learning

Assignment 1: Predicting Amazon Review Ratings

Modeling function word errors in DNN-HMM based LVCSR systems

THE VERB ARGUMENT BROWSER

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

2.1 The Theory of Semantic Fields

Linking Task: Identifying authors and book titles in verbose queries

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

A Web Based Annotation Interface Based of Wheel of Emotions. Author: Philip Marsh. Project Supervisor: Irena Spasic. Project Moderator: Matthew Morgan

Rule Learning With Negation: Issues Regarding Effectiveness

Word Segmentation of Off-line Handwritten Documents

Modeling function word errors in DNN-HMM based LVCSR systems

Speech Recognition at ICSI: Broadcast News and beyond

The MEANING Multilingual Central Repository

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

The taming of the data:

Cross Language Information Retrieval

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

The Language of Football England vs. Germany (working title) by Elmar Thalhammer. Abstract

Rule Learning with Negation: Issues Regarding Effectiveness

Some Principles of Automated Natural Language Information Extraction

Dickinson ISD ELAR Year at a Glance 3rd Grade- 1st Nine Weeks

A Bootstrapping Model of Frequency and Context Effects in Word Learning

Lecture 1: Machine Learning Basics

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

CS 598 Natural Language Processing

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

Using dialogue context to improve parsing performance in dialogue systems

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Evaluation of Teach For America:

Formulaic Language and Fluency: ESL Teaching Applications

Human Emotion Recognition From Speech

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Evolution of Symbolisation in Chimpanzees and Neural Nets

Python Machine Learning

The Importance of Social Network Structure in the Open Source Software Developer Community

Hindi-Urdu Phrase Structure Annotation

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

1. Introduction. 2. The OMBI database editor

Handling Sparsity for Verb Noun MWE Token Classification

Project in the framework of the AIM-WEST project Annotation of MWEs for translation

CS 446: Machine Learning

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition

AQUA: An Ontology-Driven Question Answering System

A Comparison of Two Text Representations for Sentiment Analysis

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

The College Board Redesigned SAT Grade 12

Data-driven Type Checking in Open Domain Question Answering

Semantic and Context-aware Linguistic Model for Bias Detection

Vocabulary Usage and Intelligibility in Learner Language

Treebank mining with GrETEL. Liesbeth Augustinus Frank Van Eynde

The Smart/Empire TIPSTER IR System

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

This Performance Standards include four major components. They are

Extracting and Ranking Product Features in Opinion Documents

Multi-Lingual Text Leveling

Short Text Understanding Through Lexical-Semantic Analysis

Argument structure and theta roles

LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

(Sub)Gradient Descent

On document relevance and lexical cohesion between query terms

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Corpus Linguistics (L615)

Calibration of Confidence Measures in Speech Recognition

Access Center Assessment Report

cmp-lg/ Jul 1995

Methods for the Qualitative Evaluation of Lexical Association Measures

5 th Grade Language Arts Curriculum Map

Probabilistic Latent Semantic Analysis

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Procedia - Social and Behavioral Sciences 154 ( 2014 )

A Bayesian Learning Approach to Concept-Based Document Classification

Data-driven type checking in open domain question answering

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

An Evaluation of the Interactive-Activation Model Using Masked Partial-Word Priming. Jason R. Perry. University of Western Ontario. Stephen J.

A Grammar for Battle Management Language

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

Speech Emotion Recognition Using Support Vector Machine

2006 Mississippi Language Arts Framework-Revised Grade 12

Data Fusion Through Statistical Matching

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

South Carolina English Language Arts

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

Switchboard Language Model Improvement with Conversational Data from Gigaword

Learning Methods in Multilingual Speech Recognition

Proof Theory for Syntacticians

Глубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

arxiv: v2 [cs.cv] 30 Mar 2017

Transcription:

17/06/2016 1 Detecting novel metaphor using selectional preference information Hessel Haagsma and Johannes Bjerva University of Groningen, The Netherlands

17/06/2016 2 Outline 1. Types of metaphor 2. Selectional preference violation 3. Approach & implementation 4. Evaluation & results 5. Analysis & discussion

17/06/2016 3 A definition of metaphor A lexical unit is metaphorical if it has a more basic contemporary meaning in other contexts than in the current context

17/06/2016 4 A definition of metaphor A lexical unit is metaphorical if it has a more basic contemporary meaning in other contexts than in the current context Wide range of metaphor: 1. Do the Greeks have a word for it? 2. only little scientific evidence supports the link

17/06/2016 5 Degrees of metaphoricity 1. None Literal meaning, most basic, in lexicon 2. Conventional Metaphorical meaning, non-basic, in lexicon 3. Novel Metaphorical meaning, non-basic, not in lexicon

17/06/2016 6 Examples 1. No metaphor The scientists eat their sandwiches. eat#1 (take in solid food) Senses from WordNet 3.1

17/06/2016 7 Examples 1. No metaphor The scientists eat their sandwiches. eat#1 (take in solid food) 2. Conventional metaphor Firefox is eating my memory. eat#5 (use up (resources or materials)) Senses from WordNet 3.1

17/06/2016 8 Examples 1. No metaphor The scientists eat their sandwiches. eat#1 (take in solid food) 2. Conventional metaphor Firefox is eating my memory. eat#5 (use up (resources or materials)) 3. Novel metaphor You wanted to eat up my sadness. eat#? (take away/cure/remove) Senses from WordNet 3.1

17/06/2016 9 Metaphor processing and WSD Problem: which is the meaning of this ambiguous word/phrase in this specific context? WSD and metaphor processing overlap on conventional metaphors Novel metaphor outside of scope WSD Improved handling of metaphor can benefit WSD

17/06/2016 10 Outline 1. Types of metaphor 2. Selectional preference violation 3. Approach & implementation 4. Evaluation & results 5. Analysis & discussion

17/06/2016 11 Selectional preference violation Selectional preferences capture intuitive knowledge about what fits in a certain domain Metaphor combines a source and target domain Violation of selectional preferences as an indicator of two distinct domains, metaphor

17/06/2016 12 Examples 1. No metaphor The scientists eat their sandwiches. eat#1 (take in solid food) 2. Conventional metaphor Firefox is eating my RAM. eat#5 (use up (resources or materials)) 3. Novel metaphor You wanted to eat up my sadness. eat#? (take away/cure/remove) Senses from WordNet 3.1

17/06/2016 13 Novel metaphor Automatically acquired selectional preferences capture frequency, not basicness Conventional metaphor sometimes more frequent than literal e.g. uncover a treasure vs. uncover a secret Assumption: novel metaphors are always infrequent

17/06/2016 14 Outline 1. Types of metaphor 2. Selectional preference violation 3. Approach & implementation 4. Evaluation & results 5. Analysis & discussion

17/06/2016 15 Approach Gather verb-subject and verb-object pairs from a large, parsed English corpus Extract selectional preference metrics Generalize over co-occurrence counts Use as features in a logistic regression classifier to detect metaphors in the VUAMC

17/06/2016 16 Selectional preference information Word-level verb metaphor detection Parse Wikipedia dump (1.6B words), extract and count verb-noun pairs Calculate conditional probability (CP), log probability (LP), selectional association (SA) and selectional preference strength (SPS) CP, LP, SA represent likelihood of verb-noun pair SPS represents selectivity of verb

17/06/2016 17 Generalization Generalization helps going from word-word pairs to domain-domain pairs Three approaches 1. Pre-trained Brown clusters, from Derczynski et al. (2015), 80-5120 clusters 2. K-means clustered GloVe embeddings (300D/840B), 400k vocabulary, 80-5120 clusters 3. Neural net predictor of LP, based on embeddings, single hidden layer, 600 units, ADAM, Dropout

17/06/2016 18 Training data Verb Subj. Obj. CP-s LP-s SPS-s SA-s Label maintain couple link 0.005-7.51 0.93 6.20 1 need we pilot 0.05-2.98 0.73 0.17 0

17/06/2016 19 Outline 1. Types of metaphor 2. Selectional preference violation 3. Approach & implementation 4. Evaluation & results 5. Analysis & discussion

17/06/2016 20 Evaluation data VU Amsterdam Metaphor Corpus (VUAMC), parsed Extract all verbs Verb-subject-object: 5,539 Verb-subject: 13,466 Verb-object: 3,913 Downside: broad definition of metaphor, highly conventionalized metaphors dominate Manual inspection of metaphor type

17/06/2016 21 Classifier Logistic regression with L2 regularization 10-fold cross-validation Separate classifier per dataset Back-off to majority class (non-metaphor)

17/06/2016 22 Re-weighting Re-weighting of examples to counter class imbalance Subject-verb: 13.0% Verb-object: 34.7% Subject-verb-object: 36.4% Assign more weight to minority class examples

17/06/2016 23 Results (1) Without re-weighting of training data Data BL CP LP Pred-LP SPS SA All Subject 23,0 0,0 0,0 0,0 0,0 0,0 1,3 Object 50,8 0,0 3,2 1,4 0,0 0,0 2,4 Both 53,4 0,0 18,1 0,7 0,0 2,3 32,1

17/06/2016 24 Results (1) Without re-weighting of training data Data BL CP LP Pred-LP SPS SA All Subject 23,0 0,0 0,0 0,0 0,0 0,0 1,3 Object 50,8 0,0 3,2 1,4 0,0 0,0 2,4 Both 53,4 0,0 18,1 0,7 0,0 2,3 32,1 With re-weighting of training data Data BL CP LP Pred-LP SPS SA All Subject 23,0 24,5 24,5 23,2 20,9 26,4 33,6 Object 50,8 53,4 45,6 49,2 49,0 51,2 47,6 Both 53,4 54,2 44,3 50,0 50,5 63,8 57,8

17/06/2016 25 Results (2) With Brown clustering Data BL 80 160 320 640 1280 2560 5120 Subject 23,0 26,3 28,8 27,9 25,9 26,3 26,6 25,3 Object 50,8 48,7 47,7 45,3 46,9 44,7 44,6 46,2 Both 53,4 52,7 52,8 53,7 54,3 53,5 54,3 54,5 With k-means clustering Data BL 80 160 320 640 1280 2560 5120 Subject 23,0 24,2 23,5 30,7 28,6 24,4 23,6 22,9 Object 50,8 40,4 44,8 45,8 44,2 48,9 48,8 49,8 Both 53,4 49,8 48,2 50,4 49,2 47,6 50,4 49,5

17/06/2016 26 Outline 1. Types of metaphor 2. Selectional preference violation 3. Approach & implementation 4. Evaluation & results 5. Analysis & discussion

17/06/2016 27 Generalization In the current set-up, generalization does not work Brown k-means prediction No clear effect of cluster size Information loss outweighs generalization gain Clusters do not form coherent domains

17/06/2016 28 Error analysis Large number of (unresolved) pronouns True positives contain many light verbs (take, have, make, put). Logistic regression exploits corpus distribution One example of novel metaphor: [ ] Adam might have escaped the file memories for years, [ ]

17/06/2016 29 Conclusion Is selectional preference information useful for detecting novel metaphors? Better evaluation data is needed Annotate novel/oov senses in VUAMC Annotate metaphor on a scale, not binary Use selectional preference violation to discover novel metaphors