Word Sense Disambiguation

Similar documents
On document relevance and lexical cohesion between query terms

Vocabulary Usage and Intelligibility in Learner Language

A Comparative Evaluation of Word Sense Disambiguation Algorithms for German

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2

! # %& ( ) ( + ) ( &, % &. / 0!!1 2/.&, 3 ( & 2/ &,

Leveraging Sentiment to Compute Word Similarity

A Semantic Similarity Measure Based on Lexico-Syntactic Patterns

Combining a Chinese Thesaurus with a Chinese Dictionary

Robust Sense-Based Sentiment Classification

Linking Task: Identifying authors and book titles in verbose queries

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

2.1 The Theory of Semantic Fields

Python Machine Learning

A Bayesian Learning Approach to Concept-Based Document Classification

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Rule Learning With Negation: Issues Regarding Effectiveness

First Grade Curriculum Highlights: In alignment with the Common Core Standards

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

The MEANING Multilingual Central Repository

Automatic Extraction of Semantic Relations by Using Web Statistical Information

Ensemble Technique Utilization for Indonesian Dependency Parser

Multilingual Sentiment and Subjectivity Analysis

English Language and Applied Linguistics. Module Descriptions 2017/18

DKPro WSD A Generalized UIMA-based Framework for Word Sense Disambiguation

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

A Case Study: News Classification Based on Term Frequency

Outline. Web as Corpus. Using Web Data for Linguistic Purposes. Ines Rehbein. NCLT, Dublin City University. nclt

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Accuracy (%) # features

Cross Language Information Retrieval

The stages of event extraction

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Matching Similarity for Keyword-Based Clustering

Word Segmentation of Off-line Handwritten Documents

A Domain Ontology Development Environment Using a MRD and Text Corpus

Rule Learning with Negation: Issues Regarding Effectiveness

Speech Recognition at ICSI: Broadcast News and beyond

A process by any other name

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

Probabilistic Latent Semantic Analysis

Switchboard Language Model Improvement with Conversational Data from Gigaword

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

Controlled vocabulary

TextGraphs: Graph-based algorithms for Natural Language Processing

Procedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Multi-Lingual Text Leveling

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

Lexical Similarity based on Quantity of Information Exchanged - Synonym Extraction

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Radius STEM Readiness TM

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Learning Methods in Multilingual Speech Recognition

Semantic Evidence for Automatic Identification of Cognates

Context Free Grammars. Many slides from Michael Collins

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

arxiv: v1 [cs.cl] 2 Apr 2017

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1] 6.1 Type-token ratio

A Comparison of Two Text Representations for Sentiment Analysis

Using Web Searches on Important Words to Create Background Sets for LSI Classification

CEFR Overall Illustrative English Proficiency Scales

1. Introduction. 2. The OMBI database editor

Using dialogue context to improve parsing performance in dialogue systems

Lecture 1: Basic Concepts of Machine Learning

Combining Bidirectional Translation and Synonymy for Cross-Language Information Retrieval

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data

arxiv:cmp-lg/ v1 22 Aug 1994

AQUA: An Ontology-Driven Question Answering System

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

Lecture 1: Machine Learning Basics

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Let's Learn English Lesson Plan

Student User s Guide to the Project Integration Management Simulation. Based on the PMBOK Guide - 5 th edition

The taming of the data:

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

A Graph Based Authorship Identification Approach

BYLINE [Heng Ji, Computer Science Department, New York University,

Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features

Sleeping Coconuts Cluster Projects

EQuIP Review Feedback

CS 446: Machine Learning

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

Distant Supervised Relation Extraction with Wikipedia and Freebase

LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

CHANCERY SMS 5.0 STUDENT SCHEDULING

The Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh

Grade Band: High School Unit 1 Unit Target: Government Unit Topic: The Constitution and Me. What Is the Constitution? The United States Government

Ontologies vs. classification systems

Airplane Rescue: Social Studies. LEGO, the LEGO logo, and WEDO are trademarks of the LEGO Group The LEGO Group.

Lesson Plan Title Aquatic Ecology

Effectiveness of Electronic Dictionary in College Students English Learning

Transcription:

Word Sense Disambiguation D. De Cao R. Basili Corso di Web Mining e Retrieval a.a. 2008-9 May 21, 2009 Excerpt of the R. Mihalcea and T. Pedersen AAAI 2005 Tutorial, at: http://www.d.umn.edu/ tpederse/tutorials/advances-in-wsd-aaai-2005.ppt

Definitions Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities.

Definitions Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities. Sense Inventory usually comes from a dictionary or thesaurus. Knowledge intensive methods, supervised learning, and (sometimes) bootstrapping approaches.

Definitions Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities. Sense Inventory usually comes from a dictionary or thesaurus. Knowledge intensive methods, supervised learning, and (sometimes) bootstrapping approaches. Word sense discrimination is the problem of dividing the usages of a word into different meanings, without regard to any particular existing sense inventory.

Definitions Word sense disambiguation is the problem of selecting a sense for a word from a set of predefined possibilities. Sense Inventory usually comes from a dictionary or thesaurus. Knowledge intensive methods, supervised learning, and (sometimes) bootstrapping approaches. Word sense discrimination is the problem of dividing the usages of a word into different meanings, without regard to any particular existing sense inventory. Unsupervised techniques.

Computers versus Humans Polysemy: most words have many possible meanings.

Computers versus Humans Polysemy: most words have many possible meanings. A computer program has no basis for knowing which one is appropriate, even if it is obvious to a human...

Computers versus Humans Polysemy: most words have many possible meanings. A computer program has no basis for knowing which one is appropriate, even if it is obvious to a human... Ambiguity is rarely a problem for humans in their day to day communication, except in extreme cases...

Computers versus Humans Polysemy: most words have many possible meanings. A computer program has no basis for knowing which one is appropriate, even if it is obvious to a human... Ambiguity is rarely a problem for humans in their day to day communication, except in extreme cases... Example: The fisherman jumped off the bank and into the water. The bank down the street was robbed!

Brief Historical Overview Noted as problem for Machine Translation (Weaver, 1949) A word can often only be translated if you know the specific sense intended (A bill in English could be a pico or a cuenta in Spanish) Bar-Hillel (1960) posed the following: Little John was looking for his toy box. Finally, he found it. The box was in the pen. John was very happy. Is pen a writing instrument or an enclosure where children play?... declared it unsolvable, left the field of MT!

Brief Historical Overview 1970s - 1980s Rule based systems Rely on hand crafted knowledge sources 1990s Corpus based approaches Dependence on sense tagged text (Ide and Veronis, 1998) overview history from early days to 1998. 2000s Hybrid Systems Minimizing or eliminating use of sense tagged text Taking advantage of the Web

Practical Applications Machine Translation Translate bill from English to Spanish Is it a pico or a cuenta? Is it a bird jaw or an invoice?

Practical Applications Machine Translation Translate bill from English to Spanish Is it a pico or a cuenta? Is it a bird jaw or an invoice? Information Retrieval Find all Web Pages about cricket The sport or the insect?

Practical Applications Machine Translation Translate bill from English to Spanish Is it a pico or a cuenta? Is it a bird jaw or an invoice? Information Retrieval Find all Web Pages about cricket The sport or the insect? Question Answering What is George Miller s position on gun control? The psychologist or US congressman?

Practical Applications Machine Translation Translate bill from English to Spanish Is it a pico or a cuenta? Is it a bird jaw or an invoice? Information Retrieval Find all Web Pages about cricket The sport or the insect? Question Answering What is George Miller s position on gun control? The psychologist or US congressman? Knowledge Acquisition Add to KB: Herb Bergson is the mayor of Duluth. Minnesota or Georgia?

Overview of the Problem Many words have several meanings (homonymy / polysemy) Ex: chair - furniture or person Ex: child - young person or human offspring

Overview of the Problem Many words have several meanings (homonymy / polysemy) Ex: chair - furniture or person Ex: child - young person or human offspring Determine which sense of a word is used in a specific sentence

Overview of the Problem Many words have several meanings (homonymy / polysemy) Ex: chair - furniture or person Ex: child - young person or human offspring Determine which sense of a word is used in a specific sentence Note: often, the different senses of a word are closely related Ex: title : right of legal ownership document that is evidence of the legal ownership

Overview of the Problem Many words have several meanings (homonymy / polysemy) Ex: chair - furniture or person Ex: child - young person or human offspring Determine which sense of a word is used in a specific sentence Note: often, the different senses of a word are closely related Ex: title : right of legal ownership document that is evidence of the legal ownership sometimes, several senses can be activated in a single context (co-activation) Ex: This could bring competition to the trade: the act of competing the people who are competing

Word Senses The meaning of a word in a given context

Word Senses The meaning of a word in a given context Word sense representations With respect to a dictionary chair = a seat for one person, with a support for the back; he put his coat over the back of the chair and sat down chair = the position of professor; he was awarded an endowed chair in economics

Word Senses The meaning of a word in a given context Word sense representations With respect to a dictionary chair = a seat for one person, with a support for the back; he put his coat over the back of the chair and sat down chair = the position of professor; he was awarded an endowed chair in economics With respect to the translation in a second language chair = chaise chair = directeur

Word Senses The meaning of a word in a given context Word sense representations With respect to a dictionary chair = a seat for one person, with a support for the back; he put his coat over the back of the chair and sat down chair = the position of professor; he was awarded an endowed chair in economics With respect to the translation in a second language chair = chaise chair = directeur With respect to the context where it occurs (discrimination) Sit on a chair Take a seat on this chair The chair of the Math Department The chair of the meeting

Approaches to Word Sense Disambiguation Knowledge-Based Disambiguation use of external lexical resources such as dictionaries and thesauri discourse properties

Approaches to Word Sense Disambiguation Knowledge-Based Disambiguation use of external lexical resources such as dictionaries and thesauri discourse properties Supervised Disambiguation based on a labeled training set the learning system has: a training set of feature-encoded inputs AND their appropriate sense label (category)

Approaches to Word Sense Disambiguation Knowledge-Based Disambiguation use of external lexical resources such as dictionaries and thesauri discourse properties Supervised Disambiguation based on a labeled training set the learning system has: a training set of feature-encoded inputs AND their appropriate sense label (category) Unsupervised Disambiguation based on unlabeled corpora The learning system has: a training set of feature-encoded inputs BUT NOT their appropriate sense label (category)

All Words Word Sense Disambiguation Attempt to disambiguate all open-class words in a text He put his suit over the back of the chair

All Words Word Sense Disambiguation Attempt to disambiguate all open-class words in a text He put his suit over the back of the chair Knowledge-based approaches

All Words Word Sense Disambiguation Attempt to disambiguate all open-class words in a text He put his suit over the back of the chair Knowledge-based approaches Use information from dictionaries Definitions / Examples for each meaning Find similarity between definitions and current context

All Words Word Sense Disambiguation Attempt to disambiguate all open-class words in a text He put his suit over the back of the chair Knowledge-based approaches Use information from dictionaries Definitions / Examples for each meaning Find similarity between definitions and current context Position in a semantic network Find that table is closer to chair/furniture than to chair/person

All Words Word Sense Disambiguation Attempt to disambiguate all open-class words in a text He put his suit over the back of the chair Knowledge-based approaches Use information from dictionaries Definitions / Examples for each meaning Find similarity between definitions and current context Position in a semantic network Find that table is closer to chair/furniture than to chair/person Use discourse properties A word exhibits the same sense in a discourse / in a collocation

All Words Word Sense Disambiguation Minimally supervised approaches Learn to disambiguate words using small annotated corpora E.g. SemCor - corpus where all open class words are disambiguated 200,000 running words Most frequent sense

Targeted Word Sense Disambiguation Disambiguate one target word Take a seat on this chair The chair of the Math Department

Targeted Word Sense Disambiguation Disambiguate one target word Take a seat on this chair The chair of the Math Department WSD is viewed as a typical classification problem use machine learning techniques to train a system

Targeted Word Sense Disambiguation Disambiguate one target word Take a seat on this chair The chair of the Math Department WSD is viewed as a typical classification problem use machine learning techniques to train a system Training: Corpus of occurrences of the target word, each occurrence annotated with appropriate sense Build feature vectors: a vector of relevant linguistic features that represents the context (ex: a window of words around the target word)

Targeted Word Sense Disambiguation Disambiguate one target word Take a seat on this chair The chair of the Math Department WSD is viewed as a typical classification problem use machine learning techniques to train a system Training: Corpus of occurrences of the target word, each occurrence annotated with appropriate sense Build feature vectors: a vector of relevant linguistic features that represents the context (ex: a window of words around the target word) Disambiguation: Disambiguate the target word in new unseen text

Targeted Word Sense Disambiguation Take a window of n word around the target word

Targeted Word Sense Disambiguation Take a window of n word around the target word Encode information about the words around the target word

Targeted Word Sense Disambiguation Take a window of n word around the target word Encode information about the words around the target word typical features include: words, root forms, POS tags, frequency,...

Targeted Word Sense Disambiguation Take a window of n word around the target word Encode information about the words around the target word typical features include: words, root forms, POS tags, frequency,... An electric guitar and bass player stand off to one side, not really part of the scene, just as a sort of nod to gringo expectations perhaps.

Targeted Word Sense Disambiguation Take a window of n word around the target word Encode information about the words around the target word typical features include: words, root forms, POS tags, frequency,... An electric guitar and bass player stand off to one side, not really part of the scene, just as a sort of nod to gringo expectations perhaps. Surrounding context (local features) [ (guitar, NN1), (and, CJC), (player, NN1), (stand, VVB) ]

Targeted Word Sense Disambiguation Take a window of n word around the target word Encode information about the words around the target word typical features include: words, root forms, POS tags, frequency,... An electric guitar and bass player stand off to one side, not really part of the scene, just as a sort of nod to gringo expectations perhaps. Surrounding context (local features) [ (guitar, NN1), (and, CJC), (player, NN1), (stand, VVB) ] Frequent co-occurring words (topical features) [fishing, big, sound, player, fly, rod, pound, double, runs, playing, guitar, band] [0,0,0,1,0,0,0,0,0,0,1,0]

Targeted Word Sense Disambiguation Take a window of n word around the target word Encode information about the words around the target word typical features include: words, root forms, POS tags, frequency,... An electric guitar and bass player stand off to one side, not really part of the scene, just as a sort of nod to gringo expectations perhaps. Surrounding context (local features) [ (guitar, NN1), (and, CJC), (player, NN1), (stand, VVB) ] Frequent co-occurring words (topical features) [fishing, big, sound, player, fly, rod, pound, double, runs, playing, guitar, band] [0,0,0,1,0,0,0,0,0,0,1,0] Other features: [followed by player, contains show in the sentence,... ] [yes, no,... ]

Unsupervised Disambiguation Disambiguate word senses: without supporting tools such as dictionaries and thesauri without a labeled training text

Unsupervised Disambiguation Disambiguate word senses: without supporting tools such as dictionaries and thesauri without a labeled training text Without such resources, word senses are not labeled We cannot say chair/furniture or chair/person

Unsupervised Disambiguation Disambiguate word senses: without supporting tools such as dictionaries and thesauri without a labeled training text Without such resources, word senses are not labeled We cannot say chair/furniture or chair/person We can: Cluster/group the contexts of an ambiguous word into a number of groups Discriminate between these groups without actually labeling them

Unsupervised Disambiguation Hypothesis: same senses of words will have similar neighboring words

Unsupervised Disambiguation Hypothesis: same senses of words will have similar neighboring words Disambiguation algorithm Identify context vectors corresponding to all occurrences of a particular word Partition them into regions of high density Assign a sense to each such region

Unsupervised Disambiguation Hypothesis: same senses of words will have similar neighboring words Disambiguation algorithm Identify context vectors corresponding to all occurrences of a particular word Partition them into regions of high density Assign a sense to each such region Sit on a chair Take a seat on this chair

Unsupervised Disambiguation Hypothesis: same senses of words will have similar neighboring words Disambiguation algorithm Identify context vectors corresponding to all occurrences of a particular word Partition them into regions of high density Assign a sense to each such region Sit on a chair Take a seat on this chair The chair of the Math Department The chair of the meeting

Evaluating Word Sense Disambiguation Metrics: Precision = percentage of words that are tagged correctly, out of the words addressed by the system Recall = percentage of words that are tagged correctly, out of all words in the test set Special tags are possible: Unknown Proper noun Multiple senses Compare to a gold standard SEMCOR corpus, SENSEVAL corpus,...

Evaluating Word Sense Disambiguation Difficulty in evaluation: Nature of the senses to distinguish has a huge impact on results

Evaluating Word Sense Disambiguation Difficulty in evaluation: Nature of the senses to distinguish has a huge impact on results Coarse versus fine-grained sense distinction

Evaluating Word Sense Disambiguation Difficulty in evaluation: Nature of the senses to distinguish has a huge impact on results Coarse versus fine-grained sense distinction chair = a seat for one person, with a support for the back; he put his coat over the back of the chair and sat down chair = the position of professor; he was awarded an endowed chair in economics

Evaluating Word Sense Disambiguation Difficulty in evaluation: Nature of the senses to distinguish has a huge impact on results Coarse versus fine-grained sense distinction chair = a seat for one person, with a support for the back; he put his coat over the back of the chair and sat down chair = the position of professor; he was awarded an endowed chair in economics bank = a financial institution that accepts deposits and channels the money into lending activities; he cashed a check at the bank ; that bank holds the mortgage on my home bank = a building in which commercial banking is transacted; the bank is on the corner of Nassau and Witherspoon

Evaluating Word Sense Disambiguation Difficulty in evaluation: Nature of the senses to distinguish has a huge impact on results Coarse versus fine-grained sense distinction chair = a seat for one person, with a support for the back; he put his coat over the back of the chair and sat down chair = the position of professor; he was awarded an endowed chair in economics bank = a financial institution that accepts deposits and channels the money into lending activities; he cashed a check at the bank ; that bank holds the mortgage on my home bank = a building in which commercial banking is transacted; the bank is on the corner of Nassau and Witherspoon Sense maps Cluster similar senses Allow for both fine-grained and coarse-grained evaluation

Knowledge-based Methods for Word Sense Disambiguation Knowledge-based WSD = class of WSD methods relying (mainly) on knowledge drawn from dictionaries and/or raw text

Knowledge-based Methods for Word Sense Disambiguation Knowledge-based WSD = class of WSD methods relying (mainly) on knowledge drawn from dictionaries and/or raw text Resources Yes Machine Readable Dictionaries Raw corpora No Manually annotated corpora

Knowledge-based Methods for Word Sense Disambiguation Knowledge-based WSD = class of WSD methods relying (mainly) on knowledge drawn from dictionaries and/or raw text Resources Yes Machine Readable Dictionaries Raw corpora No Manually annotated corpora Scope All open-class words

Machine Readable Dictionaries In recent years, most dictionaries made available in Machine Readable format (MRD) Oxford English Dictionary Collins Longman Dictionary of Ordinary Contemporary English (LDOCE)

Machine Readable Dictionaries In recent years, most dictionaries made available in Machine Readable format (MRD) Oxford English Dictionary Collins Longman Dictionary of Ordinary Contemporary English (LDOCE) Thesauruses - add synonymy information Roget Thesaurus

Machine Readable Dictionaries In recent years, most dictionaries made available in Machine Readable format (MRD) Oxford English Dictionary Collins Longman Dictionary of Ordinary Contemporary English (LDOCE) Thesauruses - add synonymy information Roget Thesaurus Semantic networks - add more semantic relations WordNet EuroWordNet

MRD - A Resource for Knowledge-based WSD For each word in the language vocabulary, an MRD provides: A list of meanings Definitions (for all word meanings) Typical usage examples (for most word meanings)

MRD - A Resource for Knowledge-based WSD For each word in the language vocabulary, an MRD provides: A list of meanings Definitions (for all word meanings) Typical usage examples (for most word meanings) WordNet definitions/examples for the noun plant 1 buildings for carrying on industrial labor; they built a large plant to manufacture automobiles 2 a living organism lacking the power of locomotion 3 something planted secretly for discovery by another; the police used a plant to trick the thieves ; he claimed that the evidence against him was a plant 4 an actor situated in the audience whose acting is rehearsed but seems spontaneous to the audience

MRD - A Resource for Knowledge-based WSD A thesaurus adds: An explicit synonymy relation between word meanings

MRD - A Resource for Knowledge-based WSD A thesaurus adds: An explicit synonymy relation between word meanings WordNet synsets for the noun plant 1 plant, works, industrial plant 2 plant, flora, plant life

MRD - A Resource for Knowledge-based WSD A thesaurus adds: An explicit synonymy relation between word meanings WordNet synsets for the noun plant 1 plant, works, industrial plant 2 plant, flora, plant life A semantic network adds: Hypernymy/hyponymy (IS-A), meronymy/holonymy (PART-OF), antonymy, entailnment, etc.

MRD - A Resource for Knowledge-based WSD A thesaurus adds: An explicit synonymy relation between word meanings WordNet synsets for the noun plant 1 plant, works, industrial plant 2 plant, flora, plant life A semantic network adds: Hypernymy/hyponymy (IS-A), meronymy/holonymy (PART-OF), antonymy, entailnment, etc. WordNet related concepts for the meaning plant life - {plant, flora, plant life} hypernym: {organism, being} hypomym: {house plant}, {fungus},... meronym: {plant tissue}, {plant part} holonym: {Plantae, kingdom Plantae, plant kingdom}

Lesk Algorithm (Michael Lesk 1986): Identify senses of words in context using definition overlap Algorithm: 1 Retrieve from MRD all sense definitions of the words to be disambiguated 2 Determine the definition overlap for all possible sense combinations 3 Choose senses that lead to highest overlap

Lesk Algorithm: Example disambiguate PINE CONE PINE 1 kinds of evergreen tree with needle-shaped leaves 2 waste away through sorrow or illness CONE 1 solid body which narrows to a point 2 something of this shape whether solid or hollow 3 fruit of certain evergreen trees

Lesk Algorithm: Example disambiguate PINE CONE PINE 1 kinds of evergreen tree with needle-shaped leaves 2 waste away through sorrow or illness CONE 1 solid body which narrows to a point 2 something of this shape whether solid or hollow 3 fruit of certain evergreen trees Pine#1 Cone#1 = 0 Pine#2 Cone#1 = 0 Pine#1 Cone#2 = 1 Pine#2 Cone#2 = 0 Pine#1 Cone#3 = 2 Pine#2 Cone#4 = 0

Lesk Algorithm for More than Two Words? I saw a man who is 98 years old and can still walk and tell jokes

Lesk Algorithm for More than Two Words? I saw a man who is 98 years old and can still walk and tell jokes nine open class words: see(26), man(11), year(4), old(8), can(5), still(4), walk(10), tell(8), joke(3)

Lesk Algorithm for More than Two Words? I saw a man who is 98 years old and can still walk and tell jokes nine open class words: see(26), man(11), year(4), old(8), can(5), still(4), walk(10), tell(8), joke(3) 43,929,600 sense combinations! How to find the optimal sense combination?

Lesk Algorithm for More than Two Words? I saw a man who is 98 years old and can still walk and tell jokes nine open class words: see(26), man(11), year(4), old(8), can(5), still(4), walk(10), tell(8), joke(3) 43,929,600 sense combinations! How to find the optimal sense combination? Simulated annealing (Cowie, Guthrie, Guthrie 1992) Define a function E = combination of word senses in a given text. Find the combination of senses that leads to highest definition overlap (redundancy)

Lesk Algorithm for More than Two Words? I saw a man who is 98 years old and can still walk and tell jokes nine open class words: see(26), man(11), year(4), old(8), can(5), still(4), walk(10), tell(8), joke(3) 43,929,600 sense combinations! How to find the optimal sense combination? Simulated annealing (Cowie, Guthrie, Guthrie 1992) Define a function E = combination of word senses in a given text. Find the combination of senses that leads to highest definition overlap (redundancy) 1 Start with E = the most frequent sense for each word 2 At each iteration, replace the sense of a random word in the set with a different sense, and measure E 3 Stop iterating when there is no change in the configuration of senses

Lesk Algorithm: A Simplified Version Original Lesk definition: measure overlap between sense definitions for all words in context. Identify simultaneously the correct senses for all words in context

Lesk Algorithm: A Simplified Version Original Lesk definition: measure overlap between sense definitions for all words in context. Identify simultaneously the correct senses for all words in context Simplified Lesk (Kilgarriff & Rosensweig 2000): measure overlap between sense definitions of a word and current context Identify the correct sense for one word at a time Search space significantly reduced Algorithm for simplified Lesk: 1 Retrieve from MRD all sense definitions of the word to be disambiguated 2 Determine the overlap between each sense definition and the current context 3 Choose the sense that leads to highest overlap

Example of simplified Lesk disambiguate PINE in Pine cones hanging in a tree PINE 1 kinds of evergreen tree with needle-shaped leaves 2 waste away through sorrow or illness

Example of simplified Lesk disambiguate PINE in Pine cones hanging in a tree PINE 1 kinds of evergreen tree with needle-shaped leaves 2 waste away through sorrow or illness Pine#1 Sentence = 1 Pine#2 Sentence = 0

Evaluations of Lesk Algorithm Initial evaluation by M. Lesk 50-70% on short samples of text manually annotated set, with respect to Oxford Advanced Learner s Dictionary Simulated annealing 47% on 50 manually annotated sentences Evaluation on Senseval-2 all-words data, with back-off to random sense (Mihalcea & Tarau 2004) Original Lesk: 35% Simplified Lesk: 47% Evaluation on Senseval-2 all-words data, with back-off to most frequent sense (Vasilescu, Langlais, Lapalme 2004) Original Lesk: 42% Simplified Lesk: 58%

Semantic Similarity Words in a discourse must be related in meaning, for the discourse to be coherent (Haliday and Hassan, 1976) Use this property for WSD - Identify related meanings for words that share a common context

Semantic Similarity Words in a discourse must be related in meaning, for the discourse to be coherent (Haliday and Hassan, 1976) Use this property for WSD - Identify related meanings for words that share a common context Context span: 1 Local context: semantic similarity between pairs of words 2 Global context: lexical chains

Semantic Similarity in a Local Context Similarity determined between pairs of concepts, or between a word and its surrounding context Relies on similarity metrics on semantic networks (Rada et al. 1989)

Decision List for WSD (Yarowsky, 1994) Identify collocational features from sense tagged data. Word immediately to the left or right of target : I have my bank/1 statement. The river bank/2 is muddy. Pair of words to immediate left or right of target : The world s richest bank/1 is here in New York. The river bank/2 is muddy. Words found within k positions to left or right of target, where k is often 10-50 : My credit is just horrible because my bank/1 has made several mistakes with my account and the balance is very low.

Decision List for WSD (Yarowsky, 1994) Sort order of collocation tests using log of conditional probabilities. Words most indicative of one sense (and not the other) will be ranked highly.

Decision List for WSD (Yarowsky, 1994) Sort order of collocation tests using log of conditional probabilities. Words most indicative of one sense (and not the other) will be ranked highly. ( Abs log p(s = 1 F ) i = Collocation i ) p(s = 2 F i = Collocation i )

Decision List for WSD (Yarowsky, 1994)

Decision List for WSD (Yarowsky, 1994)

Decision List for WSD (Yarowsky, 1994)

References I (Gale, Church and Yarowsky 1992) Gale, W., Church, K., and Yarowsky, D. Estimating upper and lower bounds on the performance of word-sense disambiguation programs ACL 1992. (Miller et. al., 1994) Miller, G., Chodorow, M., Landes, S., Leacock, C., and Thomas, R. Using a semantic concordance for sense identification. ARPA Workshop 1994. (Miller, 1995) Miller, G. Wordnet: A lexical database. ACM, 38(11) 1995. (Senseval) Senseval evaluation exercises http://www.senseval.org (Agirre and Rigau, 1995) Agirre, E. and Rigau, G. A proposal for word sense disambiguation using conceptual distance. RANLP 1995. (Banerjee and Pedersen 2002) Banerjee, S. and Pedersen, T. An adapted Lesk algorithm for word sense disambiguation using WordNet. CICLING 2002.

References II (Cowie, Guthrie and Guthrie 1992), Cowie, L. and Guthrie, J. A. and Guthrie, L.: Lexical disambiguation using simulated annealing. COLING 2002. (Jiang and Conrath 1997) Jiang, J. and Conrath, D. Semantic similarity based on corpus statistics and lexical taxonomy. COLING 1997. (Lesk, 1986) Lesk, M. Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from an ice cream cone. SIGDOC 1986. (Lin 1998) Lin, D An information theoretic definition of similarity. ICML 1998. (Mihalcea, Tarau, Figa 2004) R. Mihalcea, P. Tarau, E. Figa PageRank on Semantic Networks with Application to Word Sense Disambiguation, COLING 2004. (Patwardhan, Banerjee, and Pedersen 2003) Patwardhan, S. and Banerjee, S. and Pedersen, T. Using Measures of Semantic Relatedeness for Word Sense Disambiguation. CICLING 2003.

References III (Resnik 1995) Resnik, P. Using information content to evaluate semantic similarity. IJCAI 1995. (Vasilescu, Langlais, Lapalme 2004) F. Vasilescu, P. Langlais, G. Lapalme Evaluating variants of the Lesk approach for disambiguating words, LREC 2004. (Yarowsky, 1994) Decision lists for lexical ambiguity resolution: Application to accent restoration in Spanish and French. In Proceedings of ACL. pp. 88-95. (Yarowsky, 2000) Hierarchical decision lists for word sense disambiguation. Computers and the Humanities, 34.