Effects of Using Simple Semantic Similarity on Textual Entailment Recognition

Similar documents
Vocabulary Usage and Intelligibility in Learner Language

Leveraging Sentiment to Compute Word Similarity

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

Robust Sense-Based Sentiment Classification

A Semantic Similarity Measure Based on Lexico-Syntactic Patterns

A Comparative Evaluation of Word Sense Disambiguation Algorithms for German

Word Sense Disambiguation

arxiv: v1 [cs.cl] 2 Apr 2017

On document relevance and lexical cohesion between query terms

A Case Study: News Classification Based on Term Frequency

LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization

The stages of event extraction

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Probabilistic Latent Semantic Analysis

Semantic Evidence for Automatic Identification of Cognates

Lexical Similarity based on Quantity of Information Exchanged - Synonym Extraction

A Domain Ontology Development Environment Using a MRD and Text Corpus

Combining a Chinese Thesaurus with a Chinese Dictionary

The Role of String Similarity Metrics in Ontology Alignment

A Graph Based Authorship Identification Approach

Multilingual Sentiment and Subjectivity Analysis

Matching Similarity for Keyword-Based Clustering

A Bayesian Learning Approach to Concept-Based Document Classification

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Automatic Extraction of Semantic Relations by Using Web Statistical Information

Assessing Entailer with a Corpus of Natural Language From an Intelligent Tutoring System

Rule Learning With Negation: Issues Regarding Effectiveness

Information-theoretic evaluation of predicted ontological annotations

AQUA: An Ontology-Driven Question Answering System

A Comparison of Two Text Representations for Sentiment Analysis

Let s think about how to multiply and divide fractions by fractions!

Finding Translations in Scanned Book Collections

Linking Task: Identifying authors and book titles in verbose queries

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition

Integrating Semantic Knowledge into Text Similarity and Information Retrieval

A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA

Prediction of Maximal Projection for Semantic Role Labeling

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

Handling Sparsity for Verb Noun MWE Token Classification

Learning Semantically Coherent Rules

Cross Language Information Retrieval

The Ups and Downs of Preposition Error Detection in ESL Writing

arxiv: v1 [cs.lg] 3 May 2013

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Comparison of network inference packages and methods for multiple networks inference

Distributed Divergent Creativity: Computational Creative Agents at Web Scale

HLTCOE at TREC 2013: Temporal Summarization

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

The MEANING Multilingual Central Repository

Improving Machine Learning Input for Automatic Document Classification with Natural Language Processing

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2

Graph Alignment for Semi-Supervised Semantic Role Labeling

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Mining meaning from Wikipedia

Analysis: Evaluation: Knowledge: Comprehension: Synthesis: Application:

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Part III: Semantics. Notes on Natural Language Processing. Chia-Ping Chen

Constructing Parallel Corpus from Movie Subtitles

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

SEMAFOR: Frame Argument Resolution with Log-Linear Models

Semantic Inference at the Lexical-Syntactic Level

Extended Similarity Test for the Evaluation of Semantic Similarity Functions

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features

Grade 6: Correlated to AGS Basic Math Skills

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

2 Mitsuru Ishizuka x1 Keywords Automatic Indexing, PAI, Asserted Keyword, Spreading Activation, Priming Eect Introduction With the increasing number o

Rule Learning with Negation: Issues Regarding Effectiveness

ADDIS ABABA UNIVERSITY SCHOOL OF GRADUATE STUDIES SCHOOL OF INFORMATION SCIENCES

Universiteit Leiden ICT in Business

Extracting Lexical Reference Rules from Wikipedia

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

The Structure of Multiple Complements to V

Can Human Verb Associations help identify Salient Features for Semantic Verb Classification?

A Statistical Approach to the Semantics of Verb-Particles

Distant Supervised Relation Extraction with Wikipedia and Freebase

Segmentation of Multi-Sentence Questions: Towards Effective Question Retrieval in cqa Services

A heuristic framework for pivot-based bilingual dictionary induction

Natural Language Arguments: A Combined Approach

Cross-lingual Text Fragment Alignment using Divergence from Randomness

Artificial Neural Networks written examination

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

Words come in categories

Compositional Semantics

Chapter 2 Rule Learning in a Nutshell

Memory-based grammatical error correction

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Bug triage in open source systems: a review

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Reducing Features to Improve Bug Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Beyond the contextual: the importance of theoretical knowledge in vocational qualifications & the implications for work

LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting

What is a Mental Model?

A Grammar for Battle Management Language

Individual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION

Intermediate Academic Writing

Transcription:

Effects of Using Simple Semantic Similarity on Textual Entailment Recognition TEAM ID:u_tokyo Ken-ichi Yokote, Shohei Tanaka and Mitsuru Ishizuka Department of Information and Communication Eng. School of Information Science and Technology The University of Tokyo {yokote, tanaka, ishizuka}@mi.ci.i.u-tokyo.ac.jp Abstract We applied various WordNet based similarity measures to the RTE (Recognizing Textual Entailment) task in order to compare the effects of them on Textual Entailment Recognition. Although the improvements over a baseline system are not big, many of them show positive effects. 1. Introduction In RTE (Recognizing Textual Entailment) tasks, it becomes effective to consider semantic similarities between given sentences -- T(precedent text) and H(hypothesis)--, while word-level matching is mainly employed in many present systems. However, the definition of semantic similarity is ambiguous and it is unclear what is the best way to measure the similarity for textual entailment. Thus, in our research, we tried to apply various WordNet based similarity measures to the RTE task in order to compare the effects of them on Textual Entailment Recognition. We used WordNet::Similarity [WordNet similarity] which is a freely available software package that makes it possible to measure the semantic similarity and relatedness between a pair of concepts (or synsets). 2. Our RTE System The following figure shows the overview of our RTE system. This is roughly divided into three stages.

2.1 Stage 1 -- Classifying terms in H The system classifies terms in H into two groups, ones which are closely related to T, and the other. (The former are called classified words in the figure.) We employed two criteria in this classification. One is Lexical Classifier, which is based on lexical coincidence. Another one is Semantic and Syntactic Classifier, which is based on POS (part-of-speech) coincidence and Semantic Score. Here, the Semantic Score of h (h H) is defined as: score( h ) = Max { WordNet - Similarity ( h, t) s.t. t T } 2.2 Stage 2 -- Calculating the term s weights After the term classification in Stage 1, the system calculates the term s weight for all terms in H (including the classified words) as follows: T ( t ) = log ( T is amount of sentences in the Topic) textfreq( t) + 1 w 2 This is almost equivalent to IDF (Inverse Document Frequency). 2.3 Stage 3 -- Judging textual entailment First in this stage, the system constructs feature vectors of H and the set of the classified words, where each feature component corresponds to each word. Then, Entailment Recognizer judges whether entailment is YES or NO by comparing a threshold with the cosine similarity between H and the classified words. (The result of this similarity can be approximated by the degree of the overlaps of H and the classified words.) 3. Experimental Results 3.1 Baseline system As a baseline system, we used only lexical classifier in the stage1. For the development data set, it brought the best result shown below when the threshold was 0.7 in the experiments. DEVELOPMENT-SET Recall 43.58 Precision 61.92

F-measure (macro) 50.6 (threshold = 0.7) Using this threshold value, experimental results for the test data set were as follows:. TEST-SET Recall 41.36 Precision 50.00 F-measure (macro) 45.27 (threshold = 0.7) 3.2 Applying WordNet Similarity Functions We applied various WordNet similarity functions [WordNet similarity] to the classifier, and obtained their performance for the development data set as:. DEVELOPMENT-SET F-measure (macro) Path Similarity 51.0 Res (Resnik) Similarity 50.1 Wup (Wu-Palmer) Similarity 50.8 Lin Similarity 51.2 Lch (Leacock-Chodorow) Similarity 51.2 Jcn (Jiang-Conrath) Similarity 51.7 where the threshold in each case was chosen to attain the best result. Applying the same threshold in each case, we obtained the experimental results for the test data set. Below shows only top two cases. TEST-SET F-measure (macro) 46.78 using Jcn (Jiang-Conrath) Similarity 46.04 using Lch (Leacock-Chodorow) Similarity If we multiply these two similarity measures to generate a new measure, a bit better result has been obtained as: 46.87 using Jcn and Lch Similarities where the threshold was also determined by multiplying two thresholds of Jcn and Lcn cases.

4. Discussion and Conclusion The experimental results to date show that Jcn (Jiang-Conrath) Similarity in the WordNet similarity functions is the most effective to RTE-7 task. There are rooms for further improvements by applying several WordNet similarity functions simultaneously. Also, we are interested in applying more comprehensive measures as the semantic similarity. Acknowledgments We are grateful to Kai Ishikawa, Masaaki Tsuchida and Toshi-ichi Fukushima (NEC Corp.) for their advice and help. References [WordNet similarity] http://nltk.googlecode.com/svn/trunk/doc/howto/wordnet.html