Improving Data Driven Dependency Parsing Using Clausal Information

Similar documents
Two methods to incorporate local morphosyntactic features in Hindi dependency

Ensemble Technique Utilization for Indonesian Dependency Parser

Grammar Extraction from Treebanks for Hindi and Telugu

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Accurate Unlexicalized Parsing for Modern Hebrew

The Smart/Empire TIPSTER IR System

Linking Task: Identifying authors and book titles in verbose queries

UNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

The Discourse Anaphoric Properties of Connectives

Intension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation

Parsing of part-of-speech tagged Assamese Texts

Using dialogue context to improve parsing performance in dialogue systems

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

Prediction of Maximal Projection for Semantic Role Labeling

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Compositional Semantics

Probabilistic Latent Semantic Analysis

AQUA: An Ontology-Driven Question Answering System

A Simple Surface Realization Engine for Telugu

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

The stages of event extraction

Learning Computational Grammars

Survey on parsing three dependency representations for English

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Developing a TT-MCTAG for German with an RCG-based Parser

Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition

Context Free Grammars. Many slides from Michael Collins

Extracting Verb Expressions Implying Negative Opinions

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

LTAG-spinal and the Treebank

Beyond the Pipeline: Discrete Optimization in NLP

Formulaic Language and Fluency: ESL Teaching Applications

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

Named Entity Recognition: A Survey for the Indian Languages

CS Machine Learning

SEMAFOR: Frame Argument Resolution with Log-Linear Models

Exploiting Wikipedia as External Knowledge for Named Entity Recognition

Loughton School s curriculum evening. 28 th February 2017

Grammars & Parsing, Part 1:

Cross-Lingual Dependency Parsing with Universal Dependencies and Predicted PoS Labels

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Indian Institute of Technology, Kanpur

Search right and thou shalt find... Using Web Queries for Learner Error Detection

Adapting Stochastic Output for Rule-Based Semantics

Natural Language Processing. George Konidaris

LQVSumm: A Corpus of Linguistic Quality Violations in Multi-Document Summarization

cambridge occasional papers in linguistics Volume 8, Article 3: 41 55, 2015 ISSN

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

Some Principles of Automated Natural Language Information Extraction

Using Semantic Relations to Refine Coreference Decisions

DIRECT AND INDIRECT SPEECH

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Corpus Linguistics (L615)

A Computational Evaluation of Case-Assignment Algorithms

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

CS 598 Natural Language Processing

Today we examine the distribution of infinitival clauses, which can be

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

Cross Language Information Retrieval

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

Linguistic Variation across Sports Category of Press Reportage from British Newspapers: a Diachronic Multidimensional Analysis

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Basic Syntax. Doug Arnold We review some basic grammatical ideas and terminology, and look at some common constructions in English.

The Interface between Phrasal and Functional Constraints

Building a Semantic Role Labelling System for Vietnamese

Assignment 1: Predicting Amazon Review Ratings

Inleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Evaluation of a Simultaneous Interpretation System and Analysis of Speech Log for User Experience Assessment

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

AN ANALYSIS OF GRAMMTICAL ERRORS MADE BY THE SECOND YEAR STUDENTS OF SMAN 5 PADANG IN WRITING PAST EXPERIENCES

The Role of the Head in the Interpretation of English Deverbal Compounds

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

arxiv: v1 [cs.cv] 10 May 2017

5 th Grade Language Arts Curriculum Map

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Specifying Logic Programs in Controlled Natural Language

A Comparison of Two Text Representations for Sentiment Analysis

Hindi-Urdu Phrase Structure Annotation

The Ups and Downs of Preposition Error Detection in ESL Writing

An Evaluation of POS Taggers for the CHILDES Corpus

Argument structure and theta roles

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Derivational and Inflectional Morphemes in Pak-Pak Language

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

Annotation Projection for Discourse Connectives

STATUS OF OPAC AND WEB OPAC IN LAW UNIVERSITY LIBRARIES IN SOUTH INDIA

Proof Theory for Syntacticians

A First-Pass Approach for Evaluating Machine Translation Systems

Lesson objective: Year: 5/6 Resources: 1a, 1b, 1c, 1d, 1e, 1f, Examples of newspaper orientations.

What the National Curriculum requires in reading at Y5 and Y6

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

Transcription:

Improving Data Driven Dependency Parsing Using Clausal Information, Karan Jindal, Samar Husain, Dipti Misra Sharma, Rajeev Sangal Language Technologies Research Centre International Institute of Information Technology, Hyderabad, India May 24, 2010

Outline 1 Data Driven Dependency Parsing 2 3 Baseline Clausal Information Results 4 Dependency Accuracy Vs Distance Non-projective Dependencies 5

Outline Parsing 1 Data Driven Dependency Parsing 2 3 Baseline Clausal Information Results 4 Dependency Accuracy Vs Distance Non-projective Dependencies 5

Parsing

Outline 1 Data Driven Dependency Parsing 2 3 Baseline Clausal Information Results 4 Dependency Accuracy Vs Distance Non-projective Dependencies 5

Clause Traditionally, a clause is a group of words that consist of a subject and a predicate. Example I went to the market yesterday, where, I found a beautiful watch. Exact definition in experiments section

Clause Traditionally, a clause is a group of words that consist of a subject and a predicate. Example I went to the market yesterday, where, I found a beautiful watch. Exact definition in experiments section

Clause Traditionally, a clause is a group of words that consist of a subject and a predicate. Example I went to the market yesterday, where, I found a beautiful watch. Exact definition in experiments section

Clause Traditionally, a clause is a group of words that consist of a subject and a predicate. Example I went to the market yesterday, where, I found a beautiful watch. Exact definition in experiments section

Clause Traditionally, a clause is a group of words that consist of a subject and a predicate. Example I went to the market yesterday, where, I found a beautiful watch. Exact definition in experiments section

Clause Traditionally, a clause is a group of words that consist of a subject and a predicate. Example I went to the market yesterday, where, I found a beautiful watch. Exact definition in experiments section

Motivation for using Clausal Information Most of the dependencies of words appear inside the same clause. The dependencies of the words are mostly localized to the clause boundary. Parsing: Finding the correct parent/child of a word in the sentence Use of the clause boundary information Reduces the search space of the parser to find the dependent Makes the parser less prone to errors?

Motivation for using Clausal Information Most of the dependencies of words appear inside the same clause. The dependencies of the words are mostly localized to the clause boundary. Parsing: Finding the correct parent/child of a word in the sentence Use of the clause boundary information Reduces the search space of the parser to find the dependent Makes the parser less prone to errors?

Motivation for using Clausal Information Most of the dependencies of words appear inside the same clause. The dependencies of the words are mostly localized to the clause boundary. Parsing: Finding the correct parent/child of a word in the sentence Use of the clause boundary information Reduces the search space of the parser to find the dependent Makes the parser less prone to errors?

Does it really work? Indian Languages Relatively-free word order languages Dependency framework is best suited Paninian framework proved to be helpful (Bharti et al., 93,95, etc...)

Dependency Distance Vs Clause

Dependency Label Vs Clause

Clause Bharti et al., 93 proposed a two stage method in which Only Intra Clausal dependencies are resolved in Stage1 Only Inter Clausal dependencies are resolved in Stage2 Successfully tried for Indian Languages (Bharti et al., 2008,09) Husain et al., 2009 proposed data- driven Two-Stage Parsing Stage1 parse of Husain et al., used as the clausal information provider For us, a clause is a group of words having a single verb, unless the verb is a child of another verb

Details To do the Stage1 Parsing, Husain et al., 09 Adds a dummy node The clauses are attached to it by dummy relations The treebank is converted to this format by rules Trains MSTParser on this, to get the stage1 model Here, we use MaltParser instead of MSTParser The output is post processed to get the clausal information A figure needs to be included here which makes the process clear.

Outline 1 Data Driven Dependency Parsing 2 3 Baseline Clausal Information Results 4 Dependency Accuracy Vs Distance Non-projective Dependencies 5 Baseline Clausal Information Results

Data, Parser Baseline Clausal Information Results Hindi dataset released as partof the ICON09 parsing contest () Training: 1500, Development: 150, Testing: 150 Sentences are annotated using syntactico semantic relations based on Paninian framework (Begum et al., 2008) Dependency relations exist between chunks Malt Parser is used Arc-eager Turkish SVM settings

Baseline Features and Accuracy Baseline Clausal Information Results Data specific features Tense, Aspect, Modality for Verbs Vibhakti(Post-position) for Nouns General features Lexical items (Stack,Input) window size:? POS,Chunk tags (Stack, Input) window size:? Clausal Features Precision Recall Clause Boundary 84.83 91.23 Clause Head 92.42 99.40 LAS LA L Baseline 73.62 91.00 76.04

Why and How? Baseline Clausal Information Results F As said earlier, clause boundary info. reduces the search space of the parser But, clausal information spans across many words Hard to encode as a boolean feature Modified the code of MSTParser to handle the following features Whether two words (Stack[0] and Input[0]) are in the same clause or not (boolean) The head/non-head info. of each word in a clause (H or NH) Figure showing the feature clearly

Results Baseline Clausal Information Results LAS UAS LS Baseline 73.62 91.00 76.04 F1 72.66 91.00 74.74 F2 72.66 91.00 74.74 F3 74.39 91.87 76.21 F1: Only Boundary F2: Only Head Info. F3: Both Boundary and Head info. Improvement in LAs: 0.87 UAS: 0.87

Outline Distance Non-projectivity 1 Data Driven Dependency Parsing 2 3 Baseline Clausal Information Results 4 Dependency Accuracy Vs Distance Non-projective Dependencies 5

Distance Non-projectivity Dependency Accuracy Vs Distance Once can see that The accuracy improvement increases as the distance increases Shows that the clausal features, help distinguishing and identifying long distance dependencies

Distance Non-projectivity Dependency Accuracy for Non-projective Dependencies Most of the non-projectivities exist in-between the clauses (Mannem et al., 2009) So, The head features should guide the parser to identify non-projectivities The following table shows this clearly. F1(%) F4(%) Precision 41.1 50 Recall 30.5 39.2

Outline Future Work 1 Data Driven Dependency Parsing 2 3 Baseline Clausal Information Results 4 Dependency Accuracy Vs Distance Non-projective Dependencies 5

Future Work Clausal features help dependency parsing, especially, when there is dependency and label bias toward the clause.

Future Work Future Work

References Future Work