SE367A Project Report Complex Predicates in Hindi

Similar documents
DCA प रय जन क य म ग नद शक द र श नद श लय मह म ग ध अ तरर य ह द व व व लय प ट ह द व व व लय, ग ध ह स, वध (मह र ) DCA-09 Project Work Handbook

क त क ई-व द य लय पत र क 2016 KENDRIYA VIDYALAYA ADILABAD

S. RAZA GIRLS HIGH SCHOOL

HinMA: Distributed Morphology based Hindi Morphological Analyzer


Question (1) Question (2) RAT : SEW : : NOW :? (A) OPY (B) SOW (C) OSZ (D) SUY. Correct Option : C Explanation : Question (3)

ENGLISH Month August

ह द स ख! Hindi Sikho!

CROSS LANGUAGE INFORMATION RETRIEVAL: IN INDIAN LANGUAGE PERSPECTIVE

The Prague Bulletin of Mathematical Linguistics NUMBER 95 APRIL

Detection of Multiword Expressions for Hindi Language using Word Embeddings and WordNet-based Features

F.No.29-3/2016-NVS(Acad.) Dated: Sub:- Organisation of Cluster/Regional/National Sports & Games Meet and Exhibition reg.

व रण क ए आ दन-पत र. Prospectus Cum Application Form. न दय व kऱय सम त. Navodaya Vidyalaya Samiti ਨਵ ਦ ਆ ਦਵਦ ਆਦ ਆ ਸਦ ਤ. Navodaya Vidyalaya Samiti

Leveraging Sentiment to Compute Word Similarity

A process by any other name

ELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit

Fluency YES. an important idea! F.009 Phrases. Objective The student will gain speed and accuracy in reading phrases.

First Grade Curriculum Highlights: In alignment with the Common Core Standards

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Taking into Account the Oral-Written Dichotomy of the Chinese language :

CEFR Overall Illustrative English Proficiency Scales

Copyright 2002 by the McGraw-Hill Companies, Inc.

Introducing the New Iowa Assessments Language Arts Levels 15 17/18

TABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards

Lesson M4. page 1 of 2

Dickinson ISD ELAR Year at a Glance 3rd Grade- 1st Nine Weeks

Indian Institute of Technology, Kanpur

Hindi Aspectual Verb Complexes

AQUA: An Ontology-Driven Question Answering System

Taught Throughout the Year Foundational Skills Reading Writing Language RF.1.2 Demonstrate understanding of spoken words,

Curriculum Design Project with Virtual Manipulatives. Gwenanne Salkind. George Mason University EDCI 856. Dr. Patricia Moyer-Packenham

Formulaic Language and Fluency: ESL Teaching Applications

Grade 6: Correlated to AGS Basic Math Skills

Language Acquisition by Identical vs. Fraternal SLI Twins * Karin Stromswold & Jay I. Rifkin

THE M.A. DEGREE Revised 1994 Includes All Further Revisions Through May 2012

Dibels Math Early Release 2nd Grade Benchmarks

Naviance Family Connection

Learning Disability Functional Capacity Evaluation. Dear Doctor,

Compositional Semantics

Stefan Engelberg (IDS Mannheim), Workshop Corpora in Lexical Research, Bucharest, Nov [Folie 1] 6.1 Type-token ratio

Rwanda. Out of School Children of the Population Ages Percent Out of School 10% Number Out of School 217,000

Intra-talker Variation: Audience Design Factors Affecting Lexical Selections

MISSISSIPPI OCCUPATIONAL DIPLOMA EMPLOYMENT ENGLISH I: NINTH, TENTH, ELEVENTH AND TWELFTH GRADES

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

Teaching Vocabulary Summary. Erin Cathey. Middle Tennessee State University

Possessive have and (have) got in New Zealand English Heidi Quinn, University of Canterbury, New Zealand

CELTA. Syllabus and Assessment Guidelines. Third Edition. University of Cambridge ESOL Examinations 1 Hills Road Cambridge CB1 2EU United Kingdom

Participation Representation Achievement leadership Service Enrichment

Seventh Grade Curriculum

TA Script of Student Test Directions

Hindi Aspectual Complex Predicates. Shakthi Poornima and Jean-Pierre Koenig. State University of New York at Buffalo

5. UPPER INTERMEDIATE

Characteristics of Functions

SRI LANKA INSTITUTE OF ADVANCED TECHNOLOGICAL EDUCATION REVISED CURRICULUM HIGHER NATIONAL DIPLOMA IN ENGLISH. September 2010

success. It will place emphasis on:

Text: envisionmath by Scott Foresman Addison Wesley. Course Description

Programma di Inglese

Non-Secure Information Only

Ready Common Core Ccls Answer Key

Grade 11 Language Arts (2 Semester Course) CURRICULUM. Course Description ENGLISH 11 (2 Semester Course) Duration: 2 Semesters Prerequisite: None

The development of a new learner s dictionary for Modern Standard Arabic: the linguistic corpus approach

Vocabulary Usage and Intelligibility in Learner Language

Introducing the New Iowa Assessments Mathematics Levels 12 14

Comprehension Recognize plot features of fairy tales, folk tales, fables, and myths.

Houghton Mifflin Reading Correlation to the Common Core Standards for English Language Arts (Grade1)

K.L.N. COLLEGE OF ENGINEERING, POTTAPALAYAM. Department of Computer Science and Engineering. Academic Year:

Mercer County Schools

Big Fish. Big Fish The Book. Big Fish. The Shooting Script. The Movie

HOW TO STUDY A FOREIGN LANGUAGE MENDY COLBERT

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

INDIAN INSTITUTE OF SCIENCE EDUCATION AND RESEARCH KOLKATA Mohanpur Ref.No.: IISER-K/Rectt.NT-01/2016/Admn Date:

Introduce yourself. Change the name out and put your information here.

Mussoorie International School. Become the EXTRAORDINAIRE

Common Core State Standards for English Language Arts

Grade Band: High School Unit 1 Unit Target: Government Unit Topic: The Constitution and Me. What Is the Constitution? The United States Government

The Effects of Linguistic Diversity on Standardized Testing

Coast Academies Writing Framework Step 4. 1 of 7

NOT SO FAIR AND BALANCED:

Cross Language Information Retrieval

Biome I Can Statements

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

Radius STEM Readiness TM

Unit 3 Ratios and Rates Math 6

Software Maintenance

SECTION I: Strategic Planning Background and Approach

MISSISSIPPI EXTENDED CURRICULUM FRAMEWORKS CORRELATION WITH PROJECT DISCOVERY

Ch VI- SENTENCE PATTERNS.

Mentouri University-Constantine. Faculty of Letters and Languages. Department of Languages

Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features

Lower and Upper Secondary

Answers To Gradpoint Review Test

Wisconsin 4 th Grade Reading Results on the 2015 National Assessment of Educational Progress (NAEP)

SAMPLE. Chapter 1: Background. A. Basic Introduction. B. Why It s Important to Teach/Learn Grammar in the First Place

Reinforcement Learning by Comparing Immediate Reward

The Effect of Syntactic Simplicity and Complexity on the Readability of the Text

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

Project Based Learning Debriefing Form Elementary School

Ensemble Technique Utilization for Indonesian Dependency Parser

Prentice Hall Literature: Timeless Voices, Timeless Themes Gold 2000 Correlated to Nebraska Reading/Writing Standards, (Grade 9)

2013 TRIAL URBAN DISTRICT ASSESSMENT (TUDA) RESULTS

Transcription:

SE367A Project Report Complex Predicates in Hindi By: Sachet Chavan (Dept. of HSS) Pranav Kumar (Dept. of Electrical Engineering) Guide: Prof. Amitabh Mukherjee

Abstract: Complex predicates are found in South Asian languages more than European languages. We will analyze different complex predicates in Hindi and try to gauge their acceptability rates. Thus the complex predicate found to be acceptable to majority of people might go on to get stored in WordNet or a dictionary. Introduction: Complex predicate is a multi-word compound that functions as a single verb. Complex predicates can be formed by two type of combination: Noun + Verb Verb + Verb र म कत ब पढ़ रह ह For this project we have considered Verb + Verb type combination only. This type of combination is formed by combining a Heavy verb (HV) with light verb: Heavy Verb + Light Verb नकल गय, हस पड़ Heavy verb: Heavy Verb contains most of the meaning of the compound. E.g. in नकल गय, नकल is the heavy verb. Light Verb:

Light Verb contributes to finer aspect of temporal meaning. Verb + Verb type combination can also be of two types depending on which occurs first HV or LV. Standard aspectual complex predicate construction: These are formed by combining HV + LV (in order) र मन य म क तम च म र दय Reverse aspectual complex predicate construction: These are formed by combining HV + LV (in order) र मन य म क तम च द म र Standard Aspectual complex Predicates are more common than Reverse Aspectual. Except for a few exception all Verb + Verb Type complex predicates can be replaced by some inflection of HV. E.g. नकल गय can be replaced by नकल Also all HV + LV does not give a complex predicate. E.g. नकल ड ल is not found acceptable by most people.

Related Works: The HSS department of IIT Bombay also conducted some research in this subject. Their work is motivated primarily by the need to automatically augment lexical networks such as the Princeton Wordnet. Another paper from IITB presents their experience in the construction of lexical knowledge bases for Indian languages with special attention to Hindi. The question of storing or deriving complex predicates has been dealt with linguistically and computationally in their work. Our Experiment: Complex predicates are lexical compound verbs. Now as they are a part of our day-to-day language we are very much accustomed to them. But if a person has no knowledge of Hindi, he won t be able to decipher the meaning of these compound verbs. And that s because individually these heavy verbs and light verbs mean different but when put together as a complex predicate, the meaning they try to convey changes. Also not each and every heavy verb and light verb combination can be said to be a complex predicate. There are very limited number of combinations which are prevalent enough to be accepted as a part of general vocabulary. Our experiment used two methods to obtain the same kind of results. Those two methods were gaze tracking-based model and survey based study. The first part of our experiment was that of gaze tracking in which we made our subjects read a text comprising of sentences which contained standard aspectual complex predicates, reverse aspectual complex predicates and plain verbs. For our survey based part, we circulated a survey containing a grid like structure comprising 5 heavy verbs and 5 light verbs which made 25

different combinations and asked the participants to vote the ones they find appropriate. Following are the sentences from the reading text we used in the gaze tracking part: (We gave standard aspectual, reverse standard aspectual and inflection of same HV ) लल न य म क च "लख श ल न र म क च म र "लख घनय म न 'वव क क खत "लख म र र म ग न ग पड़ र म न ग न ग य क *त क द ख कर "शवम भ ग पड़ ब,दर क द ख कर भ म भ ग ब ज़ प.र द प झपट गय च ह क द ख 3ब4ल उसप झपट ग 6स आन पर सचन च ख पड़ अपन म क: ड ट स न कर ब;च र ड ल "मठ ई न "मलन पर ब;च र य घड द ख कर र हत नकल उठ च टक ल स नकर दन श हस चल 'वजय न घर क द ब च ग ल फश@प नर श फसल पड़ श म क स त बजत ह र हन नकल चल स बह क आठ बज गए और व चल नकल

Our main aim was to track the eye movement of the subjects while they read these sentences. For the survey part we circulated a google doc in our hostel. As you know hostel inmates consist of students from different parts of the country, so this prevented our survey from getting dialect biased and give us a more general result. So as you can see in the doc below, we created a grid-like structure asking them to vote for the combinations they found to be acceptable in their vocabulary. There were given 5 heavy verbs, namely गय, चल, पड़, ड ल, उठ, and 5 light verbs, namely नकल, कह, र, हस, ब ल. So there were 25 different choices and they were allowed to vote as many as they wanted.

Results: Following are the results of the survey (84 subjects)

The above graphs show the number of votes each light verb got for its respective combination with heavy verb. Following are the results of the gaze tracking experiments for three different subjects: (Radius of circle is proportional to Saccade duration).

Conclusion: Thus from result of survey we can assume that the combinations with higher number of votes can definitely be a part of the WordNet and has a high acceptability rate. The votes received are subjective to the participants but we do get a rough idea of which combinations are prevalent and which are not. like नकल ड ल is not found acceptable by any of the subject.also from Gaze tracking result we verify as HV (नकल) contains most of the meaning circle of saccade is much longer around them than their LV counterpart (चल ). Future Work: (a)survey on larger population so that we have more reliable data and can study demographic relation (expected) for acceptability of HV +LV combination. (b)analyzing data of saccade time for standard aspectual, reverse standard aspectual and inflection of same HV complex predicates to get ratio of saccade time for those case to get idea of relative use of these form in day to day usage.

References: D.Chakravarti H.Mandalia R.Priya V.Sharma P.Bhattacharya; 2008: Hindi Compound Verbs and their Automatic Extraction(IITB) Shakthi Poornima, Jean-Pierre Koenig; 2009: Hindi Aspectual Complex Predicates (State University of New York at Buffalo)