CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37 Semantics; Universal Networking Language)

Similar documents
HinMA: Distributed Morphology based Hindi Morphological Analyzer

CS 598 Natural Language Processing

CROSS LANGUAGE INFORMATION RETRIEVAL: IN INDIAN LANGUAGE PERSPECTIVE

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

Applications of memory-based natural language processing

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Parsing of part-of-speech tagged Assamese Texts

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

ROSETTA STONE PRODUCT OVERVIEW

DCA प रय जन क य म ग नद शक द र श नद श लय मह म ग ध अ तरर य ह द व व व लय प ट ह द व व व लय, ग ध ह स, वध (मह र ) DCA-09 Project Work Handbook

Leveraging Sentiment to Compute Word Similarity

S. RAZA GIRLS HIGH SCHOOL

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

National Literacy and Numeracy Framework for years 3/4

Towards a Machine-Learning Architecture for Lexical Functional Grammar Parsing. Grzegorz Chrupa la

Natural Language Processing. George Konidaris

Cross Language Information Retrieval

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

AQUA: An Ontology-Driven Question Answering System

Some Principles of Automated Natural Language Information Extraction

The Conversational User Interface

Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG

Segmented Discourse Representation Theory. Dynamic Semantics with Discourse Structure

arxiv: v1 [cs.cl] 2 Apr 2017

Florida Reading Endorsement Alignment Matrix Competency 1

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Character Stream Parsing of Mixed-lingual Text

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Chapter 4: Valence & Agreement CSLI Publications

Which verb classes and why? Research questions: Semantic Basis Hypothesis (SBH) What verb classes? Why the truth of the SBH matters

Analysis of Probabilistic Parsing in NLP

Compositional Semantics

1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Accurate Unlexicalized Parsing for Modern Hebrew

Language Independent Passage Retrieval for Question Answering

क त क ई-व द य लय पत र क 2016 KENDRIYA VIDYALAYA ADILABAD

Prediction of Maximal Projection for Semantic Role Labeling

English Language and Applied Linguistics. Module Descriptions 2017/18

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

A Framework for Customizable Generation of Hypertext Presentations

Proceedings of the 19th COLING, , 2002.

Intension, Attitude, and Tense Annotation in a High-Fidelity Semantic Representation

Grammars & Parsing, Part 1:

The Smart/Empire TIPSTER IR System

NAME: East Carolina University PSYC Developmental Psychology Dr. Eppler & Dr. Ironsmith

A Simple Surface Realization Engine for Telugu

A heuristic framework for pivot-based bilingual dictionary induction

Derivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.

Ontologies vs. classification systems

Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities

Ensemble Technique Utilization for Indonesian Dependency Parser

A relational approach to translation

Context Free Grammars. Many slides from Michael Collins

Semantic Inference at the Lexical-Syntactic Level for Textual Entailment Recognition

Constraining X-Bar: Theta Theory

2.1 The Theory of Semantic Fields

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Trend Survey on Japanese Natural Language Processing Studies over the Last Decade

Annotation Projection for Discourse Connectives

An Introduction to the Minimalist Program

Modeling full form lexica for Arabic

Learning Methods in Multilingual Speech Recognition

Indian Institute of Technology, Kanpur

Proof Theory for Syntacticians

Text-mining the Estonian National Electronic Health Record

Universal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Combining a Chinese Thesaurus with a Chinese Dictionary

THE VERB ARGUMENT BROWSER

Linguistics. Undergraduate. Departmental Honors. Graduate. Faculty. Linguistics 1

Detection of Multiword Expressions for Hindi Language using Word Embeddings and WordNet-based Features

Information for Candidates

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

A First-Pass Approach for Evaluating Machine Translation Systems

TextGraphs: Graph-based algorithms for Natural Language Processing

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

cmp-lg/ Jul 1995

Sample Goals and Benchmarks

The MEANING Multilingual Central Repository

My First Spanish Phrases (Speak Another Language!) By Jill Kalz

Developing a TT-MCTAG for German with an RCG-based Parser

LNGT0101 Introduction to Linguistics

Linking Task: Identifying authors and book titles in verbose queries

Linguistics. The School of Humanities

Speech Recognition at ICSI: Broadcast News and beyond

Cross-Lingual Text Categorization

A Comparison of Two Text Representations for Sentiment Analysis

A Grammar for Battle Management Language

F.No.29-3/2016-NVS(Acad.) Dated: Sub:- Organisation of Cluster/Regional/National Sports & Games Meet and Exhibition reg.

Pre-Processing MRSes

The Role of the Head in the Interpretation of English Deverbal Compounds

Introduction to Causal Inference. Problem Set 1. Required Problems

Machine Translation on the Medical Domain: The Role of BLEU/NIST and METEOR in a Controlled Vocabulary Setting

Transcription:

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37 Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th April, 2011

Semantics: wikipedia Semantics (from Greek sēmantiká, neuter plural of sēmantikós) is the study of meaning. It typically focuses on the relation It typically focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata.

Computational Semantics: wikipedia Computational semantics is the study of how to automate the process of constructing and reasoning with meaning representations of natural language expressions. Some traditional topics of interest are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, and quantifier scope resolution. Methods employed usually draw from formal semantics or statistical semantics. Computational semantics has points of contact with the areas of lexical semantics (word sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning (in particular, automated theorem proving). Since 1999 there has been an ACL special interest group on computational semantics, SIGSEM.

A hurdle: signifier-denotata dichotomy Divide between a word and what it stands for red is NOT red in colour red wine, red rose, he is in the red denote very different sense of the word Translation into another language reveals this difference

A Perpective Discourse Pragmatics Semantics Syntax Lexicon Morphology

Our tryst with semantics: Universal Networking Language (UNL)

Motivation Extraction of semantics, i.e., deep meaning is important for many applications. Machine Translation, Meaning-based IR, CLIR Robust, scalable & efficient methods of knowledge extraction required Machine Translation and Cross Lingual IR: a need of the hour for crossing language barrier 7

Interlingua: a vehicle for machine translation English Hindi Interlingua (UNL) French Analysis generation Chinese 8

UNL: a United Nations project Started in 1996 10 year program 15 research groups across continents First goal: generators Next goal: analysers (needs solving various ambiguity problems) Current active language groups UNL_French (GETA-CLIPS, IMAG) UNL_English+Hindi UNL_Italian (Univ. of Pisa) UNL_Portugese (Univ of Sao Paolo, Brazil) UNL_Russian (Institute of Linguistics, Moscow) UNL_Spanish (UPM, Madrid) 9

World-wide Universal Networking Language (UNL) Project Marathi English Russian UNL Japanese Spanish Hindi Others Language independent meaning representation. 10

The UNL MT System: an Overview 11

NLP@IITB 12

Foundations and Applications UNL Foundations Semantic Relations Universal Words Attributes How to write UNL expressions UNL Applications Machine Translation: Rule based and Statistical Search Text Entailment Sentiment Analysis 13

Information Extraction: Part of Speech tagging Named Entity Recognition Shallow Parsing Summarization IR: Cross Lingual Search Crawling Indexing Multilingual Relevance Feedback Language Processing & Understanding Machine Learning: Semantic Role labeling Sentiment Analysis Text Entailment (web 2.0 applications) Using graphical models, support vector machines, neural networks Machine Translation: Statistical Interlingua Based English Indian languages Indian languages Indian languages Indowordnet Resources: http://www.cfilt.iitb.ac.in Publications: http://www.cse.iitb.ac.in/~pb Linguistics is the eye and computation the body

UNL represents knowledge: John eats rice with a spoon Universal words Semantic relations attributes Repository of 42 Semantic Relations and 84 attribute labels 15

Sentence embeddings Deepa claimed that she had composed a poem. [UNL] agt(claim.@entry.@past, Deepa) obj(claim.@entry.@past, :01) agt:01(compose.@past.@entry.@complete, she) obj:01(compose.@past.@entry.@complete, poem.@indef) [\UNL] 16

Constituents of Universal Networking Language Universal Words (UWs) Relations Attributes Knowledge Base 17

UNL Graph He forwarded the mail to the minister. forward(icl>send) @ entry @ past agt he(icl>person) obj gol minister(icl>person) @def mail(icl>collection) @def 18

UNL Expression agt (forward(icl>send).@ entry @ past, he(icl>person)) obj (forward(icl>send).@ entry @ past, minister(icl>person)) gol (forward(icl>send ).@ entry @ past, mail(icl>collection). @def) 19

What is a Universal Word (UW)? Words of UNL Constitute the UNL vocabulary, the syntacticsemantic units to form UNL expressions A UW represents a concept Basic UW (an English word/compound word/phrase with no restrictions or Constraint List) Restricted UW (with a Constraint List ) Examples: crane(icl>device) crane(icl>bird) 20

The Lexicon Format of the dictionary entry [headword] {} Universal word (Attribute list); e.g., [minister] {} minister(icl>person) (N,ANIMT,PHSCL,PRSN); Head word Universal word Attributes Morphological - Pl(plural), V_ed(past tense form) Syntactic - V(verb),VOA(verb of action) Semantic - ANIMT(animate), PLACE, TIME 21

The Lexicon (cntd) He forwarded the mail to the minister. Content words: [forward] {} forward(icl>send) (V,VOA) <E,0,0>; [mail] {} mail(icl>message) (N,PHSCL,INANI) <E,0,0>; [minister] {} minister(icl>person) (N,ANIMT,PHSCL,PRSN) <E,0,0>; Headword Universal Word Attributes 22

The Lexicon (cntd) He forwarded the mail to the minister. function words: [he] {} he (PRON,SUB,SING,3RD) <E,0,0>; [the] {} the (ART,THE) <E,0,0>; [to] {} to Headword Universal Word (PRE,#TO) <E,0,0>; Attributes 23

Hindi example: स क उद हरण १/२ म य श द स व भ म श द ग ण farmer farmer(icl>creator) N,ANIMT,FAUNA,MML,PRSN E श तकर M N,M,ANIMT,FAUNA,MML,PRSN कस न H N,M,ANIMT,FAUNA,MML,PRSN,Na

The Features of a UW Every concept existing in any language must correspond to a UW The constraint list should be as small as necessary to disambiguate the headword Every UW should be defined in the UNL Knowledge-Base 25

Restricted UWs Examples He will hold office until the spring of next year. The spring was broken. Restricted UWs, which are Headwords with a constraint list, for example: spring(icl>season) spring(icl>device) spring(icl>jump) spring(icl>fountain) 26

How to create UWs? Pick up a concept the concept of crane" as "a device for lifting heavy loads or as a long-legged bird that wade in water in search of food Choose an English word for the concept. In the case for crane", since it is a word of English, the corresponding word should be crane' Choose a constraint list for the word. [ ] crane(icl>device)' [ ] crane(icl>bird)' 27

How to create UNL expressions

English sentences: basic structure A <verb> B John eats bread agt(eat.@entry, John) obj(eat.@entry, bread) A <verb> John sleeps aoj(sleep.@entry, John) A <be> B John is good aoj(good.@entry, John) R 2 R 1 A verb R 1 R 2 A B verb B aoj A

Hindi sentences: basic structure A B <verb> verb John roti khaataa hai agt(eat.@entry, John) obj(eat.@entry, bread) R 2 A <verb> John sotaa hai aoj(sleep.@entry, John) A <be> B John acchaa hai aoj(good.@entry, John) R 1 A R 1 R 2 A B verb B aoj A

Complex English sentences: Use recursion on the basic structure A <verb> B agt eat obj John who is a good boy eats bread which is toasted :01 :02 agt(eat.@entry, :01) obj(eat.@entry, :02) :01 boy :02 toast aoj:01(boy, John.@entry) aoj mod mod:01(boy, good) obj:01(toast, bread.@entry.@focus) John good Bread obj Red arrows indicate entry nodes