CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37 Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th April, 2011
Semantics: wikipedia Semantics (from Greek sēmantiká, neuter plural of sēmantikós) is the study of meaning. It typically focuses on the relation It typically focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata.
Computational Semantics: wikipedia Computational semantics is the study of how to automate the process of constructing and reasoning with meaning representations of natural language expressions. Some traditional topics of interest are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, and quantifier scope resolution. Methods employed usually draw from formal semantics or statistical semantics. Computational semantics has points of contact with the areas of lexical semantics (word sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning (in particular, automated theorem proving). Since 1999 there has been an ACL special interest group on computational semantics, SIGSEM.
A hurdle: signifier-denotata dichotomy Divide between a word and what it stands for red is NOT red in colour red wine, red rose, he is in the red denote very different sense of the word Translation into another language reveals this difference
A Perpective Discourse Pragmatics Semantics Syntax Lexicon Morphology
Our tryst with semantics: Universal Networking Language (UNL)
Motivation Extraction of semantics, i.e., deep meaning is important for many applications. Machine Translation, Meaning-based IR, CLIR Robust, scalable & efficient methods of knowledge extraction required Machine Translation and Cross Lingual IR: a need of the hour for crossing language barrier 7
Interlingua: a vehicle for machine translation English Hindi Interlingua (UNL) French Analysis generation Chinese 8
UNL: a United Nations project Started in 1996 10 year program 15 research groups across continents First goal: generators Next goal: analysers (needs solving various ambiguity problems) Current active language groups UNL_French (GETA-CLIPS, IMAG) UNL_English+Hindi UNL_Italian (Univ. of Pisa) UNL_Portugese (Univ of Sao Paolo, Brazil) UNL_Russian (Institute of Linguistics, Moscow) UNL_Spanish (UPM, Madrid) 9
World-wide Universal Networking Language (UNL) Project Marathi English Russian UNL Japanese Spanish Hindi Others Language independent meaning representation. 10
The UNL MT System: an Overview 11
NLP@IITB 12
Foundations and Applications UNL Foundations Semantic Relations Universal Words Attributes How to write UNL expressions UNL Applications Machine Translation: Rule based and Statistical Search Text Entailment Sentiment Analysis 13
Information Extraction: Part of Speech tagging Named Entity Recognition Shallow Parsing Summarization IR: Cross Lingual Search Crawling Indexing Multilingual Relevance Feedback Language Processing & Understanding Machine Learning: Semantic Role labeling Sentiment Analysis Text Entailment (web 2.0 applications) Using graphical models, support vector machines, neural networks Machine Translation: Statistical Interlingua Based English Indian languages Indian languages Indian languages Indowordnet Resources: http://www.cfilt.iitb.ac.in Publications: http://www.cse.iitb.ac.in/~pb Linguistics is the eye and computation the body
UNL represents knowledge: John eats rice with a spoon Universal words Semantic relations attributes Repository of 42 Semantic Relations and 84 attribute labels 15
Sentence embeddings Deepa claimed that she had composed a poem. [UNL] agt(claim.@entry.@past, Deepa) obj(claim.@entry.@past, :01) agt:01(compose.@past.@entry.@complete, she) obj:01(compose.@past.@entry.@complete, poem.@indef) [\UNL] 16
Constituents of Universal Networking Language Universal Words (UWs) Relations Attributes Knowledge Base 17
UNL Graph He forwarded the mail to the minister. forward(icl>send) @ entry @ past agt he(icl>person) obj gol minister(icl>person) @def mail(icl>collection) @def 18
UNL Expression agt (forward(icl>send).@ entry @ past, he(icl>person)) obj (forward(icl>send).@ entry @ past, minister(icl>person)) gol (forward(icl>send ).@ entry @ past, mail(icl>collection). @def) 19
What is a Universal Word (UW)? Words of UNL Constitute the UNL vocabulary, the syntacticsemantic units to form UNL expressions A UW represents a concept Basic UW (an English word/compound word/phrase with no restrictions or Constraint List) Restricted UW (with a Constraint List ) Examples: crane(icl>device) crane(icl>bird) 20
The Lexicon Format of the dictionary entry [headword] {} Universal word (Attribute list); e.g., [minister] {} minister(icl>person) (N,ANIMT,PHSCL,PRSN); Head word Universal word Attributes Morphological - Pl(plural), V_ed(past tense form) Syntactic - V(verb),VOA(verb of action) Semantic - ANIMT(animate), PLACE, TIME 21
The Lexicon (cntd) He forwarded the mail to the minister. Content words: [forward] {} forward(icl>send) (V,VOA) <E,0,0>; [mail] {} mail(icl>message) (N,PHSCL,INANI) <E,0,0>; [minister] {} minister(icl>person) (N,ANIMT,PHSCL,PRSN) <E,0,0>; Headword Universal Word Attributes 22
The Lexicon (cntd) He forwarded the mail to the minister. function words: [he] {} he (PRON,SUB,SING,3RD) <E,0,0>; [the] {} the (ART,THE) <E,0,0>; [to] {} to Headword Universal Word (PRE,#TO) <E,0,0>; Attributes 23
Hindi example: स क उद हरण १/२ म य श द स व भ म श द ग ण farmer farmer(icl>creator) N,ANIMT,FAUNA,MML,PRSN E श तकर M N,M,ANIMT,FAUNA,MML,PRSN कस न H N,M,ANIMT,FAUNA,MML,PRSN,Na
The Features of a UW Every concept existing in any language must correspond to a UW The constraint list should be as small as necessary to disambiguate the headword Every UW should be defined in the UNL Knowledge-Base 25
Restricted UWs Examples He will hold office until the spring of next year. The spring was broken. Restricted UWs, which are Headwords with a constraint list, for example: spring(icl>season) spring(icl>device) spring(icl>jump) spring(icl>fountain) 26
How to create UWs? Pick up a concept the concept of crane" as "a device for lifting heavy loads or as a long-legged bird that wade in water in search of food Choose an English word for the concept. In the case for crane", since it is a word of English, the corresponding word should be crane' Choose a constraint list for the word. [ ] crane(icl>device)' [ ] crane(icl>bird)' 27
How to create UNL expressions
English sentences: basic structure A <verb> B John eats bread agt(eat.@entry, John) obj(eat.@entry, bread) A <verb> John sleeps aoj(sleep.@entry, John) A <be> B John is good aoj(good.@entry, John) R 2 R 1 A verb R 1 R 2 A B verb B aoj A
Hindi sentences: basic structure A B <verb> verb John roti khaataa hai agt(eat.@entry, John) obj(eat.@entry, bread) R 2 A <verb> John sotaa hai aoj(sleep.@entry, John) A <be> B John acchaa hai aoj(good.@entry, John) R 1 A R 1 R 2 A B verb B aoj A
Complex English sentences: Use recursion on the basic structure A <verb> B agt eat obj John who is a good boy eats bread which is toasted :01 :02 agt(eat.@entry, :01) obj(eat.@entry, :02) :01 boy :02 toast aoj:01(boy, John.@entry) aoj mod mod:01(boy, good) obj:01(toast, bread.@entry.@focus) John good Bread obj Red arrows indicate entry nodes