Semantic Textual Similarity & more on Alignment

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "Semantic Textual Similarity & more on Alignment"

Transcription

1 Semantic Textual Similarity & more on Alignment CMSC 723 / LING 723 / INST 725 MARINE CARPUAT

2 2 topics today P3 task: Semantic Textual Similarity Including Monolingual alignment Beyond IBM word alignment Synchronous CFGs

3 Semantic Textual Similarity Series of tasks at international workshop on semantic evaluations (SemEval), since

4 What is Semantic Textual Similarity? Hnh whdun duuhj js ijd dj iow oijd oidj dk uwhd8 yh djhdhwuih jhu h uh jhihk, jdhhii, gdytysla, yuiyduinsjsh, iodpisomkncijsi. Kjhhuduh, dhdhhd hhduhd jjhuiq Welcome to my world, trust me you will never be disappointed djijdp idiowdiw I iwfiow ifiwoufowi ioiowruo iyfi I wioiwf oid oi iwoiwy iowuouwr ujjd hihi iohoihiof uouo ou o oufois f uhdiy oioi oo ouiosufoisuf iouiouf paidp paudoi uiu fh uhhioiof 안녕하세요제가당신에게전화했지만아무소용이있을려고... 당신이시간을즐기고있었다희망 Shjkahsiunu iuhndhau dhdkhn hdhaud8 kdhikahdi dhjhd dhjh jiidh iihiiohio hihiahdiod Yo! Come over here, you will be pleasantly surprised idoasd io idjioio jidjduio iodio oi iiouio oiudoi ifuiosu fiuoi oiuiou oi io hiyuify 8iy ih iouoiu ou o ooihyiush iuh fhdfosiip upouosu oiu oi o oisyoisy oi sih oiiou ios oisuois uois oudiosu doi soiddu os oso iio oioisosuo. Semantic Similarity جدالكجد يدجياجد يجدي يج جي وغو يحيح يحسيفحس يحيحفي سف ي جي جيييدج كجساكجاس حفجحسوجح ج. كححسح حيحي حوحوس دح حدي يجدي يو جي جيحجفححكسحجسكحك حفحسوحوشيحيدويويد وي يوسحفوفوفوطبس تعالى ومالكش دعوه هتبنبسط اخر انبساط Добро пожаловать в мой мир, поверьте мне вы никогда не будете разочарованы Quantitative Graded Similarity Score Confidence Score Principled Interpretability, which semantic components/features led to results (hopefully will lead to us gaining a better understanding of semantics)

5 Why Semantic Textual Similarity? Most NLP applications need some notion of semantic similarity to overcome brittleness and sparseness Provides evaluation beyond surface text processing A hub for semantic processing as a black box in applications beyond NLP Lends itself to an extrinsic evaluation of scattered semantic components

6 What is STS? The graded process by which two snippets of text (t1 and t2) are deemed equivalent semantically, i.e. bear the same meaning An STS system will quantifiably inform us on how similar t1 and t2 are, resulting in a similarity score An STS system will tell us why t1 and t2 are similar giving a nuanced interpretation of similarity based on semantic components contributions

7 What is STS? Word similarity has been relatively well studied For example according to WN cord smile 0.02 rooster voyage 0.04 noon string 0.04 fruit furnace hill woodland 1.48 car journey 1.55 cemetery mound cemetery graveyard 3.88 automobile car 3.92 More similar

8 What is STS? Fewer datasets for similarity between sentences A forest is a large area where trees grow close together. VS. The coast is an area of land that is next to the sea. [0.25]

9 What is STS? Fewer datasets for similarity between sentences A forest is a large area where trees grow close together. VS. Woodland is land with a lot of trees. [2.51]

10 What is STS? Fewer datasets for similarity between sentences Once there was a Czar who had three lovely daughters. VS. There were three beautiful girls, whose father was a Czar. [4.3]

11 Related tasks Paraphrase detection Are 2 sentences equivalent in meaning? Textual Entailment Does premise P entail hypothesis H? STS provides graded similarity judgments

12

13 Annotation: crowd-sourcing

14 Annotation: crowd-sourcing English annotation process Pairs annotated in batches of 20 Annotators paid $1 per batch 5 annotations per pair Workers need to have Mturk master qualification Defining gold standard judgments Median value of annotations After filtering low quality annotators (<0.80 correlation with leave-on-out gold & <0.20 Kappa)

15 Diverse data sources

16 Evaluation: a shared task Subset of 2016 results (Score: Pearson correlation)

17 STS models from word to sentence vectors Can we perform STS by comparing sentence vector representation? This approach works well for word level similarity But can we capture the meaning of a sentence in a single vector?

18 Composing by averaging g( shots fired at residence ) = shots fired at residence [Tai et al. 2015, Wieting et al. 2016]

19 How can we induce word vectors for composition? English paraphrases [Wieting et al. 2016] Bilingual sentence pairs [Hermann & Blunsom 2014] x 1 By our fellow members Thus in fact by our fellow members x 2 By our colleagues As que podramos nuestra colega disputado Bilingual phrase pairs by our fellow member de nuestra colega

20 STS models: monolingual alignment

21 Idea One (of many) approaches to monolingual entailment Exploit not only similarity between words But also similarity between their contexts See Sultan et al

22 2 topics today P3 task: Semantic Textual Similarity Including Monolingual alignment Beyond IBM word alignment Synchronous CFGs

23 Aligning words & constituents Alignment: mapping between spans of text in lang1 and spans of text in lang2 Sentences in document pairs Words in sentence pairs Syntactic constituents in sentence pairs Today: 2 methods for aligning constituents Parse and match biparse

24 Parse & Match

25 Parse(-Parse)-Match Idea Align spans that are consistent with existing structure Pros Builds on existing NLP tools Cons Assume availability of lots of resources Assume that representations can be matched

26 Aligning words & constituents 2 methods for aligning constituents: Parse and match assume existing parses and alignment Biparse alignment = structure

27 A straw man hypothesis: All languages have same grammar

28 A straw man hypothesis: All languages have same grammar

29 A straw man hypothesis: All languages have same grammar

30 A straw man hypothesis: All languages have same grammar

31 The biparsing hypothesis: All languages have nearly the same grammar

32 The biparsing hypothesis: All languages have nearly the same grammar

33 Example for the biparsing hypothesis: All languages have nearly the same grammar

34 The biparsing hypothesis: All languages have nearly the same grammar

35 The biparsing hypothesis: All languages have nearly the same grammar Dekai Wu and Pascale Fung, IJCNLP-2005 HKUST Human Language Technology Center

36 The biparsing hypothesis : All languages have nearly the same grammar Dekai Wu and Pascale Fung, IJCNLP-2005 HKUST Human Language Technology Center

37 The biparsing hypothesis : All languages have nearly the same grammar Dekai Wu and Pascale Fung, IJCNLP-2005 HKUST Human Language Technology Center

38 The biparsing hypothesis: All languages have nearly the same grammar Permuted SDTG/SCFG VP VV PP ; 1 2 VP VV PP ; 2 1 Indexed SDTG/SCFG notation VP VV (1) PP (2), VV (1) PP (2) VP VV (1) PP (2), PP (2) VV (1) SDTG/SCFG notation VP VV PP, VV PP VP VV PP, PP VV ITG shorthand VP [ VV PP ] VP VV PP Dekai Wu and Pascale Fung, IJCNLP-2005 HKUST Human Language Technology Center

39 Synchronous Context Free Grammars Context free grammars (CFG) Common way of representing syntax in (monolingual) NLP Synchronous context free grammars (SCFG) Generate pairs of strings Align sentences by parsing them Translate sentences by parsing them Key algorithm: how to parse with SCFGs?

40 SCFG trade off Expressiveness SCFGs cannot represent all sentence pairs in all languages Efficiency SCFGs let us view alignment as parsing & benefit from well-studied formalism

41 Synchronous parsing cannot represent all sentence pairs

42 Synchronous parsing cannot represent all sentence pairs

43 Synchronous parsing cannot represent all sentence pairs

44 A subclass of SCFGs: Inversion Transduction Grammars ITGs are the subclass of SDTGs/SCFGs: with only straight and inverted transduction rules equivalent with only transduction rules of rank < 2 with only transduction rules of rank < 3 ITGs are context-free (like SCFGs).

45 For length-4 phrases (or frames), ITGs can express 22 out of 24 permutations!

46 ITGs enable efficient DP algorithms [Wu 1995] e 0 e 1 e 2 e 3 e 4 e 5 e 6 e 7 c 0 c 1 c 2 c 3 c 4 c 5 c 6

47 ITGs enable efficient DP algorithms [Wu 1995] e 0 e 1 e 2 e 3 e 4 e 5 e 6 e 7 c 0 c 1 c 2 c 3 c 4 c 5 c 6

48 ITGs enable efficient DP algorithms [Wu 1995] e 0 e 1 e 2 e 3 e 4 e 5 e 6 e 7 c 0 c 1 c 2 c 3 c 4 c 5 c 6

49 ITGs enable efficient DP algorithms [Wu 1995] e 0 e 1 e 2 e 3 e 4 e 5 e 6 e 7 c 0 c 1 c 2 c 3 c 4 c 5 c 6

50 ITGs enable efficient DP algorithms [Wu 1995] e 0 e 1 e 2 e 3 e 4 e 5 e 6 e 7 c 0 c 1 c 2 c 3 c 4 c 5 c 6

51 ITGs enable efficient DP algorithms [Wu 1995] e 0 e 1 e 2 e 3 e 4 e 5 e 6 e 7 c 0 c 1 c 2 c 3 c 4 c 5 c 6

52 ITGs enable efficient DP algorithms [Wu 1995] e 0 e 1 e 2 e 3 e 4 e 5 e 6 e 7 c 0 c 1 c 2 c 3 c 4 c 5 c 6

53 Biparsing with CKY Given the following SCFG A -> fat, gordos A -> thin, delgados N -> cats, gatos VP -> eats, comen NP -> A (1) N (2),N (2) A (1) S -> NP (1) VP (2), NP (1) VP (2) Let s parse a sentence pair fat cats eat gatos gordos comen Example by Matt Post (JHU)

54 Biparsing with CKY A -> fat, gordos A -> thin, delgados N -> cats, gatos VP -> eats, comen NP -> A (1) N (2),N (2) A (1) S -> NP (1) VP (2), NP (1) VP (2) 3 comen 2 gordos 1 gatos fat cats eats Chart now enumerates pairs of spans

55 Biparsing with CKY A -> fat, gordos A -> thin, delgados N -> cats, gatos VP -> eats, comen NP -> A (1) N (2),N (2) A (1) S -> NP (1) VP (2), NP (1) VP (2) 3 comen 2 gordos 1 gatos A ((1,1), (2,2)) N ((2,2), (1,1)) VP ((3,3), (3,3)) fat cats eats Apply lexical rules

56 Biparsing with CKY A -> fat, gordos A -> thin, delgados N -> cats, gatos VP -> eats, comen NP -> A (1) N (2),N (2) A (1) S -> NP (1) VP (2), NP (1) VP (2) 3 comen 2 gordos 1 gatos A S ((1,1), ((3,3), (2,2)) NP (3,3)) ((1,2), N (1,2)) ((2,2), (1,1)) VP ((3,3), (3,3)) fat cats eats For each block, apply straight & inverted rules 1 2 3

57 Biparsing with CKY 3 comen 2 gordos 1 gatos O(GN 3 M 3 ) fat cats eats 1 2 3

58 Aligning words & constituents 2 different ways of looking at this problem: parse-parse-match assume existing parses and alignment biparse alignment = structure

Parsing with Context Free Grammars

Parsing with Context Free Grammars Parsing with Context Free Grammars CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Today s Agenda Grammar-based parsing with CFGs CKY algorithm Dealing with ambiguity Probabilistic CFGs

More information

Compositional Semantics

Compositional Semantics Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language

More information

Context Free Grammars

Context Free Grammars Ewan Klein ewan@inf.ed.ac.uk ICL 31 October 2005 Some Definitions Trees Constituency Recursion Ambiguity Agreement Subcategorization Unbounded Dependencies Syntax Outline Some Definitions Trees How words

More information

Context Free Grammar

Context Free Grammar Context Free Grammar CS 585, Fall 2017 Introduction to Natural Language Processing http://people.cs.umass.edu/~brenocon/inlp2017 Brendan O Connor College of Information and Computer Sciences University

More information

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together

More information

Textual Entailment. Alina Petrova. February 22, 2012 EMCL TUD, HLT FBK. Textual Entailment

Textual Entailment. Alina Petrova. February 22, 2012 EMCL TUD, HLT FBK. Textual Entailment February 22, 2012 Introduction (TE): What is it? a notion from classical logic is applied to natural language using NLP technologies Which techniques can be applied? relevant features for detecting TE

More information

Basic Parsing with Context- Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky

Basic Parsing with Context- Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky Basic Parsing with Context- Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 To view past videos: http://globe.cvn.columbia.edu:8080/oncampus.ph p?c=133ae14752e27fde909fdbd64c06b337

More information

Context Free Grammars

Context Free Grammars Context Free Grammars Synchronic Model of Language Syntactic Lexical Morphological Semantic Pragmatic Discourse Syntactic Analysis Syntax expresses the way in which words are arranged together. The kind

More information

Basic concepts of syntax. Holger Diessel

Basic concepts of syntax. Holger Diessel Basic concepts of syntax Holger Diessel holger.diessel@uni-jena.de Basic concepts Syntax: The study of how sentences are composed. Syntactic analysis involves three basic concepts: Types of words Parts

More information

Efficient Search for Inversion Transduction Grammar

Efficient Search for Inversion Transduction Grammar Efficient Search for Inversion Transduction Grammar Hao Zhang and Daniel Gildea Computer Science Department University of Rochester Rochester, NY 14627 Abstract We develop admissible A* search heuristics

More information

For Friday. Finish chapter 22 Homework. Chapter 22, exercises 1, 7, 9, 14 Allocate some time for this one

For Friday. Finish chapter 22 Homework. Chapter 22, exercises 1, 7, 9, 14 Allocate some time for this one For Friday Finish chapter 22 Homework Chapter 22, exercises 1, 7, 9, 14 Allocate some time for this one Program 5 Learning mini-project Worth 2 homeworks Due Wednesday Foil6 is available in /home/mecalif/public/itk340/foil

More information

Natural Language Processing SoSe Parsing. (based on the slides of Dr. Saeedeh Momtazi)

Natural Language Processing SoSe Parsing. (based on the slides of Dr. Saeedeh Momtazi) Natural Language Processing SoSe 2015 Parsing Dr. Mariana Neves May 18th, 2014 (based on the slides of Dr. Saeedeh Momtazi) Parsing Finding structural relationships between words in a sentence (http://nlp.stanford.edu:8080/parser)

More information

Lecture 5: Parsing with constraint-based grammars

Lecture 5: Parsing with constraint-based grammars Lecture 5: Parsing with constraint-based grammars Providing a more adequate treatment of syntax than simple CFGs: replacing the atomic categories by more complex data structures. 1. Problems with simple

More information

Speech and Language Processing. Today

Speech and Language Processing. Today Speech and Language Processing Formal Grammars Chapter 12 Formal Grammars Today Context-free grammar Grammars for English Treebanks Dependency grammars 9/26/2013 Speech and Language Processing - Jurafsky

More information

CS474 Introduction to Natural Language Processing Final Exam December 15, 2005

CS474 Introduction to Natural Language Processing Final Exam December 15, 2005 Name: CS474 Introduction to Natural Language Processing Final Exam December 15, 2005 Netid: Instructions: You have 2 hours and 30 minutes to complete this exam. The exam is a closed-book exam. # description

More information

Gloss overlap extensions for a semantic network algorithm: building a better semantic distance measure

Gloss overlap extensions for a semantic network algorithm: building a better semantic distance measure Gloss overlap extensions for a semantic network algorithm: building a better semantic distance measure Thimal Jayasooriya and Suresh Manandhar Department of Computer Science, The University of York, York

More information

Assignment 4. CMSC 473/673 Introduction to Natural Language Processing. Due Monday December 11, 2017, 11:59 AM

Assignment 4. CMSC 473/673 Introduction to Natural Language Processing. Due Monday December 11, 2017, 11:59 AM Assignment 4 CMSC 473/673 Introduction to Natural Language Processing Due Monday December 11, 2017, 11:59 AM Item Summary Assigned Tuesday November 21st, 2017 Due Monday December 11th, 2017 Topic Syntax

More information

Slides credited from Richard Socher

Slides credited from Richard Socher Slides credited from Richard Socher Sequence Modeling Idea: aggregate the meaning from all words into a vector Compositionality Method: Basic combination: average, sum Neural combination: Recursive neural

More information

Statistical Approaches to Natural Language Processing CS 4390/5319 Spring Semester, 2003 Syllabus

Statistical Approaches to Natural Language Processing CS 4390/5319 Spring Semester, 2003 Syllabus Statistical Approaches to Natural Language Processing CS 4390/5319 Spring Semester, 2003 Syllabus http://www.cs.utep.edu/nigel/nlp.html Time and Location 15:00 16:25, Tuesdays and Thursdays Computer Science

More information

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up

More information

Alternative Syntactic Theories

Alternative Syntactic Theories Alternative Syntactic Theories L614 Spring 2015 Syntactic analysis Generative grammar: collection of words and rules with which we generate strings of those words, i.e., sentences Syntax attempts to capture

More information

Dependency Parsing. Prashanth Mannem

Dependency Parsing. Prashanth Mannem Dependency Parsing Prashanth Mannem mannemp@eecs.oregonstate.edu Outline Introduction Dependency Parsing Formal definition Parsing Algorithms Introduction Dynamic programming Deterministic search 2 Syntax

More information

Hierarchical Translation Equivalence over Word Alignments

Hierarchical Translation Equivalence over Word Alignments Hierarchical Translation Equivalence over Word Alignments Khalil Sima an University of Amsterdam Gideon Maillette de Buy Wenniger University of Amsterdam We present a theory of word alignments in machine

More information

UIO-Lien: Entailment Recognition using Minimal Recursion Semantics

UIO-Lien: Entailment Recognition using Minimal Recursion Semantics UIO-Lien: Entailment Recognition using Minimal Recursion Semantics Elisabeth Lien Department of Informatics University of Oslo, Norway elien@ifi.uio.no Milen Kouylekov Department of Informatics University

More information

Cross Lingual Syntax Projection for Resource-Poor Languages

Cross Lingual Syntax Projection for Resource-Poor Languages Cross Lingual Syntax Projection for Resource-Poor Languages Vamshi Ambati Language Technologies Institute, Carnegie Mellon University Wei Chen Language Technologies Institute, Carnegie Mellon University

More information

Lecture 12. Chapter 9: Syntax

Lecture 12. Chapter 9: Syntax Lecture 12 Chapter 9: Syntax Introduction to Linguistics LANE 321 Lecturer: Haifa Alroqi What is syntax? When we concentrate on the structure & ordering of components within a sentence = studying the syntax

More information

Syntactic Theory. Tree-Adjoining Grammar (TAG) Yi Zhang. November 5th, Department of Computational Linguistics Saarland University

Syntactic Theory. Tree-Adjoining Grammar (TAG) Yi Zhang. November 5th, Department of Computational Linguistics Saarland University Syntactic Theory Tree-Adjoining Grammar (TAG) Yi Zhang Department of Computational Linguistics Saarland University November 5th, 2009 What you should have known so far... Phrase structure grammars Context-free

More information

Introduction to Advanced Natural Language Processing (NLP)

Introduction to Advanced Natural Language Processing (NLP) Advanced Natural Language Processing () L645 / B659 Dept. of Linguistics, Indiana University Fall 2015 1 / 24 Definition of CL 1 Computational linguistics is the study of computer systems for understanding

More information

CS224d: Deep NLP. Lecture 11: Advanced Recursive Neural Networks. Richard Socher

CS224d: Deep NLP. Lecture 11: Advanced Recursive Neural Networks. Richard Socher CS224d: Deep NLP Lecture 11: Advanced Recursive Neural Networks Richard Socher richard@metamind.io PSet2 please read instructions for submissions Please followpiazza for questions and announcements Because

More information

ENLP Lecture 21b Word & Document Representations; Distributional Similarity

ENLP Lecture 21b Word & Document Representations; Distributional Similarity ENLP Lecture 21b Word & Document Representations; Distributional Similarity Nathan Schneider (some slides by Marine Carpuat, Sharon Goldwater, Dan Jurafsky) 28 November 2016 1 Topics Similarity Thesauri

More information

6.891: Lecture 4 (September 20, 2005) Parsing and Syntax II

6.891: Lecture 4 (September 20, 2005) Parsing and Syntax II 6.891: Lecture 4 (September 20, 2005) Parsing and Syntax II Overview Weaknesses of PCFGs Heads in context-free rules Dependency representations of parse trees Two models making use of dependencies Weaknesses

More information

A comprehensive information extraction module for reducing call handling time in a contact centre

A comprehensive information extraction module for reducing call handling time in a contact centre International Journal of Scientific and Research Publications, Volume 5, Issue 1, January 2015 1 A comprehensive information extraction module for reducing call handling time in a contact centre K.I.H.

More information

Multi-Engine Machine Translation (MT Combination) Weiyun Ma 2012/02/17

Multi-Engine Machine Translation (MT Combination) Weiyun Ma 2012/02/17 Multi-Engine Machine Translation (MT Combination) Weiyun Ma 2012/02/17 1 Why MT combination? A wide range of MT approaches have emerged We want to leverage strengths and avoid weakness of individual systems

More information

LING 101 Lecture outline M Mar 19 Today s topic: Movement rules

LING 101 Lecture outline M Mar 19 Today s topic: Movement rules LING 101 Lecture outline M Mar 19 Today s topic: Movement rules Background reading: CL Ch 5, sec 3, Move The Ch 5 Appendix section on Using Move 1 0. Course information HW #7 is due - Please put it in

More information

Syntactic Theory: Its Goals and Tasks

Syntactic Theory: Its Goals and Tasks Syntactic Theory: Its Goals and Tasks Overview Introduction... 1 Preliminaries... 3 Main Goals and Tasks of Syntactic Theory... 10 Constituent Structure... 11 Syntactic Categories... 12 Syntactic Relations...

More information

TTIC 31190: Natural Language Processing

TTIC 31190: Natural Language Processing TTIC 31190: Natural Language Processing Kevin Gimpel Winter 2016 Lecture 15: Introduction to Machine Translation Announcements Assignment 3 due Monday email me to sign up for your (10-minute) class presentation

More information

Exploration of Semantic Spaces Obtained from Czech Corpora. Exploration of Semantic Spaces Obtained from. Czech Corpora

Exploration of Semantic Spaces Obtained from Czech Corpora. Exploration of Semantic Spaces Obtained from. Czech Corpora Exploration of Semantic Spaces Obtained from Exploration of Semantic Spaces Obtained from Czech Corpora Czech Corpora Lubomír Krčmář, Miloslav Konopík, and Karel Ježek Lubomír Krčmář, Miloslav Konopík,

More information

Semantic Parsing of Natural Language Input for Dialogue Systems. Jamie Frost Oxford University Computational Linguistics Group

Semantic Parsing of Natural Language Input for Dialogue Systems. Jamie Frost Oxford University Computational Linguistics Group Semantic Parsing of Natural Language Input for Dialogue Systems Jamie Frost Oxford University Computational Linguistics Group Video The EUROPA Project Autonomous pedestrian assistant robot designed to

More information

Correcting Contradictions

Correcting Contradictions Correcting Contradictions Aikaterini-Lida Kalouli University of Konstanz aikaterini-lida.kalouli@uni-konstanz.de Valeria de Paiva Nuance Communications valeria.depaiva@gmail.com Livy Real University of

More information

Using Left-corner Parsing to Encode Universal Structural Constraints in Grammar Induction

Using Left-corner Parsing to Encode Universal Structural Constraints in Grammar Induction Using Left-corner Parsing to Encode Universal Structural Constraints in Grammar Induction Hiroshi Noji Yusuke Miyao Mark Johnson Nara Institute of Science and Technology National Institute of Informatics

More information

Foundations of Natural Language Processing Lecture 1 Introduction

Foundations of Natural Language Processing Lecture 1 Introduction Foundations of Natural Language Processing Lecture 1 Introduction Alex Lascarides (Slides based on those of Philipp Koehn, Alex Lascarides, Sharon Goldwater) 16 January 2018 Alex Lascarides FNLP Lecture

More information

Introduction to Machine Translation

Introduction to Machine Translation Introduction to Machine Translation CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides & figure credits: Philipp Koehn mt-class.org Today s topics Machine Translation Historical Background Machine Translation

More information

Lexical Acquisition in Statistical NLP

Lexical Acquisition in Statistical NLP Lexical Acquisition in Statistical NLP Adapted from: Manning and Schütze, 1999 Chapter 8 (pp. 265-278; 308-312) Anjana Vakil University of Saarland Outline What is lexical information? Why is it important

More information

Continuously Improving Natural Language Understanding for Robotic Systems through Semantic Parsing, Dialog, and Multi-modal Perception

Continuously Improving Natural Language Understanding for Robotic Systems through Semantic Parsing, Dialog, and Multi-modal Perception Continuously Improving Natural Language Understanding for Robotic Systems through Semantic Parsing, Dialog, and Multi-modal Perception Jesse Thomason Doctoral Dissertation Proposal 1 Natural Language Understanding

More information

Employing External Rich Knowledge for Machine Comprehension

Employing External Rich Knowledge for Machine Comprehension Employing External Rich Knowledge for Machine Comprehension IJCAI-16 Bingning Wang, Shangmin Guo, Kang Liu, Shizhu He, Jun Zhao National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

Lecture 12: Semantic Analysis. Semantic Analysis

Lecture 12: Semantic Analysis. Semantic Analysis The University of North Carolina at Chapel Hill Spring 2002 Lecture 12: Semantic Analysis Feb 6 1 Semantic Analysis From Code Form To Program Meaning Source Code Compiler or Interpreter Translation Execution

More information

Current Grammar. Our grammar has several types of rules, which are organized roughly as in (1): Transformations Form Rules.

Current Grammar. Our grammar has several types of rules, which are organized roughly as in (1): Transformations Form Rules. Ling 121, Syntax Current Grammar 1. Organization Our grammar has several types of rules, which are organized roughly as in (1): (1) Phrase Structure Rules Deep Structure Lexicon Transformations Form Rules

More information

CS502: Compilers & Programming Systems

CS502: Compilers & Programming Systems CS502: Compilers & Programming Systems Context Free Grammars Zhiyuan Li Department of Computer Science Purdue University, USA Course Outline Languages which can be represented by regular expressions are

More information

First Workshop Data Science: Theory and Application RWTH Aachen University, Oct. 26, 2015

First Workshop Data Science: Theory and Application RWTH Aachen University, Oct. 26, 2015 First Workshop Data Science: Theory and Application RWTH Aachen University, Oct. 26, 2015 The Statistical Approach to Speech Recognition and Natural Language Processing Hermann Ney Human Language Technology

More information

Dependency parsing & Dependency parsers

Dependency parsing & Dependency parsers Dependency parsing & Dependency parsers Lecture 11 qkang@fi.muni.cz Syntactic formalisms for natural language parsing IA161, FI MU autumn 2011 Study materials Course materials and homeworks are available

More information

Generalized Phrase Structure Grammar

Generalized Phrase Structure Grammar Generalized Phrase Structure Grammar Petr Horáček, Eva Zámečníková and Ivana Burgetová Department of Information Systems Faculty of Information Technology Brno University of Technology Božetěchova 2, 612

More information

CSC Senior Project: NLPStats

CSC Senior Project: NLPStats CSC Senior Project: NLPStats By Michael Mease Cal Poly San Luis Obispo Advised by Dr. Foaad Khosmood March 16, 2013 Abstract Natural Language Processing has recently increased in popularity. The field

More information

Exploring Vector Space Models to Predict the Compositionality of German Noun-Noun Compounds

Exploring Vector Space Models to Predict the Compositionality of German Noun-Noun Compounds Exploring to Predict the Compositionality of German Noun-Noun Compounds Institut für Maschinelle Sprachverarbeitung (IMS) Universität Stuttgart, Germany *SEM, Atlanta June 13-14, 2013 Overview Motivation

More information

Computational Linguistics: Syntax-Semantics Interface

Computational Linguistics: Syntax-Semantics Interface Computational Linguistics: Syntax-Semantics Interface Raffaella Bernardi KRDB, Free University of Bozen-Bolzano P.zza Domenicani, Room: 2.28, e-mail: bernardi@inf.unibz.it Contents 1 The Syntax-Semantics

More information

Introduction to Semantic Theory Definite descriptions and modification

Introduction to Semantic Theory Definite descriptions and modification Introduction to Semantic Theory Definite descriptions and modification Class: June 8, 2016 Recap and aim Connecting back to the previous lecture Central result: extension to multi-step derivations; introduction

More information

Generalized Phrase Structure Grammar

Generalized Phrase Structure Grammar Generalized Phrase Structure Grammar Petr Horáček, Eva Zámečníková and Ivana Burgetová Department of Information Systems Faculty of Information Technology Brno University of Technology Božetěchova 2, 612

More information

Syntactic Constraints on Paraphrases Extracted from Parallel Corpora

Syntactic Constraints on Paraphrases Extracted from Parallel Corpora Syntactic Constraints on Paraphrases Extracted from Parallel Corpora Chris Callison-Burch Center for Language and Speech Processing Johns Hopkins University Baltimore, Maryland ccb cs jhu edu Abstract

More information

Textual Entailment Recognition Based on Dependency Analysis and WordNet

Textual Entailment Recognition Based on Dependency Analysis and WordNet Textual Entailment Recognition Based on Dependency Analysis and WordNet Jesús Herrera, Anselmo Peñas, Felisa Verdejo Departamento de Lenguajes y Sistemas Informáticos Universidad Nacional de Educación

More information

Proof Theory for Syntacticians

Proof Theory for Syntacticians Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax

More information

Texts as Knowledge Bases

Texts as Knowledge Bases Texts as Knowledge Bases Christopher Manning Joint work with Gabor Angeli and Danqi Chen Stanford NLP Group @chrmanning @stanfordnlp AKBC 2016 Machine Comprehension = Machine has an Augmented Knowledge

More information

A TAG-based noisy channel model of speech repairs

A TAG-based noisy channel model of speech repairs A TAG-based noisy channel model of speech repairs Mark Johnson and Eugene Charniak Brown University ACL, 2004 Supported by NSF grants LIS 9720368 and IIS0095940 1 Talk outline Goal: Apply parsing technology

More information

CS474 Natural Language Processing. Noisy channel model. Decoding algorithm. Pronunciation subproblem. Special case of Bayesian inference

CS474 Natural Language Processing. Noisy channel model. Decoding algorithm. Pronunciation subproblem. Special case of Bayesian inference CS474 Natural Language Processing Last week SENSEVAL» Pronunciation variation in speech recognition Today» Decoding algorithm Introduction to generative models of language» What are they?» Why they re

More information

Hypothesis Mixture Decoding for Statistical Machine Translation

Hypothesis Mixture Decoding for Statistical Machine Translation Hypothesis Mixture Decoding for Statistical Machine Translation Nan Duan, School of Computer Science and Technology Tianjin University Tianjin, China v-naduan@microsoft.com Mu Li, and Ming Zhou Natural

More information

TTIC 31210: Advanced Natural Language Processing. Lecture 14: Finish up Bayesian/Unsupervised NLP, Start Structured Prediction

TTIC 31210: Advanced Natural Language Processing. Lecture 14: Finish up Bayesian/Unsupervised NLP, Start Structured Prediction TTIC 31210: Advanced Natural Language Processing Kevin Gimpel Spring 2017 Lecture 14: Finish up Bayesian/Unsupervised NLP, Start Structured Prediction 1 Today and Wednesday: structured prediction No class

More information

Natural Language Processing. George Konidaris

Natural Language Processing. George Konidaris Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans

More information

A Pattern-based Machine Translation System Extended by Example-based Processing

A Pattern-based Machine Translation System Extended by Example-based Processing A Pattern-based Machine Translation System Extended by Example-based Processing Hideo Watanabe and Koichi Takeda IBM Research, Tokyo Research Laboratory 1623-14 Shimotsuruma, Yamato, Kanagawa 242-8502,

More information

Word Sense Disambiguation using case based Approach with Minimal Features Set

Word Sense Disambiguation using case based Approach with Minimal Features Set Word Sense Disambiguation using case based Approach with Minimal Features Set Tamilselvi P * Research Scholar, Sathyabama Universtiy, Chennai, TN, India Tamil_n_selvi@yahoo.co.in S.K.Srivatsa St.Joseph

More information

Statistical NLP: linguistic essentials. Updated 10/15

Statistical NLP: linguistic essentials. Updated 10/15 Statistical NLP: linguistic essentials Updated 10/15 Parts of Speech and Morphology syntactic or grammatical categories or parts of Speech (POS) are classes of word with similar syntactic behavior Examples

More information

Introduction to NLP and Text Mining Tutor: Rahmad Mahendra

Introduction to NLP and Text Mining Tutor: Rahmad Mahendra Introduction to NLP and Text Mining Tutor: Rahmad Mahendra Natural Language Processing & Text Mining Short Course Pusat Ilmu Komputer UI 22 26 Agustus 2016 References Jurafsky and Martin, Speech and Language

More information

CS 598 Natural Language Processing

CS 598 Natural Language Processing CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@

More information

A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic Similarity

A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic Similarity A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic Similarity Michael L. Mc Hale Intelligent Information Systems Air Force Research Laboratory 525 Brooks Road 13441 Rome, NY, USA, mchale@ai.rl.af.mil

More information

Constituency, Trees, Context-free Grammar

Constituency, Trees, Context-free Grammar Constituency, Trees, Context-free Grammar Weiwei Sun Institute of Computer Science and Technology Peking University March 18, 2015 Administration Grading: Regular attendance of the lectures is required

More information

Björn Gambäck 1 October 2013

Björn Gambäck 1 October 2013 Course Examination Computational 1. Natural Language Processing and Communication Oral presentation (15-20 min), in November [not graded] Short essay (½ -2 pages) on the same topic [not graded] Björn Gambäck

More information

Generating Disambiguating Paraphrases for Structurally Ambiguous Sentences

Generating Disambiguating Paraphrases for Structurally Ambiguous Sentences Generating Disambiguating Paraphrases for Structurally Ambiguous Sentences Manjuan Duan, Ethan Hill, Michael White August 11-12, 2016, LAW-X The Ohio State University Department of Linguistics 1 Joint

More information

Context Free Grammar (CFG) Analysis for simple Kannada sentences

Context Free Grammar (CFG) Analysis for simple Kannada sentences 32 Context Free Grammar (CFG) Analysis for simple Kannada sentences B M Sagar Asst Prof, Information Science, RVCE Bangalore, India sagar.bm@gmail.com Abstract When Computational Linguistic is concerns

More information

Creativity of linguistic meaning

Creativity of linguistic meaning Some putative hallmarks of human language Creativity of linguistic meaning! 50k words in a language Open class words Concrete: dog, woman, house Abstract: belief, believe, jealousy Closed class words the,

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

Automatic Category Label Coarsening for Syntax-Based Machine Translation

Automatic Category Label Coarsening for Syntax-Based Machine Translation Automatic Category Label Coarsening for Syntax-Based Machine Translation Greg Hanneman and Alon Lavie Language Technologies Institute Carnegie Mellon University Fifth Workshop on Syntax and Structure in

More information

Slot Grammar. Zahra Solgi. June 18, Universität Tübingen

Slot Grammar. Zahra Solgi. June 18, Universität Tübingen 1 Slot Grammar Zahra Solgi Universität Tübingen June 18, 2016 2 Slot Grammar Overview Slot Grammar? what is the use of that? 3 Slot Grammar Overview Slot Grammar? what is the use of that? Slot Grammar

More information

Lectures Machine Translation

Lectures Machine Translation Lectures 19 20 Machine Translation Nathan Schneider (with slides by Philipp Koehn, Chris Dyer) ANLP 15, 20 November 2017 A Clear Plan 5 Interlingua Lexical Transfer Source Target Philipp Koehn Machine

More information

Syntax: The Sentence Patterns of Language WEEK 4 DAY 1

Syntax: The Sentence Patterns of Language WEEK 4 DAY 1 Syntax: The Sentence Patterns of Language WEEK 4 DAY 1 Contents Last lecture: Morphology (Other Morphological Processes, Back Formations, Compounds, Pullet Surprises ) Today: What the Syntax Rules Do We

More information

Exploiting Parallel Treebanks in Phrase-Based SMT. Statistical Machine Translation

Exploiting Parallel Treebanks in Phrase-Based SMT. Statistical Machine Translation Exploiting Parallel Treebanks in Phrase-Based Statistical Machine Translation John Tinsley National Centre for Language Technology Dublin City University Ireland Collaborators: Mary Hearne and Andy Way

More information

Introduction F-structures LFG. Syntactic Theory Winter Semester 2009/2010. Antske Fokkens. Department of Computational Linguistics Saarland University

Introduction F-structures LFG. Syntactic Theory Winter Semester 2009/2010. Antske Fokkens. Department of Computational Linguistics Saarland University LFG Syntactic Theory Winter Semester 2009/2010 Antske Fokkens Department of Computational Linguistics Saarland University 17 November 2009 Antske Fokkens Syntax Lexical Functional Grammar 1 / 33 Outline

More information

Semantic Analysis. Computational Semantics Chapter 18. Compositional Analysis. Example. Compositional Analysis. Compositional Semantics

Semantic Analysis. Computational Semantics Chapter 18. Compositional Analysis. Example. Compositional Analysis. Compositional Semantics Semantic Analysis Computational Semantics Chapter 18 Lecture #12 November 2012 We will not do all of this Semantic analysis is the process of taking in some linguistic input and producing a meaning representation

More information

INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad

INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad INSTITUTE OF AERONAUTICAL ENGINEERING (Autonomous) Dundigal, Hyderabad - 500 043 INFORMATION TECHNOLOGY TUTORIAL QUESTION BANK Name INFORMATION RETRIEVAL SYSTEM Code A70533 Class IV B. Tech I Semester

More information

Natural Language Inference via Dependency Tree Mapping: An Application to Question Answering

Natural Language Inference via Dependency Tree Mapping: An Application to Question Answering Natural Language Inference via Dependency Tree Mapping: An Application to Question Answering Vasin Punyakanok Dan Roth Wen-tau Yih Department of Computer Science University of Illinois at Urbana-Champaign

More information

Simple, Effective, Robust Semi-Supervised Learning, Thanks To Google N-grams. Shane Bergsma Johns Hopkins University

Simple, Effective, Robust Semi-Supervised Learning, Thanks To Google N-grams. Shane Bergsma Johns Hopkins University Simple, Effective, Robust Semi-Supervised Learning, Thanks To Google N-grams Shane Bergsma Johns Hopkins University Hissar, Bulgaria September 15, 2011 Research Vision Robust processing of human language

More information

The Prague Bulletin of Mathematical Linguistics NUMBER 91 JANUARY Grammar based statistical MT on Hadoop

The Prague Bulletin of Mathematical Linguistics NUMBER 91 JANUARY Grammar based statistical MT on Hadoop The Prague Bulletin of Mathematical Linguistics NUMBER 91 JANUARY 2009 67 78 Grammar based statistical MT on Hadoop An end-to-end toolkit for large scale PSCFG based MT Ashish Venugopal, Andreas Zollmann

More information

Vector Space Models (VSM) and Information Retrieval (IR)

Vector Space Models (VSM) and Information Retrieval (IR) Vector Space Models (VSM) and Information Retrieval (IR) T-61.5020 Statistical Natural Language Processing 24 Feb 2016 Mari-Sanna Paukkeri, D. Sc. (Tech.) Lecture 3: Agenda Vector space models word-document

More information

CSCI-GA Compiler Construction Lecture 6: Syntax Analysis. Mohamed Zahran (aka Z)

CSCI-GA Compiler Construction Lecture 6: Syntax Analysis. Mohamed Zahran (aka Z) CSCI-GA.2130-001 Compiler Construction Lecture 6: Syntax Analysis Mohamed Zahran (aka Z) mzahran@cs.nyu.edu Context-Free Grammars Precise syntactic specifications of a programming language For some classes,

More information

Syntactic Theory. Introduction. Yi Zhang & Antske Fokkens. October 15, Department of Computational Linguistics Saarland University

Syntactic Theory. Introduction. Yi Zhang & Antske Fokkens. October 15, Department of Computational Linguistics Saarland University Syntactic Theory Introduction Yi Zhang & Antske Fokkens Department of Computational Linguistics Saarland University October 15, 2009 Syntax: What does it mean? We can view syntax/syntactic theory in a

More information

Vector Representations of Word Meaning in Context

Vector Representations of Word Meaning in Context Vector Representations of Word Meaning in Context Lea Frermann Universität des Saarlandes May 23, 2011 Lea Frermann (Universität des Saarlandes) Vector Representation of Word Semantics May 23, 2011 1 /

More information

Course Overview for Final May 8 (Tuesday), 8-10 AM (Bartlett Room 301)

Course Overview for Final May 8 (Tuesday), 8-10 AM (Bartlett Room 301) Course Overview for Final May 8 (Tuesday), 8-10 AM (Bartlett Room 301) Course goals: o Learn about the methodology (and formalisms) used by linguists to explore the following questions: What do you know

More information

THE problem of calculating the semantic similarity between

THE problem of calculating the semantic similarity between IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 Calculating the similarity between words and sentences using a lexical database and corpus statistics Atish Pawar, Vijay Mago arxiv:1802.05667v2 [cs.cl]

More information

Ed nburgh University of Edinburgh NLP. Understanding Visual Scences. Dependency Graphs, Word Senses, and Multimodal Embeddings

Ed nburgh University of Edinburgh NLP. Understanding Visual Scences. Dependency Graphs, Word Senses, and Multimodal Embeddings Understanding Visual Scences Dependency Graphs, Word Senses, and Multimodal Embeddings Mirella Lapata School of Informatics University of Edinburgh Ed nburgh University of Edinburgh NLP Natural Language

More information

Syntactic Reordering of Source Sentences for Statistical Machine Translation

Syntactic Reordering of Source Sentences for Statistical Machine Translation Syntactic Reordering of Source Sentences for Statistical Machine Translation Mohammad Sadegh Rasooli Columbia University rasooli@cs.columbia.edu April 9, 2013 M. S. Rasooli (Columbia University) Syntactic

More information

Linear Models Continued: Perceptron & Logistic Regression

Linear Models Continued: Perceptron & Logistic Regression Linear Models Continued: Perceptron & Logistic Regression CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Linear Models for Classification Feature function

More information

Introduction to Tactical Generation with HPSG

Introduction to Tactical Generation with HPSG Introduction to Tactical Generation with HPSG Woodley Packard University of Washington March 5, 2013 Introduction Natural Language Generation: the task of automatically producing natural language utterances

More information

Natural Language Generation, Non- Metric Methods, Probabilistic Context Free Grammar, Parsing Algorithms, NLP Tools

Natural Language Generation, Non- Metric Methods, Probabilistic Context Free Grammar, Parsing Algorithms, NLP Tools Natural Language Generation, Non- Metric Methods, Probabilistic Context Free Grammar, Parsing Algorithms, NLP Tools Sameer Maskey Week 4, Sept 26, 2012 *animation slides on parsing obtained from Prof Raymond

More information

Opinion Mining and Sentiment Analysis

Opinion Mining and Sentiment Analysis Opinion Mining and Sentiment Analysis She Feng Shanghai Jiao Tong University sjtufs@gmail.com April 15, 2016 Outline What & Why? Data Tasks Interesting methods Topic Model Neural Network 2 What is Opinion

More information