Semantics 3/3 (Lexical semantics)
|
|
- Victoria Fletcher
- 6 years ago
- Views:
Transcription
1 Slides based on Jurafsky and Martin Speech and Language Processing Semantics 3/3 (Lexical semantics) Ing. Roberto Tedesco, PhD NLP AA Prof. L. Sbattella
2 Lexical semantics l The linguistic study of: The meaning of words Relations among words and their meanings l Tools: Resources: lexical databases (e.g. WordNet) Technologies: Word Sense Disambiguation 2
3 3 Lexical Semantics: How to represent word meanings
4 Some basic definitions l Lexeme: smallest unit with orthographic form, phonological form and meaning Orthographic form (written form) Lexeme Phonological form (spoken form) Sense (the meaning) 4 l The orthographic form is usually given in base form : the lemma l Lexicon: a collection of lexemes (including special forms like compound nouns)
5 Lexical relations among lexemes l Most used: Polysemy / Homonymy Synonymy Antonymy Hyponymy/hypernymy Meronymy/holonymy l Others exist 5
6 Polysemy / Homonymy 6 l Polysemy A lexeme with more related senses l The bank is constructed from red brick (the building) l I withdrew the money from the bank (the financial establishment) Frequent words tend to be polysemic, especially verbs l to get, to put,... l Homonymy Different lexemes with the same form, but with distinct unrelated senses l bank (a financial establishment) l bank (the land alongside or sloping down to a river or lake) l So, we have two bank lexemes, with 4 senses
7 Homograph and homophones 7 l All the polysemic senses of a lexeme share the same orthographic and phonological form l For homonym lexemes, instead, we can have: Homographs: l Lexemes with the same orthographic form l conduct (noun) [ˈkänˌdəkt] conduct (verb) [kənˈdəkt] Homophones: l Lexemes with the same phonological form l E.g. write and right ; piece and peace Perfect homonym: homograph + homophone l bank (a financial establishment) bank (the land alongside or sloping down to a river or lake)
8 Problems related to homonymy and polysemy 8 l In the following, problems related to homographs / homophones. Of course they also hold for polysemy l Text-To-Speech is affected by homographs with different phonological form conduct (noun) [ˈkänˌdəkt] and conduct (verb) [kənˈdəkt] bass (noun: a voice in the lowest range) [bās] and bass (noun: the common European freshwater perch) [bas] l Information Retrieval is affected by homographs QUERY: bat care à l bat as an implement with a handle and a solid surface, usually of wood, used for hitting the ball; l bat as a mainly nocturnal mammal capable of sustained flight
9 Problems related to homonymy and polysemy 9 l Spelling correction is affected by homophones People tend to confound homophones while writing (malapropism): weather à whether This leads to real-word spelling errors l Speech recognition is affected by homophones to, too, two but also by perfect homonyms bank has two senses, that occur in different contexts Speech recognition is based on statistical model of word co-occurrences In these models, the two meanings of bank are conflated As a result, words co-occurring with the wrong sense are considered
10 Metaphor and Metonymy 10 l Special kinds of polysemy l Methaphor: Constructs an analogy between two things or ideas, the analogy is conveyed by the use of a metaphorical word in place of some other word Germany will pull Slovenia out of its economic slump l Metonymy: A concept is denoted by naming some other concept closely related to it The White House announced yesterday This chapter talks about part-of-speech tagging
11 Synonymy l Different lexemes with the same meaning youth adolescent big large automobile car l What does it mean for two lexemes to mean the same thing? Practical definition: two lexemes are considered synonyms if they can be substituted for one another in sentences without changing the meaning of the sentence (substitutability) 11
12 Synonymy l Perfect synonyms are rare Lexemes rarely share all they senses l E.g: Big and large? That s my big sister That s my large sister Fails because big has, among its senses, the notion of being older, while large lacks it 12
13 Antonymy l Lexemes with opposite sense l Opposite but related! Dark light Boy girl Hot cold Up down In out 13
14 Hypernymy/hyponymy l Hyponymy: an hyponym lexeme denotes a subclass of another lexeme l Hypernymy: an hypernym lemexe denotes a superclass of another lexeme l E.g., since dogs are canids: dog is hyponym of canid canid is hypernym of dog 14
15 Meronymy/holonymy l Meronymy: a meronym lexeme denotes a constituent part of, or a member of another lexeme l Holonymy: an holonym lemexe denotes the whole of a lexeme that denotes a part of it l E.g., since trees have trunk and limbs: trunk and limb are meronyms of tree tree is holonym of both trunk and limb 15
16 Lexical Databases l Model senses and relationship among them l Model a language lexicon l A sense: 16 Represents a specific meaning Is a collection of synonym terms l Relationships are a predefined set: Hyponym/hypernym: the subclass relationship Meronym/holonym: the part-of relationship Synonym/antonym
17 Lexical Databases l Node: word; arc: lexical relationship 17
18 A Lexical Database: WordNet 18 l English lexicon database About terms: nouns, verbs, adjectives, adverbs l Terms are organized in sets called synsets: A synset contains synonym lexemes A synset carries a specific sense, a meaning A synset has a gloss, explaining the carried meaning A lexeme can appear in several synsets (homonymy/polysemy) l Synsets or single lexemes are connected by a set of predefined relations: So-called semantic relations: connect synsets So-called lexical relations: connect lexemes NB: This is WordNet terminology, they are both lexical relations!
19 WordNet: Number of senses # senses 19 # verbs
20 WordNet: Synsets 20 Fonte:
21 WordNet: Structure 21 l Nouns and verbs: Two taxonomies of synsets l Adjectives: Pairs of opposite lemexes form a group Each adjective is connected to synonym lexemes l Adverbs: Connected to the related adjectives l NB: WordNet is not a dictionary; it does not contain: Pronouns, articles, particles (e.g. prepositions) I.e., WordNet does not contain the closed vocabulary (the keywords ) of English WordNet contains the open vocabulary of English
22 WordNet: Relations 22 l Main semantic relations for nouns: X is hypernym of Y: X is superclass of Y Y is hyponym of X: Y is subclass of X X is holonym of Y: X is the whole and Y is part of it Y is meronym of X: Y is part of X X is coordinated with Y: X and Y have a common hypernym l Main semantic relations for verbs: hypernym, hyponym, coordinated X is troponym of Y: X is a particular way to do Y (e.g. X= to walk, Y= to move ) X implies Y: action X implies action Y (X= to snore implies Y= to sleep ) Actually, it is a logical relationship
23 WordNet: Relations l Main semantic relations for adjectives: similar to l and adverbs: pertainym: connects the related adjective l Some lexical reations: antonymy: opposite adjectives synonymy (lexemes in the same synset are implicitly connected by the synonymy relation) 23
24 24
25 25
26 26
27 WordNet: Domains l Labels that identify usage domains Associated to synsets, for nouns and verbs Associated to lexemes, for adjectives and adverbs l Domains are actually synset names l E.g. domains associated to adjective light : 27 1 of 15 senses of light Sense 1 light, visible light, visible radiation TOPIC->(noun) physics#2, natural philosophy#1
28 28 WordNet: Domains
29 WordNet: Instance of Usually, ontologies do that 29 l Last WordNet versions distinguish between classes and instances l Not easy to separate class and instances Typical problem when one tries to define an ontology E.g. Is Nero d Avola a class or an instance? l instance, because it is an element of the set Wine l Class because, in turn, it is a set of bottles Depends on the goal l WordNet approach: proper nouns are usually instances (this is just a general rule )
30 WordNet: Attributes Usually, ontologies define that l Adjectives can represent values associated to a noun noun weight has attribute adjectives light, heavy Adjective light is attribute of nouns weight, value, light 30
31 Multi-language lexical databases: MultiWordNet l Translates WordNet into many languages One-to-one synset translation Is this a sound approach? l Adds to synsets the so-called semantic domain Taxonomy of hundreds of labels, denoting usage domains The goal is similar to the one of WordNet Domains 31
32 32 MultiWordNet
33 33 Multi-language lexical databases: EuroWordNet
34 MultiWordNet vs EuroWordNet l MultiWordNet Quick and dirty approach: implies a one-to-one matching among senses in different languages l But this assumption is not true! It is easy to add new languages l EuroWordNet 34 Sound approach: each language defines its own network The ILI structure is the intermediate language and permits to connects all the languages It is not easy to add new languages
35 Lexical Databases and NLP l Semantic similarity among words W 1 and W 2 Distance (possibly a weighted distance) in terms of relations connecting two words l Using hypernym/hyponym (path in a tree) l Using all the relations (path in a graph) WordNet is composed of synsets, then: d SN (W 1,W 2 ) = min d SYN (S 1, S 2 ) S 1 sysetsof (W 1 ) S 2 sysetsof (W 2 ) 35 d SYN (S 1,S 2 ) = min path(s 1,S 2 )
36 Lexical Databases and NLP l Clustering Divide similar words in clusters, using the distance Divide similar documents in clusters, using distances among their words l Advanced search engines Search for a word and its synonyms, hynonyms, etc. Search for an adjective and the derived adverb l Lemmatization: from the flexed form to the base form (treesà tree; running à to run) Depends on the particular SN: WordNet uses heuristics
37 A library for WordNet: JWNL 37 JWNL.initialize(new FileInputStream( )); Dictionary dic = Dictionary.getInstance(); Synset synset; IndexWord idxword = dic.lookupindexword(pos.noun,"wine"); if (idxword!= null){ } for (int i = 1; i <= idxword.getsensecount(); i++){ synset = idxword.getsense(i); for (Word w : synset.getwords()){ } System.out.println(w.getLemma());
38 Internal structure of words 38 l Thematic roles: roles associated with verbal arguments l Selectional restriction: constraints that verbs pose on their arguments l Primitive decomposition: decomposing words in primitive parts l Semantic fields: takes into account the background information that lexemes may share See MultiWordnet s Semantic Domains, or WordNet s Domains
39 Thematic roles He opened a door Houston s Billy Hatcher broke a bat e, x, y Isa(e,Opening) Opener(e, he) OpenedThing(e, y) Isa(y, door) e, x, y Isa(e, Breaking) Brea ker(e, BillyHatcher) BrokenThing(e, y) Isa(y, bat) 39 l Semantic deep roles: Opener, OpenedThing, Breaker, BrokenThing l Opener, Breaker have something in common They are both volitional actors, often animate, they cause an event to happen à AGENT l OpenedThing, BrokenThing have something in common Inanimate object affected by the action à THEME
40 Thematic roles 40 l Commonly-used thematic roles
41 41 Thematic roles: examples
42 Linking theory l Thematic roles as an intermediate level: Semantic deep role (e.g. Breaker) Thematic role (e.g. AGENT) Grammatical realization (e.g. subject) l Example surface form Houston s Billy Hatcher broke a bat grammatical realization subject verb dir-obj thematic roles AGENT THEME 42 semantic deep roles Breaker BrokenThing
43 Issues with linking theory l Such thematic roles only applies to arguments of verbs l But other parts of speech have arguments, too: E.g. nouns destruction of the city father of the bride l Linking theory does not consider them 43
44 FrameNet l An English lexicon listing the syntactic and thematic combinations of each word (not only verbs ) l Each word (Lexical Unit - LU) is defined inside a frame l Each frame has Frame Elements (FEs) The thematic roles, very specific With various possible grammatical realizations l FEs are arranged in Patterns 44 l Frames are connected each other by means of particular relationships l VerbNet is another English verb lexicon
45 FrameNet Valence Patterns: appreciate.v (Judgment) Thematic roles Cognizer Evaluee Reason The Cognizer makes the judgment Evaluee is the person or thing about whom/which a judgment is made Typically, there is a constituent expressing the reason for the Judge's judgmen Grammatical realizations (Phrase Type. Grammatical Function) e.g.: NP.Obj: Noun Phrase. Object 45
46 Selectional restrictions 46 l A semantic constraint imposed by a lexeme on the concepts that can fill argument roles associated with it l Remember the sentence: I wanna eat someplace that s close to Politecnico? Try to interpret it using the transitive version of eat l Transitive version of eat has AGENT and THEME roles: l I wanna eat someplace that s close to Politecnico AGENT THEME Semantic ill-formedness (unless you are Godzilla ) THEME should be edible, for the transitive form of eat Selectional restriction violation
47 Representing selectional restrictions I want to eat an hamburger l Representation with roles e, y Eating(e) Agent(e, Speaker) Theme(e, y) Isa(y, Hamburger) l Adding restrictions e, y Eating(e) Agent(e, Speaker) Theme(e, y) Isa(y, Hamburger) Isa(y, EdibleThing) 47 l Using WordNet it is possible to derive that a word is edible Following hypernyms taxonomy
48 Hamburger is edible 48 l Hypothesis: I must know that the word food means something edible l I must map EdibleThing to food Actually, on the synset containing food
49 Primitive decomposition l So far, words seem to represent atomic symbols carrying semantic information l But words have internal structure l For verbs: CD (Conceptual Dependency) 49 Eleven primitive predicates Used to represent all predicate-like language expressions Each verb is a combination of such primitive predicates The waiter brought Mary the check brought : physical movement of an object + change of possession/control of an object
50 Primitive decomposition The waiter brought Mary the check 50 x, y Atrans(x) Actor(x,Waiter) Object(x, Check) To(x, Mary) Ptrans(y) Actor(y,Waiter) Object(y, Check) To(y, Mary)
51 51 Lexical Semantics: Word Sense Disambiguation (WSD)
52 1) WSD & selectional restrictions l WSD as a side-effect of semantic analysis l Restrictions eliminate ill-formed components l As a result, the right meanings survive l If the predicate in unambiguous: I wash dishes ( wash requires something washable) I eat this dish ( eat requires edible thing) The predicate selects the correct sense of its argument ( dish ) l If the argument is unambiguous: Which airlines serve Denver? ( Denver is a location) Which one serve breakfast? ( breakfast is edible ) 52 The argument selects the sense of the predicate ( serve )
53 WSD & selectional restrictions l If both predicate and arguments have multiple senses: I m looking for a restaurant that serves vegetarian dishes Both are ambiguous; several sense combinations 53 But, in this case, only one sense combination does not lead to selectional constraints violation ( serve as serving food and dish as edible thing ) l Limitations of this approach: What kind of dishes do you recommend? ( dish as?) You can t eat gold for lunch if you re hungry! ( gold is not edible. Violation? No, because of the can t ) Mr. Kulkarni ate glass on an empty stomach (for Mr. Kulkarni, glass is edible!) Inter FC will eat Milan AC at the next match (metaphor)
54 2) WSD & Machine Learning l Classify words by means of a stochastic model Classes: the meanings l Input: Word to classify (the so-called target word ) The portion of text where it is embedded (context) Usually, POS of the words (target and context) Often, morphologic analysis is performed on words Less often, some form of parsing is used 54 l Output: The right class (i.e., the right meaning)
55 Features l Input is transformed into a set of features l Common features for WSD: The target word itself The target word collocations The target word co-occurrences l Representation: Per each word, a vector of feature name/value pairs is computed Such vectors are used to train, test, and run the model 55 l First of all we need to chose the window that represents the context of the word to classify
56 Window An electric guitar and bass player stand off to one side not really part of the scene, just as a sort of nod to gringo expectations perhaps l Window: +/- 2 words l Target word: bass An electric guitar and bass player stand off to one side not really part of the scene, just as a sort of nod to gringo expectations perhaps 56
57 Features l The target word (not the lemma!) l Collocation About context words (usually, in base form) in specific positions around the word to classify l Co-occurrence Whether a given word (usually, its base form) appears in the context of the target word, or not 57
58 Collocation l About context words in specific positions around the target word E.g. word base-form, POS [,word n-2, POS n-2, word n-1, POS n-1, word n+1, POS n+1 ] l Representation: a vector Using the window=+/-2: guitar and bass player stand [guitar, NN, and, CJC, player, NN, stand, VVB] 58
59 Co-occurrence l Whether a given word (usually, the base form) appears in the context of the target word, or not Previous operation: collect the n most frequent co-occurring words, according to a corpus, for each target word Feature calc.: select words appearing in the window l Representation: a vector Using window=+/-2: e.g., guitar and bass player stand E.g., collect the n=12 most frequent co-occurring words in sentences with the target word bass (every meaning): [fishing, big, sound, player, fly, rod, pound, double, runs, playing, guitar, band] Then, example of feature: [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0] for the target word bass 59 player guitar
60 Example: bass 60 l Sense s {1, 2, 3, 4, 5, 6, 7, 8}
61 Supervised machine learning 61 l Such models undergo a training phase: Input: a training set Output: the trained model l Training set: a (usually huge) set of samples A sample: (list of features; right class) E.g.: ( [guitar, NN, and, CJC, player, NN, stand, VVB], [0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0], bass; right class:2 ) l Popular models: Naïve Bayes, Decision lists/trees, Neural Nets, Support Vector Machines, etc.
62 Naïve Bayes l P(s): sense prior probability l v j : j-th feature l P(v j s): probability of feature v j, given sense s l Use a tagged corpus to calculate these values l A sample 62 Tags: the right senses guitar and bass player stand v 1 : [guitar, and, player, stand] v 2 : [NN, CJC, NN, VVB] v 3 : [0,0,0,1,0,0,0,0,0,0,1,0] v 4 : bass s: 7 (Tag: the right sense)
63 Naïve Bayes l Having n features, we want to find: l Using Bayes: ŝ = argmax s S ŝ = argmax s S P(s v 1, v 2,..., v n ) P(v 1, v 2,..., v n s)p(s) P(v 1, v 2,..., v n ) 63 l Denominator does not depend on s à does not modify the result of argmax à it can be deleted ŝ = argmax s S P(v 1, v 2,..., v n s)p(s) l Finally, assuming indepencence of features: ŝ = argmax s S n j=1 P(s) P(v j s)
64 Bootstrapping approaches l Common issue: a large corpus is needed! l Bootstrap: Start with a small number of instances of each sense for each lexeme (seeds) Train a classifier Use the classifier to label a larger set of words Check correctness of the labeling Repeat l Selecting seeds 64 Hand-label a subset of the corpus, using the one sense per collocation approach
65 The one sense per collocation approach 65 l For each lexeme (i.e. the target sense); discover word(s) that co-occur frequently l Use sentences where such words appear as a seeding set for the target lexeme l E.g. bass Assume play occurs with the music sense and fish occurs with the fish sense Select sentences containing either play or fish, not both! l How to select co-occurring words and the related sense? By hand (examining the co-occurring words and the target lexeme) Using a lexical database
66 66 REFERENCES
67 67 On lexical databases l WordNet l WordNet Domains l MultiWordNet l EuroWordNet l Global WordNet
68 On verbal frames l FrameNet l VerbNet l PropNet l Unified Verb Idex 68
69 Unifying lexical resources l SemLink 69
Word Sense Disambiguation
Word Sense Disambiguation D. De Cao R. Basili Corso di Web Mining e Retrieval a.a. 2008-9 May 21, 2009 Excerpt of the R. Mihalcea and T. Pedersen AAAI 2005 Tutorial, at: http://www.d.umn.edu/ tpederse/tutorials/advances-in-wsd-aaai-2005.ppt
More informationPart III: Semantics. Notes on Natural Language Processing. Chia-Ping Chen
Part III: Semantics Notes on Natural Language Processing Chia-Ping Chen Department of Computer Science and Engineering National Sun Yat-Sen University Kaohsiung, Taiwan ROC Part III: Semantics p. 1 Introduction
More informationVocabulary Usage and Intelligibility in Learner Language
Vocabulary Usage and Intelligibility in Learner Language Emi Izumi, 1 Kiyotaka Uchimoto 1 and Hitoshi Isahara 1 1. Introduction In verbal communication, the primary purpose of which is to convey and understand
More informationA Bayesian Learning Approach to Concept-Based Document Classification
Databases and Information Systems Group (AG5) Max-Planck-Institute for Computer Science Saarbrücken, Germany A Bayesian Learning Approach to Concept-Based Document Classification by Georgiana Ifrim Supervisors
More informationCS 598 Natural Language Processing
CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@
More information2.1 The Theory of Semantic Fields
2 Semantic Domains In this chapter we define the concept of Semantic Domain, recently introduced in Computational Linguistics [56] and successfully exploited in NLP [29]. This notion is inspired by the
More informationLEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE
LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE Submitted in partial fulfillment of the requirements for the degree of Sarjana Sastra (S.S.)
More informationChunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.
NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationLanguage Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus
Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,
More informationConstraining X-Bar: Theta Theory
Constraining X-Bar: Theta Theory Carnie, 2013, chapter 8 Kofi K. Saah 1 Learning objectives Distinguish between thematic relation and theta role. Identify the thematic relations agent, theme, goal, source,
More informationThe MEANING Multilingual Central Repository
The MEANING Multilingual Central Repository J. Atserias, L. Villarejo, G. Rigau, E. Agirre, J. Carroll, B. Magnini, P. Vossen January 27, 2004 http://www.lsi.upc.es/ nlp/meaning Jordi Atserias TALP Index
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationContext Free Grammars. Many slides from Michael Collins
Context Free Grammars Many slides from Michael Collins Overview I An introduction to the parsing problem I Context free grammars I A brief(!) sketch of the syntax of English I Examples of ambiguous structures
More informationDerivational: Inflectional: In a fit of rage the soldiers attacked them both that week, but lost the fight.
Final Exam (120 points) Click on the yellow balloons below to see the answers I. Short Answer (32pts) 1. (6) The sentence The kinder teachers made sure that the students comprehended the testable material
More informationTHE VERB ARGUMENT BROWSER
THE VERB ARGUMENT BROWSER Bálint Sass sass.balint@itk.ppke.hu Péter Pázmány Catholic University, Budapest, Hungary 11 th International Conference on Text, Speech and Dialog 8-12 September 2008, Brno PREVIEW
More informationLoughton School s curriculum evening. 28 th February 2017
Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's
More informationObjectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition
Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic
More informationOntologies vs. classification systems
Ontologies vs. classification systems Bodil Nistrup Madsen Copenhagen Business School Copenhagen, Denmark bnm.isv@cbs.dk Hanne Erdman Thomsen Copenhagen Business School Copenhagen, Denmark het.isv@cbs.dk
More informationTarget Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data
Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se
More informationCompositional Semantics
Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language
More informationThe Internet as a Normative Corpus: Grammar Checking with a Search Engine
The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a
More informationAspectual Classes of Verb Phrases
Aspectual Classes of Verb Phrases Current understanding of verb meanings (from Predicate Logic): verbs combine with their arguments to yield the truth conditions of a sentence. With such an understanding
More informationA Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many
Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.
More informationOn document relevance and lexical cohesion between query terms
Information Processing and Management 42 (2006) 1230 1247 www.elsevier.com/locate/infoproman On document relevance and lexical cohesion between query terms Olga Vechtomova a, *, Murat Karamuftuoglu b,
More informationCopyright 2017 DataWORKS Educational Research. All rights reserved.
Copyright 2017 DataWORKS Educational Research. All rights reserved. No part of this work may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic or mechanical,
More information11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation
tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each
More informationLeveraging Sentiment to Compute Word Similarity
Leveraging Sentiment to Compute Word Similarity Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya Dept. of Computer Science and Engineering, IIT Bombay 6th International Global
More informationControlled vocabulary
Indexing languages 6.2.2. Controlled vocabulary Overview Anyone who has struggled to find the exact search term to retrieve information about a certain subject can benefit from controlled vocabulary. Controlled
More informationAutomatic Extraction of Semantic Relations by Using Web Statistical Information
Automatic Extraction of Semantic Relations by Using Web Statistical Information Valeria Borzì, Simone Faro,, Arianna Pavone Dipartimento di Matematica e Informatica, Università di Catania Viale Andrea
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationReading Grammar Section and Lesson Writing Chapter and Lesson Identify a purpose for reading W1-LO; W2- LO; W3- LO; W4- LO; W5-
New York Grade 7 Core Performance Indicators Grades 7 8: common to all four ELA standards Throughout grades 7 and 8, students demonstrate the following core performance indicators in the key ideas of reading,
More informationSEMAFOR: Frame Argument Resolution with Log-Linear Models
SEMAFOR: Frame Argument Resolution with Log-Linear Models Desai Chen or, The Case of the Missing Arguments Nathan Schneider SemEval July 16, 2010 Dipanjan Das School of Computer Science Carnegie Mellon
More informationIntra-talker Variation: Audience Design Factors Affecting Lexical Selections
Tyler Perrachione LING 451-0 Proseminar in Sound Structure Prof. A. Bradlow 17 March 2006 Intra-talker Variation: Audience Design Factors Affecting Lexical Selections Abstract Although the acoustic and
More informationEdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar
EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More informationGrammars & Parsing, Part 1:
Grammars & Parsing, Part 1: Rules, representations, and transformations- oh my! Sentence VP The teacher Verb gave the lecture 2015-02-12 CS 562/662: Natural Language Processing Game plan for today: Review
More informationChapter 9 Banked gap-filling
Chapter 9 Banked gap-filling This testing technique is known as banked gap-filling, because you have to choose the appropriate word from a bank of alternatives. In a banked gap-filling task, similarly
More informationUniversal Grammar 2. Universal Grammar 1. Forms and functions 1. Universal Grammar 3. Conceptual and surface structure of complex clauses
Universal Grammar 1 evidence : 1. crosslinguistic investigation of properties of languages 2. evidence from language acquisition 3. general cognitive abilities 1. Properties can be reflected in a.) structural
More informationInleiding Taalkunde. Docent: Paola Monachesi. Blok 4, 2001/ Syntax 2. 2 Phrases and constituent structure 2. 3 A minigrammar of Italian 3
Inleiding Taalkunde Docent: Paola Monachesi Blok 4, 2001/2002 Contents 1 Syntax 2 2 Phrases and constituent structure 2 3 A minigrammar of Italian 3 4 Trees 3 5 Developing an Italian lexicon 4 6 S(emantic)-selection
More information2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases
POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz
More informationEnhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities
Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationEnglish Language and Applied Linguistics. Module Descriptions 2017/18
English Language and Applied Linguistics Module Descriptions 2017/18 Level I (i.e. 2 nd Yr.) Modules Please be aware that all modules are subject to availability. If you have any questions about the modules,
More informationCalifornia Department of Education English Language Development Standards for Grade 8
Section 1: Goal, Critical Principles, and Overview Goal: English learners read, analyze, interpret, and create a variety of literary and informational text types. They develop an understanding of how language
More informationLING 329 : MORPHOLOGY
LING 329 : MORPHOLOGY TTh 10:30 11:50 AM, Physics 121 Course Syllabus Spring 2013 Matt Pearson Office: Vollum 313 Email: pearsonm@reed.edu Phone: 7618 (off campus: 503-517-7618) Office hrs: Mon 1:30 2:30,
More informationMultilingual Sentiment and Subjectivity Analysis
Multilingual Sentiment and Subjectivity Analysis Carmen Banea and Rada Mihalcea Department of Computer Science University of North Texas rada@cs.unt.edu, carmen.banea@gmail.com Janyce Wiebe Department
More informationMercer County Schools
Mercer County Schools PRIORITIZED CURRICULUM Reading/English Language Arts Content Maps Fourth Grade Mercer County Schools PRIORITIZED CURRICULUM The Mercer County Schools Prioritized Curriculum is composed
More informationNatural Language Processing. George Konidaris
Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationText: envisionmath by Scott Foresman Addison Wesley. Course Description
Ms. Burr 4B Mrs. Hession 4A Math Syllabus 4A & 4B Text: envisionmath by Scott Foresman Addison Wesley In fourth grade we will learn and develop in the acquisition of different mathematical operations while
More informationDeveloping Grammar in Context
Developing Grammar in Context intermediate with answers Mark Nettle and Diana Hopkins PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE The Pitt Building, Trumpington Street, Cambridge, United
More informationDear Teacher: Welcome to Reading Rods! Reading Rods offer many outstanding features! Read on to discover how to put Reading Rods to work today!
Dear Teacher: Welcome to Reading Rods! Your Sentence Building Reading Rod Set contains 156 interlocking plastic Rods printed with words representing different parts of speech and punctuation marks. Students
More informationFirst Grade Curriculum Highlights: In alignment with the Common Core Standards
First Grade Curriculum Highlights: In alignment with the Common Core Standards ENGLISH LANGUAGE ARTS Foundational Skills Print Concepts Demonstrate understanding of the organization and basic features
More informationA Comparison of Two Text Representations for Sentiment Analysis
010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational
More informationPrediction of Maximal Projection for Semantic Role Labeling
Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba
More informationWhat the National Curriculum requires in reading at Y5 and Y6
What the National Curriculum requires in reading at Y5 and Y6 Word reading apply their growing knowledge of root words, prefixes and suffixes (morphology and etymology), as listed in Appendix 1 of the
More informationUnit 8 Pronoun References
English Two Unit 8 Pronoun References Objectives After the completion of this unit, you would be able to expalin what pronoun and pronoun reference are. explain different types of pronouns. understand
More informationAn Introduction to the Minimalist Program
An Introduction to the Minimalist Program Luke Smith University of Arizona Summer 2016 Some findings of traditional syntax Human languages vary greatly, but digging deeper, they all have distinct commonalities:
More informationMore ESL Teaching Ideas
More ESL Teaching Ideas Grades 1-8 Written by Anne Moore and Dana Pilling Illustrated by Tom Riddolls, Alicia Macdonald About the authors: Anne Moore is a certified teacher with a specialist certification
More informationIN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions.
6 1 IN THIS UNIT YOU LEARN HOW TO: ask and answer common questions about jobs talk about what you re doing at work at the moment talk about arrangements and appointments recognise and use collocations
More information1. Introduction. 2. The OMBI database editor
OMBI bilingual lexical resources: Arabic-Dutch / Dutch-Arabic Carole Tiberius, Anna Aalstein, Instituut voor Nederlandse Lexicologie Jan Hoogland, Nederlands Instituut in Marokko (NIMAR) In this paper
More informationBeyond the Pipeline: Discrete Optimization in NLP
Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We
More informationBANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS
Daffodil International University Institutional Repository DIU Journal of Science and Technology Volume 8, Issue 1, January 2013 2013-01 BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS Uddin, Sk.
More informationELD CELDT 5 EDGE Level C Curriculum Guide LANGUAGE DEVELOPMENT VOCABULARY COMMON WRITING PROJECT. ToolKit
Unit 1 Language Development Express Ideas and Opinions Ask for and Give Information Engage in Discussion ELD CELDT 5 EDGE Level C Curriculum Guide 20132014 Sentences Reflective Essay August 12 th September
More informationAdjectives tell you more about a noun (for example: the red dress ).
Curriculum Jargon busters Grammar glossary Key: Words in bold are examples. Words underlined are terms you can look up in this glossary. Words in italics are important to the definition. Term Adjective
More informationA Bottom-up Comparative Study of EuroWordNet and WordNet 3.0 Lexical and Semantic Relations
A Bottom-up Comparative Study of EuroWordNet and WordNet 3.0 Lexical and Semantic Relations Maria Teresa Pazienza a, Armando Stellato a, Alexandra Tudorache ab a) AI Research Group, Dept. of Computer Science,
More informationComprehension Recognize plot features of fairy tales, folk tales, fables, and myths.
4 th Grade Language Arts Scope and Sequence 1 st Nine Weeks Instructional Units Reading Unit 1 & 2 Language Arts Unit 1& 2 Assessments Placement Test Running Records DIBELS Reading Unit 1 Language Arts
More informationcmp-lg/ Jul 1995
A CONSTRAINT-BASED CASE FRAME LEXICON ARCHITECTURE 1 Introduction Kemal Oazer and Okan Ylmaz Department of Computer Engineering and Information Science Bilkent University Bilkent, Ankara 0, Turkey fko,okang@cs.bilkent.edu.tr
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationBULATS A2 WORDLIST 2
BULATS A2 WORDLIST 2 INTRODUCTION TO THE BULATS A2 WORDLIST 2 The BULATS A2 WORDLIST 21 is a list of approximately 750 words to help candidates aiming at an A2 pass in the Cambridge BULATS exam. It is
More informationTracy Dudek & Jenifer Russell Trinity Services, Inc. *Copyright 2008, Mark L. Sundberg
Tracy Dudek & Jenifer Russell Trinity Services, Inc. *Copyright 2008, Mark L. Sundberg Verbal Behavior-Milestones Assessment & Placement Program Criterion-referenced assessment tool Guides goals and objectives/benchmark
More informationChapter 4: Valence & Agreement CSLI Publications
Chapter 4: Valence & Agreement Reminder: Where We Are Simple CFG doesn t allow us to cross-classify categories, e.g., verbs can be grouped by transitivity (deny vs. disappear) or by number (deny vs. denies).
More informationThe stages of event extraction
The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks
More informationThe Role of the Head in the Interpretation of English Deverbal Compounds
The Role of the Head in the Interpretation of English Deverbal Compounds Gianina Iordăchioaia i, Lonneke van der Plas ii, Glorianna Jagfeld i (Universität Stuttgart i, University of Malta ii ) Wen wurmt
More informationProof Theory for Syntacticians
Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax
More informationDickinson ISD ELAR Year at a Glance 3rd Grade- 1st Nine Weeks
3rd Grade- 1st Nine Weeks R3.8 understand, make inferences and draw conclusions about the structure and elements of fiction and provide evidence from text to support their understand R3.8A sequence and
More informationSyntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationUNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen
UNIVERSITY OF OSLO Department of Informatics Dialog Act Recognition using Dependency Features Master s thesis Sindre Wetjen November 15, 2013 Acknowledgments First I want to thank my supervisors Lilja
More informationTWO OLD WOMEN (An Alaskan Legend of Betrayal, Courage and Survival) By Velma Wallis
TWO OLD WOMEN (An Alaskan Legend of Betrayal, Courage and Survival) By Velma Wallis Sample Lesson meeting the Alaska English/Language Arts Standards Grade 4 By Nita Rearden Vocabulary List: Pick words
More informationIntroduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.
to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about
More informationImproved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form
Orthographic Form 1 Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form The development and testing of word-retrieval treatments for aphasia has generally focused
More informationCopyright Corwin 2015
2 Defining Essential Learnings How do I find clarity in a sea of standards? For students truly to be able to take responsibility for their learning, both teacher and students need to be very clear about
More informationShort Text Understanding Through Lexical-Semantic Analysis
Short Text Understanding Through Lexical-Semantic Analysis Wen Hua #1, Zhongyuan Wang 2, Haixun Wang 3, Kai Zheng #4, Xiaofang Zhou #5 School of Information, Renmin University of China, Beijing, China
More informationA Domain Ontology Development Environment Using a MRD and Text Corpus
A Domain Ontology Development Environment Using a MRD and Text Corpus Naomi Nakaya 1 and Masaki Kurematsu 2 and Takahira Yamaguchi 1 1 Faculty of Information, Shizuoka University 3-5-1 Johoku Hamamatsu
More informationApplications of memory-based natural language processing
Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal
More informationEmmaus Lutheran School English Language Arts Curriculum
Emmaus Lutheran School English Language Arts Curriculum Rationale based on Scripture God is the Creator of all things, including English Language Arts. Our school is committed to providing students with
More informationProject in the framework of the AIM-WEST project Annotation of MWEs for translation
Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment
More informationExtended Similarity Test for the Evaluation of Semantic Similarity Functions
Extended Similarity Test for the Evaluation of Semantic Similarity Functions Maciej Piasecki 1, Stanisław Szpakowicz 2,3, Bartosz Broda 1 1 Institute of Applied Informatics, Wrocław University of Technology,
More informationTABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards
TABE 9&10 Revised 8/2013- with reference to College and Career Readiness Standards LEVEL E Test 1: Reading Name Class E01- INTERPRET GRAPHIC INFORMATION Signs Maps Graphs Consumer Materials Forms Dictionary
More informationWord Stress and Intonation: Introduction
Word Stress and Intonation: Introduction WORD STRESS One or more syllables of a polysyllabic word have greater prominence than the others. Such syllables are said to be accented or stressed. Word stress
More informationCross Language Information Retrieval
Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................
More informationCh VI- SENTENCE PATTERNS.
Ch VI- SENTENCE PATTERNS faizrisd@gmail.com www.pakfaizal.com It is a common fact that in the making of well-formed sentences we badly need several syntactic devices used to link together words by means
More informationMETHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS
METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS Ruslan Mitkov (R.Mitkov@wlv.ac.uk) University of Wolverhampton ViktorPekar (v.pekar@wlv.ac.uk) University of Wolverhampton Dimitar
More informationDerivational and Inflectional Morphemes in Pak-Pak Language
Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes
More informationCase government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG
Case government vs Case agreement: modelling Modern Greek case attraction phenomena in LFG Dr. Kakia Chatsiou, University of Essex achats at essex.ac.uk Explorations in Syntactic Government and Subcategorisation,
More informationUnsupervised Learning of Narrative Schemas and their Participants
Unsupervised Learning of Narrative Schemas and their Participants Nathanael Chambers and Dan Jurafsky Stanford University, Stanford, CA 94305 {natec,jurafsky}@stanford.edu Abstract We describe an unsupervised
More information1/20 idea. We ll spend an extra hour on 1/21. based on assigned readings. so you ll be ready to discuss them in class
If we cancel class 1/20 idea We ll spend an extra hour on 1/21 I ll give you a brief writing problem for 1/21 based on assigned readings Jot down your thoughts based on your reading so you ll be ready
More informationEnsemble Technique Utilization for Indonesian Dependency Parser
Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id
More information