Word Sense Disambiguation

Size: px
Start display at page:

Download "Word Sense Disambiguation"

Transcription

1 Word Sense Disambiguation Carlo Strapparava FBK-Irst Istituto per la ricerca scientifica e tecnologica I Povo, Trento, ITALY strappa@fbk.eu The problem of WSD What is the idea of word sense disambiguation? Many words have several meanings or senses For such words (given out of context) there is ambiguity about how they should be interpreted WSD is the task of examining word tokens in context and specifying exactly which sense of each word is being used 1

2 Computers versus Humans Polysemy most words have many possible meanings. A computer program has no basis for knowing which one is appropriate, even if it is obvious to a human Ambiguity is rarely a problem for humans in their day to day communication, except in extreme cases Ambiguity for a Computer The fisherman jumped off the bank and into the water. The bank down the street was robbed! Back in the day, we had an entire bank of computers devoted to this problem. The bank in that road is entirely too steep and is really dangerous. The plane took a bank to the left, and then headed off towards the mountains. 2

3 Other examples Two examples: 1. In my office there are 2 tables and 4 chairs. 2. This year the chair of the ACL conference is prof. W. D. For humans - it is not a problem Ex. 1: chair = in the sense of furniture Ex. 2: chair = in the sense of the role covered by a person For machines - it is not trivial: one of the hardest problems in NLP Brief Historical Overview 1970s s Rule based systems Rely on hand crafted knowledge sources 1990s Corpus based approaches Dependence on sense tagged text (Ide and Veronis, 1998) overview history from early days to s Hybrid Systems Minimizing or eliminating use of sense tagged text Taking advantage of the Web 3

4 Interdisciplinary Connections Cognitive Science & Psychology Quillian (1968), Collins and Loftus (1975) : spreading activation Hirst (1987) developed marker passing model Linguistics Fodor & Katz (1963) : selectional preferences Resnik (1993) pursued statistically Philosophy of Language Wittgenstein (1958): meaning as use For a large class of cases - though not for all - in which we employ the word "meaning" it can be defined thus: the meaning of a word is its use in the language. Why do we need the senses? Sense disambiguation is an intermediate task (Wilks and Stevenson, 1996) It is necessary to deal with most natural language tasks that involve language understanding (e.g. message understanding, man-machine communication, ) 4

5 Where WSD could be useful Machine translation Information retrieval and hypertext navigation Content and thematic analysis Grammatical analysis Speech processing Text processing WSD for machine translation Sense disambiguation is essential for the proper translation of words For example the French word grille, depending on the context, can be translated as railings, gate, bar, grid, scale, schedule, etc 5

6 WSD for information retrieval WSD could be useful for information retrieval and hypertext navigation When searching for specific keywords, it is desirable to eliminate documents where words are used in an inappropriate sense For example, when searching for judicial references, court as associated with royalty rather than with law. Vorhees (1999) Krovetz (1997, 2002) =>more benefits in Cross Language IR WSD for grammatical analysis Sense disambiguation can be useful in Part-of-speech tagging ex. L étagère plie sous les livres [the shelf is bending under (the weight of) the books] livres (which can mean books or pounds) is masculine in in the former sense, feminine in the latter Prepositional phrase attachment (Hindle and Rooth, 1993) In general it can restrict the space of parses (Alshawi and Carter, 1994) 6

7 WSD for speech processing Sense disambiguation is required for correct phonetization of words in speech synthesis Ex. The word conjure He conjured up an image or I conjure you to help me (Yarowsky, 1996) And also for word segmentation and homophone discrimination in speech recognition WSD for content representation WSD is useful to have a content-based representation of documents Ex. user-model for multilingual news web sites A content-based technique to represent documents as a starting point to build a model of user s interest A user-model built using a word-sense based representation of the visited documents A filtering procedure dynamically predicts new document on the basis of the user s interest model 7

8 A simple idea: selectional restriction-based disambiguation Examples from Wall Street Journal washing dishes Ms. Chen works well, stir-frying several simple dishes Two senses: Physical Objects Meals or Recipes The selection can be based on restrictions imposed by wash or stir-fry on their PATIENT role The object of stir-fry should be something edible The sense of meal conflicts with the restrictions imposed by wash Selectional restrictions (2) However there are some limitations Sometimes the available selectional restrictions are too general -> what kind of dishes do you recommend? Violations of selectional restrictions, but perfectly well-formed and interpretable sentences: ex. metaphors, metonymy) Requires a huge knowledge base Some attempts: FrameNet Berkeley) LCS (Lexical Conceptual Structure) - Jackendoff 83 8

9 WSD methodology A WSD task involves two steps 1. Sense repository -> the determination of all the different senses for every word relevant to the text under consideration 2. Sense assignment -> a means to assign each occurrence of a word to the appropriate sense Sense repositories Distinguish word senses in texts with respect to a dictionary, thesaurus, etc WordNet, LDOCE, Roget thesaurus A word is assumed to have a finite number of discrete senses However often a word has somewhat related senses and it is unclear whether to and where to draw lines between them The issue of a coarse-grained task 9

10 Word Senses (ex. WordNet) The noun title has 10 senses (first 7 from tagged texts) 1. title, statute title -- (a heading that names a statute or legislative bill; gives a brief summary of the matters it deals with; "Title 8 provided federal help for schools") 2. title -- (the name of a work of art or literary composition etc.; "he looked for books with the word `jazz' in the title"; "he refused to give titles to his paintings"; "I can never remember movie titles") 3. title -- (a general or descriptive heading for a section of a written work; "the novel had chapter titles") 4. championship, title -- (the status of being a champion; "he held the title for two years ) 5. deed, deed of conveyance, title -- (a legal document signed and sealed and delivered to effect a transfer of property and to show the legal right to possess it; "he signed the deed"; "he kept the title to his car in the glove compartment") 6. title -- (an identifying appellation signifying status or function: e.g. Mr. or General; "the professor didn't like his friends to use his formal title") 7. title, claim -- (an established or recognized right: "a strong legal claim to the property"; "he had no documents confirming his title to his father's estate") 8. title -- ((usually plural) written material introduced into a movie or TV show to give credits or represent dialogue or explain an action; "the titles go by faster than I can read ) 9. title -- (an appellation signifying nobility; "`your majesty' is the appropriate title to use in addressing a king") 10. claim, title -- (an informal right to something: "his claim on her attentions"; "his title to fame ) The verb title has 1 sense (first 1 from tagged texts) 1. entitle, title -- (give a title to) Sense repositories - a different approach Word Sense Discrimination Sense discrimination divides the occurrences of a word into a number of classes by determining for any two occurrences whether they belong to the same sense or not. We need only determine which occurrences have the same meaning and not what the meaning actually is. 10

11 Word Sense Discrimination Discrimination Clustering word senses in a text Pros: no need for a-priori dictionary definitions agglomerative clustering is a well studied field Cons: sense inventory varies from one text to another hard to evaluate hard to standardize (Schütze 98) Automatic Word Sense Discrimination ACL 98 WSD methodology A WSD task involves two steps 1. Sense repository -> the determination of all the different senses for every word relevant to the text under consideration 2. Sense assignment -> a means to assign each occurrence of a word to the appropriate sense 11

12 Sense assignment The assignment of words to senses relies on two major sources of information the context of the word to be disambiguated, in the broad sense: this includes info contained within the text in which the word appears external knowledge sources, including lexical, encyclopedic, etc. resources as well as (possibly) hand-devised knowledge sources Sense assignment (2) All disambiguation work involves matching the context of the word to be disambiguated with either information from an external knowledge source (knowledge-driven WSD) OR information information about the contexts of previous disambiguated instances of the word derived from corpora (corpus-based WSD) 12

13 Some approaches Corpus based approaches Supervised algorithms: Exemplar-Based Learning (Ng & Lee 96) Naïve Bayes Semi-supervised algorithms: bootstrapping approaches (Yarowsky 95) Dictionary based approaches Lesk 86 Hybrid algorithms (supervised + dictionary) Mihalcea 00 Brief review: What is Supervised Learning? Collect a set of examples that illustrate the various possible classifications or outcomes of an event Identify patterns in the examples associated with each particular class of the event Generalize those patterns into rules Apply the rules to classify a new event 13

14 Learn from these examples : when do I go to the university? CLASS F1 F2 F3 Day Go to University? Hot Outside? Slept Well? Ate Well? 1 YES YES 2 YES YES 3 YES 4 YES Learn from these examples : when do I go to the university? Day CLASS Go to University? F1 Hot Outside? F2 Slept Well? F3 Ate Well? 1 YES YES 2 YES YES 3 YES 4 YES 14

15 Supervised WSD Supervised WSD: Class of methods that induces a classifier from manually sense-tagged text using machine learning techniques. Resources Sense Tagged Text Dictionary (implicit source of sense inventory) Syntactic Analysis (POS tagger, Chunker, Parser, ) Scope Typically one target word per context Part of speech of target word resolved Lends itself to targeted word formulation Looking at WSD as a classification problem where a target word is assigned the most appropriate sense from a given set of possibilities based on the context in which it occurs An Example of Sense Tagged Text Bonnie and Clyde are two really famous criminals, I think they were bank/1 robbers My bank/1 charges too much for an overdraft. I went to the bank/1 to deposit my check and get a new ATM card. The University of Minnesota has an East and a West Bank/2 campus right on the Mississippi River. My grandfather planted his pole in the bank/2 and got a great big catfish! The bank/2 is pretty muddy, I can t walk there. 15

16 Two Bags of Words (Co-occurrences in the window of context ) FINANCIAL_BANK_BAG: a an and are ATM Bonnie card charges check Clyde criminals deposit famous for get I much My new overdraft really robbers the they think to too two went were RIVER_BANK_BAG: a an and big campus cant catfish East got grandfather great has his I in is Minnesota Mississippi muddy My of on planted pole pretty right River The the there University walk West Simple Supervised Approach Given a sentence S containing bank : For each word W i in S If W i is in FINANCIAL_BANK_BAG then Sense_1 = Sense_1 + 1; If W i is in RIVER_BANK_BAG then Sense_2 = Sense_2 + 1; If Sense_1 > Sense_2 then print Financial else if Sense_2 > Sense_1 then print River else print Can t Decide ; 16

17 General Supervised Methodology Create a sample of training data where a given target word is manually annotated with a sense from a predetermined set of possibilities One tagged word per instance/lexical sample disambiguation Select a set of features with which to represent context. co-occurrences, collocations, POS tags, verb-obj relations, etc... Convert sense-tagged training instances to feature vectors Apply a machine learning algorithm to induce a classifier Form structure or relation among features Parameters strength of feature interactions Convert a held out sample of test data into feature vectors Apply classifier to test instances to assign a sense tag From Text to Feature Vectors My/pronoun grandfather/noun used/verb to/prep fish/verb along/adv the/det banks/shore of/prep the/det Mississippi/noun River/noun. (S1) The/det bank/finance issued/verb a/det check/noun for/prep the/det amount/noun of/prep interest/noun. (S2) P-2 P-1 P+1 P+2 fish check river interest SENSE TAG S1 adv det prep det Y N Y N SHORE S2 det verb det N Y N Y FINANCE 17

18 Supervised Learning Algorithms Once data is converted to feature vector form, any supervised learning algorithm can be used. Many have been applied to WSD with good results: Support Vector Machines Nearest Neighbor Classifiers Decision Trees Decision Lists Naïve Bayesian Classifiers Perceptrons Neural Networks Graphical Models Log Linear Models Summing up: Supervised algorithms In ML approaches, systems are trained on a set of labeled instances to perform the task of WSD What is learned is a classifier that can be used to assign unseen examples to senses These approaches vary in the nature of the training material how much material is needed the kind of linguistic knowledge 18

19 Summing up: Supervised algorithms All approaches put emphasis on acquiring knowledge needed for the task from data (rather from humans) The question is about scaling up: Is it possible (or realistic) to apply these methodologies to the entire lexicon of the language? Manually building sense-tagged corpora is extremely costly Inputs: feature vectors The input consists of the target word (i.e. the word to be disambiguated) the context (i.e. a portion of the text in which it is embedded) The input is normally part-of-speech tagged The context consists of larger of smaller segments surrounding the target word Often some kind of morphological processing is performed on all words of the context Seldom, some form of parsing is performed to find out grammatical roles and relations 19

20 Feature vectors Two steps Selecting the relevant linguistic features Encoding them in a suitable way for a learning algorithm A feature vector consists of numerical or nominal values encoding the selected linguistic information Linguistic features The linguistic features used in training WSD systems can be divided in two classes Syntagmatic features - two words are syntagmatically related when they frequently appear in the same syntagm (e.g. when one of them frequently follows or precedes the other) Paradigmatic features - two words are paradigmatically related when their meanings are closely related (e.g. like synonyms, hyponyms, they have the same semantic domains) 20

21 Syntagmatic features Typical syntagmatic features are collocational features, encoding information about the lexical inhabitants of specific positions on the left or on the right of the target words: the word the root form of the word the word s part-of-speech Ex. An electric guitar and bass player stand off 2 words on the right, 2 on the left, with POS [guitar, NN1, and, CJC, player, NN1, stand, VVB] Pos Paradigmatic features Features that are effective at capturing the general topic of the discourse in which the target word has occurred Bag of words Domain labels In BOW, co-occurrences of words ignoring their exact position - the value of the feature is the number the word occurs in a region surrounding the target word 21

22 Ng and Lee (1996): LEXAS Exemplar-based learning (in practice k-nearest neighbor learning) LEXical Ambiguity-resolving System A set of features is extracted from disambiguated examples When a new untagged example is encountered, it is compared with each of the training examples, using a distance function LEXAS: the features Feature extraction [L 3, L 2, L 1, R 1, R 2, R 3, K 1,..., K m, C 1,..., C 9, V] Part of speech and morphological form: part of speech of the words to the left L 3, L 2, L 1 and right R 1, R 2, R 3 Unordered set of surrounding words: keywords that co-occur frequently with the word w K 1,.., K m Local collocations C 1,.., C 9 determined by left and right offset e.g. : [-1 1] national interest of Verb-object syntactic relations used only for nouns 22

23 LEXAS: the metric Distance among feature vectors: distance among two vectors: sum of distances among features distance among two values v 1 and v 2 of feature f n #!(v 1,v 2 ) = C 1,i " C 2,i C 1 C 2 i=1 C 1,i represents the number of training examples with value v 1 for the feature f that are classified with the sense i in the training corpus. C 1 is the number with value v 1 in any sense C 2,i and C 2 denote similar quantities for v 2 n is the total number of senses for the word LEXAS: the algorithm Training phase: build training examples Testing: for a new untagged occurrence of word w, measure the distance with the training examples choose the sense from the example which provides the minimum distance 23

24 LEXAS: evaluation Two datasets: The interest corpus by Bruce and Wiebe (1994): 2,639 sentences from the Wall Street Journal each containing the noun interest Sense repository: one of the six senses from LDOCE Lexas: 89% Yarowsky: 72% Bruce & Wiebe: 79% 192,800 occurrences of very ambiguous words: 121 nouns (e.g. action, board) [~ 7.8 senses per word] and 70 verbs (e.g. add, build) [~ 12 senses per word] - (from Brown and WSJ) Sense repository: WordNet Lexas: 68,6% Most Frequent Sense: 63,7% Intermezzo: Evaluating a WSD system Numerical evaluation is meaningless without specifying how difficult is the task Ex. 90% accuracy is easy for a POS tagger but it is beyond the ability of any Machine Translation system Estimation of upper and lower bound does make sense of the performance of an algorithm 24

25 Evaluation a WSD system: Upper bound The upper bound is usually human performance In the case of WSD, if humans disagree on sense disambiguation, we cannot expect a WSD system to do better Inter-judgment agreement is higher for words with clear sense distinctions ex. bank (95% and higher), lower for polysemous words ex. title (65% to 70%) Evaluation a WSD system: Inter-judgment agreement To correctly compare the extent of interjudge agreement, we need to correct for the expected chance agreement This depends on the number of senses being distinguished This is done by Kappa statistics (Carletta, 1996) 25

26 Evaluation a WSD system: Lower bound The lower bound or baseline is the performance of the simplest possible algorithm Usually the assignment of most frequent sense 90% accuracy is a good result for a word with 2 equiprobable sense, a trivial result for a word with 2 sense in a 9 to 1 frequency ratio Another possible baseline: random choice Evaluation a WSD system: Scoring Precision and Recall Precision was defined as the proportion of classified instances that were correctly classified Recall as the proportion of instances classified correctly these allow for the possibility of an algorithm choosing not to classify a given instance. For the sense-tagging task, accuracy is reported as recall. The coverage of a system (i.e., the percentage of items for which the system guesses some sense tag) can be computed by dividing recall by precision. 26

27 Evaluation frameworks SENSEVAL A competition with various WSD tasks and on different languages All-words task Lexical sample task Senseval 1: 1999 [Hector dictionary] ~ 10 teams Senseval 2: 2001 [WordNet dictionary] ~ 30 teams Senseval 3: 2004 [mainly WordNet dictionary] ~ 60 teams many different tasks Senseval 4 -> Semeval 1 : 2007 Naïve Bayes A premise: choosing the best sense for an input vector is choosing the most probable sense given that vector s ˆ = argmaxp(s V ) s!s s ˆ = argmax s!s P(V s)p(s) P(V ) re-writing in the usual Bayesian manner But the data available that associates specific vectors with sense is too sparse What is largely available in the training set is information about individual feature-value pairs for a specific sense 27

28 Naïve Bayes (2) Naïve assumption: the features are independent P(V s)! P(v j s) n " j =1 s ˆ = argmaxp(s) P(v j s) s#s n " j =1 P(V) is the same for all possible senses it does not effect the final ranking of senses Training a naïve Bayes classifier consists of collecting the individual feature-value statistics wrt each sense of the target word in a sensetagged training corpus In practice, considerations about smoothing apply Semi-supervised approaches Problems with supervised algorithms => the need of a large sense-tagged training set Bootstrapping approach (Yarowsky, 1995) A small number of labeled instances (seeds) is used to trained an initial classifier in a supervised way This classifier is then used to extract a large training set from an untagged corpus Iterating this process results in a series of accurate classifiers 28

29 One sense per constraints There are constraints between different occurrences of an ambiguous word within a corpus that can be exploited for disambiguation: One sense per discourse: The sense of a target word is highly consistent within any given document. e.g. He planted the pansy seeds himself, buying them from a pansy specialist. These specialists have done a great deal of work to improve the size and health of the plants and the resulting flowers. Their seeds produce vigorous blooming plants half again the size of he unimproved strains. One sense per collocation: nearby words provide strong and consistent clues to the sense of a target word, conditional on relative distance, order and syntactic relationship. e.g. industrial plant -- same meaning for plant regardless of where this collocation occurs One sense per constraints Summing up: One sense per discourse A word tends to preserve its meaning across all its occurrences in a given discourse (Gale, Church, Yarowksy 1992) One sense per collocation A word tends to preserver its meaning when used in the same collocation (Yarowsky 1993) Strong for adjacent collocations Weaker as the distance between words increases 29

30 Bootstrapping (Yarowsky 95) Simplification: binary sense assignment Step 1 identify in a corpus all examples for a given polysemous word Step 2 Identify a small set of representative examples for each sense Step 3 a) Train a classification algorithm on Sense-A/Sense-B seed sets b) Apply the resulting classifier to the rest of the corpus and add the new examples to the seed set c) Repeat iteratively Step 4 Apply the classification algorithm to the testing set Bootstrapping (Yarowsky 95) Example: word plant Sense-A living organism Sense-B factory Seed collocations: life and manufacturing Extract examples containing these seeds Decision list: LogL Collocation Sense 8.10 plant life A 7.58 manufacturing plant B 7.39 life (within ± 2-10 words) A 7.20 manufacturing (within ± 2-10 words) B 6.27 animal (within ± 2-10 words) A 4.70 equipment (within ± 2-10 words) B 4.36 employee (within ± 2-10 words) B 30

31 Bootstrapping (Yarowsky 95) Use this decision list to classify new examples Repeat until no more examples can be classified Test: apply the decision list on the testing set Option for training seeds: use words in dictionary definitions single defining collocate (e.g. bird and machine for word crane ) Results: Yarowsky 96.5% Most frequent sense 63.9 % It works well only for distinct senses of words. Sometime this is not the case 1. bass -- (the lowest part of the musical range) 3. bass, basso -- (an adult male singer with the lowest voice) Some approaches Corpus based approaches Supervised algorithms: Exemplar-Based Learning (Ng & Lee 96) Naïve Bayes Semi-supervised algorithms: bootstrapping approaches (Yarowsky 95) Dictionary based approaches Lesk 86 Hybrid algorithms (supervised + dictionary) Mihalcea 00 31

32 Lesk algorithm (1986) It is one of the first algorithm developed for the semantic disambiguation of all words in open text The only resource requires is a set of dictionary entries (definitions) The most likely sense for a word in a given context is decided by a measure of the overlap between the definitions of the target words and of the words of the current context Lesk algorithm - dictionary based The main idea of the original version of the algorithm is to disambiguate words finding the overlap among their sense definitions (1) for each sense i of W 1 (2) for each sense j of W 2 (3) determine Overlap ij as the number of common occurrences between definitions of sense i of W 1 and sense j of W 2 (4) find i and j for which Overlap ij is maximum (5) Assign sense i to W 1 and sense j to W 2 32

33 Lesk algorithm - an example Select the appropriate sense of cone and pine in the phrase pine cone given the following definitions pine 1. kinds of evergreen tree with needle-shaped leaves 2. waste away through sorrow or illness cone 1. solid body which narrows to a point from a round flat base 2. something of this shape whether solid or hollow 3. fruit of certain evergreen trees Select pine#1 and cone#3 because they have two words in common Lesk algorithm - corpus based A corpus-based variation: take into consideration also additional tagged examples (1) for each sense s of the word W 1 (2) set weight(s) to zero (3) for each unique word w in surrounding context of W 1 (4) for each sense s, (5) if w occurs in the training examples / dictionary definitions for sense s, (6) add weight(w) to weight(s) (7) choose sense with greatest weight(s) weight(w) = IDF = -log(p(w)) p(w) is estimated over the examples and dictionary definitions 33

34 Lesk algorithm - evaluation Corpus-based variation is one of the best performing baselines in comparative evaluation of WSD systems In Senseval-2 => 51.2% precision compared to 64.2% achieved by the best supervised system Problems of this approach: The dictionary entries are relatively short Combinatorial explosion when applied to more than two words Some approaches Corpus based approaches Supervised algorithms: Exemplar-Based Learning (Ng & Lee 96) Naïve Bayes Semi-supervised algorithms: bootstrapping approaches (Yarowsky 95) Dictionary based approaches Lesk 86 Hybrid algorithms (supervised + dictionary) Mihalcea 00 34

35 Hybrid algorithms (Mihalcea 2000) It combines two sources of information: WordNet and a sense tagged corpus (SemCor) It is based on WordNet definitions WordNet ISA relations Rules acquired from SemCor Hybrid algorithm (Mihalcea 2000) It was developed for the purpose of increasing the Information Retrieval with WSD techniques: disambiguation of the words in the input IR query disambiguation of words in the documents The algorithm determines a set of nouns and verbs that can be disambiguated with high precision Several procedures (8) are called iteratively in the main algorithm 35

36 Procedure 1 The system uses a Named Entity recognizer In particular person names, organizations and locations Identify Named Entities in text and mark them with sense #1 Examples: Bush => PER => person#1 Trento => LOC => location#1 IBM => ORG => organization#1 Procedure 2 Exploiting the monosemous words Identify words having only one sense in WordNet and mark them with that sense Example: The noun subcommittee has only one sense in WordNet 36

37 Procedure 3 Exploiting the contextual clues about the usage of the words Given a clue (collocation), search it in SemCor and mark them with the correspondent sense from SemCor Example: disambiguate approval in approval of => 4 examples in SemCor with the approval#1 of the Farm Credit Association subject to the approval#1 of the Secretary of State administrative approval#1 of the reclassification recommended approval#1 of the 1-A classification In all this occurrences the sense of approval is approval#1 Procedure 4 Using SemCor, for a given noun N in the text, determine the noun-context of each of its senses Noun-context: list of nouns which occur more often within the context of N Find common words between the current context and the noun context Example: diameter has 2 senses 2 noun-contexts: diameter#1: {property, hole, ratio} diameter#2: {form} 37

38 Procedure 5 Find words that are semantically connected to already disambiguated words Connected means there is a relation in WordNet If they belong to the same synset => connection distance = 0 Example: authorize and clear in a text to be disambiguated Knowing: authorize#1 disambiguated with procedure 2 It results: clear#4 - because they are synonyms in WordNet Procedure 6 Find words that are semantically connected and for which the connection distance is 0 Weaker than procedure 5: none of the words considered are already disambiguated Example: measure and bill, both are ambiguous bill has 10 senses, measure 9 bill#1 and measure#4 belong to the same synset 38

39 Procedures 7-8 Similar to procedures 5-6, but they use other semantic relations (connection distance = 1) hypernymy/hyponymy = ISA relations Example: subcommittee and committee subcommittee #1 disambiguated with procedure 2 committee #1 because is a hypernym of subcommittee#1 (Mihalcea 2000) - Evaluation The procedures presented above are applied iteratively This allows us to identify a set of nouns and verbs that can be disambiguated with high precision Tests on six randomly selected files from SemCor The algorithm disambiguate 55% of the nouns and verbs with 92% precision 39

Word Sense Disambiguation

Word Sense Disambiguation Word Sense Disambiguation D. De Cao R. Basili Corso di Web Mining e Retrieval a.a. 2008-9 May 21, 2009 Excerpt of the R. Mihalcea and T. Pedersen AAAI 2005 Tutorial, at: http://www.d.umn.edu/ tpederse/tutorials/advances-in-wsd-aaai-2005.ppt

More information

On document relevance and lexical cohesion between query terms

On document relevance and lexical cohesion between query terms Information Processing and Management 42 (2006) 1230 1247 www.elsevier.com/locate/infoproman On document relevance and lexical cohesion between query terms Olga Vechtomova a, *, Murat Karamuftuoglu b,

More information

! # %& ( ) ( + ) ( &, % &. / 0!!1 2/.&, 3 ( & 2/ &,

! # %& ( ) ( + ) ( &, % &. / 0!!1 2/.&, 3 ( & 2/ &, ! # %& ( ) ( + ) ( &, % &. / 0!!1 2/.&, 3 ( & 2/ &, 4 The Interaction of Knowledge Sources in Word Sense Disambiguation Mark Stevenson Yorick Wilks University of Shef eld University of Shef eld Word sense

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Multilingual Sentiment and Subjectivity Analysis

Multilingual Sentiment and Subjectivity Analysis Multilingual Sentiment and Subjectivity Analysis Carmen Banea and Rada Mihalcea Department of Computer Science University of North Texas rada@cs.unt.edu, carmen.banea@gmail.com Janyce Wiebe Department

More information

A Bayesian Learning Approach to Concept-Based Document Classification

A Bayesian Learning Approach to Concept-Based Document Classification Databases and Information Systems Group (AG5) Max-Planck-Institute for Computer Science Saarbrücken, Germany A Bayesian Learning Approach to Concept-Based Document Classification by Georgiana Ifrim Supervisors

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,

More information

2.1 The Theory of Semantic Fields

2.1 The Theory of Semantic Fields 2 Semantic Domains In this chapter we define the concept of Semantic Domain, recently introduced in Computational Linguistics [56] and successfully exploited in NLP [29]. This notion is inspired by the

More information

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics (L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes

More information

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence. NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and

More information

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,

More information

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2 Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2 Ted Pedersen Department of Computer Science University of Minnesota Duluth, MN, 55812 USA tpederse@d.umn.edu

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

Vocabulary Usage and Intelligibility in Learner Language

Vocabulary Usage and Intelligibility in Learner Language Vocabulary Usage and Intelligibility in Learner Language Emi Izumi, 1 Kiyotaka Uchimoto 1 and Hitoshi Isahara 1 1. Introduction In verbal communication, the primary purpose of which is to convey and understand

More information

Combining a Chinese Thesaurus with a Chinese Dictionary

Combining a Chinese Thesaurus with a Chinese Dictionary Combining a Chinese Thesaurus with a Chinese Dictionary Ji Donghong Kent Ridge Digital Labs 21 Heng Mui Keng Terrace Singapore, 119613 dhji @krdl.org.sg Gong Junping Department of Computer Science Ohio

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each

More information

Matching Similarity for Keyword-Based Clustering

Matching Similarity for Keyword-Based Clustering Matching Similarity for Keyword-Based Clustering Mohammad Rezaei and Pasi Fränti University of Eastern Finland {rezaei,franti}@cs.uef.fi Abstract. Semantic clustering of objects such as documents, web

More information

The stages of event extraction

The stages of event extraction The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

The MEANING Multilingual Central Repository

The MEANING Multilingual Central Repository The MEANING Multilingual Central Repository J. Atserias, L. Villarejo, G. Rigau, E. Agirre, J. Carroll, B. Magnini, P. Vossen January 27, 2004 http://www.lsi.upc.es/ nlp/meaning Jordi Atserias TALP Index

More information

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

Leveraging Sentiment to Compute Word Similarity

Leveraging Sentiment to Compute Word Similarity Leveraging Sentiment to Compute Word Similarity Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya Dept. of Computer Science and Engineering, IIT Bombay 6th International Global

More information

SEMAFOR: Frame Argument Resolution with Log-Linear Models

SEMAFOR: Frame Argument Resolution with Log-Linear Models SEMAFOR: Frame Argument Resolution with Log-Linear Models Desai Chen or, The Case of the Missing Arguments Nathan Schneider SemEval July 16, 2010 Dipanjan Das School of Computer Science Carnegie Mellon

More information

A Comparison of Two Text Representations for Sentiment Analysis

A Comparison of Two Text Representations for Sentiment Analysis 010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,

More information

Memory-based grammatical error correction

Memory-based grammatical error correction Memory-based grammatical error correction Antal van den Bosch Peter Berck Radboud University Nijmegen Tilburg University P.O. Box 9103 P.O. Box 90153 NL-6500 HD Nijmegen, The Netherlands NL-5000 LE Tilburg,

More information

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS Ruslan Mitkov (R.Mitkov@wlv.ac.uk) University of Wolverhampton ViktorPekar (v.pekar@wlv.ac.uk) University of Wolverhampton Dimitar

More information

Short Text Understanding Through Lexical-Semantic Analysis

Short Text Understanding Through Lexical-Semantic Analysis Short Text Understanding Through Lexical-Semantic Analysis Wen Hua #1, Zhongyuan Wang 2, Haixun Wang 3, Kai Zheng #4, Xiaofang Zhou #5 School of Information, Renmin University of China, Beijing, China

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions. to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about

More information

BYLINE [Heng Ji, Computer Science Department, New York University,

BYLINE [Heng Ji, Computer Science Department, New York University, INFORMATION EXTRACTION BYLINE [Heng Ji, Computer Science Department, New York University, hengji@cs.nyu.edu] SYNONYMS NONE DEFINITION Information Extraction (IE) is a task of extracting pre-specified types

More information

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz

More information

Which verb classes and why? Research questions: Semantic Basis Hypothesis (SBH) What verb classes? Why the truth of the SBH matters

Which verb classes and why? Research questions: Semantic Basis Hypothesis (SBH) What verb classes? Why the truth of the SBH matters Which verb classes and why? ean-pierre Koenig, Gail Mauner, Anthony Davis, and reton ienvenue University at uffalo and Streamsage, Inc. Research questions: Participant roles play a role in the syntactic

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer

More information

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Ulrike Baldewein (ulrike@coli.uni-sb.de) Computational Psycholinguistics, Saarland University D-66041 Saarbrücken,

More information

Project in the framework of the AIM-WEST project Annotation of MWEs for translation

Project in the framework of the AIM-WEST project Annotation of MWEs for translation Project in the framework of the AIM-WEST project Annotation of MWEs for translation 1 Agnès Tutin LIDILEM/LIG Université Grenoble Alpes 30 october 2014 Outline 2 Why annotate MWEs in corpora? A first experiment

More information

Parsing of part-of-speech tagged Assamese Texts

Parsing of part-of-speech tagged Assamese Texts IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal

More information

Universiteit Leiden ICT in Business

Universiteit Leiden ICT in Business Universiteit Leiden ICT in Business Ranking of Multi-Word Terms Name: Ricardo R.M. Blikman Student-no: s1184164 Internal report number: 2012-11 Date: 07/03/2013 1st supervisor: Prof. Dr. J.N. Kok 2nd supervisor:

More information

ESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly

ESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly ESSLLI 2010: Resource-light Morpho-syntactic Analysis of Highly Inflected Languages Classical Approaches to Tagging The slides are posted on the web. The url is http://chss.montclair.edu/~feldmana/esslli10/.

More information

Accuracy (%) # features

Accuracy (%) # features Question Terminology and Representation for Question Type Classication Noriko Tomuro DePaul University School of Computer Science, Telecommunications and Information Systems 243 S. Wabash Ave. Chicago,

More information

Context Free Grammars. Many slides from Michael Collins

Context Free Grammars. Many slides from Michael Collins Context Free Grammars Many slides from Michael Collins Overview I An introduction to the parsing problem I Context free grammars I A brief(!) sketch of the syntax of English I Examples of ambiguous structures

More information

The Role of the Head in the Interpretation of English Deverbal Compounds

The Role of the Head in the Interpretation of English Deverbal Compounds The Role of the Head in the Interpretation of English Deverbal Compounds Gianina Iordăchioaia i, Lonneke van der Plas ii, Glorianna Jagfeld i (Universität Stuttgart i, University of Malta ii ) Wen wurmt

More information

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,

More information

Outline. Web as Corpus. Using Web Data for Linguistic Purposes. Ines Rehbein. NCLT, Dublin City University. nclt

Outline. Web as Corpus. Using Web Data for Linguistic Purposes. Ines Rehbein. NCLT, Dublin City University. nclt Outline Using Web Data for Linguistic Purposes NCLT, Dublin City University Outline Outline 1 Corpora as linguistic tools 2 Limitations of web data Strategies to enhance web data 3 Corpora as linguistic

More information

The Smart/Empire TIPSTER IR System

The Smart/Empire TIPSTER IR System The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of

More information

The taming of the data:

The taming of the data: The taming of the data: Using text mining in building a corpus for diachronic analysis Stefania Degaetano-Ortlieb, Hannah Kermes, Ashraf Khamis, Jörg Knappen, Noam Ordan and Elke Teich Background Big data

More information

Developing a TT-MCTAG for German with an RCG-based Parser

Developing a TT-MCTAG for German with an RCG-based Parser Developing a TT-MCTAG for German with an RCG-based Parser Laura Kallmeyer, Timm Lichte, Wolfgang Maier, Yannick Parmentier, Johannes Dellert University of Tübingen, Germany CNRS-LORIA, France LREC 2008,

More information

First Grade Curriculum Highlights: In alignment with the Common Core Standards

First Grade Curriculum Highlights: In alignment with the Common Core Standards First Grade Curriculum Highlights: In alignment with the Common Core Standards ENGLISH LANGUAGE ARTS Foundational Skills Print Concepts Demonstrate understanding of the organization and basic features

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Prediction of Maximal Projection for Semantic Role Labeling

Prediction of Maximal Projection for Semantic Role Labeling Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba

More information

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering

More information

Characteristics of the Text Genre Realistic fi ction Text Structure

Characteristics of the Text Genre Realistic fi ction Text Structure LESSON 14 TEACHER S GUIDE by Oscar Hagen Fountas-Pinnell Level A Realistic Fiction Selection Summary A boy and his mom visit a pond and see and count a bird, fish, turtles, and frogs. Number of Words:

More information

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important

More information

1. Introduction. 2. The OMBI database editor

1. Introduction. 2. The OMBI database editor OMBI bilingual lexical resources: Arabic-Dutch / Dutch-Arabic Carole Tiberius, Anna Aalstein, Instituut voor Nederlandse Lexicologie Jan Hoogland, Nederlands Instituut in Marokko (NIMAR) In this paper

More information

Developing Grammar in Context

Developing Grammar in Context Developing Grammar in Context intermediate with answers Mark Nettle and Diana Hopkins PUBLISHED BY THE PRESS SYNDICATE OF THE UNIVERSITY OF CAMBRIDGE The Pitt Building, Trumpington Street, Cambridge, United

More information

CEFR Overall Illustrative English Proficiency Scales

CEFR Overall Illustrative English Proficiency Scales CEFR Overall Illustrative English Proficiency s CEFR CEFR OVERALL ORAL PRODUCTION Has a good command of idiomatic expressions and colloquialisms with awareness of connotative levels of meaning. Can convey

More information

Compositional Semantics

Compositional Semantics Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language

More information

The College Board Redesigned SAT Grade 12

The College Board Redesigned SAT Grade 12 A Correlation of, 2017 To the Redesigned SAT Introduction This document demonstrates how myperspectives English Language Arts meets the Reading, Writing and Language and Essay Domains of Redesigned SAT.

More information

arxiv:cmp-lg/ v1 22 Aug 1994

arxiv:cmp-lg/ v1 22 Aug 1994 arxiv:cmp-lg/94080v 22 Aug 994 DISTRIBUTIONAL CLUSTERING OF ENGLISH WORDS Fernando Pereira AT&T Bell Laboratories 600 Mountain Ave. Murray Hill, NJ 07974 pereira@research.att.com Abstract We describe and

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

Analysis: Evaluation: Knowledge: Comprehension: Synthesis: Application:

Analysis: Evaluation: Knowledge: Comprehension: Synthesis: Application: In 1956, Benjamin Bloom headed a group of educational psychologists who developed a classification of levels of intellectual behavior important in learning. Bloom found that over 95 % of the test questions

More information

Ensemble Technique Utilization for Indonesian Dependency Parser

Ensemble Technique Utilization for Indonesian Dependency Parser Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id

More information

Distant Supervised Relation Extraction with Wikipedia and Freebase

Distant Supervised Relation Extraction with Wikipedia and Freebase Distant Supervised Relation Extraction with Wikipedia and Freebase Marcel Ackermann TU Darmstadt ackermann@tk.informatik.tu-darmstadt.de Abstract In this paper we discuss a new approach to extract relational

More information

Corpus Linguistics (L615)

Corpus Linguistics (L615) (L615) Basics of Markus Dickinson Department of, Indiana University Spring 2013 1 / 23 : the extent to which a sample includes the full range of variability in a population distinguishes corpora from archives

More information

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.

More information

Beyond the Pipeline: Discrete Optimization in NLP

Beyond the Pipeline: Discrete Optimization in NLP Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

Detecting English-French Cognates Using Orthographic Edit Distance

Detecting English-French Cognates Using Orthographic Edit Distance Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Part III: Semantics. Notes on Natural Language Processing. Chia-Ping Chen

Part III: Semantics. Notes on Natural Language Processing. Chia-Ping Chen Part III: Semantics Notes on Natural Language Processing Chia-Ping Chen Department of Computer Science and Engineering National Sun Yat-Sen University Kaohsiung, Taiwan ROC Part III: Semantics p. 1 Introduction

More information

Procedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing

Procedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 141 ( 2014 ) 124 128 WCLTA 2013 Using Corpus Linguistics in the Development of Writing Blanka Frydrychova

More information

Applications of memory-based natural language processing

Applications of memory-based natural language processing Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal

More information

THE VERB ARGUMENT BROWSER

THE VERB ARGUMENT BROWSER THE VERB ARGUMENT BROWSER Bálint Sass sass.balint@itk.ppke.hu Péter Pázmány Catholic University, Budapest, Hungary 11 th International Conference on Text, Speech and Dialog 8-12 September 2008, Brno PREVIEW

More information

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2 1 CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2 Peter A. Chew, Brett W. Bader, Ahmed Abdelali Proceedings of the 13 th SIGKDD, 2007 Tiago Luís Outline 2 Cross-Language IR (CLIR) Latent Semantic Analysis

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

Grade 6: Correlated to AGS Basic Math Skills

Grade 6: Correlated to AGS Basic Math Skills Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and

More information

Constructing Parallel Corpus from Movie Subtitles

Constructing Parallel Corpus from Movie Subtitles Constructing Parallel Corpus from Movie Subtitles Han Xiao 1 and Xiaojie Wang 2 1 School of Information Engineering, Beijing University of Post and Telecommunications artex.xh@gmail.com 2 CISTR, Beijing

More information

Loughton School s curriculum evening. 28 th February 2017

Loughton School s curriculum evening. 28 th February 2017 Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Chinese Language Parsing with Maximum-Entropy-Inspired Parser Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art

More information