Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling

Size: px
Start display at page:

Download "Unsupervised Large-Vocabulary Word Sense Disambiguation with Graph-based Algorithms for Sequence Data Labeling"

Transcription

1 Unsupervised Large-Vocabuary Word Sense Disambiguation with Graph-based Agorithms for Sequence Data Labeing Rada Mihacea Department of Computer Science University of North Texas Abstract This paper introduces a graph-based agorithm for sequence data abeing, using random waks on graphs encoding abe dependencies. The agorithm is iustrated and tested in the context of an unsupervised word sense disambiguation probem, and shown to significanty outperform the accuracy achieved through individua abe assignment, as measured on standard senseannotated data sets. 1 Introduction Many natura anguage processing tasks consist of abeing sequences of words with inguistic annotations, e.g. word sense disambiguation, part-of-speech tagging, named entity recognition, and others. Typica abeing agorithms attempt to formuate the annotation task as a traditiona earning probem, where the correct abe is individuay determined for each word in the sequence using a earning process, usuay conducted independent of the abes assigned to the other words in the sequence. Such agorithms do not have the abiity to encode and thereby expoit dependencies across abes corresponding to the words in the sequence, which potentiay imits their performance in appications where such dependencies can infuence the seection of the correct set of abes. In this paper, we introduce a graph-based sequence data abeing agorithm we suited for such natura anguage annotation tasks. The agorithm simutaneousy annotates a the words in a sequence by expoiting reations identified among word abes, using random waks on graphs encoding abe dependencies. The random waks are mathematicay modeed through iterative graph-based agorithms, which are appied on the abe graph associated with the given sequence of words, resuting in a stationary distribution over abe probabiities. These probabiities are then used to simutaneousy seect the most probabe set of abes for the words in the input sequence. The annotation method is iustrated and tested on an unsupervised word sense disambiguation probem, targeting the annotation of a open-cass words in unrestricted text using information derived excusivey from dictionary definitions. The graph-based sequence data abeing agorithm significanty outperforms the accuracy achieved through individua data abeing, resuting in an error reduction of 10.7%, as measured on standard sense-annotated data sets. The method is aso shown to exceed the performance of other previousy proposed unsupervised word sense disambiguation agorithms. 2 Iterative Graphica Agorithms for Sequence Data Labeing In this section, we introduce the iterative graphica agorithm for sequence data abeing. The agorithm is succincty iustrated using a sampe sequence for a generic annotation probem, with a more extensive iustration and evauation provided in Section 3. Given a sequence of words, each word with corresponding admissibe abes # "!, we define a abe graph G = (V,E) such *) that there is a vertex $&%(' for every possibe abe, + -,/., 0 1,324. Dependencies between pairs of abes are represented as directed or indirected edges 56%87, defined over the set of vertex pairs ':9;'. Such abe dependencies can be earned from annotated data, or derived by other means, as iustrated ater. Figure 1 shows an exampe of a graph-

2 ' 3 w1 2 w1 1 w1 w 1 [0.86] w4 3 w [1.12] [1.38] [1.05] w2 0.2 w [1.39] w2 [1.13] w3 [1.56] w4 [0.40] w 2 w 3 w 4 [0.48] [0.58] Figure 1: Sampe graph buit on the set of possibe abes (shaded nodes) for a sequence of four words (unshaded nodes). Labe dependencies are indicated as edge weights. Scores computed by the graph-based agorithm are shown in brackets, next to each abe. ica structure derived over the set of abes for a sequence of four words. Note that the graph does not have to be fuy connected, as not a abe pairs can be reated by a dependency. Given such a abe graph associated with a sequence of words, the ikeihood of each abe can be recursivey determined using an iterative graph-based ranking agorithm, which runs over the graph of abes and identifies the importance of each abe (vertex) in the graph. The iterative graphica agorithm is modeing a random wak, eading to a stationary distribution over abe probabiities, represented as scores attached to vertices in the graph. These scores are then used to identify the most probabe abe for each word, resuting in the annotation of a the words in the input sequence. For instance, for the graph drawn in Figure 1, the word wi be assigned with abe, since the score associated with this abe (, ) is the maximum among the scores assigned to a admissibe abes associated with this word. A remarkabe property that makes these iterative graphica agorithms appeaing for sequence data abeing is the fact that they take into account goba information recursivey drawn from the entire graph, rather than reying on oca vertex-specific information. Through the random wak performed on the abe graph, these iterative agorithms attempt to coectivey expoit the dependencies drawn between a abes in the graph, which makes them superior to other approaches that rey ony on oca information, individuay derived for each word in the sequence. 2.1 Graph-based Ranking The basic idea impemented by an iterative graphbased ranking agorithm is that of voting or recommendation. When one vertex inks to another one, it is basicay casting a vote for that other vertex. The higher the number of votes that are cast for a vertex, the higher the importance of the vertex. Moreover, the importance of the vertex casting a vote determines how important the vote itsef is, and this information is aso taken into account by the ranking agorithm. Whie there are severa graph-based ranking agorithms previousy proposed in the iterature, we focus on ony one such agorithm, namey PageRank (Brin and Page, 1998), as it was previousy found successfu in a number of appications, incuding Web ink anaysis, socia networks, citation anaysis, and more recenty in severa text processing appications.. Given a graph 7, et ' be the set of vertices that point to vertex ' (predecessors), and et ' be the set of vertices that vertex ' points to (successors). The PageRank score associated with the vertex ' is then defined using a recursive function that integrates the scores of its predecessors: ',!#"%$& ('!*)*+ '-,. ',. / 0 / (1) where is a parameter that is set between 0 and 1 1. This vertex scoring scheme is based on a random wak mode, where a waker takes random steps on the graph, with the wak being modeed as a Markov process that is, the decision on what edge to foow is soey based on the vertex where the waker is currenty ocated. Under certain conditions, this mode converges to a stationary distribution of probabiities, associated with vertices in the graph. Based on the Ergodic theorem for Markov chains (Grimmett and Stirzaker, 1989), the agorithm is guaranteed to converge if the graph is both aperiodic and irreducibe. The first condition is achieved for any graph that is a non-bipartite graph, whie the second condition hods for any strongy connected graph property achieved by PageRank through the random jumps introduced by the,01 factor. In matrix notation, the PageRank vector of stationary probabiities is the principa eigenvector for the matrix 20354, which is obtained from the adjacency matrix 2 representing the graph, with a rows normaized to sum to 1: ( ). Intuitivey, the stationary probabiity associated with a vertex in the graph represents the probabiity 1 The typica vaue for 7 is 0.85 (Brin and Page, 1998), and this is the vaue we are aso using in our impementation.

3 5, X ^ Y i D D of finding the waker at that vertex during the random wak, and thus it represents the importance of the vertex within the graph. In the context of sequence data abeing, the random wak is performed on the abe graph associated with a sequence of words, and thus the resuting stationary distribution of probabiities can be used to decide on the most probabe set of abes for the given sequence. 2.2 Ranking on Weighted Graphs In a weighted graph, the decision on what edge to foow during a random wak is aso taking into account the weights of outgoing edges, with a higher ikeihood of foowing an edge that has a arger weight. The weighted version of the ranking agorithm is particuary usefu for sequence data abeing, since the dependencies between pairs of abes are more naturay modeed through weights indicating their strength, rather than binary vaues. Given a set of weights, associated with edges connecting vertices ' and ',, the weighted PageRank score is determined as: ) 7 7! " ) " #$%'&)(*! " +, " 2.3 Agorithm for Sequence Data Labeing Given a sequence of words with their corresponding admissibe abes, the agorithm for sequence data abeing seeks to identify a graph of abe dependencies on which a random wak can be performed, resuting in a set of scores that can be used for abe assignment. Agorithm 1 shows the pseudocode for the abeing process. The agorithm consists of three main steps: (1) construction of abe dependencies graph; (2) abe scoring using graph-based ranking agorithms; (3) abe assignment. First, a weighted graph of abe dependencies is buit by adding a vertex for each admissibe abe, and an edge for each pair of abes for which a dependency is identified. A maximum aowabe distance can be set (-/ ), indicating a constraint over the distance between words for which a abe dependency is sought. For instance, if is set to, no edges wi be drawn between abes corresponding to words that are more than three words apart, counting a running words.. Labe.:9<; dependencies are determined through the 1 5$8 5 function, whose definition depends on the appication and type of resources avaiabe (see Section 2.4). Next, scores are assigned to vertices using a graphbased ranking agorithm. Current experiments are (2) Agorithm 1 Graph-based Sequence Data Labeing Input: Sequence ABA CED Input: Admissibe abes F!G >=)H (? I!G ABA <ABA C, Output: Sequence of abes F ABA CED, with abe H!G corresponding to word! from the input Buid graph G of abe dependencies 1: for to C do 2: to C do 3: if K P[Z then 4: W4X<Y 5: endi if 6: for U to C!G do 7: for to C!]\ do Y YbYc 7YdcfeVg H (!hgvi Hkj 9: Y q then H (! Gdi H!h\ j Y 10: o 7#7p 7 11: end if 12: end for 13: end for 14: end for 15: end for Score vertices in G 1: repeat! \ i! 2: for a )ts Iu@ U q YdX evy )< v do 3: 7 f " 7]w! " ), " yx "! " + < # %'&)(* 4: end for 5: unti convergence of scores, Labe assignment 1: forh a/pto C do 2:!G 3: end for ^z{p[r =, H (!G? I <ABA C!G i!m based on PageRank, but other ranking agorithms can be used as we. Finay, the most ikey set of abes is determined by identifying for each word the abe that has the highest score. Note that a admissibe abes corresponding to the words in the input sequence are assigned with a score, and thus the seection of two or more most ikey abes for a word is aso possibe. 2.4 Labe Dependencies Labe dependencies can be defined in various ways, depending on the appication at hand and on the knowedge sources that are avaiabe. If an annotated corpus is avaiabe, dependencies can be defined as abe co-occurrence probabiities approximated with y} frequency counts, or as conditiona probabiities # / y}. Optionay, these dependencies can be exicaized by taking into account the corre- / } 49 sponding words in the sequence, e.g. /. In the absence of an annotated corpus, dependencies can be derived by other means, e.g. part-

4 " " " 9 of-speech probabiities can be approximated from a raw corpus as in (Cutting et a., 1992), word-sense dependencies can be derived as definition-based simiarities, etc. Labe dependencies are set as weights on the arcs drawn between corresponding abes. Arcs can be directed or undirected for joint probabiities or simiarity measures, and are usuay directed for conditiona probabiities. 2.5 Labeing Exampe Consider again the exampe from Figure 1, consisting of a sequence of four words, and their possibe corresponding abes. In the first step of the agorithm, abe dependencies are determined, and et us assume that the vaues for these dependencies are as indicated through the edge weights in Figure 1. Next, vertices in the graph are scored using an iterative ranking agorithm, resuting in a score attached to each abe, shown in brackets next to each vertex. Finay, the most probabe abe for each word is seected. Word is thus assigned with abe, since the score of this abe (, ) is the maximum among the scores associated with a its possibe abes (,,,*,, ). Simiary, word is assigned with abe, with abe, and receives abe. 2.6 Efficiency Considerations For a sequence of words 24, each word with admissibe abes, the running time of the graph-based sequence data abeing agorithm } 2 2 is proportiona with O( ) ) (the time spent in buiding the abe graph and iterating the agorithm for a constant number of times ). This is order of magnitudes better than the running time 2 of O( ) for agorithms that attempt to seect the best sequence of abes by searching through the entire space of possibe abe combinations, athough it can be significanty higher than the running time of 24 O( ) for individua data abeing. 2.7 Other Agorithms for Sequence Data Labeing It is interesting to contrast our agorithm with previousy proposed modes for sequence data abeing, e.g. Hidden Markov Modes, Maximum Entropy Markov Modes, or Conditiona Random Fieds. Athough they differ in the mode used (generative, discriminative, or dua), and the type of probabiities invoved (joint or conditiona), these previous agorithms are a parameterized agorithms that typicay require parameter training through maximization of ikeihood on training exampes. In these modes, parameters that maximize sequence probabiities are earned from a corpus during a training phase, and then appied to the annotation of new unseen data. Instead, in the agorithm proposed in this paper, the ikeihood of a sequence of abes is determined during test phase, through random waks performed on the abe graph buit for the data to be annotated. Whie current evauations of our agorithm are performed on an unsupervised abeing task, future work wi consider the evauation of the agorithm in the presence of an annotated corpus, which wi aow for direct comparison with these previousy proposed modes for sequence data abeing. 3 Experiments in Word Sense Disambiguation The agorithm for sequence data abeing is iustrated and tested on an a-words word sense disambiguation probem. Word sense disambiguation is a abeing task consisting of assigning the correct meaning to each open-cass word in a sequence (usuay a sentence). Most of the efforts for soving this probem were concentrated so far toward targeted supervised earning, where each sense tagged occurrence of a particuar word is transformed into a feature vector used in an automatic earning process. The appicabiity of such supervised agorithms is however imited to those few words for which sense tagged data is avaiabe, and their accuracy is strongy connected to the amount of abeed data avaiabe at hand. Instead, agorithms that attempt to disambiguate a-words in unrestricted text have received significanty ess attention, as the deveopment and success of such agorithms has been hindered by both (a) ack of resources (training data), and (b) efficiency aspects resuting from the arge size of the probem. 3.1 Graph-based Sequence Data Labeing for Unsupervised Word Sense Disambiguation To appy the graph-based sequence data abeing agorithm to the disambiguation of an input text, we need information on abes (word senses) and dependencies (word sense dependencies). Word senses can be easiy obtained from any sense inventory, e.g. WordNet or LDOCE. Sense dependencies can be derived in various ways, depending on the type of resources avaiabe for the anguage and/or domain at hand. In this paper, we expore the unsupervised derivation of sense

5 dependencies using information drawn from machine readabe dictionaries, which is genera and can be appied to any anguage or domain for which a sense inventory is avaiabe. Reying excusivey on a machine readabe dictionary, a sense dependency can be defined as a measure of simiarity between word senses. There are severa metrics that can be used for this purpose, see for instance (Budanitsky and Hirst, 2001) for an overview. However, most of them rey on measures of semantic distance computed on semantic networks, and thus they are imited by the avaiabiity of expicity encoded semantic reations (e.g. is-a, part-of). To maintain the unsupervised aspect of the agorithm, we chose instead to use a measure of simiarity based on sense definitions, which can be computed on any dictionary, and can be evauated across different parts-ofspeech. Given two word senses and their corresponding definitions, the sense simiarity is determined as a function of definition overap, measured as the number of common tokens between the two definitions, after running them through a simpe fiter that eiminates a stop-words. To avoid promoting ong definitions, we aso use a normaization factor, and divide the content overap of the two definitions with the ength of each definition. This sense simiarity measure is inspired by the definition of the Lesk agorithm (Lesk, 1986). Starting with a sense inventory and a function for computing sense dependencies, the appication of the sequence data abeing agorithm to the unsupervised disambiguation of a new text proceeds as foows. First, for the given text, a abe graph is buit by adding a vertex for each possibe sense for a opencass words in the text. Next, weighted edges are drawn using the definition-based semantic simiarity measure, computed for a pairs of senses for words found within a certain distance ( , as defined in Agorithm 1). Once the graph is constructed, the graph-based ranking agorithm is appied, and a score is determined for a word senses in the graph. Finay, for each open-cass word in the text, we seect the vertex in the abe graph which has the highest score, and abe the word with the corresponding word sense. 3.2 An Exampe Consider the task of assigning senses to the words in the text The church bes no onger rung on Sundays 2. For the purpose of iustration, we assume at 2 Exampe drawn from the data set provided during the SENSEVAL-2 Engish a-words task. Manua sense annotations The church bes no onger rung on Sundays. church 1: one of the groups of Christians who have their own beiefs and forms of worship 2: a pace for pubic (especiay Christian) worship 3: a service conducted in a church be 1: a hoow device made of meta that makes a ringing sound when struck 2: a push button at an outer door that gives a ringing or buzzing signa when pushed 3: the sound of a be ring 1: make a ringing sound 2: ring or echo with sound 3: make (bes) ring, often for the purposes of musica edification Sunday 1: first day of the week; observed as a day of rest and worship by most Christians [0.96] [2.56] [0.99] [1.46] s3 s2 S1 be S3 s s1 ring [0.73] [0.93] s3 S2 s1 church [0.42] [0.63] [0.58] S1 [0.67] Sunday Figure 2: The abe graph for assigning senses to words in the sentence The church bes no onger rung on Sundays. most three senses for each word, which are shown in Figure 2. Word senses and definitions are obtained from the WordNet sense inventory (Mier, 1995). A word senses are added as vertices in the abe graph, and weighted edges are drawn as dependencies among word senses, derived using the definition-based simiarity measure (no edges are drawn between word senses with a simiarity of zero). The resuting abe graph is an undirected weighted graph, as shown in Figure 2. After running the ranking agorithm, scores are identified for each word-sense in the graph, indicated between brackets next to each node. Seecting for each word the sense with the argest score resuts in the foowing sense assignment: The church#2 bes#1 were aso made avaiabe for this data.

6 , no onger rung#3 on Sundays#1, which is correct according to annotations performed by professiona exicographers. 3.3 Resuts and Discussion The agorithm was primariy evauated on the SENSEVAL-2 Engish a-words data set, consisting of three documents from Penn Treebank, with 2,456 open-cass words (Pamer et a., 2001). Unike other sense-annotated data sets, e.g. SENSEVAL-3 or Sem- Cor, SENSEVAL-2 is the ony testbed for a-words word sense disambiguation that incudes a sense map, which aows for additiona coarse-grained sense evauations. Moreover, there is a arger body of previous work that was evauated on this data set, which can be used as a base of comparison. The performance of our agorithm is compared with the disambiguation accuracy obtained with a variation of the Lesk agorithm 3 (Lesk, 1986), which seects the meaning of an open-cass word by finding the word sense that eads to the highest overap between the corresponding dictionary definition and the current context. Simiar to the definition simiarity function used in the graph-based disambiguation agorithm (Section 3.1), the overap measure used in the Lesk impementation does not take into account stop-words, and it is normaized with the ength of each definition to avoid promoting onger definitions. We are thus comparing the performance of sequence data abeing, which takes into account abe dependencies, with individua data abeing, where a abe is seected independent of the other abes in the text. Note that both agorithms rey on the same knowedge source, i.e. dictionary definitions, and thus they are directy comparabe. Moreover, none of the agorithms take into account the dictionary sense order (e.g. the most frequent sense provided by WordNet), and therefore they are both fuy unsupervised. Tabe 1 shows precision and reca figures 4 for a 3 Given a sequence of words, the origina Lesk agorithm attempts to identify the combination of word senses that maximizes the redundancy (overap) across a corresponding definitions. The agorithm was ater improved through a method for simuated anneaing (Cowie et a., 1992), which soved the combinatoria exposion of word senses, whie sti finding an optima soution. However, recent comparative evauations of different variants of the Lesk agorithm have shown that the performance of the origina agorithm is significanty exceeded by an agorithm variation that reies on the overap between word senses and current context (Vasiescu et a., 2004). We are thus using this atter Lesk variant in our impementation. 4 Reca is particuary ow for each individua part-of-speech because it is cacuated with respect to the entire data set. The overa precision and reca figures coincide, refecting the 100% coverage of the agorithm. context size (-/ V3 ) equa to the ength of each sentence, using: (a) sequence data abeing with iterative graph-based agorithms; (b) individua data abeing with a version of the Lesk agorithm; (c) random baseine. Evauations are run for both fine-grained and coarse-grained sense distinctions, to determine the agorithm performance under different cassification granuarities. The accuracy of the graph-based sequence data abeing agorithm exceeds by a arge margin the individua data abeing agorithm, resuting in 10.7% error rate reduction for fine-grained sense distinctions, which is statisticay significant (8 TT, paired t-test). Performance improvements are equay distributed across a parts-of-speech, with comparabe improvements obtained for nouns, verbs, and adjectives. A simiar error rate reduction of 11.0% is obtained for coarse-grained sense distinctions, which suggests that the performance of the graph-based sequence data abeing agorithm does not depend on cassification granuarity, and simiar improvements over individua data abeing can be obtained regardess of the average number of abes per word. We aso measured the variation of performance with context size, and evauated the disambiguation accuracy for both agorithms for a window size ranging from two words to an entire sentence. The window size parameter imits the number of surrounding words considered when seeking abe dependencies (sequence data abeing), or the words counted in the measure of definition context overap (individua data abeing). Figure 3 pots the disambiguation accuracy of the two agorithms as a function of context size. As seen in the figure, both agorithms benefit from arger contexts, with a steady increase in performance observed for increasingy arger window sizes. Athough the initia growth observed for the sequence data abeing agorithm is somewhat sharper, the gap between the two curves stabiizes for window sizes arger than five words, which suggests that the improvement in performance achieved with sequence data abeing over individua data abeing does not depend on the size of avaiabe context. The agorithm was aso evauated on two other data sets, SENSEVAL-3 Engish a-words data (Snyder and Pamer, 2004) and a subset of SemCor (Mier et a., 1993), athough ony fine-grained sense evauations coud be conducted on these test sets. The disambiguation precision on the SENSEVAL-3 data was measured at 52.2% using sequence data abeing, compared to 48.1% obtained with individua

7 Fine-grained sense distinctions Coarse-grained sense distinctions Random Individua Sequence Random Individua Sequence Part-of baseine (Lesk) (graph-based) baseine (Lesk) (graph-based) speech P R P R P R P R P R P R Noun 41.4% 19.4% 50.3% 23.6% 57.5% 27.0% 42.7% 20.0% 51.4% 24.1% 58.8% 27.5% Verb 20.7% 3.9% 30.5% 5.7% 36.5% 6.9% 22.8% 4.3% 31.9% 6.0% 37.9% 7.1% Adjective 41.3% 9.3% 49.1% 11.0% 56.7% 12.7% 42.6% 42.6% 49.8% 11.2% 57.6% 12.9% Adverb 44.6% 5.2% 64.6% 7.6% 70.9% 8.3% 40.7% 4.8% 65.3% 7.7% 71.9% 8.5% ALL 37.9% 37.9% 48.7% 48.7% 54.2% 54.2% 38.7% 38.7% 49.8% 49.8% 55.3% 55.3% Tabe 1: Precision and reca for graph-based sequence data abeing, individua data abeing, and random baseine, for fine-grained and coarse-grained sense distinctions. Disambiguation precision (%) Window size sequence individua random Figure 3: Disambiguation resuts using sequence data abeing, individua abeing, and random baseine, for various context sizes. data abeing, and 34.3% achieved through random sense assignment. The average disambiguation figure obtained on a the words in a random subset of 10 SemCor documents, covering different domains, was 56.5% for sequence data abeing, 47.4% for individua abeing, and 35.3% for the random baseine. Comparison with Reated Work For a given sequence of ambiguous words, the origina definition of the Lesk agorithm (Lesk, 1986), and more recent improvements based on simuated anneaing (Cowie et a., 1992), seek to identify the combination of senses that maximizes the overap among their dictionary definitions. Tests performed with this agorithm on the SENSEVAL-2 data set resuted in a disambiguation accuracy of 39.5%. This precision is exceeded by the Lesk agorithm variation used in the experiments reported in this paper, which measures the overap between sense definitions and the current context, for a precision of 48.7% on the same data set (see Tabe 1). In the SENSEVAL-2 evauations, the best performing fuy unsupervised agorithm 5 was deveoped by (Litkowski, 2001), who combines anaysis of mutiword units and contextua cues based on coocations and content words from dictionary definitions and exampes, for an overa precision and reca of 45.1%. More recenty, (McCarthy et a., 2004) reports one of the best resuts on the SENSEVAL-2 data set, using an agorithm that automaticay derives the most frequent sense for a word using distributiona simiarities earned from a arge raw corpus, for a disambiguation precision of 53.0% and a reca of 49.0%. Another reated ine of work consists of the disambiguation agorithms based on exica chains (Morris and Hirst, 1991), and the more recent improvements reported in (Gaey and McKeown, 2003) where threads of meaning are identified throughout a text. Lexica chains however ony take into account connections between concepts identified in a static way, without considering the importance of the concepts that participate in a reation, which is recursivey determined in our agorithm. Moreover, the construction of exica chains requires structured dictionaries such as WordNet, with expicity defined semantic reations between word senses, whereas our agorithm can aso work with simpe unstructured dictionaries that provide ony word sense definitions. (Gaey and McKeown, 2003) evauated their agorithm on the nouns from a subset of SEMCOR, reporting 62.09% disambiguation precision. The performance of our agorithm on the same subset of SEMCOR nouns was measured at 64.2% 6. Finay, another disambiguation method reying on graph agorithms that expoit the 5 Agorithms that integrate the most frequent sense in Word- Net are not considered here, since this represents a supervised knowedge source (WordNet sense frequencies are derived from a sense-annotated corpus). 6 Note that the resuts are not directy comparabe, since (Gaey and McKeown, 2003) used the WordNet sense order to break the ties, whereas we assume that such sense order frequency is not avaiabe, and thus we break the ties through random choice.

8 structure of semantic networks was proposed in (Mihacea et a., 2004), with a disambiguation accuracy of 50.9% measured on a the words in the SENSEVAL-2 data set. Athough it reies excusivey on dictionary definitions, the graph-based sequence data abeing agorithm proposed in this paper, with its overa performance of 54.2%, exceeds significanty the accuracy of a these previousy proposed unsupervised word sense disambiguation methods, proving the benefits of taking into account abe dependencies when annotating sequence data. An additiona interesting benefit of the agorithm is that it provides a ranking over word senses, and thus the seection of two or more most probabe senses for each word is aso possibe. 4 Concusions We proposed a graphica agorithm for sequence data abeing that reies on random waks on graphs encoding abe dependencies. Through the abe graphs it buids for a given sequence of words, the agorithm expoits reations between word abes, and impements a concept of recommendation. A abe recommends other reated abes, and the strength of the recommendation is recursivey computed based on the importance of the abes making the recommendation. In this way, the agorithm simutaneousy annotates a the words in an input sequence, by identifying the most probabe (most recommended) set of abes. The agorithm was iustrated and tested on an unsupervised word sense disambiguation probem, targeting the annotation of a words in unrestricted texts. Through experiments performed on standard senseannotated data sets, the graph-based sequence data abeing agorithm was shown to significanty outperform the accuracy achieved through individua data abeing, resuting in a statisticay significant error rate reduction of 10.7%. The disambiguation method was aso shown to exceed the performance of previousy proposed unsupervised word sense disambiguation agorithms. Moreover, comparative resuts obtained under various experimenta settings have shown that the agorithm is robust to changes in cassification granuarity and context size. Acknowedgments This work was partiay supported by a Nationa Science Foundation grant IIS References S. Brin and L. Page The anatomy of a arge-scae hypertextua Web search engine. Computer Networks and ISDN Systems, 30(1 7). A. Budanitsky and G. Hirst Semantic distance in wordnet: An experimenta, appication-oriented evauation of five measures. In Proceedings of the NAACL Workshop on WordNet and Other Lexica Resources, Pittsburgh. J. Cowie, L. Guthrie, and J. Guthrie Lexica disambiguation using simuated anneaing. In Proceedings of the 5th Internationa Conference on Computationa Linguistics (COLING 1992). D. Cutting, J. Kupiec, J. Pedersen, and P. Sibun A practica part-of-speech tagger. In Proceedings of the Third Conference on Appied Natura Language Processing ANLP-92. M. Gaey and K. McKeown Improving word sense disambiguation in exica chaining. In Proceedings of the 18th Internationa Joint Conference on Artificia Inteigence (IJCAI 2003), Acapuco, Mexico, August. G. Grimmett and D. Stirzaker Probabiity and Random Processes. Oxford University Press. M.E. Lesk Automatic sense disambiguation using machine readabe dictionaries: How to te a pine cone from an ice cream cone. In Proceedings of the SIGDOC Conference 1986, Toronto. K. Litkowski Use of machine readabe dictionaries in word sense disambiguation for Senseva-2. In Proceedings of ACL/SIGLEX Senseva-2, Tououse, France. D. McCarthy, R. Koeing, J. Weeds, and J. Carro Using automaticay acquired predominant senses for word sense disambiguation. In Proceedings of ACL/SIGLEX Senseva-3, Barceona, Spain. R. Mihacea, P. Tarau, and E. Figa PageRank on semantic networks, with appication to word sense disambiguation. In Proceedings of the 20st Internationa Conference on Computationa Linguistics (COLING 2004). G. Mier, C. Leacock, T. Randee, and R. Bunker A semantic concordance. In Proceedings of the 3rd DARPA Workshop on Human Language Technoogy, Painsboro, New Jersey. G. Mier Wordnet: A exica database. Communication of the ACM, 38(11): J. Morris and G. Hirst Lexica cohesion, the thesaurus, and the structure of text. Computationa Linguistics, 17(1): M. Pamer, C. Febaum, S. Cotton, L. Defs, and H.T. Dang Engish tasks: a-words and verb exica sampe. In Proceedings of ACL/SIGLEX Senseva-2, Tououse, France. B. Snyder and M. Pamer The Engish awords task. In Proceedings of ACL/SIGLEX Senseva-3, Barceona, Spain. F. Vasiescu, P. Langais, and G. Lapame Evauating variants of the Lesk approach for disambiguating words. In Proceedings of the Conference of Language Resources and Evauations (LREC 2004).

Word Sense Disambiguation

Word Sense Disambiguation Word Sense Disambiguation D. De Cao R. Basili Corso di Web Mining e Retrieval a.a. 2008-9 May 21, 2009 Excerpt of the R. Mihalcea and T. Pedersen AAAI 2005 Tutorial, at: http://www.d.umn.edu/ tpederse/tutorials/advances-in-wsd-aaai-2005.ppt

More information

Using Voluntary work to get ahead in the job market

Using Voluntary work to get ahead in the job market Vo_1 Vounteering Using Vountary work to get ahead in the job market Job Detais data: {documents}httpwwwopeneduopenearnocw_cmid4715_2014-08-21_14-34-17_ht2.xm user: ht2 tempate: ve_pdf job name: httpwwwopeneduopenearnocw_cmid4715_2014-08-

More information

Precision Decisions for the Timings Chart

Precision Decisions for the Timings Chart PPENDIX 1 Precision Decisions for the Timings hart Data-Driven Decisions for Performance-Based Measures within ssions Deb Brown, MS, BB Stanisaus ounty Office of Education Morningside Teachers cademy Performance-based

More information

Making and marking progress on the DCSF Languages Ladder

Making and marking progress on the DCSF Languages Ladder Making and marking progress on the DCSF Languages Ladder Primary anguages taster pack Year 3 summer term Asset Languages and CILT have been asked by the DCSF to prepare support materias to hep teachers

More information

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2 Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2 Ted Pedersen Department of Computer Science University of Minnesota Duluth, MN, 55812 USA tpederse@d.umn.edu

More information

Vocabulary Usage and Intelligibility in Learner Language

Vocabulary Usage and Intelligibility in Learner Language Vocabulary Usage and Intelligibility in Learner Language Emi Izumi, 1 Kiyotaka Uchimoto 1 and Hitoshi Isahara 1 1. Introduction In verbal communication, the primary purpose of which is to convey and understand

More information

Ensemble Technique Utilization for Indonesian Dependency Parser

Ensemble Technique Utilization for Indonesian Dependency Parser Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id

More information

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se

More information

Combining a Chinese Thesaurus with a Chinese Dictionary

Combining a Chinese Thesaurus with a Chinese Dictionary Combining a Chinese Thesaurus with a Chinese Dictionary Ji Donghong Kent Ridge Digital Labs 21 Heng Mui Keng Terrace Singapore, 119613 dhji @krdl.org.sg Gong Junping Department of Computer Science Ohio

More information

A Comparative Evaluation of Word Sense Disambiguation Algorithms for German

A Comparative Evaluation of Word Sense Disambiguation Algorithms for German A Comparative Evaluation of Word Sense Disambiguation Algorithms for German Verena Henrich, Erhard Hinrichs University of Tübingen, Department of Linguistics Wilhelmstr. 19, 72074 Tübingen, Germany {verena.henrich,erhard.hinrichs}@uni-tuebingen.de

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Leveraging Sentiment to Compute Word Similarity

Leveraging Sentiment to Compute Word Similarity Leveraging Sentiment to Compute Word Similarity Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya Dept. of Computer Science and Engineering, IIT Bombay 6th International Global

More information

On document relevance and lexical cohesion between query terms

On document relevance and lexical cohesion between query terms Information Processing and Management 42 (2006) 1230 1247 www.elsevier.com/locate/infoproman On document relevance and lexical cohesion between query terms Olga Vechtomova a, *, Murat Karamuftuoglu b,

More information

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,

More information

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Chinese Language Parsing with Maximum-Entropy-Inspired Parser Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art

More information

! # %& ( ) ( + ) ( &, % &. / 0!!1 2/.&, 3 ( & 2/ &,

! # %& ( ) ( + ) ( &, % &. / 0!!1 2/.&, 3 ( & 2/ &, ! # %& ( ) ( + ) ( &, % &. / 0!!1 2/.&, 3 ( & 2/ &, 4 The Interaction of Knowledge Sources in Word Sense Disambiguation Mark Stevenson Yorick Wilks University of Shef eld University of Shef eld Word sense

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence. NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and

More information

A Bayesian Learning Approach to Concept-Based Document Classification

A Bayesian Learning Approach to Concept-Based Document Classification Databases and Information Systems Group (AG5) Max-Planck-Institute for Computer Science Saarbrücken, Germany A Bayesian Learning Approach to Concept-Based Document Classification by Georgiana Ifrim Supervisors

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

The Smart/Empire TIPSTER IR System

The Smart/Empire TIPSTER IR System The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of

More information

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

Training and evaluation of POS taggers on the French MULTITAG corpus

Training and evaluation of POS taggers on the French MULTITAG corpus Training and evaluation of POS taggers on the French MULTITAG corpus A. Allauzen, H. Bonneau-Maynard LIMSI/CNRS; Univ Paris-Sud, Orsay, F-91405 {allauzen,maynard}@limsi.fr Abstract The explicit introduction

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models 1 Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models James B.

More information

Disambiguation of Thai Personal Name from Online News Articles

Disambiguation of Thai Personal Name from Online News Articles Disambiguation of Thai Personal Name from Online News Articles Phaisarn Sutheebanjard Graduate School of Information Technology Siam University Bangkok, Thailand mr.phaisarn@gmail.com Abstract Since online

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

Variations of the Similarity Function of TextRank for Automated Summarization

Variations of the Similarity Function of TextRank for Automated Summarization Variations of the Similarity Function of TextRank for Automated Summarization Federico Barrios 1, Federico López 1, Luis Argerich 1, Rosita Wachenchauzer 12 1 Facultad de Ingeniería, Universidad de Buenos

More information

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

The Internet as a Normative Corpus: Grammar Checking with a Search Engine The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a

More information

stateorvalue to each variable in a given set. We use p(x = xjy = y) (or p(xjy) as a shorthand) to denote the probability that X = x given Y = y. We al

stateorvalue to each variable in a given set. We use p(x = xjy = y) (or p(xjy) as a shorthand) to denote the probability that X = x given Y = y. We al Dependency Networks for Collaborative Filtering and Data Visualization David Heckerman, David Maxwell Chickering, Christopher Meek, Robert Rounthwaite, Carl Kadie Microsoft Research Redmond WA 98052-6399

More information

University of Groningen. Systemen, planning, netwerken Bosman, Aart

University of Groningen. Systemen, planning, netwerken Bosman, Aart University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Ulrike Baldewein (ulrike@coli.uni-sb.de) Computational Psycholinguistics, Saarland University D-66041 Saarbrücken,

More information

Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models

Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models Jung-Tae Lee and Sang-Bum Kim and Young-In Song and Hae-Chang Rim Dept. of Computer &

More information

A Semantic Similarity Measure Based on Lexico-Syntactic Patterns

A Semantic Similarity Measure Based on Lexico-Syntactic Patterns A Semantic Similarity Measure Based on Lexico-Syntactic Patterns Alexander Panchenko, Olga Morozova and Hubert Naets Center for Natural Language Processing (CENTAL) Université catholique de Louvain Belgium

More information

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS Ruslan Mitkov (R.Mitkov@wlv.ac.uk) University of Wolverhampton ViktorPekar (v.pekar@wlv.ac.uk) University of Wolverhampton Dimitar

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Ch 2 Test Remediation Work Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) High temperatures in a certain

More information

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Using Web Searches on Important Words to Create Background Sets for LSI Classification Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Lecture 10: Reinforcement Learning

Lecture 10: Reinforcement Learning Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

The stages of event extraction

The stages of event extraction The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks

More information

Exploiting Wikipedia as External Knowledge for Named Entity Recognition

Exploiting Wikipedia as External Knowledge for Named Entity Recognition Exploiting Wikipedia as External Knowledge for Named Entity Recognition Jun ichi Kazama and Kentaro Torisawa Japan Advanced Institute of Science and Technology (JAIST) Asahidai 1-1, Nomi, Ishikawa, 923-1292

More information

Multilingual Sentiment and Subjectivity Analysis

Multilingual Sentiment and Subjectivity Analysis Multilingual Sentiment and Subjectivity Analysis Carmen Banea and Rada Mihalcea Department of Computer Science University of North Texas rada@cs.unt.edu, carmen.banea@gmail.com Janyce Wiebe Department

More information

A Comparison of Two Text Representations for Sentiment Analysis

A Comparison of Two Text Representations for Sentiment Analysis 010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational

More information

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF Read Online and Download Ebook ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF Click link bellow and free register to download

More information

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents

More information

Robust Sense-Based Sentiment Classification

Robust Sense-Based Sentiment Classification Robust Sense-Based Sentiment Classification Balamurali A R 1 Aditya Joshi 2 Pushpak Bhattacharyya 2 1 IITB-Monash Research Academy, IIT Bombay 2 Dept. of Computer Science and Engineering, IIT Bombay Mumbai,

More information

Corrective Feedback and Persistent Learning for Information Extraction

Corrective Feedback and Persistent Learning for Information Extraction Corrective Feedback and Persistent Learning for Information Extraction Aron Culotta a, Trausti Kristjansson b, Andrew McCallum a, Paul Viola c a Dept. of Computer Science, University of Massachusetts,

More information

Grade 6: Correlated to AGS Basic Math Skills

Grade 6: Correlated to AGS Basic Math Skills Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and

More information

Short Text Understanding Through Lexical-Semantic Analysis

Short Text Understanding Through Lexical-Semantic Analysis Short Text Understanding Through Lexical-Semantic Analysis Wen Hua #1, Zhongyuan Wang 2, Haixun Wang 3, Kai Zheng #4, Xiaofang Zhou #5 School of Information, Renmin University of China, Beijing, China

More information

Discriminative Learning of Beam-Search Heuristics for Planning

Discriminative Learning of Beam-Search Heuristics for Planning Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University

More information

Accuracy (%) # features

Accuracy (%) # features Question Terminology and Representation for Question Type Classication Noriko Tomuro DePaul University School of Computer Science, Telecommunications and Information Systems 243 S. Wabash Ave. Chicago,

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,

More information

Given a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations

Given a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations 4 Interior point algorithms for network ow problems Mauricio G.C. Resende AT&T Bell Laboratories, Murray Hill, NJ 07974-2070 USA Panos M. Pardalos The University of Florida, Gainesville, FL 32611-6595

More information

What the National Curriculum requires in reading at Y5 and Y6

What the National Curriculum requires in reading at Y5 and Y6 What the National Curriculum requires in reading at Y5 and Y6 Word reading apply their growing knowledge of root words, prefixes and suffixes (morphology and etymology), as listed in Appendix 1 of the

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma University of Alberta Large-Scale Semi-Supervised Learning for Natural Language Processing by Shane Bergsma A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of

More information

Introduction to Causal Inference. Problem Set 1. Required Problems

Introduction to Causal Inference. Problem Set 1. Required Problems Introduction to Causal Inference Problem Set 1 Professor: Teppei Yamamoto Due Friday, July 15 (at beginning of class) Only the required problems are due on the above date. The optional problems will not

More information

Distant Supervised Relation Extraction with Wikipedia and Freebase

Distant Supervised Relation Extraction with Wikipedia and Freebase Distant Supervised Relation Extraction with Wikipedia and Freebase Marcel Ackermann TU Darmstadt ackermann@tk.informatik.tu-darmstadt.de Abstract In this paper we discuss a new approach to extract relational

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each

More information

NCEO Technical Report 27

NCEO Technical Report 27 Home About Publications Special Topics Presentations State Policies Accommodations Bibliography Teleconferences Tools Related Sites Interpreting Trends in the Performance of Special Education Students

More information

AP Calculus AB. Nevada Academic Standards that are assessable at the local level only.

AP Calculus AB. Nevada Academic Standards that are assessable at the local level only. Calculus AB Priority Keys Aligned with Nevada Standards MA I MI L S MA represents a Major content area. Any concept labeled MA is something of central importance to the entire class/curriculum; it is a

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important

More information

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com

More information

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014 UNSW Australia Business School School of Risk and Actuarial Studies ACTL5103 Stochastic Modelling For Actuaries Course Outline Semester 2, 2014 Part A: Course-Specific Information Please consult Part B

More information

Multi-Lingual Text Leveling

Multi-Lingual Text Leveling Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency

More information

DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS

DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS Julia Tmshkina Centre for Text Techitology, North-West University, 253 Potchefstroom, South Africa 2025770@puk.ac.za

More information

Online Updating of Word Representations for Part-of-Speech Tagging

Online Updating of Word Representations for Part-of-Speech Tagging Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2 1 CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2 Peter A. Chew, Brett W. Bader, Ahmed Abdelali Proceedings of the 13 th SIGKDD, 2007 Tiago Luís Outline 2 Cross-Language IR (CLIR) Latent Semantic Analysis

More information

Diagnostic Test. Middle School Mathematics

Diagnostic Test. Middle School Mathematics Diagnostic Test Middle School Mathematics Copyright 2010 XAMonline, Inc. All rights reserved. No part of the material protected by this copyright notice may be reproduced or utilized in any form or by

More information

BENCHMARK TREND COMPARISON REPORT:

BENCHMARK TREND COMPARISON REPORT: National Survey of Student Engagement (NSSE) BENCHMARK TREND COMPARISON REPORT: CARNEGIE PEER INSTITUTIONS, 2003-2011 PREPARED BY: ANGEL A. SANCHEZ, DIRECTOR KELLI PAYNE, ADMINISTRATIVE ANALYST/ SPECIALIST

More information

1. Introduction. 2. The OMBI database editor

1. Introduction. 2. The OMBI database editor OMBI bilingual lexical resources: Arabic-Dutch / Dutch-Arabic Carole Tiberius, Anna Aalstein, Instituut voor Nederlandse Lexicologie Jan Hoogland, Nederlands Instituut in Marokko (NIMAR) In this paper

More information

2.1 The Theory of Semantic Fields

2.1 The Theory of Semantic Fields 2 Semantic Domains In this chapter we define the concept of Semantic Domain, recently introduced in Computational Linguistics [56] and successfully exploited in NLP [29]. This notion is inspired by the

More information

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education GCSE Mathematics B (Linear) Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education Mark Scheme for November 2014 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

The Moodle and joule 2 Teacher Toolkit

The Moodle and joule 2 Teacher Toolkit The Moodle and joule 2 Teacher Toolkit Moodlerooms Learning Solutions The design and development of Moodle and joule continues to be guided by social constructionist pedagogy. This refers to the idea that

More information

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,

More information

Julia Smith. Effective Classroom Approaches to.

Julia Smith. Effective Classroom Approaches to. Julia Smith @tessmaths Effective Classroom Approaches to GCSE Maths resits julia.smith@writtle.ac.uk Agenda The context of GCSE resit in a post-16 setting An overview of the new GCSE Key features of a

More information

Classroom Connections Examining the Intersection of the Standards for Mathematical Content and the Standards for Mathematical Practice

Classroom Connections Examining the Intersection of the Standards for Mathematical Content and the Standards for Mathematical Practice Classroom Connections Examining the Intersection of the Standards for Mathematical Content and the Standards for Mathematical Practice Title: Considering Coordinate Geometry Common Core State Standards

More information

The Strong Minimalist Thesis and Bounded Optimality

The Strong Minimalist Thesis and Bounded Optimality The Strong Minimalist Thesis and Bounded Optimality DRAFT-IN-PROGRESS; SEND COMMENTS TO RICKL@UMICH.EDU Richard L. Lewis Department of Psychology University of Michigan 27 March 2010 1 Purpose of this

More information

CSC200: Lecture 4. Allan Borodin

CSC200: Lecture 4. Allan Borodin CSC200: Lecture 4 Allan Borodin 1 / 22 Announcements My apologies for the tutorial room mixup on Wednesday. The room SS 1088 is only reserved for Fridays and I forgot that. My office hours: Tuesdays 2-4

More information

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many Schmidt 1 Eric Schmidt Prof. Suzanne Flynn Linguistic Study of Bilingualism December 13, 2013 A Minimalist Approach to Code-Switching In the field of linguistics, the topic of bilingualism is a broad one.

More information