A Novel Two-stage Framework for Extracting Opinionated Sentences from News Articles

Size: px
Start display at page:

Download "A Novel Two-stage Framework for Extracting Opinionated Sentences from News Articles"

Transcription

1 A Novel Two-stage Framework for Extracting Opinionate Sentences from News Articles Pujari Rajkumar 1, Swara Desai 2, Niloy Ganguly 1 an Pawan Goyal 1 1 Dept. of Computer Science an Engineering, Inian Institute of Technology Kharagpur, Inia Yahoo! Inia 1 rajkumarsaikorian@gmail.com, {niloy,pawang}@cse.iitkgp.ernet.in 2 swara@yahoo-inc.com Abstract This paper presents a novel two-stage framework to extract opinionate sentences from a given news article. In the first stage, Naïve Bayes classifier by utilizing the local features assigns a score to each sentence - the score signifies the probability of the sentence to be opinionate. In the secon stage, we use this prior within the HITS (Hyperlink-Inuce Topic Search) schema to exploit the global structure of the article an relation between the sentences. In the HITS schema, the opinionate sentences are treate as Hubs an the facts aroun these opinions are treate as the Authorities. The algorithm is implemente an evaluate against a set of manually marke ata. We show that using HITS significantly improves the precision over the baseline Naïve Bayes classifier. We also argue that the propose metho actually iscovers the unerlying structure of the article, thus extracting various opinions, groupe with supporting facts as well as other supporting opinions from the article. 1 Introuction With the avertising base revenues becoming the main source of revenue, fining novel ways to increase focusse user engagement has become an important research topic. A typical problem face by web publishing houses like Yahoo!, is unerstaning the nature of the comments poste by reaers of 10 5 articles poste at any moment on its website. A lot of users engage in iscussions in the comments section of the articles. Each user has a ifferent perspective an thus comments in that genre - this many a times, results in a situation where the iscussions in the comment section waner far away from the articles topic. In orer to assist users to iscuss relevant points in the comments section, a possible methoology can be to generate questions from the article s content that seek user s opinions about various opinions conveye in the article (Rokhlenko an Szpektor, 2013). It woul also irect the users into thinking about a spectrum of various points that the article covers an encourage users to share their unique, personal, aily-life experience in events relevant to the article. This woul thus provie a broaer view point for reaers as well as perspective questions can be create thus catering to users with rich user generate content, this in turn can increase user engagement on the article pages. Generating such questions manually for huge volume of articles is very ifficult. However, if one coul ientify the main opinionate sentences within the article, it will be much easier for an eitor to generate certain questions aroun these. Otherwise, the sentences themselves may also serve as the points for iscussion by the users. Hence, in this paper we iscuss a two-stage algorithm which picks opinionate sentences from the articles. The algorithm assumes an unerlying structure for an article, that is, each opinionate sentence is supporte by a few factual statements that justify the opinion. We use the HITS schema to exploit this unerlying structure an pick opinionate sentences from the article. The main contribtutions of this papers are as follows. First, we present a novel two-stage framework for extracting opinionate sentences from a news article. Seconly, we propose a new evaluation metric that takes into account the fact that since the amount of polarity (an thus, the number of opinionate sentences) within ocuments can vary a lot an thus, we shoul stress on the ratio of opinionate sentences in the top sentences, relative to the ratio of opinionate sentences in the article. Finally, iscussions on how the propose algorithm captures the unerlying structure of the opinions an surrouning facts in a news article reveal that the algorithm oes much more than just extracting opinionate sentences. This paper has been organise as follows. Section 2 iscusses relate work in this fiel. In section 3, we iscuss our two-stage moel in further etails. Section 4 iscusses the experimental framework an the results. Further iscussions on the unerlying assumption behin using HITS along with error analysis are carrie out in Section 5. Conclusions an future work are etaile in Section 6. 2 Relate Work Opinion mining has rawn a lot of attention in recent years. Research works have focuse on mining 25 Proceeings of TextGraphs-9: the workshop on Graph-base Methos for Natural Language Processing, pages 25 33, October 29, 2014, Doha, Qatar. c 2014 Association for Computational Linguistics

2 opinions from various information sources such as blogs (Conra an Schiler, 2007; Harb et al., 2008), prouct reviews (Hu an Liu, 2004; Qair, 2009; Dave et al., 2003), news articles (Kim an Hovy, 2006; Hu an Liu, 2006) etc. Various aspects in opinion mining have been explore over the years (Ku et al., 2006). One important imension is to ientify the opinion holers as well as opinion targets. (Lu, 2010) use epenency parser to ientify the opinion holers an targets in Chinese news text. (Choi et al., 2005) use Conitional Ranom Fiels to ientify the sources of opinions from the sentences. (Kobayashi et al., 2005) propose a learning base anaphora resolution technique to extract the opinion tuple < Subject, Attribute, V alue >. Opinion summarization has been another important aspect (Kim et al., 2013). A lot of research work has been one for opinion mining from prouct reviews where most of the text is opinion-rich. Opinion mining from news articles, however, poses its own challenges because in contrast with the prouct reviews, not all parts of news articles present opinions (Balahur et al., 2013) an thus fining opinionate sentences itself remains a major obstacle. Our work mainly focus on classifying a sentence in a news article as opinionate or factual. There have been works on sentiment classification (Wiebe an Riloff, 2005) but the task of fining opinionate sentences is ifferent from fining sentiments, because sentiments mainly convey the emotions an not the opinions. There has been research on fining opinionate sentences from various information sources. Some of these works utilize a ictionary-base (Fei et al., 2012) or regular pattern base (Brun, 2012) approach to ientify aspects in the sentences. (Kim an Hovy, 2006) utilize the presence of a single strong valence wors as well as the total valence score of all wors in a sentence to ientify opinion-bearing sentences. (Zhai et al., 2011) work on fining evaluative sentences in online iscussions. They exploit the inter-relationship of aspects, evaluation wors an emotion wors to reinforce each other. Thus, while ours is not the first attempt at opinion extraction from news articles, to the best of our knowlege, none of the previous works has exploite the global structure of a news article to classify a sentence as opinionate/factual. Though summarization algorithms (Erkan an Raev, 2004; Goyal et al., 2013) utilize the similarity between sentences in an article to fin the important sentences, our formulation is ifferent in that we conceptualize two ifferent kins of noes in a ocument, as oppose to the summarization algorithms, which treat all the sentences equally. In the next section, we escribe the propsoe two-stage algorithm in etail. 3 Our Approach Figure 1 gives a flowchart of the propose two-stage metho for extracting opinionate sentences from news articles. First, each news article is pre-processe to get the epenency parse as well as the TF-IDF vector corresponing to each of the sentences present in the article. Then, various features are extracte from these sentences which are use as input to the Naïve Bayes classifier, as will be escribe in Section 3.1. The Naïve Bayes classifier, which correspons to the first-stage of our metho, assigns a probability score to each sentence as being an opinionate sentence. In the secon stage, the entire article is viewe as a complete an irecte graph with eges from every sentence to all other sentences, each ege having a weight suitably compute. Iterative HITS algorithm is applie to the sentence graph, with opinionate sentences conceptualize as hubs an factual sentences conceptualize as authorities. The two stages of our approach are etaile below. 3.1 Naïve Bayes Classifier The Naïve Bayes classifier assigns the probability for each sentence being opinionate. The classifier is traine on 70 News articles from politics omain, sentences of which were marke by a group of annotators as being opinionate or factual. Each sentence was marke by two annotators. The inter-annotator agreement using Cohen s kappa coefficient was foun to be The features utilize for the classifier are etaile in Table 1. These features were aapte from those reporte in (Qair, 2009; Yu an Hatzivassiloglou, 2003). A list of positive an negative polar wors, further expane using wornet synsets was taken from (Kim an Hovy, 2005). Stanfor epenency parser (De Marneffe et al., 2006) was utilize to compute the epenencies for each sentence within the news article. After the features are extracte from the sentences, we use the Weka implementation of Naïve Bayes to train the classifier 1. Table 1: Features List for the Naïve Bayes Classifier 1. Count of positive polar wors 2. Count of negative polar wors 3. Polarity of the root verb of the sentence 4. Presence of acomp, xcomp an avmo epenencies in the sentence 3.2 HITS The Naïve Bayes classifier as iscusse in Section 3.1 utilizes only the local features within a sentence. Thus, the probability that a sentence is opinionate remains

3 Figure 1: Flow Chart of Various Stages in Our Approach inepenent of its context as well as the ocument structure. The main motivation behin formulating this problem in HITS schema is to utilize the hien link structures among sentences. HITS stans for Hyperlink-Inuce Topic Search ; Originally, this algorithm was evelope to rank Web-pages, with a particular insight that some of the webpages (Hubs) serve as catalog of information, that coul lea users irectly to the other pages, which actually containe the information (Authorities). The intuition behin applying HITS for the task of opinion extraction came from the following assumption about unerlying structure of an article. A news article pertains to a specific theme an with that theme in min, the author presents certain opinions. These opinions are justifie with the facts present in the article itself. We conceptualize the opinionate sentences as Hubs an the associate facts for an opinionate sentence as Authorities for this Hub. To escribe the formulation of HITS parameters, let us give the notations. Let us enote a ocument D using a set of sentences {S 1, S 2,..., S i,..., S n }, where n correspons to the number of sentences in the ocument D. We construct the sentence graph where noes in the graph correspon to the sentences in the ocument. Let H i an A i enote the hub an authority scores for sentence S i. In HITS, the eges always flow from a Hub to an Authority. In the original HITS algorithm, each ege is given the same weight. However, it has been reporte that using weights in HITS upate improves the performance significantly (Li et al., 2002). In our formulation, since each noe has a non-zero probablility of acting as a hub as well as an authority, we have outgoing as well as incoming eges for every noe. Therefore, the weights are assigne, keeping in min the proximity between sentences as well as the probability (of being opinionate/factual) assigne by the classifier. The following criteria were use for eciing the weight function. An ege in the HITS graph goes from a hub (source noe) to an authority (target noe). So, the ege weight from a source noe to a target noe shoul be higher if the source noe has a high hub score. A fact corresponing to an opinionate sentence shoul be iscussing the same topic. So, the ege weight shoul be higher if the sentences are more similar. It is more probable that the facts aroun an opinion appear closer to that opinionate sentence in the article. So, the ege weight from a source to target noe ecreases as the istance between the two sentences increases. Let W be the weight matrix such that W ij enotes the weight for the ege from the sentence S i to the sentence S j. Base on the criteria outline above, we formulate that the weight W ij shoul be such that W ij H i W ij Sim ij W ij 1 ist ij where we use cosine similarity between the sentence vectors to compute Sim ij. ist ij is simply the number 27

4 of sentences separating the source an target noe. Various combinations of these factors were trie an will be iscusse in section 4. While factors like sentence similarity an istance are symmetric, having the weight function epen on the hub score makes it asymmetric, consistent with the basic iea of HITS. Thus, an ege from the sentence S i to S j is given a high weight if S i has a high probability score of being opinionate (i.e., acting as hub) as obtaine the classifier. Now, for applying the HITS algorithm iteratively, the Hubs an Authorities scores for each sentence are initialize using the probability scores assigne by the classifier. That is, if P i (Opinion) enotes the probability that S i is an opinionate sentence as per the Naïve Bayes Classifier, H i (0) is initialize to P i (Opinion) an A i (0) is initialize to 1 P i (Opinion). The iterative HITS is then applie as follows: H i (k) = Σ j W ij A i (k 1) (1) A i (k) = Σ j W ji H i (k 1) (2) where H i (k) enote the hub score for the i th sentence uring the k th iteration of HITS. The iteration is stoppe once the mean square error between the Hub an Authority values at two ifferent iterations is less than a threshol ɛ. After the HITS iteration is over, five sentences having the highest Hub scores are returne by the system. 4 Experimental Framework an Results The experiment was conucte with 90 news articles in politics omain from Yahoo! website. The sentences in the articles were marke as opinionate or factual by a group of annotators. In the training set, 1393 out of 3142 sentences were foun to be opinianate. In the test set, 347 out of 830 sentences were marke as opinionate. Out of these 90 articles, 70 articles were use for training the Naïve Bayes classifier as well as for tuning various parameters. The rest 20 articles were use for testing. The evaluation was one in an Information Retrieval setting. That is, the system returns the sentences in a ecreasing orer of their score (or probability in the case of Naïve Bayes) as being opinionate. We then utilize the human jugements (provie by the annotators) to compute precision at various points. Let op(.) be a binary function for a given rank such that op(r) = 1 if the sentence returne as rank r is opinionate as per the human jugements. A precision is calculate as follows: k r=1 = op(r) k (3) While the precision at various points inicates how reliable the results returne by the system are, it oes not take into account the fact that some of the ocuments are opinion-rich an some are not. For the opinion-rich ocuments, a high value might be similar to picking sentences ranomly, whereas for the ocuments with a very few opinions, even a lower value might be useful. We, therefore, evise another evaluation metric M@k that inicates the ratio of opinionate sentences at any point, normalize with respect to the ratio of opinionate sentences in the article. Corresponingly, an M@k value is calculate as M@k = Ratio op (4) where Ratio op enotes the fraction of opinionate sentences in the whole article. Thus Ratio op = Number of opinionate sentences Number of sentences (5) The parameters that we neee to fix for the HITS algorithm were the weight function W ij an the threshol ɛ at which we stop the iteration. We varie ɛ from to 0.1 multiplying it by 10 in each step. The results were not sensitive to the value of ɛ an we use ɛ = For fixing the weight function, we trie out various combinations using the criteria outline in Section 3.2. Various weight functions an the corresponing an M@5 scores are shown in Table 2. Firstly, we varie k in Sim k ij an foun that the square of the similarity function gives better results. Then, keeping it constant, we varie l in H l i an foun the best results for l = 3. Then, keeping both of these constants, we varie α in (α + 1 ). We foun the best results for α = 1.0. With this α, we trie to vary l again but it only reuce the final score. Therefore, we fixe the weight function to be W ij = H i 3 (0)Sim ij 2 (1 + 1 ist ij ) (6) Note that H i (0) in Equation 6 correspons to the probablity assigne by the classifier that the sentence S i is opinionate. We use the classifier results as the baseline for the comparisons. The secon-stage HITS algorithm is then applie an we compare the performance with respect to the classifier. Table 3 shows the comparison results for various precision scores for the classifier an the HITS algorithm. In practical situation, an eitor requires quick ientification of 3-5 opinionate sentences from the article, which she can then use to formulate questions. We thus report an M@k values for k = 3 an k = 5. From the results shown in Table 3, it is clear that applying the secon-stage HITS over the Naïve Bayes Classifier improves the performance by a large egree, both in term of an M@k. For instance, the first-stage NB Classifier gives a of 0.52 an of Using the classifier outputs uring the secon-stage HITS algorithm improves the 28

5 Table 2: Average an M@5 scores: Performance comparison between various functions for W ij Function M@5 Sim ij Sim 2 ij Sim 3 ij Sim 2 ijh i Sim 2 2 ijh i Sim 2 3 ijh i Sim 2 4 ijh i Sim ijh i Sim 2 ijh 3 i ( ) Sim 2 ijh 3 i ( ) Sim 2 ijh 3 i ( ) Sim 2 ijh 3 i ( ) Sim 2 ijh 3 i (1 + 1 ) Sim 2 ijh 3 i ( ) Sim 2 ijh 2 i (1 + 1 ) Table 3: Average M@5, an M@3 scores: Performance comparison between the NB classifier an HITS System P@5 M@5 P@3 M@3 NB Classifier HITS Imp. (%) preformance by 21.2% to 0.63 in the case of For the improvements were much more significant an a 35.8% improvement was obtaine over the NB classifier. M@5 an M@3 scores also improve by 17.7% an 30.8% respectively. Strikingly, while the classifier gave nearly the same scores for an M@k for k = 3 an k = 5, HITS gave much better results for k = 3 than k = 5. Specially, the an M@3 scores obtaine by HITS were very encouraging, inicating that the propose approach helps in pushing the opinionate sentences to the top. This clearly shows the avantage of using the global structure of the ocument in contrast with the features extracte from the sentence itself, ignoring the context. Figures 2 an 3 show the M@5, an M@3 scores for iniviual ocuments as numbere from 1 to 20 on the X-axis. The articles are sorte as per the ratio of (an M@5) obtaine using the HITS an NB classifier. Y-axis shows the corresponing scores. Two ifferent lines are use to represent the results as returne by the classifier an the HITS algorithm. A ashe line enotes the scores obtaine by HITS while a continuous line enotes the scores obtaine by the NB classifier. A etaile analysis of these figures can help us raw the following conclusions: For 40% of the articles (numbere 13 to 20) HITS improves over the baseline NB classifier. For 40% of the articles (numbere 5 to 12) the results provie by HITS were the same as that of the baseline. For 20% of the articles (numbere 1 to 4) HITS gives a performance lower than that of the baseline. Thus, for 80% of the ocuments, the secon-stage performs at least as goo as the first stage. This inicates that the secon-stage HITS is quite robust. M@5 results are much more robust for the HITS, with 75% of the ocuments having an M@5 score > 1. An M@k score > 1 inicates that the ratio of opinionate sentences in top k sentences, picke up by the algorithm, is higher than the overall ratio in the article. For 45% of the articles, (numbere 6, 9 11 an 15 20), HITS was able to achieve a = 1.0. Thus, for these 9 articles, the top 3 sentences picke up by the algorithm were all marke as opinionate. The graphs also inicate a high correlation between the results obtaine by the NB classifier an HITS. We use Pearson s correlation to fin the correlation strength. For the values, the correlation was foun to be an for the M@5 values, the correlation was obtaine as In the next section, we will first attempt to further analyze the basic assumption behin using HITS, by looking at some actual Hub-Authority structures, capture by the algorithm. We will also take some cases of failure an perform error analysis. 5 Discussion First point that we wante to verify was, whether HITS is really capturing the unerlying structure of the ocument. That is, are the sentences ientifie as authorities for a given hub really correspon to the facts supporting the particular opinion, expresse by the hub sentence. Figure 4 gives two examples of the Hub-Authority structure, as capture by the HITS algorithm, for two ifferent articles. For each of these examples, we show the sentence ientifie as Hub in the center along with the top four sentences, ientifie as Authorities for that hub. We also give the annotations as to whether the sentences were marke as opinionate or factual by the annotators. In both of these examples, the hubs were actually marke as opinionate by the annotators. Aitionally, we fin that all the four sentences, ientifie as authorities to the hub, are very relevant to the opinion expresse by the hub. In the first example, top 3 authority sentences are marke as factual by the annotator. Although the fourth sentence is marke as opinionate, it can be seen that this sentence presents a supporting opinion for the hub sentence. While stuying the secon example, we foun that while the first authority oes not present an important fact, the fourth authority surely oes. Both of these 29

6 (a) Comparison of values (b) Comparison of values Figure 2: Comparison Results for 20 Test articles between the Classifier an HITS: an (a) Comparison of values (b) Comparison of values Figure 3: Comparison Results for 20 Test articles between the Classifier an HITS: an (a) Hub-Authority Structure: Example 1 (b) Hub-Authority Structure: Example 2 Figure 4: Example from two ifferent test articles capturing the Hub-Authority Structure were marke as factual by the annotators. In this particular example, although the secon an thir authority sentences were annotate as opinionate, these can be seen as supporting the opinion expresse by the hub sentence. This example also gives us an interesting iea to improve iversification in the final results. That is, once an opinionate sentence is ientifie by the algorithm, the hub score of all its suthorities can be reuce proportional to the ege weight. This will reuce the chances of the supporting opinions being reurne by the system, at a later stage as a main opinion. We then attempte to test our tool on a recently publishe article, What s Wrong with a Meritocracy Rug? 2. The tool coul pick up a very 2 whats-wrong-meritocracy-rug html 30

7 important opinion in the article, Most people ten to think that the most qualifie person is someone who looks just like them, only younger., which was ranke 2 n by the system. The supporting facts an opinions for this sentence, as iscovere by the algorithm were also quite relevant. For instance, the top two authorities corresponing to this sentence hub were: 1. An that appreciation, we learne painfully, can easily be tinge with all kins of genere elements without the person who is making the ecisions even realizing it. 2. An many of the traits we value, an how we value them, also en up being laen with gener overtones. 5.1 Error Analysis We then trie to analyze certain cases of failures. Firstly, we wante to unerstan why HITS was not performing as goo as the classifier for 3 articles (Figures 2 an 3). The analysis reveale that the supporting sentences for the opinionate sentences, extracte by the classifier, were not very similar on the textual level. Thus a low cosine similarity score resulte in having lower ege weights, thereby getting a lower hub score after applying HITS. For one of the articles, the sentence picke up by HITS was wrongly annotate as a factual sentence. Then, we looke at one case of failure ue to the error introuce by the classifier prior probablities. For instance, the sentence, The civil war between establishment an tea party Republicans intensifie this week when House Speaker John Boehner slamme outsie conservative groups for riiculous pushback against the bipartisan buget agreement which cleare his chamber Thursay. was classifie as an opinionante sentence, whereas this is a factual sentence. Looking closely, we foun that the sentence contains three polar wors (marke in bol), as well as an avm o epenency between the pair (slamme,when). Thus the sentence got a high initial prior by the classifier. As a result, the outgoing eges from this noe got a higher H 3 i factor. Some of the authorities ientifie for this sentence were: For Democrats, the tea party is the gift that keeps on giving. Tea party sympathetic organizations, Boehner later sai, are pushing our members in places where they on t want to be. which ha wors, similar to the original sentence, thus having a higher Sim ij factor as well. We foun that these sentences were also very close within the article. Thus, a high hub prior along with a high outgoing weight gave rise to this sentence having a high hub score after the HITS iterations. 5.2 Online Interface To facilitate easy usage an unerstaning of the system by others, a web interface has been built for the system 3. The webpage caters for users to either input a new article in form of text to get top opinionate sentences or view the output analysis of the system over manually marke test ata consisting of 20 articles. The wors in green color are positive polar wors, re inicates negative polar wors. Wors marke in violet are the root verbs of the sentences. The colore graph shows top ranke opinionate sentences in yellow box along with top supporting factual sentences for that particluar opinionate sentence in purple boxes. Snapshots from the online interface are provie in Figures 5 an 6. 6 Conclusions an Future Work In this paper, we presente a novel two-stage framework for extracting the opinionate sentences in the news articles. The problem of ientifying top opinionate sentences from news articles is very challenging, especially because the opinions are not as explicit in a news article as in a iscussion forum. It was also evient from the inter-annotator agreement an the kappa coefficient was foun to be The experiments conucte over 90 News articles (70 for training an 20 for testing) clearly inicate that the propose two-stage metho almost always improves the performance of the baseline classifier-base approach. Specifically, the improvements are much higher for an M@3 scores (35.8% an 30.8% over the NB classifier). An M@3 score of 1.5 an score of 0.72 inicates that the propose metho was able to push the opinionate sentences to the top. On an average, 2 out of top 3 sentences returne by the system were actually opinionate. This is very much esire in a practical scenario, where an eitor requires quick ientification of 3-5 opinionate sentences, which she can then use to formulate questions. The examples iscusse in Section 5 bring out another important aspect of the propose algorithm. In aition to the main objective of extracting the opinionate sentences within the article, the propose metho actually iscovers the unerlying structure of the article an woul certainly be useful to present various opinions, groupe with supporting facts as well as supporting opinions in the article. While the initial results are encouraging, there is scope for improvement. We saw that the results obtaine via HITS were highly correlate with the Naïve Bayes classifier results, which were use in assigning a weight to the ocument graph. One irection for the future work woul be to experiment with other features to improve the precision of the classifier. Aitionally, in the current evaluation, we are not evaluating the egree of iversity of the opinions returne by the system. The Hub-Authority 3 available at resgrp/cnerg/temp2/final.php 31

8 Figure 5: Screenshot from the Web Interface Figure 6: Hub-Authority Structure as output on the Web Interface structure of the secon example gives us an interesting iea to improve iversification an we woul like to implement that in future. In the future, we woul also like to apply this work to track an event over time, base on the opinionate sentences present in the articles. When an event occurs, articles start out with more factual sentences. Over time, opinions start surfacing on the event, an as the event matures, opinions preominate the facts in the articles. For example, a set of articles on a plane crash woul start out as factual, an woul offer expert opinions over time. This work can be use to plot the maturity of the meia coverage by keeping track of facts v/s opinions on any event, an this can be use by organizations to provie a timeline for the event. We woul also like to experiment with this moel on a ifferent meia like microblogs. References Alexanra Balahur, Ralf Steinberger, Mijail Kabajov, Vanni Zavarella, Erik Van Der Goot, Matina Halkia, Bruno Pouliquen, an Jenya Belyaeva Sentiment analysis in the news. arxiv preprint arxiv: Caroline Brun Learning opinionate patterns for contextual opinion etection. In COLING (Posters), pages Yejin Choi, Claire Carie, Ellen Riloff, an Siharth Patwarhan Ientifying sources of opinions with conitional ranom fiels an extraction 32

9 patterns. In Proceeings of the conference on Human Language Technology an Empirical Methos in Natural Language Processing, pages Association for Computational Linguistics. Jack G Conra an Frank Schiler Opinion mining in legal blogs. In Proceeings of the 11th international conference on Artificial intelligence an law, pages ACM. Kushal Dave, Steve Lawrence, an Davi M Pennock Mining the peanut gallery: Opinion extraction an semantic classification of prouct reviews. In Proceeings of the 12th international conference on Worl Wie Web, pages ACM. Marie-Catherine De Marneffe, Bill MacCartney, Christopher D Manning, et al Generating type epenency parses from phrase structure parses. In Proceeings of LREC, volume 6, pages Günes Erkan an Dragomir R Raev Lexrank: Graph-base lexical centrality as salience in text summarization. J. Artif. Intell. Res.(JAIR), 22(1): Geli Fei, Bing Liu, Meichun Hsu, Malu Castellanos, an Rihiman Ghosh A ictionary-base approach to ientifying aspects im-plie by ajectives for opinion mining. In Proceeings of COLING 2012 (Posters). Pawan Goyal, Laxmihar Behera, an Thomas Martin McGinnity A context-base wor inexing moel for ocument summarization. Knowlege an Data Engineering, IEEE Transactions on, 25(8): Ali Harb, Michel Plantié, Gerar Dray, Mathieu Roche, François Trousset, an Pascal Poncelet Web opinion mining: How to extract opinions from blogs? In Proceeings of the 5th international conference on Soft computing as transisciplinary science an technology, pages ACM. Minqing Hu an Bing Liu Mining opinion features in customer reviews. In Proceeings of Nineteeth National Conference on Artificial Intellgience (AAAI). Minqing Hu an Bing Liu Opinion extraction an summarization on the web. In AAAI, volume 7, pages Soo-Min Kim an Euar Hovy Automatic etection of opinion bearing wors an sentences. In Proceeings of IJCNLP, volume 5. Soo-Min Kim an Euar Hovy Extracting opinions, opinion holers, an topics expresse in online news meia text. In Proceeings of the Workshop on Sentiment an Subjectivity in Text, pages 1 8. Association for Computational Linguistics. Hyun Duk Kim, Malu Castellanos, Meichun Hsu, ChengXiang Zhai, Umeshwar Dayal, an Rihiman Ghosh Compact explanatory opinion summarization. In Proceeings of the 22n ACM international conference on Conference on information & knowlege management, pages ACM. Nozomi Kobayashi, Ryu Iia, Kentaro Inui, an Yuji Matsumoto Opinion extraction using a learning-base anaphora resolution technique. In The Secon International Joint Conference on Natural Language Processing (IJCNLP), Companion Volume to the Proceeing of Conference incluing Posters/Demos an Tutorial Abstracts. Lun-Wei Ku, Yu-Ting Liang, an Hsin-Hsi Chen Opinion extraction, summarization an tracking in news an blog corpora. In AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs, volume Longzhuang Li, Yi Shang, an Wei Zhang Improvement of hits-base algorithms on web ocuments. In Proceeings of the 11th international conference on Worl Wie Web, pages ACM. Bin Lu Ientifying opinion holers an targets with epenency parser in chinese news texts. In Proceeings of Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL. Ashequl Qair Detecting opinion sentences specific to prouct features in customer reviews using type epenency relations. In Proceeings of the Workshop on Events in Emerging Text Types, eetts 09, pages Oleg Rokhlenko an Ian Szpektor Generating synthetic comparable questions for news articles. In ACL, pages Janyce Wiebe an Ellen Riloff Creating subjective an objective sentence classifiers from unannotate texts. In Computational Linguistics an Intelligent Text Processing, pages Springer. Hong Yu an Vasileios Hatzivassiloglou Towars answering opinion questions: Separating facts from opinions an ientifying the polarity of opinion sentences. In Proceeings of the 2003 Conference on Empirical Methos in Natural Language Processing, EMNLP 03, pages Zhongwu Zhai, Bing Liu, Lei Zhang, Hua Xu, an Peifa Jia Ientifying evaluative sentences in online iscussions. In Proceeings of the Twenty-Fifth AAAI Conference on Artificial Intelligence. 33

Sweden, The Baltic States and Poland November 2000

Sweden, The Baltic States and Poland November 2000 Folkbilning co-operation between Sween, The Baltic States an Polan 1990 2000 November 2000 TABLE OF CONTENTS FOREWORD...3 SUMMARY...4 I. CONCLUSIONS FROM THE COUNTRIES...6 I.1 Estonia...8 I.2 Latvia...12

More information

SANTIAGO CANYON COLLEGE Reading & English Placement Testing Information

SANTIAGO CANYON COLLEGE Reading & English Placement Testing Information SANTIAGO CANYON COLLEGE Reaing & English Placement Testing Information DO YOUR BEST on the Reaing & English Placement Test The Reaing & English placement test is esigne to assess stuents skills in reaing

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Vijayshri Ramkrishna Ingale PG Student, Department of Computer Engineering JSPM s Imperial College of Engineering &

More information

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,

More information

SPECIAL ARTICLES Pharmacy Education in Vietnam

SPECIAL ARTICLES Pharmacy Education in Vietnam American Journal of Pharmaceutical Eucation 2013; 77 (6) Article 114. SPECIAL ARTICLES Pharmacy Eucation in Vietnam Thi-Ha Vo, MSc, a,b Pierrick Beouch, PharmD, PhD, b,c Thi-Hoai Nguyen, PhD, a Thi-Lien-Huong

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

Multilingual Sentiment and Subjectivity Analysis

Multilingual Sentiment and Subjectivity Analysis Multilingual Sentiment and Subjectivity Analysis Carmen Banea and Rada Mihalcea Department of Computer Science University of North Texas rada@cs.unt.edu, carmen.banea@gmail.com Janyce Wiebe Department

More information

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS 1 CALIFORNIA CONTENT STANDARDS: Chapter 1 ALGEBRA AND WHOLE NUMBERS Algebra and Functions 1.4 Students use algebraic

More information

Extracting and Ranking Product Features in Opinion Documents

Extracting and Ranking Product Features in Opinion Documents Extracting and Ranking Product Features in Opinion Documents Lei Zhang Department of Computer Science University of Illinois at Chicago 851 S. Morgan Street Chicago, IL 60607 lzhang3@cs.uic.edu Bing Liu

More information

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

The Good Judgment Project: A large scale test of different methods of combining expert predictions

The Good Judgment Project: A large scale test of different methods of combining expert predictions The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Grade 6: Correlated to AGS Basic Math Skills

Grade 6: Correlated to AGS Basic Math Skills Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Efficient Online Summarization of Microblogging Streams

Efficient Online Summarization of Microblogging Streams Efficient Online Summarization of Microblogging Streams Andrei Olariu Faculty of Mathematics and Computer Science University of Bucharest andrei@olariu.org Abstract The large amounts of data generated

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

Variations of the Similarity Function of TextRank for Automated Summarization

Variations of the Similarity Function of TextRank for Automated Summarization Variations of the Similarity Function of TextRank for Automated Summarization Federico Barrios 1, Federico López 1, Luis Argerich 1, Rosita Wachenchauzer 12 1 Facultad de Ingeniería, Universidad de Buenos

More information

Postprint.

Postprint. http://www.diva-portal.org Postprint This is the accepted version of a paper presented at CLEF 2013 Conference and Labs of the Evaluation Forum Information Access Evaluation meets Multilinguality, Multimodality,

More information

Movie Review Mining and Summarization

Movie Review Mining and Summarization Movie Review Mining and Summarization Li Zhuang Microsoft Research Asia Department of Computer Science and Technology, Tsinghua University Beijing, P.R.China f-lzhuang@hotmail.com Feng Jing Microsoft Research

More information

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade The third grade standards primarily address multiplication and division, which are covered in Math-U-See

More information

Cal s Dinner Card Deals

Cal s Dinner Card Deals Cal s Dinner Card Deals Overview: In this lesson students compare three linear functions in the context of Dinner Card Deals. Students are required to interpret a graph for each Dinner Card Deal to help

More information

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability Shih-Bin Chen Dept. of Information and Computer Engineering, Chung-Yuan Christian University Chung-Li, Taiwan

More information

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering

More information

PNR 2 : Ranking Sentences with Positive and Negative Reinforcement for Query-Oriented Update Summarization

PNR 2 : Ranking Sentences with Positive and Negative Reinforcement for Query-Oriented Update Summarization PNR : Ranking Sentences with Positive and Negative Reinforcement for Query-Oriented Update Summarization Li Wenie, Wei Furu,, Lu Qin, He Yanxiang Department of Computing The Hong Kong Polytechnic University,

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract

More information

Evidence for Reliability, Validity and Learning Effectiveness

Evidence for Reliability, Validity and Learning Effectiveness PEARSON EDUCATION Evidence for Reliability, Validity and Learning Effectiveness Introduction Pearson Knowledge Technologies has conducted a large number and wide variety of reliability and validity studies

More information

success. It will place emphasis on:

success. It will place emphasis on: 1 First administered in 1926, the SAT was created to democratize access to higher education for all students. Today the SAT serves as both a measure of students college readiness and as a valid and reliable

More information

Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models

Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models Jung-Tae Lee and Sang-Bum Kim and Young-In Song and Hae-Chang Rim Dept. of Computer &

More information

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Using Web Searches on Important Words to Create Background Sets for LSI Classification Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract

More information

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio SCSUG Student Symposium 2016 Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio Praneth Guggilla, Tejaswi Jha, Goutam Chakraborty, Oklahoma State

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

A Comparison of Two Text Representations for Sentiment Analysis

A Comparison of Two Text Representations for Sentiment Analysis 010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational

More information

Extending Place Value with Whole Numbers to 1,000,000

Extending Place Value with Whole Numbers to 1,000,000 Grade 4 Mathematics, Quarter 1, Unit 1.1 Extending Place Value with Whole Numbers to 1,000,000 Overview Number of Instructional Days: 10 (1 day = 45 minutes) Content to Be Learned Recognize that a digit

More information

Radius STEM Readiness TM

Radius STEM Readiness TM Curriculum Guide Radius STEM Readiness TM While today s teens are surrounded by technology, we face a stark and imminent shortage of graduates pursuing careers in Science, Technology, Engineering, and

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Transfer Learning Action Models by Measuring the Similarity of Different Domains Transfer Learning Action Models by Measuring the Similarity of Different Domains Hankui Zhuo 1, Qiang Yang 2, and Lei Li 1 1 Software Research Institute, Sun Yat-sen University, Guangzhou, China. zhuohank@gmail.com,lnslilei@mail.sysu.edu.cn

More information

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

How do adults reason about their opponent? Typologies of players in a turn-taking game

How do adults reason about their opponent? Typologies of players in a turn-taking game How do adults reason about their opponent? Typologies of players in a turn-taking game Tamoghna Halder (thaldera@gmail.com) Indian Statistical Institute, Kolkata, India Khyati Sharma (khyati.sharma27@gmail.com)

More information

Georgetown University at TREC 2017 Dynamic Domain Track

Georgetown University at TREC 2017 Dynamic Domain Track Georgetown University at TREC 2017 Dynamic Domain Track Zhiwen Tang Georgetown University zt79@georgetown.edu Grace Hui Yang Georgetown University huiyang@cs.georgetown.edu Abstract TREC Dynamic Domain

More information

Statewide Framework Document for:

Statewide Framework Document for: Statewide Framework Document for: 270301 Standards may be added to this document prior to submission, but may not be removed from the framework to meet state credit equivalency requirements. Performance

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

TextGraphs: Graph-based algorithms for Natural Language Processing

TextGraphs: Graph-based algorithms for Natural Language Processing HLT-NAACL 06 TextGraphs: Graph-based algorithms for Natural Language Processing Proceedings of the Workshop Production and Manufacturing by Omnipress Inc. 2600 Anderson Street Madison, WI 53704 c 2006

More information

Causal Link Semantics for Narrative Planning Using Numeric Fluents

Causal Link Semantics for Narrative Planning Using Numeric Fluents Proceedings, The Thirteenth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-17) Causal Link Semantics for Narrative Planning Using Numeric Fluents Rachelyn Farrell,

More information

Using Games with a Purpose and Bootstrapping to Create Domain-Specific Sentiment Lexicons

Using Games with a Purpose and Bootstrapping to Create Domain-Specific Sentiment Lexicons Using Games with a Purpose and Bootstrapping to Create Domain-Specific Sentiment Lexicons Albert Weichselbraun University of Applied Sciences HTW Chur Ringstraße 34 7000 Chur, Switzerland albert.weichselbraun@htwchur.ch

More information

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer

More information

Learning Methods for Fuzzy Systems

Learning Methods for Fuzzy Systems Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

Multi-Lingual Text Leveling

Multi-Lingual Text Leveling Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency

More information

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,

More information

Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews

Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews Kang Liu, Liheng Xu and Jun Zhao National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy

More information

Beyond the Pipeline: Discrete Optimization in NLP

Beyond the Pipeline: Discrete Optimization in NLP Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We

More information

Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities

Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities Soto Montalvo GAVAB Group URJC Raquel Martínez NLP&IR Group UNED Arantza Casillas Dpt. EE UPV-EHU Víctor Fresno GAVAB

More information

Functional Skills Mathematics Level 2 assessment

Functional Skills Mathematics Level 2 assessment Functional Skills Mathematics Level 2 assessment www.cityandguilds.com September 2015 Version 1.0 Marking scheme ONLINE V2 Level 2 Sample Paper 4 Mark Represent Analyse Interpret Open Fixed S1Q1 3 3 0

More information

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview Algebra 1, Quarter 3, Unit 3.1 Line of Best Fit Overview Number of instructional days 6 (1 day assessment) (1 day = 45 minutes) Content to be learned Analyze scatter plots and construct the line of best

More information

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Marek Jaszuk, Teresa Mroczek, and Barbara Fryc University of Information Technology and Management, ul. Sucharskiego

More information

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE Edexcel GCSE Statistics 1389 Paper 1H June 2007 Mark Scheme Edexcel GCSE Statistics 1389 NOTES ON MARKING PRINCIPLES 1 Types of mark M marks: method marks A marks: accuracy marks B marks: unconditional

More information

Leveraging Sentiment to Compute Word Similarity

Leveraging Sentiment to Compute Word Similarity Leveraging Sentiment to Compute Word Similarity Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya Dept. of Computer Science and Engineering, IIT Bombay 6th International Global

More information

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C Using and applying mathematics objectives (Problem solving, Communicating and Reasoning) Select the maths to use in some classroom

More information

A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique

A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique Hiromi Ishizaki 1, Susan C. Herring 2, Yasuhiro Takishima 1 1 KDDI R&D Laboratories, Inc. 2 Indiana University

More information

Extracting Verb Expressions Implying Negative Opinions

Extracting Verb Expressions Implying Negative Opinions Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Extracting Verb Expressions Implying Negative Opinions Huayi Li, Arjun Mukherjee, Jianfeng Si, Bing Liu Department of Computer

More information

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology Tiancheng Zhao CMU-LTI-16-006 Language Technologies Institute School of Computer Science Carnegie Mellon

More information

Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking

Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking Catherine Pearn The University of Melbourne Max Stephens The University of Melbourne

More information

Visit us at:

Visit us at: White Paper Integrating Six Sigma and Software Testing Process for Removal of Wastage & Optimizing Resource Utilization 24 October 2013 With resources working for extended hours and in a pressurized environment,

More information

Fragment Analysis and Test Case Generation using F- Measure for Adaptive Random Testing and Partitioned Block based Adaptive Random Testing

Fragment Analysis and Test Case Generation using F- Measure for Adaptive Random Testing and Partitioned Block based Adaptive Random Testing Fragment Analysis and Test Case Generation using F- Measure for Adaptive Random Testing and Partitioned Block based Adaptive Random Testing D. Indhumathi Research Scholar Department of Information Technology

More information

Mining Topic-level Opinion Influence in Microblog

Mining Topic-level Opinion Influence in Microblog Mining Topic-level Opinion Influence in Microblog Daifeng Li Dept. of Computer Science and Technology Tsinghua University ldf3824@yahoo.com.cn Jie Tang Dept. of Computer Science and Technology Tsinghua

More information

Detecting English-French Cognates Using Orthographic Edit Distance

Detecting English-French Cognates Using Orthographic Edit Distance Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National

More information

Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge

Resolving Complex Cases of Definite Pronouns: The Winograd Schema Challenge Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Jeju Island, South Korea, July 2012, pp. 777--789.

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

1.11 I Know What Do You Know?

1.11 I Know What Do You Know? 50 SECONDARY MATH 1 // MODULE 1 1.11 I Know What Do You Know? A Practice Understanding Task CC BY Jim Larrison https://flic.kr/p/9mp2c9 In each of the problems below I share some of the information that

More information

Patterns for Adaptive Web-based Educational Systems

Patterns for Adaptive Web-based Educational Systems Patterns for Adaptive Web-based Educational Systems Aimilia Tzanavari, Paris Avgeriou and Dimitrios Vogiatzis University of Cyprus Department of Computer Science 75 Kallipoleos St, P.O. Box 20537, CY-1678

More information

Comment-based Multi-View Clustering of Web 2.0 Items

Comment-based Multi-View Clustering of Web 2.0 Items Comment-based Multi-View Clustering of Web 2.0 Items Xiangnan He 1 Min-Yen Kan 1 Peichu Xie 2 Xiao Chen 3 1 School of Computing, National University of Singapore 2 Department of Mathematics, National University

More information

Semantic and Context-aware Linguistic Model for Bias Detection

Semantic and Context-aware Linguistic Model for Bias Detection Semantic and Context-aware Linguistic Model for Bias Detection Sicong Kuang Brian D. Davison Lehigh University, Bethlehem PA sik211@lehigh.edu, davison@cse.lehigh.edu Abstract Prior work on bias detection

More information

How to read a Paper ISMLL. Dr. Josif Grabocka, Carlotta Schatten

How to read a Paper ISMLL. Dr. Josif Grabocka, Carlotta Schatten How to read a Paper ISMLL Dr. Josif Grabocka, Carlotta Schatten Hildesheim, April 2017 1 / 30 Outline How to read a paper Finding additional material Hildesheim, April 2017 2 / 30 How to read a paper How

More information

Spinners at the School Carnival (Unequal Sections)

Spinners at the School Carnival (Unequal Sections) Spinners at the School Carnival (Unequal Sections) Maryann E. Huey Drake University maryann.huey@drake.edu Published: February 2012 Overview of the Lesson Students are asked to predict the outcomes of

More information

Truth Inference in Crowdsourcing: Is the Problem Solved?

Truth Inference in Crowdsourcing: Is the Problem Solved? Truth Inference in Crowdsourcing: Is the Problem Solved? Yudian Zheng, Guoliang Li #, Yuanbing Li #, Caihua Shan, Reynold Cheng # Department of Computer Science, Tsinghua University Department of Computer

More information

Dublin City Schools Mathematics Graded Course of Study GRADE 4

Dublin City Schools Mathematics Graded Course of Study GRADE 4 I. Content Standard: Number, Number Sense and Operations Standard Students demonstrate number sense, including an understanding of number systems and reasonable estimates using paper and pencil, technology-supported

More information

Ensemble Technique Utilization for Indonesian Dependency Parser

Ensemble Technique Utilization for Indonesian Dependency Parser Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id

More information

Story Problems with. Missing Parts. s e s s i o n 1. 8 A. Story Problems with. More Story Problems with. Missing Parts

Story Problems with. Missing Parts. s e s s i o n 1. 8 A. Story Problems with. More Story Problems with. Missing Parts s e s s i o n 1. 8 A Math Focus Points Developing strategies for solving problems with unknown change/start Developing strategies for recording solutions to story problems Using numbers and standard notation

More information

Reducing Features to Improve Bug Prediction

Reducing Features to Improve Bug Prediction Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science

More information

Parsing of part-of-speech tagged Assamese Texts

Parsing of part-of-speech tagged Assamese Texts IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA Alta de Waal, Jacobus Venter and Etienne Barnard Abstract Most actionable evidence is identified during the analysis phase of digital forensic investigations.

More information

Writing a Basic Assessment Report. CUNY Office of Undergraduate Studies

Writing a Basic Assessment Report. CUNY Office of Undergraduate Studies Writing a Basic Assessment Report What is a Basic Assessment Report? A basic assessment report is useful when assessing selected Common Core SLOs across a set of single courses A basic assessment report

More information

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology Michael L. Connell University of Houston - Downtown Sergei Abramovich State University of New York at Potsdam Introduction

More information

Automating the E-learning Personalization

Automating the E-learning Personalization Automating the E-learning Personalization Fathi Essalmi 1, Leila Jemni Ben Ayed 1, Mohamed Jemni 1, Kinshuk 2, and Sabine Graf 2 1 The Research Laboratory of Technologies of Information and Communication

More information

Metadiscourse in Knowledge Building: A question about written or verbal metadiscourse

Metadiscourse in Knowledge Building: A question about written or verbal metadiscourse Metadiscourse in Knowledge Building: A question about written or verbal metadiscourse Rolf K. Baltzersen Paper submitted to the Knowledge Building Summer Institute 2013 in Puebla, Mexico Author: Rolf K.

More information

Visual CP Representation of Knowledge

Visual CP Representation of Knowledge Visual CP Representation of Knowledge Heather D. Pfeiffer and Roger T. Hartley Department of Computer Science New Mexico State University Las Cruces, NM 88003-8001, USA email: hdp@cs.nmsu.edu and rth@cs.nmsu.edu

More information

Language Independent Passage Retrieval for Question Answering

Language Independent Passage Retrieval for Question Answering Language Independent Passage Retrieval for Question Answering José Manuel Gómez-Soriano 1, Manuel Montes-y-Gómez 2, Emilio Sanchis-Arnal 1, Luis Villaseñor-Pineda 2, Paolo Rosso 1 1 Polytechnic University

More information