More Accurate Question Answering on Freebase

Size: px
Start display at page:

Download "More Accurate Question Answering on Freebase"

Transcription

1 More Accurae Quesion Answering on Freebase Hannah Bas, Elmar Haussmann Deparmen of Compuer Science Universiy of Freiburg Freiburg, Germany {bas, ABSTRACT Real-world facoid or lis quesions ofen have a simple srucure, ye are hard o mach o facs in a given knowledge base due o high represenaional and linguisic variabiliy. For example, o answer who is he ceo of apple on Freebase requires a mach o an absrac leadership eniy wih hree relaions role, organizaion and person, and wo oher eniies apple inc and managing direcor. Recen years have seen a surge of research aciviy on learning-based soluions for his mehod. We furher advance he sae of he ar by adoping learning-o-rank mehodology and by fully addressing he inheren eniy recogniion problem, which was negleced in recen works. We evaluae our sysem, called Aqqu, on wo sandard benchmarks, Free917 and WebQuesions, improving he previous bes resul for each benchmark considerably. These wo benchmarks exhibi quie differen challenges, and many of he exising approaches were evaluaed (and work well) only for one of hem. We also consider efficiency aspecs and ake care ha all quesions can be answered ineracively (ha is, wihin a second). Maerials for full reproducibiliy are available on our websie: hp://ad.informaik. uni-freiburg.de/publicaions. 1. INTRODUCTION Knowledge bases like Freebase have reached an impressive coverage of general knowledge. The daa is sored in a clean and srucured manner, and can be queried unambiguously via srucured languages like SPARQL. However, given he enormous amoun of informaion (2.9 billion riples for Freebase), mapping a search desire o he righ query can be an exremely hard ask even for an exper user. For example, consider he (seemingly) simple quesion who is he ceo of apple. The answer is indeed conained in Freebase, and he corresponding SPARQL query 1 is: 1 For he sake of readabiliy, prefixes are omied from he eniy and relaion names. Permission o make digial or hard copies of all or par of his work for personal or classroom use is graned wihou fee provided ha copies are no made or disribued for profi or commercial advanage and ha copies bear his noice and he full ciaion on he firs page. Copyrighs for componens of his work owned by ohers han ACM mus be honored. Absracing wih credi is permied. To copy oherwise, or republish, o pos on servers or o redisribue o liss, requires prior specific permission and/or a fee. Reques permissions from Permissions@acm.org. CIKM 15, Ocober 19 23, 2015, Melbourne, Ausralia. c 2015 ACM. ISBN /15/10...$ DOI: hp://dx.doi.org/ / selec?name where { Managing Direcor job ile.people wih his ile?0.?0 employmen enure.company Apple Inc.?0 employmen enure.person?name } I would clearly be preferable, if we could jus ask he quesion in naural language, and he machine auomaically compues he corresponding SPARQL query. This is he problem we consider in his paper. We focus on srucurally simple quesions, like he one above. They involve k eniies (ypically wo or hree, in he example above: ceo and apple and he resul eniy), which are linked via a single k-ary relaion in he knowledge base. For languages like SPARQL, k-ary relaions for k > 2 can be represened by a special eniy (one for each k-uple in he relaion) and k 1 binary relaions (in he example above: he hree binary relaions in he where clause, all conneced o he?0 eniy). The challenge for hese quesions is o find he maching eniies and relaions in he given knowledge base. The eniy-maching problem is hard, because he quesion may use a varian of he name used in he knowledge base (synonymy), and he knowledge base may conain many eniies wih he same name (polysemy). For example, here are 218 eniies wih he name apple in Freebase, bu he righ mach for he quesion is acually Apple Inc. The relaionmaching problem has he same problem, which is even more difficul for k-ary relaion wih k > 2. As a furher complicaion, quesions like he above do no conain any word ha maches he relaions from he sough for query. 2 Noe how hese problems exacerbae for very large knowledge bases. If we resric o lexical maches, we will ofen miss he correc query. If we allow weaker maches, he number of possibiliies becomes very large. This will become clearer in Secion Conribuions We consider he following as our main conribuions: A new end-o-end sysem ha auomaically ranslaes a given naural-language quesion o he maching SPARQL query on a given knowledge base. Several previous sysems facor ou par of he problem, for example, by assuming he righ eniies for he query o be given by an oracle. See Secion 3 for an overview of our sysem. An evaluaion of our sysem on wo sandard benchmarks, Free917 and WebQuesions, where i ouperforms all pre- 2 This is ypical when he verb o be is used in he quesion.

2 vious approaches significanly. These wo benchmarks exhibi quie differen challenges, and many of he exising approaches were evaluaed (and work well) only for one of hem. See Secion 2 for an overview of he exising approaches, and Secion 5 for he deails of our evaluaion. Inegraion of eniy recogniion in a learning-based approach. Previous learning-based approaches reaed his sub-problem in a simplisic manner, or even facored i ou by assuming he righ eniies o be given as par of he problem. Using learning-o-rank echniques o learn pair-wise comparison of query candidaes. Previous approaches ofen use parser-inspired log-linear models for ranking. We also consider efficiency aspecs and ake care ha all quesions can be answered ineracively, ha is, wihin one second. Many of he previous sysems do no consider his aspec, and ake a leas several seconds and longer o answer a single query. Again, see Secion 5 for some deails. We make he code of our sysem publicly available under hp://ad.informaik.uni-freiburg.de/publicaions. In paricular, his allows reproducing our resuls. The websie also provides various addiional useful maerials; in paricular, a lis of misakes and inconsisencies in he Free917 and WebQuesions benchmarks. Throughou his paper, we focus on Freebase as he currenly larges general-purpose knowledge base. However, here is nohing in our approach specific o Freebase. I works for any knowledge base wih eniies and (possibly k-ary) relaions beween hem. 2. RELATED WORK Much recen work on naural-language queries on knowledge bases has focused on wo recen benchmarks, boh based on Freebase: Free917 and WebQuesions. Secion 2.1 gives an overview over his body of work, inroducing he wo benchmarks on he way. In Secion 5, we compare our new mehod agains all mehods from his secion. Secion 2.2 briefly discusses work using oher benchmarks. 2.1 Work on Free917 and WebQuesions We consider he works in chronological order, briefly highlighing he relaive innovaions o previous works and he corresponding gain in resul qualiy. A more echnical descripion of each of he mehods is provided in Secion 5.3. In [7], he Free917 benchmark was firs inroduced. The benchmark consiss of 917 quesions along wih he correc 3 knowledge-base query. All queries have exacly one (possibly k-ary) relaion. The basic approach of [7] is o exend an exising semanic parser wih correspondences beween naural-language phrases and relaion names in he knowledge base. The correspondences are learned using weak supervision echniques and from he raining porion of he benchmark (70% = 641 quesions). In [15], query candidaes are derived by ransforming an underspecified logical form of a CCG [21] parse. This form is grounded o Freebase using a se of collapsing and expansion operaors ha preserve he ype of he expression. This has he advanage ha i leverages grammaical srucure in he 3 Acually, a small porion of he queries are incorrec, bu his is no a deliberae feaure of he benchmark. quesion and can adjus knowledge base mismaches, and he disadvanage ha i relies on well-formed quesions. A linear model is learned o score derivaions, which are buil using a dynamic programming based parser. In [2], he WebQuesions (WQ) benchmark was inroduced. This benchmark is much larger (5,810 quesion) bu only provides he resul se for each quesion, no he knowledge-base query. This allows gahering more raining daa more easily (he resuls were obained via crowdsourcing). The WQ quesions are also more realisic (hey were obained via he Google Sugges API) and languagewise more diverse han he Free917 quesions, and hence also harder (e.g. who runs china in 2011 asking for he former Chinese Premier). The basic approach of [2] is o generae query candidaes by recursively generaing logical forms. The generaion is guided by a mapping of phrases o knowledge base predicaes and a small se of composiion rules. Candidae scores are learned wih a log-linear model. In a follow-up work [3], he process from [2] is urned on is head by again generaing a naural-language quesion from each query candidae. Scores are hen learned (again wih a log-linear model) based on he similariy beween he quesion represening he query candidae and he original quesion. This allows leverage of ex-similariy informaion (paraphrases) from large ex corpora (unrelaed o he queried knowledge base). In [25], he auhors go anoher sep furher by no even generaing query candidaes. Insead heir approach ries o idenify he cenral eniy of he quesion, and hen ieraes over each eniy conneced (via a single relaion) o ha cenral eniy in he knowledge base. I is hen decided (via a learned model) separaely for each such eniy wheher i becomes par of he resul se. In principle, his allows correc answers even when no single relaion from he knowledge base maches he quesion (e.g., asking for a broher of someone, when he knowledge base only knows abou siblings). On he downside, his adds a lo of addiional feaures o he learning process (he aribues of he resul eniies). Qualiy-wise, he approach does no improve over [2] and [3]. In [19], he auhors go ye a sep furher by no even using he raining daa. Insead, weak-supervision is used o generae learning examples from naural language senences. The parsing sep iself is concepualized as a graphmaching problem beween he graph of a CCG parse and graphs grounded in Freebase eniies and relaions. However, heir approach was evaluaed only on small (and opically narrow) subses of he wo benchmarks. In [4], he auhors ry o solve he problem wihou any naural-language processing (no even POS-agging). They mach he resuls from [3] bu do no improve hem. 2.2 Oher benchmarks Anoher recen noable effor in open-domain quesion answering is he QALD (Quesion Answering over Linked Daa) series of evaluaion campaigns, which sared in See [22] for he laes repor. So far, five benchmarks have been issued, one per year. The challenges behind hese benchmarks are somewha differen han hose behind he Free917 and WebQuesions benchmarks from Secion 2.1: The bigges and mos diverse knowledge base used is DBpedia, which is more han an order of magniude smaller han Freebase (abou 4M vs. abou 40M eniies).

3 A significan fracion of he quesions involves more han one relaion or non-rivial comparaives. For example, wha are he capials of all counries ha he himalayas run hrough or which acor was cas in he mos movies. The raining ses are relaively small ( queries for QALD 1-3). This is mainly due o he fac, discussed in Secion 2.1 above, ha he ground ruh provides no jus he correc resul ses bu also he corresponding SPARQL queries, which requires expensive human exper work. The benchmarks hus give relaively lile opporuniy for supervised learning. Indeed, mos of he paricipaing sysems are unsupervised. I is one of he insighs from our evaluaion in Secion 5 ha supervised learning is key for resuls of he qualiy we achieve. QALD 3 and 4 conain muli-lingual versions of he daases and quesions. For QALD 5, he daase is a combinaion of RDF daa and free ex. For hese reasons, and because here is such a subsanial body of very recen work on Free917 and WebQuesions wih a series of beer and beer resuls, we did no include QALD in our evaluaion. We consider i a very worhwhile endeavor for fuure work hough, o exend our approach o he QALD benchmarks. 3. SYSTEM OVERVIEW We firs describe our overall process of answering a naural language quesion from a knowledge base (KB). In he nex secions we describe each of he seps in deail. Assume we are rying o answer he following quesion (from he WebQuesions benchmark): wha characer does ellen play in finding nemo? Eniy idenificaion. We begin by idenifying eniies from he KB ha are menioned in he quesion. In our example, ellen refers o he v hos Ellen DeGeneres and finding nemo refers o he movie Finding Nemo. However, like for he example in he inroducion, his is no obvious: ellen could also refer o he acor Ellen Page and finding nemo o he video game wih he same name (besides ohers). Insead of fixing a decision on which eniies are menioned, we delay his decision and joinly disambiguae he menioned eniies via he nex seps. Hence, he resul of his sep is a se of (possibly overlapping) eniy menions wih aached confidence scores. Templae maching. Nex, we mach a se of query emplaes o he quesion. Figure 1 shows our emplaes. Each emplae consiss of eniy and relaion placeholders. A mached emplae corresponds o a query candidae which can be execued agains he KB o obain an answer. Our simples emplae consiss of a single eniy and an answer relaion (emplae 1 in Figure 1). One of he query candidaes for our example is generaed by maching he eniy for he v hos Ellen DeGeneres and he relaion parens 4 : <Ellen DeGeneres> <parens> <T> This has he (wrong) inerpreaion of asking for her parens. A slighly more complex emplae conains wo relaions conneced o he eniy via a mediaor objec (emplae 4 We use SPARQL-like riple (subjec, predicae, objec) noaion, where uppercase characers indicae variables. 2 in Figure 1). In our example, his maches a query candidae connecing Ellen Page o absrac film performance objecs, via a film performance relaion, and from here o all he films she aced in via a film relaion: <Ellen Page> <performance> <M> <M> <film> <T> This asks for all films Ellen Page aced in. Ye anoher emplae combines wo eniies via relaions and a mediaor eniy (m in emplae 3 in Figure 1). In our example, Ellen DeGeneres and Finding Nemo are conneced via wo relaions and a film-performance mediaor. <Ellen DeGeneres> <performance> <M> <M> <film> <Finding Nemo> <M> <characer> <T> We find his connecion using an efficien invered index (see Secion 4.2) and coninue maching from he mediaor. In paricular, we creae query candidaes asking for he characer (Dory) and performance ype (Voice) of Ellen DeGeneres in Finding Nemo. The final resul of his sep is a se of all he mached query candidaes. Relaion maching. The query candidaes sill miss he fundamenal informaion abou which relaions were acually menioned and asked for in he quesion. We disinguish hree ways of maching relaions of he query candidae o words in he quesion: 1) via he name or descripion of he relaion in he KB, 2) via words learned for each relaion using disan supervision, 3) via supervised learning on a raining se. Each mach has a confidence score aached. In our example, a word learned for he relaions performance and film connecing an acor o he film she aced in is play. This maches in he query candidaes asking for all films of Ellen Page and for he performance ype or characer of Ellen DeGeneres in Finding Nemo. Furhermore, he word characer maches he relaion wih he same name, whereas he relaion performance ype doesn mach. Coninuing his way, all relaions in all query candidaes are enriched wih informaion abou wha words were mached in which way. Ranking. We now have a se of query candidaes, where each candidae is enriched wih informaion abou which of is eniies and relaions mach which pars of he quesion how well. I remains o rank he candidaes in order o find he bes maching candidae. Noe ha performing ranking a his final sep has he srong benefi of joinly disambiguaing eniies and relaions. A candidae can have a weak mach for an eniy, bu a srong mach for a relaion, and vice versa. By deciding his a he final sage we can idenify hese combinaions as correc, even when one of he maches seems unlikely when considered separaely. Inuiively, for our example, he candidae covering mos words of he quesion is bes. Maching ellen o Ellen Page does no longer allow maching Finding Nemo because hese aren acually relaed in he KB. On he oher hand, asking for he performance ype of Ellen DeGeneres in Finding Nemo doesn mach he word characer. This leaves us wih he correc inerpreaion of asking for her characer in he movie. 4. SYSTEM DETAILS In his secion, we describe he deails of our sysem, called Aqqu. Aqqu works by generaing query candidaes for each

4 Templae Example Candidae Quesion #1 e 1 r 1 Scrabble invenor who invened scrabble? #2 e 1 r 1 m r 2 Henry Ford employmen m company wha company did henry ford work for? #3 e 1 r 1 m r 3 r 2 e2 Ellen DeGeneres film performance m film characer Finding Nemo wha characer does ellen play in finding nemo? Figure 1: Query emplaes and example candidaes wih corresponding quesions. A query emplae can consis of eniy placeholders e, relaion placeholders r, an inermediae objec m and he answer node. quesion. These query candidaes are hen ranked using a learned model. The op-ranked query is hen reurned (or no answer in case he se of candidaes was empy). The following subsecions describe he candidae generaion and ranking in deail. The previous secion explained he process by an example. 4.1 Eniy maching The goal of he eniy maching phase is o idenify all eniies from he knowledge base ha mach a par of he quesion. The mach can be lieral, or via an alias of he eniy name. POS-agging We POS-ag he quesion using he Sanford agger [17]. For eniy maching (his subsecion), we make use of he ags NN (noun) and NNP (proper noun). For relaion maching (Secion 4.3), we also make use of he ags VB (verb) and JJ (adjecive). Subsequence generaion We generae he se S of all subsequences of words from he quesion, wih he following wo resricions. Firs, a subsequence consising of a single word mus be agged NN. Second, a subsequence mus no spli a sequence of words agged NNP; ha is, when i sars (ends) wih a word agged NNP, i mus no be preceded (succeeded) by a word agged NNP. Find maching eniies For each s S, we compue he lis of all eniies from he knowledge base ha have s as heir name or alias. We use a map from phrases (he aliases) o liss of eniies (he eniies wih he respecive alias) obained from he CrossWikis daase [20]. CrossWikis was buil by mining he anchor ex of links o Wikipedia eniies (aricles) from various large web-crawls. CrossWikis covers around 4 million eniies from Wikipedia. Almos all of hese eniies also exis in Freebase, ogeher wih a link o he respecive Wikipedia eniy. For he remaining Freebase eniies, we only consider he lieral name mach. Overall, we are able o recognize around 44 million eniies wih abou 60 million aliases. We have also experimened wih he aliases provided by Freebase, bu hey end o be much more noisy (wrong aliases) and less complee (imporan aliases missing). Scores for he eniy maches We compue a score for each mach s, e compued in he previous sep, where s is a subsequence of words from he quesion and e is an eniy from Freebase wih alias s. Consider a fixed alias s. CrossWikis also provides us wih a probabiliy disribuion p cross(e s) over he Wikipedia eniies e wih alias s. Le e be a Freebase eniy ha is no conained in CrossWikis. Le e max be he CrossWikis eniy wih he highes p cross(e s). Tha is, e max is he mos likely Wikipedia eniy for alias s. Le p free (e s) = p(e max s) pop(e )/pop(e max ), where pop is he (alias-independen) populariy score of an eniy, as described in he nex subsecion. Inuiively, p free (e s) esimaes he probabiliy ha e has alias s via is relaive populariy o he mos likely Wikipedia eniy for s. We merge p cross(e s) and p free (e s) ino one probabiliy disribuion by simply normalizing he probabiliies o sum 1. Populariy scores for each eniy For each eniy, we also compue a (mach-independen) populariy score. We simply ake he number of imes he eniy is menioned in he ClueWeb12 daase [9], according o he annoaions provided by Google [13]. The populariy scores are used for he eniy mach scores above. They also yield wo feaures used in ranking each candidae; see Secion Candidae generaion Based on he eniy maches, we compue a se of query candidaes as follows. We generae he query candidaes in hree (disjoin) subses, one for each of he hree emplaes shown in Figure 1. Each emplae sands for a query wih a paricular kind of srucure. These hree emplaes cover almos all of he quesions in he Free917 and WebQuesions benchmarks. Le E be he se of all eniies mached o a subsequence of he quesion, as described in he previous secion. Templae 1 For each e E, find all relaions r such ha here is some riple (e, r, ) in he knowledge base. We obain hese via a single SPARQL query for each e. Templae 2 For each e E, find all r 1, r 2, m such ha here are wo riples (e, r 1, m) and (m, r 2, ) in he knowledge base, where r 1 and r 2 are relaions and m is a mediaor eniy. We obain hese as follows. For each e, we use a single SPARQL query o obain all maching r 1. For each e, r 1, we hen use anoher SPARQL query o obain all maching r 2. Noe ha m remains a variable in he query candidae. Templae 3 For all pairs of eniies e 1, e 2 E such ha he wo subsequences mached in he quesion do no overlap, find all r 1, r 2, r 3 such ha here are hree riples (e 1, r 1, m), (m, r 2, e 2), and (m, r 3, ) in he knowledge base, where r 1, r 2, r 3 are relaions and m is a mediaor eniy. We obain hese as follows. For each eniy e, we precompue he lis of all (r, m) such ha m is a mediaor eniy and he riple (e, r, m) exiss in he knowledge base. The lis is sored by he ids of he mediaor eniies. For given e 1, e 2 like above, we hen inersec he liss for e 1 and e 2. For each

5 mediaor m in he inersecion, we hen obain all r 3 via a simple SPARQL query. In he query candidae, m remains a variable. 4.3 Relaion maching Le C be he se of query candidaes compued in he previous subsecion. For each query candidae c C, le RW c be he se of lemmaized 5 words from he relaions from c (here can be one, wo, or hree relaions, depending on he emplae from which c was generaed). We compue how well he words from RW c mach he subse QW of lemmaized words from he quesion ha are no already mached by he eniies from c. We consider four kinds of maches, described in he following: lieral, derivaion, synonym, conex. For each of hese four kinds of maches, we compue a non-negaive score (which is zero, if here is no mach a all). I can happen ha all four of hese scores are zero. In he basis version of our sysem, we keep such candidaes, in a varian we prune hem; see Secion 4.7. Lieral maches This score is simply he number of pairs w, q, where w RW c and q QW and w = q. Almos all quesions have no repeaed words; in ha case, his score is jus he number of relaion words ha occur in he quesion (and are no already mached by an eniy). Derivaion maches This score is he number of pairs w, q, where w RW c and q QW and w is derivaionally relaed o q. Here we also consider he POS-ag of w in he quesion. We precompue a map from POS-agged words o derivaions using WordNe [11]. We exrac derivaion links for verbs and nouns (e.g. produce.vb - producer.nn and vice versa). We also exrac aribue links beween adjecives and heir describing aribue (e.g., high.jj - heigh.nn ). We exend hese links wih synonyms of he noun in Word- Ne (e.g. high.jj - elevaion.nn ). Synonym maches For each w RW c and q QW, add s o his score if w is a synonym of q wih similariy s. We compue he similariy beween wo words by compuing he cosine similariy beween he associaed word vecors. We use 300-dimensional word vecors ha were compued wih Google s word2vec on a news ex corpus of size around 100 billion words. 6 We consider only synonyms, where he score is 0.4. This hreshold is based on observaion, bu chosen very liberally: many word pairs wih score above ha hreshold are no wha humans would call real synonyms, bu almos all such real synonyms have a score above ha hreshold. Conex maches For his score, we precompue weighed indicaor words for each relaion from our knowledge base. These are words which are no necessarily synonyms of words in he relaion name, bu are used in ex o express ha relaion; see below for an example. The score is hen he sum of he weighs of all words in QW ha are indicaors for one of he relaions from he query candidae. For emplaes 2 and 3, we consider r1.r2 as one relaion. We learn indicaor words using disan supervision [18] as follows. Firs, we idenify eniy menions in Wikipedia using Wiki markup and a se of simple heurisics for coreference resoluion, as described in [1]. We also idenify 5 For example, founded found and was be. 6 hps://code.google.com/p/word2vec/ daes and values using SUTime [8]. For he 23 million senences ha conain a leas wo eniies (including daes or values), we compue a dependency parse using [17]. For each pair e 1, e 2 of eniies occurring in a senence, we look up all relaions r in he knowledge base ha connec hem. We also rea relaions r1.r2 ha connec he eniies via a mediaor as a single binary relaion r. If he shores pah beween e 1 and e 2 in he dependency parse has lengh a mos four, we consider all words along ha pah as indicaor words for r. We also experimened wih considering all words in he senence, or words along longer pahs, bu hese gave considerably worse resuls. We find abou 4.7 million senences ha mach a leas one relaion his way. For example, we can hus learn ha born is an indicaor word for he relaion place of birh from he following senence (assuming ha our knowledge base conains he respecive fac): Andy Warhol was born on Augus 6, 1928 in Pisburgh. Noe ha from he same senence, we can also learn ha born is an indicaor word for he relaion dae of birh. To disinguish beween he wo, we need some kind of answer ype maching; his is described in Secion 4.4. We compue he weighs for he indicaor words in he following IR-syle fashion. Consider each relaion as a documen consising of he words exraced for ha relaion. Then compue f.idf scores for all he words in hese (relaion) documens in he usual way. For each relaion, hen only consider he op-1000 words and sum up heir f.idf scores. The weigh for each word in a (relaion) documen is hen is f.idf score divided by his sum. This could also be inerpreed as a probabiliy disribuion p(w r) over words w given a relaion r. 4.4 Answer ype maching For each candidae, we perform a simple bu effecive binary check based on he relaion leading o he answer (r1, r2, r3 for emplaes 1,2 and 3, respecively). We precompue a lis of arge ypes for each relaion r by couning he ypes of objecs o in all riples (, r, o), keeping only he op en percen of mos frequen ypes. For quesions saring wih who, we check wheher he compued arge ypes conain he ype person, characer, or organizaion. For quesions saring wih where, we check wheher he relaion leads o a locaion or an even. For quesions saring wih when or since when, we check wheher he ype is a dae; for all oher quesions, he check for arge objecs of ype dae is negaive. As our evaluaion and error analysis shows, hese simple heurisics work reasonably well for he Free917 and Web- Quesions benchmarks. The reason is ha our eniy and relaion maching already provide ample informaion for discriminaing beween candidaes. However, as explained in Secion 4.3, a quesion word like born alone does no permi discriminaion beween he wo relaions place of birh and dae of birh. However, i is exacly hose cases ha can be easily discriminaed wih he simple answer-ype check from above. We leave elaborae answer-ype deecion (which has been addressed by many QA sysems) o fuure work. 4.5 Candidae feaures The previous subsecions have shown wo hings. Firs, how we generae query candidaes for a given quesion. Sec-

6 ID Descripion 1 number of eniies in he query candidae 2 number of eniies ha mached exacly wih heir name, or wih a high probabiliy (> 0.8) 3 number of okens of all eniies ha mached lierally as per he previous feaure 4-5 average (4) and sum (5) of eniy mach probabiliies 6-7 average (6) and sum (7) of eniy mach populariies 8 number of relaions in mached emplae 9 number of relaions ha were mached lierally via heir name number of okens ha mached a relaion of kind: lieral (10), derivaion (11), synonym (12), conex (13) 14 sum of synonym mach scores 15 sum of relaion conex mach scores 16 number of imes he answer relaion (r 1, r 2, r 3 for emplaes 1, 2 and 3 respecively) occurs in he KB 17 a value beween 0 and 1 indicaing how well he relaion maches according o n-gram feaures (Secion 4.5) 18 sum of feaures 3 and 10; ha is, he number of okens maching a relaion or eniy lierally 19 number of okens ha mach an eniy or relaion divided by he oal number of okens in quesion wheher he resul size is 0 (feaure 20), 1-20 (feaure 21), or larger han 20 (feaure 22); all binary 23 binary resul of he answer-ype check (Secion 4.4) Table 1: Feaures used by our ranking approaches. Top/middle/boom: feaures for eniy maches/feaures for relaion maches/combined or oher feaures. ond, how we compue various scores for each candidae ha measure how well he eniies and relaions from he candidae mach which pars of he quesion. In his subsecion, we show how we generae a feaure vecor from each candidae. Mos of hese feaures are based on he scores jus menioned. Anoher imporan feaure, described below, serves o learn he correspondence beween n-grams from he quesion and relaions from query candidaes. Table 1 provides an overview over all our feaures. In he descripion below, we refer o he feaures by heir ID (firs column in he able). In Secion 4.6, we show how we rank candidaes based on hese feaure vecors. Eniy/Relaion maching feaures Feaures 1-7 are based on he resuls from he eniy maching described in Secion 4.1. Feaures 8-16 are based on he resuls from he relaion maching sep described in Secion 4.3. Feaures 18 and 19 quanify he number of words in he quesion covered by eniy or relaion maches (feaure 18 = lierally, feaure 19 = in any way). Feaures quanify he resul size. This is imporan, because some candidaes produce huge resul sizes or empy resuls ses, which are boh rare. Feaure 23 is he binary oupu of he simple answer-ype check from Secion 4.4. N-Gram relaion maching feaure This feaure considers correspondences beween words (unigrams) or wo-word phrases (bigrams) in he quesion and he relaion in he query candidae. For example, in he WebQuesion benchmark, he quesion who is... almos always asks for he profession of a person. Such a correspondence canno be learned by any of he mechanisms described in Secion 4.3. We learn his feaure as follows. For each query candidae, we generae all unigrams and bigrams of he lemmaized words of he quesion. The mached eniies (Secion 4.1) are replaced wih a special word eniy. For each n-gram, we hen creae an indicaor feaure by appending he n-gram o he relaion names of he candidae. For example, for emplae 2 from Figure 1, one of he feaures would be employmen.company+work for he uni-gram work and he relaions employmen.company. We hen rain an L2-regularized logisic regression classifier wih all correc candidaes as posiive examples and all ohers as negaive examples. The value of feaure 17 is simply he (probabiliy) oupu by his classifier. This feaure will be par of a subsequen sep o learn a ranking ha uses he same raining daa. To provide realisic feaure values (ha aren overfi) we proceed as follows. Spli he raining daa ino six folds. In urn, leave ou one fold and rain he n-gram feaure classifier on he remaining folds. Then, for each example in he lef-ou fold compue he n-gram feaure value. Use his compued value as par of he raining daa for subsequen learning. 4.6 Ranking For each quesion, we finally rank he query candidaes using he feaure vecors described in he previous subsecion. The op-ranked query candidae is hen used o provide he answer. We say no answer only when he se of candidaes is empy; his is discussed in Secion 4.7 below. 7 We have experimened wih sae-of-he-ar echniques for he learning-o-rank approach from IR [14] [16], including: RankSVM [14], RankBoos [12], LambdaRank [6] and AdaRank [23]. These only lead o moderae resuls and were ouperformed by our approaches described below. We presume ha his is because our ranking problem is degenerae. In paricular, each query is only associaed wih a single relevan answer. This is differen from a ypical IR scenario where a query usually has several answers, someimes wih varying degrees of relevance. We invesigae wo varians o obain a ranking: poinwise ranking and pairwise ranking. These approaches are inspired by he learning-o-rank approaches from IR. 7 Boh benchmarks conain a considerable number of quesions saring wih how many..., asking for a coun. We simply replace how many by wha in hese quesions, and coun he size of he resul se (unless he answer already is a coun).

7 Poinwise ranking In he poinwise ranking approach we compue a score for each candidae. Candidaes are sored by his score o infer a ranking. The score is compued by a classifier learned on he candidae feaures (see Secion 4.5) and raining daa. We creae raining daa by using he correc candidae of each quesion as posiive examples and all oher candidaes as negaive examples. A drawback of he poinwise approach is ha he model compares quesion-independen examples. Tha is, correc (incorrec) query candidaes of quesions of differen ype and difficuly are in he same correc (incorrec) class, when in pracice i is no necessary o compare or discriminae beween hem. Pairwise ranking In he pairwise ranking approach, we ransform he ranking problem ino a binary classificaion problem. The idea is o learn a classifier ha can predic for a given pair of candidaes, wheher one should be ranked before he oher. To infer a ranking, we sor he lis of candidaes using he learned preference relaion. This works very well in pracice, alhough our learning does no guaranee ha he learned relaion is ransiive or ani-symmeric. We have experimened wih wo alernaives o soring. Simply compuing he maximum urned ou o perform badly. This makes sense, because he maximum has o survive a larger number of comparisons. Following [10], we have also sored he candidaes by heir number of won comparisons agains all oher candidaes. The resuls were idenical o hose for soring, bu his mehod requires Θ(n 2 ) comparisons for n candidaes. To rain he classifiers we creae raining examples in he following way. For a quesion wih n query candidaes, randomly selec n/2, bu a leas 200 candidaes (or n if n/2 < 200). This is o guaranee ha we have enough raining examples for quesions wih few candidaes and o avoid puing oo much emphasis on quesions ha have more han 200 candidaes. 8 Then, for each randomly seleced candidae r i and he correc candidae c, where r i c, creae a posiive example pair (c, r i) and a negaive example pair (r i, c). The feaure represenaion for a pair (a, b) is a uple of he individual feaure vecors and heir difference: φ pair(a, b) = (φ(a) φ(b), φ(a), φ(b)), where φ is a funcion exracing he feaures in Table 1. Boh ranking approaches, poinwise and pairwise, require a classifier. Here, we consider wo differen opions. Linear A logisic regression classifier. In iniial experimens, oher linear models, such as linear SVMs, have shown similar performance. Logisic regression is also known o oupu well calibraed probabiliies and performs well in high-dimensional feaure spaces. We rain he model using L-BFGS-B [26]. To avoid over-fiing we apply L2- regularizaion choosing he regularizaion srengh using 6- fold cross-validaion on he raining se. Random fores We learn a fores of decision rees [5]. Random foress are able o learn non-linear decision boundaries, require few hyperparameers, are simple o rain, and are known o perform very well on a variey of asks. 8 Our sysem generaes around 200 candidaes on average for a random quesion, bu he exac value had lile effec on performance in our evaluaion. 4.7 Candidae pruning Some quesions may have no answers in he knowledge base. Our sysem, as described so far, reurns no answer only when he se of query candidaes is empy. However, as also described, his would rarely happen, since here are maching eniies for every quesion, and we do no require ha he relaions mach any of he words in he quesion. 9 We consider wo varians of our sysem o deal wih his problem: (1) omiing he n-gram feaure, and using hard pruning; and (2) keeping he n-gram feaure, and using a pruning-classifier. Noe ha a nice side-effec of pruning is ha i speeds up he ranking process because i needs o consider less candidaes. Wihou n-grams, wih hard pruning When omiing he n-gram feaure, here is no reason o keep candidaes wih he wrong answer ype or where feaures 9-15 are all zero. The naural approach is hen o prune such candidaes before we do he ranking; his is wha we call hard pruning. Hard pruning naurally leads o empy candidae ses for some queries. Indeed, on he Free917 benchmark, 10 quesions have no answer, and our hard pruning yields an empy candidae se for 7 of hem. Wih n-grams, wih a pruning classifier When keeping he n-gram feaure, hard pruning as jus described would be counerproducive. As explained in Secion 4.5, he answers for he who is... quesions from he WebQuesions benchmark are professions. They would be eliminaed when hard pruning by answer ype. Also, he profession relaion maches no words from hese quesions. They would hence also be eliminaed when hard pruning if feaures 9-15 are all zero. The goal of he pruning classifier is o weed ou only he obviously bad candidaes. For example, candidaes ha do no mach he answer ype, have bad relaion maches, and a weak n-gram feaure. We rain he pruning classifier in he same way as he poinwise classifiers (see above) wih he feaures from Table 1 using logisic regression. To opimize he classifier for recall we adjus example weighs so ha posiive candidaes have wice he weigh of negaive candidaes. Before he ranking sep, we apply he classifier o each candidae and only keep candidaes classified posiively. 5. EVALUATION We perform an exensive evaluaion of our sysem. In Secion 5.1, we provide more deails on our wo benchmarks. In Secion 5.2, we describe he evaluaion measures used. In Secion 5.3, we describe he sysems we evaluae and compare o. In Secion 5.4, we provide our main resuls followed by a deailed analysis in Secion Daa We use all of Freebase as our knowledge base (2.9 billion facs on 44 million eniies). Noe ha our approach is no ailored o Freebase and could easily be adaped o anoher knowledge base, e.g., WikiDaa 10. Daases We evaluae our sysem on wo esablished benchmarks: Free917 and WebQuesions. Each benchmark consiss of a se of quesions and heir answers from Freebase. 9 In ha case, feaures 9-15 are all zero; however, he n-gram feaures could sill be posiive. 10 hp://

8 The benchmarks differ subsanially in he ypes of quesions and heir complexiy. Free917 conains 917 manually generaed naural language quesions [7]. The quesions cover a wide range of domains (81 in oal). Two examples are wha fuel does an inernal combusion engine use and how many floors does he whie house have. The mos common domains, film and business, only make up 6% of he quesions [7]. All quesions are grammaical and end o be ailored o Freebase. The daase provides a ranslaion of each quesion ino a SPARQL-equivalen form. We execue he SPARQL queries o obain a gold answer for each quesion. [7] also provide an eniy lexicon: a mapping from exac ex o he menioned eniy for all eniies appearing in he quesions. This lexicon consiss of 1014 differen eniies. I was used for idenifying eniies by all sysems reporing resuls on he daase so far. We only make use of his lexicon where explicily saed. To repor resuls, we use he original spli of he quesions by [7] ino 70% (641) quesions o rain and 30% quesions (276) o es. WebQuesions consiss of 5,810 quesions ha were seleced by crawling he Google sugges API [2]. Conrary o Free917, quesions are no necessarily grammaical and are more colloquial. For example: where did jackie kennedy go o college and wha is spoken in czech republic. Due o how hey were seleced, he quesions are biased owards opics ha are frequenly asked from Google. According o [19], he people domain alone makes up abou 7% of quesions. Furhermore, he srucure of quesions ends o be simpler. Mos quesions only require a single eniy wih an answer relaion [2]. Answers o he quesions were obained by using crowdsourcing. This inroduces addiional noise; in paricular, for some quesions only a subse of he correc answer is provided as gold answer. We use he original rain-es spli of he quesions by [2] ino 70% (3,778 quesions) o rain and 30% (2,032 quesions) o es. 5.2 Evaluaion measures Given a benchmark and a sysem, denoe he quesions by q 1...q n, he gold answers by g 1...g n, and he answers from he sysem by a 1...a n. Noe ha an answer can consis of a single value (in paricular, a dae or a lieral) or a lis of values. We consider he following wo evaluaion measures. Accuracy The fracion of queries answered wih he exac gold answer: accuracy = 1 n n I(g i = a i) where I(e) is an indicaor funcion reurning one if expression e is rue and zero else. This is reasonable on Free917 which provides perfec gold answers. Average F1 The average F1 across all quesions: average F1 = 1 n F 1(g i, a i) n where he funcion F 1 compues F1 in he regular way. This accouns for parially correc resuls, which is reasonable for WebQuesions, where gold answers are someimes incomplee. In our evaluaion we focus on accuracy for Free917 and average F1 for WebQuesions. These are he mos repored and mos inuiive measures for hese daases. We also performed he evaluaion wih oher measures ha were used i=1 i=1 Free917 WebQuesions Mehod Accuracy+ Accuracy Average F1 Cai+Yaes 59 % Jacana 35.4 % Sempre 62 % 52 % 35.7 % Kwia. e al 68 % Bordes e al 39.2 % ParaSempre 68.5 % 46 % 39.9 % Aqqu 76.4 % 65.9 % 49.4 % Table 2: Resuls on he Free917 (267 quesions) and WebQuesions (2032 quesions) es se. For he resuls in he second column (Accuracy+) a manually crafed eniy lexicon was used. in previous work, e.g., varians of F1 as defined in [15] and [25]. These provided no new insighs and srongly correlaed wih he measures above. 5.3 Sysems evaluaed We evaluae and compare he following sysems. See Secion 2 for a brief descripion of he sysems from previous work. If we (re-)produced resuls, we explicily sae so. Oherwise, we repor exising resuls. Cai+Yaes The semanic parser developed by [7]. Kwia. e al The semanic parser by [15]. Sempre The semanic parser by [2]. We produced resuls for Free917 wihou an eniy lexicon using he provided code. 11 ParaSempre The semanic parser suggesed by [3]. We used he code provided by he auhors 11 o produce resuls on Free917 wihou an eniy lexicon. GraphParser The semanic parser developed by [19]. We repor resuls obained from he code provided by he auhors 12. The resuls from heir code slighly deviaes from he resuls repored in heir paper. Jacana The informaion exracion based approach by [25]. We repor updaed resuls from [24]. Bordes e al The embedding-based model by [4]. Aqqu Our sysem, as described in Secion 4. We wan o sress ha we use he exac same sysem on boh benchmarks. As shown in Secion 5.5 below, resuls can be furher improved by adaping he feaure se o he benchmark. However, we consider his overfiing. Noe ha all of he sysems above, excep Sempre and ParaSempre, were only evaluaed on one of he wo benchmarks. 5.4 Main resuls Table 2 shows he resuls on he es ses for Free917 and WebQuesions for all he sysems from Secion 5.3. Graph- Parser is discussed separaely below, because i was evaluaed only on a subse of quesions. On Free917, Aqqu improves in accuracy over he bes previous sysems by 8% wih an eniy lexicon, and by 14% wihou eniy lexicon. Performance drops considerably for all sysems when no using an eniy lexicon. This shows 11 hp://gihub.com/percyliang/sempre 12 hp://gihub.com/sivareddyg/graph-parser

9 Top-2 Top-3 Top-5 Top-10 Free % 77.2 % 79.3 % 83.7 % WebQuesions 67.1 % 72.7 % 77.5 % 82.3 % Table 3: Top-k resuls on Free917 (op) and Web- Quesions (boom). Percenage of quesions wih he bes answer in he op-k candidaes. Free917 WebQuesions Mehod Acc+ Acc Avg F1 Aqqu-poin-lin 73.6 % 63.4 % 46.9 % Aqqu-poin-ree 74.3 % 63.0 % 47.9 % Aqqu-pair-lin 76.4 % 65.2 % 48.3 % Aqqu-pair-ree 76.4 % 65.9 % 49.4 % Table 4: Resuls for differen ranking varians on he es ses for Free917 and WebQuesions. For he resuls in he second column (Acc+) a manually crafed eniy lexicon was used. ha addressing eniy recogniion is an inegral par of he problem ha canno be ignored. Overall, we achieve an oracle accuracy (percenage of quesions where a leas one produced query candidae is correc) of 89.1% and 85.5%, wih and wihou eniy lexicon respecively. This indicaes ha here is sill room for improvemen for beer maching and ranking. On WebQuesions our sysem improves he sae of he ar by almos 10% in average F1. No sysem uses an eniy lexicon. Noe ha he WebQuesions benchmark is much harder and conains a considerable amoun of imperfec or wrong answers. Ou of a random sample of 55 quesions we found 9 quesions ha had a wrong answer, and 10 furher quesion ha had only a parially correc answer. This suggess ha he upper bound for average F1 is roughly around 80%. Our oracle average F1 is a 68.5%. [2] and [3] repor 48% and 63% respecively. Hence, we successfully idenify mos of he eniies and relaions. However, here is sill much room for improvemen in ranking and maching. GraphParser was evaluaed only on a subse of Freebase relaions. The auhors provide a rain-es spli of quesions for WebQuesions. Noe ha we didn resric our sysem o he specific relaions and ha GraphParser requires an eniy lexicon also on WebQuesions. Our sysem (wihou an eniy lexicon) scores an average F1 of 66.1 % compared o 40.5 % repored for GraphParser. The seleced subse of relaions and hus quesions seems o be considerably easier o answer for our sysem. 5.5 Deailed analysis Top-k resuls Table 3 shows he op-k resuls on he wo daases. A large majoriy of quesions can be answered from he op wo or hree candidaes. By providing hese inerpreaions and resuls (in addiion o he op-ranked candidae) o a user, many quesions can be answered correcly. Noe ha on WebQuesions some quesions only have an imperfec gold answer wih an F1 score smaller han one. Therefore, he percenage of bes answers in he op candidaes can be slighly larger han he resuling average F1. Ranking varians As described in Secion 4.6, we con- Free917 WebQuesions bes previous 52.0 % 39.9 % bes now 69.2 % 49.4 % no n-grams, all oher 69.2 % 39.6 % no n-grams, no li-mach 65.2 % 39.6 % no n-grams, no synonyms 61.6 % 28.2 % n-grams, all oher 65.9 % 49.4 % n-grams, no pruning 64.4 % 49.3 % n-grams, no synonyms 62.0 % 48.0 % n-grams, nohing else 18.1 % 43.8 % Table 5: Feaure analysis for Free917 and WebQuesions. No synonyms disables feaures and no li-mach feaures 2, 3, 9, 10 and 18. When no using he n-gram feaure a differen ype of candidae pruning is performed (see ex).. sider wo possible ranking mehods: poinwise (poin) and pairwise (pair), each wih wo differen ranking classifiers: logisic regression (lin) and random foress (ree). This gives a oal of four differen combinaions. Table 4 shows resuls of all four ranking varians wih he full feaures of Table 1. On boh benchmarks, pairwise ranking is more effecive han poinwise ranking. This is consisen wih our inuiion ha learning a pairwise comparaor is beer (see Secion 4.6). Furhermore, random foress are slighly more effecive han a weighed linear combinaion. We herefore use pairwise ranking wih random foress as a sandard choice. Feaure analysis To gain insigh ino which feaures are helpful we evaluae our sysem wih differen combinaions. Table 5 shows he resuls. Noe ha, as described in Secion 4.7, wihou he n-gram feaure hard pruning is applied. The following main observaions can be made. The n-gram feaure is exremely helpful on WebQuesions bu slighly derimenal on Free917. The WebQuesions benchmark conains many quesions ha are hard o answer wihou his kind of supervision, e.g., where is reggie bush from? (asking for he place of birh) or wha o do downown san francisco? (asking for ouris aracions). Our sysem is able o successfully learn imporan feaures for hese from he raining se. On he oher hand, he small Free917 benchmark covers a wide range of domains and relaions wih only few repeiions. N-gram feaures aren helpful on his daase, which is shown by he low performance when only using he n-gram feaure (18.1%). Noe ha he ranking and learning problem is inherenly more difficul when he number of possible candidaes increases. This is he case when no using hard pruning which goes along wih using he n-gram feaure (see Secion 4.7). This disadvanage canno be fully compensaed by he weak n-gram feaure and leaner pruning and, as a resul, he score drops by abou 3% for Free917. Sill, we consider i more imporan o have a single approach ha performs well on differen kinds of daases han o opimize for a single daase. Lieral feaures provide a small benefi for Free917 bu no benefi on WebQuesions. This is an arefac of he way Free917 was buil. Free917 quesions are ailored o Freebase, ofen using words from he relaion name as par of he

10 quesion. Synonym feaures are imporan for boh daases. They give a huge benefi on WebQuesions wihou he n- gram feaure bu only a small benefi on op of i. Finally, he pruning classifier used wih he n-gram feaure helps on Free917 because i allows o reurn no answer for some quesions ha have no answer in he knowledge base. The difference on WebQuesions (which always has an answer in he knowledge base) is no significan, and shows ha he pruning classifier doesn negaively affec performance. Manual error analysis We manually inspeced he errors our sysem makes. Many errors are due o misakes in he benchmarks (parially or compleely wrong gold answers) and inconsisencies in he knowledge base (differen relaions wih conradicing answers on he same piece of informaion). We provide a lis on our websie, see he link in Secion 1.1. On ha websie, we also provide a lis of errors due o our sysem. There is no single large class of errors worh poining ou hough. Efficiency We also evaluaed he performance of our sysem. The average response ime for a quesion is 644 ms for Free917 and 900 ms for WebQuesions. 13 None of he oher sysem from Secion 5.3 comes wih an efficiency evaluaion. For sysems ha provide code and for which we reproduced resuls, runimes are (a leas) several seconds per query. Training our sysem on he large WebQuesions benchmark akes abou 90 minues in oal. 6. CONCLUSION We have presened Aqqu, a new end-o-end sysem ha auomaically ranslaes a given naural-language quesion o he maching SPARQL query on a knowledge base. The sysem inegraes eniy recogniion and uilizes disan supervision and learning-o-rank echniques. We showed ha our sysem ouperforms previous sae-of-he-ar sysems on wo very differen benchmarks by 8% and more. Aqqu answers quesions ineracively, ha is, wihin one second. For around 80% of he queries, he correc answer is among he op-5 candidaes. This suggess ha a more ineracive approach, which asks he user s feedback for criical decisions (e.g., beween wo relaions), could achieve a significanly furher improved accuracy. 7. REFERENCES [1] H. Bas, F. Bäurle, B. Buchhold, and E. Haussmann. Broccoli: Semanic full-ex search a your fingerips. CoRR, abs/ , [2] J. Beran, A. Chou, R. Frosig, and P. Liang. Semanic Parsing on Freebase from Quesion-Answer Pairs. In EMNLP, pages , [3] J. Beran and P. Liang. Semanic Parsing via Paraphrasing. In ACL, pages , [4] A. Bordes, S. Chopra, and J. Weson. Quesion Answering wih Subgraph Embeddings. CoRR, abs/ , [5] L. Breiman. Random foress. Machine Learning, 45(1):5 32, Answer imes are averaged over hree runs on a server wih Inel E5649 CPUs, 90GB of RAM and warm SPARQL caches. [6] C. J. C. Burges, R. Ragno, and Q. V. Le. Learning o rank wih nonsmooh cos funcions. In NIPS, pages , [7] Q. Cai and A. Yaes. Large-scale Semanic Parsing via Schema Maching and Lexicon Exension. In ACL, pages , [8] A. X. Chang and C. D. Manning. Suime: A library for recognizing and normalizing ime expressions. In LREC, pages , [9] ClueWeb, The Lemur Projek. [10] W. W. Cohen, R. E. Schapire, and Y. Singer. Learning o order hings. JAIR, 10: , [11] C. Fellbaum. WordNe. Wiley Online Library, [12] Y. Freund, R. D. Iyer, R. E. Schapire, and Y. Singer. An efficien boosing algorihm for combining preferences. JMLR, 4: , [13] E. Gabrilovich, M. Ringgaard, and A. Subramanya. FACC1: Freebase annoaion of ClueWeb corpora, Version 1. [14] T. Joachims. Opimizing search engines using clickhrough daa. In KDD, pages , [15] T. Kwiakowski, E. Choi, Y. Arzi, and L. S. Zelemoyer. Scaling Semanic Parsers wih On-he-Fly Onology Maching. In EMNLP, pages , [16] T. Liu. Learning o rank for informaion rerieval. Foundaions and Trends in Informaion Rerieval, 3(3): , [17] C. D. Manning, M. Surdeanu, J. Bauer, J. R. Finkel, S. Behard, and D. McClosky. The sanford corenlp naural language processing oolki. In ACL, pages 55 60, [18] M. Minz, S. Bills, R. Snow, and D. Jurafsky. Disan supervision for relaion exracion wihou labeled daa. In ACL, pages , [19] S. Reddy, M. Lapaa, and M. Seedman. Large-scale Semanic Parsing wihou Quesion-Answer Pairs. TACL, 2: , [20] V. I. Spikovsky and A. X. Chang. A Cross-Lingual Dicionary for English Wikipedia Conceps. In LREC, pages , [21] M. Seedman. The synacic process, volume 35. MIT Press, [22] C. Unger, C. Forascu, V. Lopez, A. N. Ngomo, E. Cabrio, P. Cimiano, and S. Waler. Quesion answering over linked daa (QALD-4). In CLEF 2014, pages , [23] J. Xu and H. Li. Adarank: a boosing algorihm for informaion rerieval. In SIGIR, pages , [24] X. Yao, J. Beran, and B. V. Durme. Freebase QA: Informaion Exracion or Semanic Parsing? In ACL, Workshop on Semanic Parsing, [25] X. Yao and B. V. Durme. Informaion Exracion over Srucured Daa: Quesion Answering wih Freebase. In ACL, pages , [26] C. Zhu, R. H. Byrd, P. Lu, and J. Nocedal. Algorihm 778: L-BFGS-B: Forran subrouines for large-scale bound-consrained opimizaion. ACM Trans. Mah. Sofw., 23(4): , 1997.

Neural Network Model of the Backpropagation Algorithm

Neural Network Model of the Backpropagation Algorithm Neural Nework Model of he Backpropagaion Algorihm Rudolf Jakša Deparmen of Cyberneics and Arificial Inelligence Technical Universiy of Košice Lená 9, 4 Košice Slovakia jaksa@neuron.uke.sk Miroslav Karák

More information

Fast Multi-task Learning for Query Spelling Correction

Fast Multi-task Learning for Query Spelling Correction Fas Muli-ask Learning for Query Spelling Correcion Xu Sun Dep. of Saisical Science Cornell Universiy Ihaca, NY 14853 xusun@cornell.edu Anshumali Shrivasava Dep. of Compuer Science Cornell Universiy Ihaca,

More information

1 Language universals

1 Language universals AS LX 500 Topics: Language Uniersals Fall 2010, Sepember 21 4a. Anisymmery 1 Language uniersals Subjec-erb agreemen and order Bach (1971) discusses wh-quesions across SO and SO languages, hypohesizing:...

More information

MyLab & Mastering Business

MyLab & Mastering Business MyLab & Masering Business Efficacy Repor 2013 MyLab & Masering: Business Efficacy Repor 2013 Edied by Michelle D. Speckler 2013 Pearson MyAccouningLab, MyEconLab, MyFinanceLab, MyMarkeingLab, and MyOMLab

More information

An Effiecient Approach for Resource Auto-Scaling in Cloud Environments

An Effiecient Approach for Resource Auto-Scaling in Cloud Environments Inernaional Journal of Elecrical and Compuer Engineering (IJECE) Vol. 6, No. 5, Ocober 2016, pp. 2415~2424 ISSN: 2088-8708, DOI: 10.11591/ijece.v6i5.10639 2415 An Effiecien Approach for Resource Auo-Scaling

More information

Channel Mapping using Bidirectional Long Short-Term Memory for Dereverberation in Hands-Free Voice Controlled Devices

Channel Mapping using Bidirectional Long Short-Term Memory for Dereverberation in Hands-Free Voice Controlled Devices Z. Zhang e al.: Channel Mapping using Bidirecional Long Shor-Term Memory for Dereverberaion in Hands-Free Voice Conrolled Devices 525 Channel Mapping using Bidirecional Long Shor-Term Memory for Dereverberaion

More information

Information Propagation for informing Special Population Subgroups about New Ground Transportation Services at Airports

Information Propagation for informing Special Population Subgroups about New Ground Transportation Services at Airports Downloaded from ascelibrary.org by Basil Sephanis on 07/13/16. Copyrigh ASCE. For personal use only; all righs reserved. Informaion Propagaion for informing Special Populaion Subgroups abou New Ground

More information

Distant Supervised Relation Extraction with Wikipedia and Freebase

Distant Supervised Relation Extraction with Wikipedia and Freebase Distant Supervised Relation Extraction with Wikipedia and Freebase Marcel Ackermann TU Darmstadt ackermann@tk.informatik.tu-darmstadt.de Abstract In this paper we discuss a new approach to extract relational

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

The Internet as a Normative Corpus: Grammar Checking with a Search Engine The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a

More information

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics (L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

More information

A Comparison of Two Text Representations for Sentiment Analysis

A Comparison of Two Text Representations for Sentiment Analysis 010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational

More information

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Vijayshri Ramkrishna Ingale PG Student, Department of Computer Engineering JSPM s Imperial College of Engineering &

More information

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence. NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each

More information

Ensemble Technique Utilization for Indonesian Dependency Parser

Ensemble Technique Utilization for Indonesian Dependency Parser Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

SEMAFOR: Frame Argument Resolution with Log-Linear Models

SEMAFOR: Frame Argument Resolution with Log-Linear Models SEMAFOR: Frame Argument Resolution with Log-Linear Models Desai Chen or, The Case of the Missing Arguments Nathan Schneider SemEval July 16, 2010 Dipanjan Das School of Computer Science Carnegie Mellon

More information

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

Finding Translations in Scanned Book Collections

Finding Translations in Scanned Book Collections Finding Translations in Scanned Book Collections Ismet Zeki Yalniz Dept. of Computer Science University of Massachusetts Amherst, MA, 01003 zeki@cs.umass.edu R. Manmatha Dept. of Computer Science University

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

The stages of event extraction

The stages of event extraction The stages of event extraction David Ahn Intelligent Systems Lab Amsterdam University of Amsterdam ahn@science.uva.nl Abstract Event detection and recognition is a complex task consisting of multiple sub-tasks

More information

Multilingual Sentiment and Subjectivity Analysis

Multilingual Sentiment and Subjectivity Analysis Multilingual Sentiment and Subjectivity Analysis Carmen Banea and Rada Mihalcea Department of Computer Science University of North Texas rada@cs.unt.edu, carmen.banea@gmail.com Janyce Wiebe Department

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities

Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion

More information

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S N S ER E P S I M TA S UN A I S I T VER RANKING AND UNRANKING LEFT SZILARD LANGUAGES Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A-1997-2 UNIVERSITY OF TAMPERE DEPARTMENT OF

More information

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Active Learning. Yingyu Liang Computer Sciences 760 Fall Active Learning Yingyu Liang Computer Sciences 760 Fall 2017 http://pages.cs.wisc.edu/~yliang/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed by Mark Craven,

More information

Prediction of Maximal Projection for Semantic Role Labeling

Prediction of Maximal Projection for Semantic Role Labeling Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba

More information

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS Pirjo Moen Department of Computer Science P.O. Box 68 FI-00014 University of Helsinki pirjo.moen@cs.helsinki.fi http://www.cs.helsinki.fi/pirjo.moen

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

Detecting English-French Cognates Using Orthographic Edit Distance

Detecting English-French Cognates Using Orthographic Edit Distance Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National

More information

Extracting Verb Expressions Implying Negative Opinions

Extracting Verb Expressions Implying Negative Opinions Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence Extracting Verb Expressions Implying Negative Opinions Huayi Li, Arjun Mukherjee, Jianfeng Si, Bing Liu Department of Computer

More information

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.

More information

New Features & Functionality in Q Release Version 3.1 January 2016

New Features & Functionality in Q Release Version 3.1 January 2016 in Q Release Version 3.1 January 2016 Contents Release Highlights 2 New Features & Functionality 3 Multiple Applications 3 Analysis 3 Student Pulse 3 Attendance 4 Class Attendance 4 Student Attendance

More information

Outline. Web as Corpus. Using Web Data for Linguistic Purposes. Ines Rehbein. NCLT, Dublin City University. nclt

Outline. Web as Corpus. Using Web Data for Linguistic Purposes. Ines Rehbein. NCLT, Dublin City University. nclt Outline Using Web Data for Linguistic Purposes NCLT, Dublin City University Outline Outline 1 Corpora as linguistic tools 2 Limitations of web data Strategies to enhance web data 3 Corpora as linguistic

More information

Term Weighting based on Document Revision History

Term Weighting based on Document Revision History Term Weighting based on Document Revision History Sérgio Nunes, Cristina Ribeiro, and Gabriel David INESC Porto, DEI, Faculdade de Engenharia, Universidade do Porto. Rua Dr. Roberto Frias, s/n. 4200-465

More information

The Role of the Head in the Interpretation of English Deverbal Compounds

The Role of the Head in the Interpretation of English Deverbal Compounds The Role of the Head in the Interpretation of English Deverbal Compounds Gianina Iordăchioaia i, Lonneke van der Plas ii, Glorianna Jagfeld i (Universität Stuttgart i, University of Malta ii ) Wen wurmt

More information

Short Text Understanding Through Lexical-Semantic Analysis

Short Text Understanding Through Lexical-Semantic Analysis Short Text Understanding Through Lexical-Semantic Analysis Wen Hua #1, Zhongyuan Wang 2, Haixun Wang 3, Kai Zheng #4, Xiaofang Zhou #5 School of Information, Renmin University of China, Beijing, China

More information

Online Updating of Word Representations for Part-of-Speech Tagging

Online Updating of Word Representations for Part-of-Speech Tagging Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org

More information

Natural Language Processing. George Konidaris

Natural Language Processing. George Konidaris Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract

More information

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches Yu-Chun Wang Chun-Kai Wu Richard Tzong-Han Tsai Department of Computer Science

More information

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design Paper #3 Five Q-to-survey approaches: did they work? Job van Exel

More information

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic

More information

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February

More information

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Aspectual Classes of Verb Phrases

Aspectual Classes of Verb Phrases Aspectual Classes of Verb Phrases Current understanding of verb meanings (from Predicate Logic): verbs combine with their arguments to yield the truth conditions of a sentence. With such an understanding

More information

Language Independent Passage Retrieval for Question Answering

Language Independent Passage Retrieval for Question Answering Language Independent Passage Retrieval for Question Answering José Manuel Gómez-Soriano 1, Manuel Montes-y-Gómez 2, Emilio Sanchis-Arnal 1, Luis Villaseñor-Pineda 2, Paolo Rosso 1 1 Polytechnic University

More information

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2 Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2 Ted Pedersen Department of Computer Science University of Minnesota Duluth, MN, 55812 USA tpederse@d.umn.edu

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures Ulrike Baldewein (ulrike@coli.uni-sb.de) Computational Psycholinguistics, Saarland University D-66041 Saarbrücken,

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering

More information

On-the-Fly Customization of Automated Essay Scoring

On-the-Fly Customization of Automated Essay Scoring Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,

More information

An Interactive Intelligent Language Tutor Over The Internet

An Interactive Intelligent Language Tutor Over The Internet An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This

More information

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Chinese Language Parsing with Maximum-Entropy-Inspired Parser Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art

More information

Constructing Parallel Corpus from Movie Subtitles

Constructing Parallel Corpus from Movie Subtitles Constructing Parallel Corpus from Movie Subtitles Han Xiao 1 and Xiaojie Wang 2 1 School of Information Engineering, Beijing University of Post and Telecommunications artex.xh@gmail.com 2 CISTR, Beijing

More information

Welcome to the Purdue OWL. Where do I begin? General Strategies. Personalizing Proofreading

Welcome to the Purdue OWL. Where do I begin? General Strategies. Personalizing Proofreading Welcome to the Purdue OWL This page is brought to you by the OWL at Purdue (http://owl.english.purdue.edu/). When printing this page, you must include the entire legal notice at bottom. Where do I begin?

More information

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN From: AAAI Technical Report WS-98-08. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Recommender Systems: A GroupLens Perspective Joseph A. Konstan *t, John Riedl *t, AI Borchers,

More information

Copyright Corwin 2015

Copyright Corwin 2015 2 Defining Essential Learnings How do I find clarity in a sea of standards? For students truly to be able to take responsibility for their learning, both teacher and students need to be very clear about

More information

IN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions.

IN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions. 6 1 IN THIS UNIT YOU LEARN HOW TO: ask and answer common questions about jobs talk about what you re doing at work at the moment talk about arrangements and appointments recognise and use collocations

More information

Writing a composition

Writing a composition A good composition has three elements: Writing a composition an introduction: A topic sentence which contains the main idea of the paragraph. a body : Supporting sentences that develop the main idea. a

More information

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4 University of Waterloo School of Accountancy AFM 102: Introductory Management Accounting Fall Term 2004: Section 4 Instructor: Alan Webb Office: HH 289A / BFG 2120 B (after October 1) Phone: 888-4567 ext.

More information

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working

More information

Learning a Cross-Lingual Semantic Representation of Relations Expressed in Text

Learning a Cross-Lingual Semantic Representation of Relations Expressed in Text Learning a Cross-Lingual Semantic Representation of Relations Expressed in Text Achim Rettinger, Artem Schumilin, Steffen Thoma, and Basil Ell Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

The Ups and Downs of Preposition Error Detection in ESL Writing

The Ups and Downs of Preposition Error Detection in ESL Writing The Ups and Downs of Preposition Error Detection in ESL Writing Joel R. Tetreault Educational Testing Service 660 Rosedale Road Princeton, NJ, USA JTetreault@ets.org Martin Chodorow Hunter College of CUNY

More information

Mining Topic-level Opinion Influence in Microblog

Mining Topic-level Opinion Influence in Microblog Mining Topic-level Opinion Influence in Microblog Daifeng Li Dept. of Computer Science and Technology Tsinghua University ldf3824@yahoo.com.cn Jie Tang Dept. of Computer Science and Technology Tsinghua

More information

Life and career planning

Life and career planning Paper 30-1 PAPER 30 Life and career planning Bob Dick (1983) Life and career planning: a workbook exercise. Brisbane: Department of Psychology, University of Queensland. A workbook for class use. Introduction

More information

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1 Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial

More information

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data Maja Popović and Hermann Ney Lehrstuhl für Informatik VI, Computer

More information

Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews

Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews Syntactic Patterns versus Word Alignment: Extracting Opinion Targets from Online Reviews Kang Liu, Liheng Xu and Jun Zhao National Laboratory of Pattern Recognition Institute of Automation, Chinese Academy

More information

Proof Theory for Syntacticians

Proof Theory for Syntacticians Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax

More information

PIRLS. International Achievement in the Processes of Reading Comprehension Results from PIRLS 2001 in 35 Countries

PIRLS. International Achievement in the Processes of Reading Comprehension Results from PIRLS 2001 in 35 Countries Ina V.S. Mullis Michael O. Martin Eugenio J. Gonzalez PIRLS International Achievement in the Processes of Reading Comprehension Results from PIRLS 2001 in 35 Countries International Study Center International

More information

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma University of Alberta Large-Scale Semi-Supervised Learning for Natural Language Processing by Shane Bergsma A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of

More information

Handling Sparsity for Verb Noun MWE Token Classification

Handling Sparsity for Verb Noun MWE Token Classification Handling Sparsity for Verb Noun MWE Token Classification Mona T. Diab Center for Computational Learning Systems Columbia University mdiab@ccls.columbia.edu Madhav Krishna Computer Science Department Columbia

More information

The Good Judgment Project: A large scale test of different methods of combining expert predictions

The Good Judgment Project: A large scale test of different methods of combining expert predictions The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania

More information

How to Judge the Quality of an Objective Classroom Test

How to Judge the Quality of an Objective Classroom Test How to Judge the Quality of an Objective Classroom Test Technical Bulletin #6 Evaluation and Examination Service The University of Iowa (319) 335-0356 HOW TO JUDGE THE QUALITY OF AN OBJECTIVE CLASSROOM

More information

In Workflow. Viewing: Last edit: 10/27/15 1:51 pm. Approval Path. Date Submi ed: 10/09/15 2:47 pm. 6. Coordinator Curriculum Management

In Workflow. Viewing: Last edit: 10/27/15 1:51 pm. Approval Path. Date Submi ed: 10/09/15 2:47 pm. 6. Coordinator Curriculum Management 1 of 5 11/19/2015 8:10 AM Date Submi ed: 10/09/15 2:47 pm Viewing: Last edit: 10/27/15 1:51 pm Changes proposed by: GODWINH In Workflow 1. BUSI Editor 2. BUSI Chair 3. BU Associate Dean 4. Biggio Center

More information

A Study of Video Effects on English Listening Comprehension

A Study of Video Effects on English Listening Comprehension Studies in Literature and Language Vol. 8, No. 2, 2014, pp. 53-58 DOI:10.3968/4348 ISSN 1923-1555[Print] ISSN 1923-1563[Online] www.cscanada.net www.cscanada.org Study of Video Effects on English Listening

More information

LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano

LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES Judith Gaspers and Philipp Cimiano Semantic Computing Group, CITEC, Bielefeld University {jgaspers cimiano}@cit-ec.uni-bielefeld.de ABSTRACT Semantic parsers

More information

The Strong Minimalist Thesis and Bounded Optimality

The Strong Minimalist Thesis and Bounded Optimality The Strong Minimalist Thesis and Bounded Optimality DRAFT-IN-PROGRESS; SEND COMMENTS TO RICKL@UMICH.EDU Richard L. Lewis Department of Psychology University of Michigan 27 March 2010 1 Purpose of this

More information

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models

Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models Richard Johansson and Alessandro Moschitti DISI, University of Trento Via Sommarive 14, 38123 Trento (TN),

More information

21st CENTURY SKILLS IN 21-MINUTE LESSONS. Using Technology, Information, and Media

21st CENTURY SKILLS IN 21-MINUTE LESSONS. Using Technology, Information, and Media 21st CENTURY SKILLS IN 21-MINUTE LESSONS Using Technology, Information, and Media T Copyright 2011 by Saddleback Educational Publishing. All rights reserved. No part of this book may be reproduced in any

More information

Learning to Rank with Selection Bias in Personal Search

Learning to Rank with Selection Bias in Personal Search Learning to Rank with Selection Bias in Personal Search Xuanhui Wang, Michael Bendersky, Donald Metzler, Marc Najork Google Inc. Mountain View, CA 94043 {xuanhui, bemike, metzler, najork}@google.com ABSTRACT

More information

Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks

Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks Rajarshi Das Manzil Zaheer Siva Reddy and Andrew McCallum College of Information and Computer Sciences, University

More information

Foundations of Knowledge Representation in Cyc

Foundations of Knowledge Representation in Cyc Foundations of Knowledge Representation in Cyc Why use logic? CycL Syntax Collections and Individuals (#$isa and #$genls) Microtheories This is an introduction to the foundations of knowledge representation

More information

Derivational and Inflectional Morphemes in Pak-Pak Language

Derivational and Inflectional Morphemes in Pak-Pak Language Derivational and Inflectional Morphemes in Pak-Pak Language Agustina Situmorang and Tima Mariany Arifin ABSTRACT The objectives of this study are to find out the derivational and inflectional morphemes

More information