Resolving Dependency Ambiguity of Subordinate Clauses using Support Vector Machines
|
|
- Myron Preston
- 6 years ago
- Views:
Transcription
1 Resolvig Depedecy Ambiguity of Subordiate Clauses usig Support Vector Machies Sag-Soo Kim, Seog-Bae Park, ad Sag-Jo Lee Abstract I this paper, we propose a method of resolvig depedecy ambiguities of Korea subordiate clauses based o Support Vector Machies (SVMs). Depedecy aalysis of clauses is well kow to be oe of the most difficult tasks i parsig seteces, especially i Korea. I order to solve this problem, we assume that the depedecy relatio of Korea subordiate clauses is the depedecy relatio amog verb phrase, verb ad edigs i the clauses. As a result, this problem is represeted as a biary classificatio task. I order to apply SVMs to this problem, we selected two kids of features: static ad dyamic features. The experimetal results o STEP2000 corpus show that our system achieves the accuracy of 73.5%. Keywords Depedecy aalysis, subordiate clauses, biary classificatio, support vector machies. I. INTRODUCTION N Korea, the depedecy aalysis of clauses is kow as I oe of the most difficult tasks i parsig seteces because of the characteristics of Korea. The characteristics of Korea are that (i) it is a partially free word-order, (ii) the omissio of compoets is commo, (iii) it is a head-fial laguage, ad (iv) the spacig uit is a composite of oe or more words. Especially, what makes the clause depedecy aalysis difficult is the third factor. The edigs (Eomi) ca be freely combied with a verb, ad they cotai the sematic relatioship with other verbs. The steps of parsig Korea seteces are as follows. First, the iput setece is aalyzed ito the morphemes, ad the the part-of-speech (POS) of the morphemes is determied by some meas. Fially, the sytactic relatio is aalyzed usig the results of the previous steps. Due to the characteristics of Korea, the depedecy grammar rather tha phrase-structure grammar is geerally used i parsig Korea []. This process is ot much differet from other laguages. However, sice each word used i a setece becomes a processig uit, the complexity of parsig gets too large especially with log seteces, which results i severe ambiguities i parsig This research was supported i part by MIC & IITA through IT Leadig R&D Support Project, ad by grat No. R from the Basic Research Program of the Korea Sciece & Egieerig Foudatio. Authors are with Departmet of Computer Egieerig, Kyugpook Natioal Uiversity, Daegu , Korea (correspodig author to provide sskim@sejog.ku.ac.kr). Korea. Recetly, i order to solve this problem, elargig the processig uit gais much iterest from researchers of Korea laguage processig. May kids of research results are reported o Korea text chukig [2,3], ad they gives relatively stable results. I additio, some researchers have studied to fid the boudaries of a clause [4]. Whe the clause boudaries are kow, itra-clause parsig is a simple task compared to iter-clause parsig. This is because Korea is head-fial. However, the relatio betwee clauses is ot determied by the iformatio give by text chukig ad clause boudaries. Due to the freedom of word orderig i Korea seteces, it is extremely difficult to determie the relatio betwee clauses by seeig just the eighbor words. That is, it has bee believed that the surface form of a clause is ot sufficiet to aalyzig the relatio amog clauses. As a result, may previous works o parsig Korea have focused o how to use sematic iformatio of verb phrases i determiig the clause relatio []. However, it is a very expesive ad time-cosumig task to build sematic kowledge for the task. I this paper, we propose a ovel method of aalyzig depedecy relatio of Korea subordiate clauses without exteral kowledge. For this task, we witessed that the most importat compoet i determiig the depedecies is the base verb phrase composed of a verb ad a few edigs, rather tha the complemet ad supplemet compoets withi a clause. Therefore, i order to solve the problem, we assume that the depedecy relatio of Korea subordiate clauses is the depedecy relatio of base verb phrases. I additio, we formulate the depedecy aalysis of Korea clauses to a biary classificatio task. As a classifier for this task, we adopt a support vector machie (SVM) which is kow as the best classifier for may kids of real-world classificatio problems. The rest of this paper is orgaized as follows. Sectio 2 surveys the previous work o clause recogitio ad aalysis of iter-clause relatio, ad Sectio 3 itroduces how a support vector machie works which is adopted as a base learer for the task. Sectio 4 describes the proposed method for clausal depedecy aalysis usig support vector machies. Sectio 5 explais the corpus used i the experimets ad presets the experimetal results. Fially, Sectio 5 draws coclusios ad suggests some future work. 95
2 II. RELATED WORK There have bee a umber of studies for aalyzig depedecy relatio of subordiate clauses i the clause idetificatio ad the depedecy structure aalysis. The clause idetificatio is a task of recogizig the embeddedess of clauses, while the clause idetificatio is to fid the startig ad edig poits of clauses. I 200, there was a competitio for this task at the Coferece o Computatioal Laguage Learig (CoNLL). The best two methods are a boostig tree [5] ad a hidde Markov Model [6]. However, ulike wester laguages, i Korea the depedecy relatio is ot easily determied eve if the clause boudaries are idetified. Most previous work o depedecy aalysis of seteces has focused o the words rather tha clauses. That is, istead of fidig the depedecy relatio amog clauses, the relatio amog verb phrases withi a clause has bee the core of the research. Uchimoto et al. used a maximum etropy model ad various kids of features to idetify depedecy structure of seteces [7]. They reported the experimetal results o the relatioship betwee feature types ad depedecy aalysis. Kudo ad Matsumoto formulated the aalysis of depedecy structure as a biary classificatio task, ad adopted support vector machies as a classifier [8]. The features used i traiig support vector machies are grammatical features such as lexicos ad part-of-speech tags, ad some fuctioal features such as fuctioal words ad iflectio iformatio. Gao ad Suzuki solved the problem of aalyzig depedecy relatio by traiig a laguage model through a usupervised learig [9]. Utsuro et al. classified text chuks ito several types accordig to the fuctioal words of the fial word i a setece. With the classified type, they determied the depedecy relatio amog chuks [0]. I Korea laguage processig, most research o sytactic aalysis has bee focused o the Josa ad Eomi, ad their depedecy relatio. As a result, most works are based o the had-crafted rules []. Especially, the research o the subordiate clauses was performed o the recogitio simple setece ad restoratio of the omitted compoets i the simple setece. The first effort to use a machie learig algorithm i hadlig clauses was doe for clause boudary detectio. Lee et al. extracted -gram iformatio from a setece, ad the recogized the boudary of a clause usig the iformatio []. However, their work was limited to detectio of clauses, ad did ot suggest ay method for aalyzig their depedecy. The mai reaso why the machie learig algorithms are rare i hadlig Korea clauses is that there is o stadard large-scale dataset for the task. Recetly the great fudig of the Korea govermet i writig a large-scale tree-tagged corpus makes it possible to trasform the corpus ito the data for clause detectio ad their depedecy aalysis. III. SUPPORT VECTOR MACHINES Support Vector Machie (SVM) proposed by Vapik is a kid of machie learig algorithms, ad is well kow as the most successful biary classifier, ad have bee applied to may classificatio tasks. I the field of atural laguage processig, it has bee successfully applied to text categorizatio, spam-mail filterig ad chuk idetificatio, ad it is reported to accomplish high performace without fallig ito over-fittig eve with a large umber of features [2, 3]. Assume that the traiig data with either positive or egative class as follows: x, y ),( x, y ),...,( x, y ) ( 2 2 xi R, yi { +, } where x i is a feature vector of the i-th traiig datum i a -dimesioal space, ad y i is its class label. I the basic SVM framework, the hyperplae is defied as follows: ( w x ) + b = 0, w R, b R. Accordig to the hyperplae defiitio, there could be the ifiite umber of hyperplaes that ca separate traiig data ito two classes correctly. Fig. The margi of a hyperplae Amog such hyperplaes, we defie the optimal hyperplae as the oe with the largest margi betwee two classes. Fig. illustrates the otio of the margi. The solid lie, hyperplae, correctly divides traiig data ito two classes without misclassificatio. Two dash lies which are parallel with the hyperplae represet the distace betwee hyperplae ad the closest istace. The distace betwee each parallel dash lies, d, is called the margi. Thus, assumig that the earest distace is, the margi ca be rewritte as: ( w x) + b + ( w x) + b 2 d = w Therefore, SVM geerates a hyperplae which maximizes a margi by miimizig w uder the costraits: [( w x ) + b] y i d l l 96
3 Fig. 2 A example of a depedecy relatio betwee clauses SVMs have a advatage over covetioal machie learig algorithms such as eural etworks or decisio trees. SVMs show higher geeralizatio performace idepedet of the dimesio of feature vectors. Covetioal machie learig algorithms usually require careful feature selectio, which is ofte optimized heuristically to avoid over-fittig. SVMs also ca carry out their learig with all combiatios of give features without icreasig computatioal complexity by itroducig the kerel fuctio. IV. ANALYZING DEPENDENCY RELATION OF SUBORDINATE CLAUSES A. The Probability Model ad Geeratig Traiig Data Let a sequece of clauses be {c, c 2,..., c } deoted by C, ad the sequece depedecy patters be {Dep(), Dep(2),, Dep(-)} deoted by D, where Dep(i)=j implies that the clause c i modifies the clause c j. I Kora uder this framework, this depedecy relatio has to satisfy some costraits. A clause has oly oe depedecy relatio except for the rightmost oe. It meas that a clause modifies oly oe clause. A depedecy relatio is defied as a searchig problem for depedecy patter D that maximizes the coditioal probability P(D C). That is, D = arg max PD ( C) best D If we assume that the depedecy probability is idepedet oe aother, P(D C) ca be rewritte as: m PDC ( ) = PDepi ( ( ) = j f } i= f = { f,..., f} R where f is a -dimesioal feature vector that represets relatio betwee clauses. I order to use SVMs i aalyzig the clausal depedecy, we geerate positive ad egative examples. We adopt simple ad effective method for this purpose. (f, y ) = {(f, y ),(f, y ),...,(f, y )} U i m i+ j m m m m m f = { f,..., f } R y { Dep( + ), Not Dep( )} TABLE I FEATURES USED FOR ANALYZING DEPENDENCY RELATION Static Features Lexico Iformatio Positio Iformatio Dyamic Features Left Clause A word of verb POS tag of verb A word of edigs POS tag of edigs Right Clause A word of verb POS tag of verb A word of edigs POS tag of edigs Distace betwee left ad right clause Positio idex of left ad right Clauses A sytactic relatio betwee clauses Accordig to the above equatio, we geerate pairs of two clauses i the traiig data, ad the take a pair of clauses that are i a depedecy relatio as a positive example, ad two clauses that appear i a setece but are ot with a depedecy relatio as a egative example. Fig. 2 shows a example of depedecy relatio extractio betwee clauses. I this example, clause, 2 ad 3 meas shower-room was moved to the gymasium, office-room was closed, ad a rest room was made i the place (origial place of shower ad office room). I this case, we ca geerate oe positive example (Case 2 i Fig. 2) ad oe egative example (Case i Fig. 2). B. Feature Selectio for Aalyzig Depedecy Relatio I Korea laguage, the clauses are divided ito three types that are oe to modify other clause (called cojuctive clause), oe to modify a ou phrase (called preomial clause), ad oe to imply the ed of setece (called fial edig clause). Amog these clause types, preomial clause ad cojuctive clause make depedecy relatio. We select depedecy relatio that cojuctive clause was deped o other clause, because preomial clause make a simple depedecy relatio with to modify a ext appearig ou phrase. The cojuctive clause makes the depedecy relatio very complex ad, thus, it is difficult to recogize depedecy relatio. The relatio ca be determied ot accordig to simple sytactic iformatio such as verb type ad positio i setece but accordig to the 97
4 Fig. 4 A example of determiig depedecy relatio usig dyamic features cotext of setece ad the iflectio of edigs. I the previous sectio, we assume that the depedecy relatio of Korea subordiate clauses is the depedecy relatio of verb phrase, verb ad edigs, i the clauses. Accordig to this assumptio, we select two features that are static ad dyamic features. The feature set is show i Table I. We defie lexico ad positio iformatio appearig i a setece as the static iformatio. The lexico iformatio is a word ad POS tags of verb ad edigs i the pair of left ad right clauses. The positioal iformatio is the distace betwee clauses, ad positio idex is the locatio of clauses i a setece. We expect that this static features weakly represet the sematic iformatio betwee clauses. Fig. 3 is show the static features for Fig. 2. N o 2 Lexico iformatio Positio iformatio Left Clause Right Clase (Distace, Positio Idex) 옮기 /pvg 고 /ecc 하 /px ㄴ /etm, 옮기 /pvg 하 /xsv 었 /ep 고 /ecc 다 /ef 2, 2 Fig. 3 The example static features The dyamic features are the sytactic iformatio i a setece. Therefore, we make a simple CKY cart parser so that it captures sytactic iformatio i the setece. Table 2 shows a rule set for the chart parser. CC implies a cojuctive clause ad PC implies a preomial clause. TABLE II THE RULE USED FOR CHART PARSING Rule : CC CC CC Rule 2: CC PC CC Rule 3: PC PC PC Rule 4: PC CC PC With dyamic features we ca apply a sytactic relatio of clauses to traiig the support vector machies. The sytactic relatio states if a clause is composed of just oe simple clause or more tha oe clause. Fig. 4 shows a example of a aalyzig depedecy relatio usig the dyamic feature. The static feature of Case ad Case 2 are same, but the dyamic features are differet. The dyamic feature of Case is PC CC determied by rule 2, but that of Case 2 is CC. Table III shows the whole features for Fig. 3. No 2 3 TABLE III THE WHOLE FEATURES Static features Lexico iformatio 옮기 /pvg 고 /ecc 하 /px ㄴ /etm 옮기 /pvg 고 /ecc 하 /px ㄴ /etm 옮기 /pvg 고 /ecc 하 /xsv 었 /ep 다 /ef Positio iformatio Dyamic features, PC CC, CC 2, 2 PC V. EXPERIMENTS For the evaluatio of the proposed method, a data set for depedecy aalysis of clauses i Korea is prepared. This dataset is derived from the parse corpus, which is a product of STEP2000 project supported by the Korea govermet. The corpus cosists of 6,934 seteces with 26,876 clauses. The corpus is divided ito two parts: traiig (90%) ad test (0%) set. Table IV shows a simple statistics o the corpus. TABLE IV COUNTS ON THE DATASET Iformatio Traiig Set Test Set No. of all seteces 6, No. of all clauses 24,226 2,650 No. of preomial ad fi al edig clauses 5,457,666 No. of cojuctive clauses 8,
5 Fig. 5 shows a example of depedecy relatio i the subordiate clause dataset. For the format of this dataset, we follow that of CoNLL-200 shared task ad additioally add the depedecy relatio of clauses to it. Each istace i the traiig ad test data cosists of six colums. The first colum cotais the lexico, the secod presets a part-of-speech tag. The third colum cotais the chuk tag. The verb phrases i these colums are used as static features. The fourth ad fifth cotai a begiig, S, ad a edig, E, of clauses. The sixth colum gives the relatio idex of clauses. We apply SVM Light [4] for support vector machie, ad experimet o three cases. The first ad secod case used oly words ad POS tags of clauses ad the all of static features. The last case used both static features ad dyamic features. The evaluatio measure is defied as: correctly recogized depedecy relatio of clauses Accuracy = 00 total depedecy relatio Whe a clause makes several pairs of depedecy relatio with more tha oe clause, we select a pair which has the largest margi. Table V shows the experimetal results. The base lie is the model that determies the goveror of a clause as the earest oe. TABLE V THE EXPERIMENTAL RESULTS Features Accuracy (% ) Base Lie Case Oly words ad POS tags of clauses Case 2 All of Static features Case 3 All of Static ad Dyamic features I Case, whe oly words ad POS tags of clauses are used, the accuracy is just 68.59%, That is, the proposed model improves 6.09% over the base lie. It implies that the verb ad edigs have a depedecy relatio weakly. The secod case which uses all of the static features shows 68.59% of accuracy. It meas that the positioal iformatio i static features affect the depedecy relatio. I the last case, the results with both static ad dyamic features are far better tha those without dyamic features. That is, the model with dyamic features outperforms that with static features oly. The performace of our approach is a little bit lower tha a performace of other researches that aalyze the depedecy relatio i Japaese ad Europea laguages. It seems that our approach select oly a relatio of clauses without relatios of word ad phrases. It is easier to aalyze the relatios of word ad phrases tha to aalyze the relatio of clauses. VI. CONCLUSION We have proposed a method for aalyzig depedecy relatio of Korea subordiate clauses based o Support Vector Machies (SVMs). I other to solve this problem, we assume that the depedecy relatio of Korea subordiate clauses is the depedecy relatio of verb phrase, verb ad edigs, i the clauses. We formulate this problem as a biary classificatio task. We selected two kid of features, static ad dyamic features, for applyig SVMs to this problem. The static features are word, POS tag, ad the positioal iformatio, while the dyamic features iclude the sytactic iformatio of the caluses. For extractig the dyamic iformatio, we make a simple CKY chart parser with simple rules. The experimetal results o STEP2000 corpus show that our system achieves the accuracy of 73.5%. 샤워실 c B-NP S X 0 shower room 2 을 jco I-NP X X 0 POST 3 체육관 c B-NP X X 0 gymasium 4 으로 jca I-NP X X 0 POST 5 옮기 pcg B-VP X X 0 move 6 고 ecc I-NP X E ENDING 7 사무실 c B-NP S X 0 office room 8 을 jco I-NP X X 0 POST 9 폐쇄 cpa B-VP X X 0 closed 0 하 xsv I-VP X X 0 ENDING ㄴ etm I-VP X E 2 ENDING 2 그곳 pd B-NP X X 0 that place 3 에 jca I-NP X X 0 ENDING 4 휴게실 c B-NP X X 0 rest room 5 을 jco I-NP X X 0 POST 6 만들 pvg B-VP X X 0 Make 7 었 ep I-VP X X 0 ENDING 8 다 ef I-VP X X 0 ENDING 9. sf O X E - Fig. 5 A example of depedecy relatio i the subordiate clause dataset REFERENCES [] K.-J. Seo, A Korea laguage parser usig sytactic depedecy relatios betwee word-phrases, M.S. Thesis, KAIST, 993. [2] S.-B. Park ad B.-T. Zhag, Text Chukig by Combiig Had-Crafted Rules ad Memory-Based Learig, I Proceedigs of the 4st Aual Meetig of the Associatio for Computatioal Liguistics, pp , [3] H.-P. Shi, Maximally Efficiet Sytactic Parsig with Miimal Resources, I Proceedigs of the Coferece o Hagul ad Korea Laguage Iformatio Processig, pp , 999. (I Korea) [4] H.-J. Lee, S.-B. Park, S.-J. Lee, ad S.-Y Park, Clause Boudary Recogitio Usig Support Vector Machies, I Proceedigs of the 9th Pacific Rim Iteratioal Coferece o Artificial Itelligece, pp ,
6 [5] X. Carreras ad L. Marquez, Boostig Trees for Clause Splittig, I Proceedigs of the 5 th Coferece o Computatioal Natural Laguage Learig, pp. -3, 200. [6] A. Molia ad F. Pla, Clause Detectio usig HMM, I Proceedigs of the 5 th Coferece o Computatioal Natural Laguage Learig, pp , 200. [7] K. Uchimoto, S. Sekie, ad H. Isahara, Japaese Depedecy Structure Aalysis Based o Maximum Etropy Models, I Proceedigs of the 9th Coferece of the Europea Chapter of the Associatio for Computatioal Liguistics, pp , 999. [8] T. Kudo ad Y. Matsumoto, Japaese Depedecy Structure Aalysis Based o Support Vector Machies, I Proceedigs of the Joit SIGDAT Coferece o Empirical Methods i Natural Laguage Processig ad Very Large Corpora, pp. 8-25, [9] J. Gao ad H. Suzuki, Usupervised Learig of Depedecy Structure of Laguage Modelig, I Proceedigs of the 4st Aual Meetig of the Associatio for Computatioal Liguistics, pp , [0] T. Utsuro, S. Nishiokauama, M. Fujio, ad Y. Matsumoto, Aalyzig Depedecies of Japaese Subordiate Clauses based o Statistics of Scope Embeddig Preferece, I Proceedigs of the st Coferece o North America Chapter of the Associatio for Computatioal Liguistics, pp. 0-7, [] H.-J. Lee, S.-B. Park, S.-J. Lee, ad S.-Y Park, Clause Boudary Recogitio Usig Support Vector Machies, I Proceedigs of the 9th Pacific Rim Iteratioal Coferece o Artificial Itelligece, pp , [2] N. Cristiaii ad J. Shawe-Taylor, A Itroductio to Support Vector Machies ad Other Kerel-based Learig Methods, Cambridge Uiversity Press, [3] T. Joachims, Text Categorizatio with Support Vector Machies: Learig with May Relevat Features, I Proceedigs of the Europea Coferece o Machie Learig, pp , 998. [4] T. Joachims, Makig Large-Scale SVM Learig Practical, LS8, Uiversitaet Dortmud,
Natural language processing implementation on Romanian ChatBot
Proceedigs of the 9th WSEAS Iteratioal Coferece o SIMULATION, MODELLING AND OPTIMIZATION Natural laguage processig implemetatio o Romaia ChatBot RALF FABIAN, MARCU ALEXANDRU-NICOLAE Departmet for Iformatics
More informationarxiv: v1 [cs.dl] 22 Dec 2016
ScieceWISE: Topic Modelig over Scietific Literature Networks arxiv:1612.07636v1 [cs.dl] 22 Dec 2016 A. Magalich, V. Gemmetto, D. Garlaschelli, A. Boyarsky Uiversity of Leide, The Netherlads {magalich,
More information'Norwegian University of Science and Technology, Department of Computer and Information Science
The helpful Patiet Record System: Problem Orieted Ad Kowledge Based Elisabeth Bayega, MS' ad Samso Tu, MS2 'Norwegia Uiversity of Sciece ad Techology, Departmet of Computer ad Iformatio Sciece ad Departmet
More informationManagement Science Letters
Maagemet Sciece Letters 4 (24) 2 26 Cotets lists available at GrowigSciece Maagemet Sciece Letters homepage: www.growigsciece.com/msl A applicatio of data evelopmet aalysis for measurig the relative efficiecy
More informationFuzzy Reference Gain-Scheduling Approach as Intelligent Agents: FRGS Agent
Fuzzy Referece Gai-Schedulig Approach as Itelliget Agets: FRGS Aget J. E. ARAUJO * eresto@lit.ipe.br K. H. KIENITZ # kieitz@ita.br S. A. SANDRI sadra@lac.ipe.br J. D. S. da SILVA demisio@lac.ipe.br * Itegratio
More informationE-LEARNING USABILITY: A LEARNER-ADAPTED APPROACH BASED ON THE EVALUATION OF LEANER S PREFERENCES. Valentina Terzieva, Yuri Pavlov, Rumen Andreev
Titre du documet / Documet title E-learig usability : A learer-adapted approach based o the evaluatio of leaer's prefereces Auteur(s) / Author(s) TERZIEVA Valetia ; PAVLOV Yuri (1) ; ANDREEV Rume (2) ;
More informationConsortium: North Carolina Community Colleges
Associatio of Research Libraries / Texas A&M Uiversity www.libqual.org Cotributors Collee Cook Texas A&M Uiversity Fred Heath Uiversity of Texas BruceThompso Texas A&M Uiversity Martha Kyrillidou Associatio
More informationpart2 Participatory Processes
part part2 Participatory Processes Participatory Learig Approaches Whose Learig? Participatory learig is based o the priciple of ope expressio where all sectios of the commuity ad exteral stakeholders
More informationApplication for Admission
Applicatio for Admissio Admissio Office PO Box 2900 Illiois Wesleya Uiversity Bloomig, Illiois 61702-2900 Apply o-lie at: www.iwu.edu Applicatio Iformatio I am applyig: Early Actio Regular Decisio Early
More informationCONSTITUENT VOICE TECHNICAL NOTE 1 INTRODUCING Version 1.1, September 2014
preview begis oct 2014 lauches ja 2015 INTRODUCING WWW.FEEDBACKCOMMONS.ORG A serviced cloud platform to share ad compare feedback data ad collaboratively develop feedback ad learig practice CONSTITUENT
More informationHANDBOOK. Career Center Handbook. Tools & Tips for Career Search Success CALIFORNIA STATE UNIVERSITY, SACR AMENTO
HANDBOOK Career Ceter Hadbook CALIFORNIA STATE UNIVERSITY, SACR AMENTO Tools & Tips for Career Search Success Academic Advisig ad Career Ceter 6000 J Street Lasse Hall 1013 Sacrameto, CA 95819-6064 916-278-6231
More informationVISION, MISSION, VALUES, AND GOALS
6 VISION, MISSION, VALUES, AND GOALS 2010-2015 VISION STATEMENT Ohloe College will be kow throughout Califoria for our iclusiveess, iovatio, ad superior rates of studet success. MISSION STATEMENT The Missio
More informationA Syllable Based Word Recognition Model for Korean Noun Extraction
are used as the most important terms (features) that express the document in NLP applications such as information retrieval, document categorization, text summarization, information extraction, and etc.
More informationalso inside Continuing Education Alumni Authors College Events
SUMMER 2016 JAMESTOWN COMMUNITY COLLEGE ALUMNI MAGAZINE create a etrepreeur creatig a busiess a artist creatig beauty a citize creatig the future also iside Cotiuig Educatio Alumi Authors College Evets
More informationPrediction of Maximal Projection for Semantic Role Labeling
Prediction of Maximal Projection for Semantic Role Labeling Weiwei Sun, Zhifang Sui Institute of Computational Linguistics Peking University Beijing, 100871, China {ws, szf}@pku.edu.cn Haifeng Wang Toshiba
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationA Comparison of Two Text Representations for Sentiment Analysis
010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationSyntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm
Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together
More information11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation
tatistical Parsing (Following slides are modified from Prof. Raymond Mooney s slides.) tatistical Parsing tatistical parsing uses a probabilistic model of syntax in order to assign probabilities to each
More informationEnhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities
Enhancing Unlexicalized Parsing Performance using a Wide Coverage Lexicon, Fuzzy Tag-set Mapping, and EM-HMM-based Lexical Probabilities Yoav Goldberg Reut Tsarfaty Meni Adler Michael Elhadad Ben Gurion
More informationOn March 15, 2016, Governor Rick Snyder. Continuing Medical Education Becomes Mandatory in Michigan. in this issue... 3 Great Lakes Veterinary
michiga veteriary medical associatio i this issue... 3 Great Lakes Veteriary Coferece 4 What You Need to Kow Whe Issuig a Iterstate Certificate of Ispectio 6 Low Pathogeic Avia Iflueza H5 Virus Detectios
More information2014 Gold Award Winner SpecialParent
Award Wier SpecialParet Dedicated to all families of childre with special eeds 6 th Editio/Fall/Witer 2014 Desig ad Editorial Awards Competitio MISSION Our goal is to provide parets of childre with special
More informationIndian Institute of Technology, Kanpur
Indian Institute of Technology, Kanpur Course Project - CS671A POS Tagging of Code Mixed Text Ayushman Sisodiya (12188) {ayushmn@iitk.ac.in} Donthu Vamsi Krishna (15111016) {vamsi@iitk.ac.in} Sandeep Kumar
More informationUsing dialogue context to improve parsing performance in dialogue systems
Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,
More informationLip reading: Japanese vowel recognition by tracking temporal changes of lip shape
Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More informationSome Principles of Automated Natural Language Information Extraction
Some Principles of Automated Natural Language Information Extraction Gregers Koch Department of Computer Science, Copenhagen University DIKU, Universitetsparken 1, DK-2100 Copenhagen, Denmark Abstract
More informationReducing Features to Improve Bug Prediction
Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science
More informationTHE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING
SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationPOS tagging of Chinese Buddhist texts using Recurrent Neural Networks
POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important
More informationAccurate Unlexicalized Parsing for Modern Hebrew
Accurate Unlexicalized Parsing for Modern Hebrew Reut Tsarfaty and Khalil Sima an Institute for Logic, Language and Computation, University of Amsterdam Plantage Muidergracht 24, 1018TV Amsterdam, The
More informationLearning Computational Grammars
Learning Computational Grammars John Nerbonne, Anja Belz, Nicola Cancedda, Hervé Déjean, James Hammerton, Rob Koeling, Stasinos Konstantopoulos, Miles Osborne, Franck Thollard and Erik Tjong Kim Sang Abstract
More informationBeyond the Pipeline: Discrete Optimization in NLP
Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationNetpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models
Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models 1 Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models James B.
More informationEnsemble Technique Utilization for Indonesian Dependency Parser
Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id
More informationBasic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1
Basic Parsing with Context-Free Grammars Some slides adapted from Julia Hirschberg and Dan Jurafsky 1 Announcements HW 2 to go out today. Next Tuesday most important for background to assignment Sign up
More informationEdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar
EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationThe College Board Redesigned SAT Grade 12
A Correlation of, 2017 To the Redesigned SAT Introduction This document demonstrates how myperspectives English Language Arts meets the Reading, Writing and Language and Essay Domains of Redesigned SAT.
More informationDisambiguation of Thai Personal Name from Online News Articles
Disambiguation of Thai Personal Name from Online News Articles Phaisarn Sutheebanjard Graduate School of Information Technology Siam University Bangkok, Thailand mr.phaisarn@gmail.com Abstract Since online
More informationThe presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.
Lecture 4: OT Syntax Sources: Kager 1999, Section 8; Legendre et al. 1998; Grimshaw 1997; Barbosa et al. 1998, Introduction; Bresnan 1998; Fanselow et al. 1999; Gibson & Broihier 1998. OT is not a theory
More informationLoughton School s curriculum evening. 28 th February 2017
Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's
More informationPredicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks
Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com
More informationAssessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2
Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2 Ted Pedersen Department of Computer Science University of Minnesota Duluth, MN, 55812 USA tpederse@d.umn.edu
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationOnline Updating of Word Representations for Part-of-Speech Tagging
Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationMatching Similarity for Keyword-Based Clustering
Matching Similarity for Keyword-Based Clustering Mohammad Rezaei and Pasi Fränti University of Eastern Finland {rezaei,franti}@cs.uef.fi Abstract. Semantic clustering of objects such as documents, web
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationChunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.
NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and
More informationTarget Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data
Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data Ebba Gustavii Department of Linguistics and Philology, Uppsala University, Sweden ebbag@stp.ling.uu.se
More informationProcedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 141 ( 2014 ) 124 128 WCLTA 2013 Using Corpus Linguistics in the Development of Writing Blanka Frydrychova
More informationTowards a MWE-driven A* parsing with LTAGs [WG2,WG3]
Towards a MWE-driven A* parsing with LTAGs [WG2,WG3] Jakub Waszczuk, Agata Savary To cite this version: Jakub Waszczuk, Agata Savary. Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]. PARSEME 6th general
More informationDistant Supervised Relation Extraction with Wikipedia and Freebase
Distant Supervised Relation Extraction with Wikipedia and Freebase Marcel Ackermann TU Darmstadt ackermann@tk.informatik.tu-darmstadt.de Abstract In this paper we discuss a new approach to extract relational
More informationBANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS
Daffodil International University Institutional Repository DIU Journal of Science and Technology Volume 8, Issue 1, January 2013 2013-01 BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS Uddin, Sk.
More informationThe Internet as a Normative Corpus: Grammar Checking with a Search Engine
The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a
More informationProduct Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments
Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Vijayshri Ramkrishna Ingale PG Student, Department of Computer Engineering JSPM s Imperial College of Engineering &
More informationA Vector Space Approach for Aspect-Based Sentiment Analysis
A Vector Space Approach for Aspect-Based Sentiment Analysis by Abdulaziz Alghunaim B.S., Massachusetts Institute of Technology (2015) Submitted to the Department of Electrical Engineering and Computer
More informationExperiments with a Higher-Order Projective Dependency Parser
Experiments with a Higher-Order Projective Dependency Parser Xavier Carreras Massachusetts Institute of Technology (MIT) Computer Science and Artificial Intelligence Laboratory (CSAIL) 32 Vassar St., Cambridge,
More informationAn Interactive Intelligent Language Tutor Over The Internet
An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This
More informationLTAG-spinal and the Treebank
LTAG-spinal and the Treebank a new resource for incremental, dependency and semantic parsing Libin Shen (lshen@bbn.com) BBN Technologies, 10 Moulton Street, Cambridge, MA 02138, USA Lucas Champollion (champoll@ling.upenn.edu)
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationCS 598 Natural Language Processing
CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@
More informationChinese Language Parsing with Maximum-Entropy-Inspired Parser
Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art
More informationGCSE. Mathematics A. Mark Scheme for January General Certificate of Secondary Education Unit A503/01: Mathematics C (Foundation Tier)
GCSE Mathematics A General Certificate of Secondary Education Unit A503/0: Mathematics C (Foundation Tier) Mark Scheme for January 203 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge and RSA)
More informationUNIVERSITY OF OSLO Department of Informatics. Dialog Act Recognition using Dependency Features. Master s thesis. Sindre Wetjen
UNIVERSITY OF OSLO Department of Informatics Dialog Act Recognition using Dependency Features Master s thesis Sindre Wetjen November 15, 2013 Acknowledgments First I want to thank my supervisors Lilja
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationCross Language Information Retrieval
Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................
More informationA Graph Based Authorship Identification Approach
A Graph Based Authorship Identification Approach Notebook for PAN at CLEF 2015 Helena Gómez-Adorno 1, Grigori Sidorov 1, David Pinto 2, and Ilia Markov 1 1 Center for Computing Research, Instituto Politécnico
More informationMultilingual Sentiment and Subjectivity Analysis
Multilingual Sentiment and Subjectivity Analysis Carmen Banea and Rada Mihalcea Department of Computer Science University of North Texas rada@cs.unt.edu, carmen.banea@gmail.com Janyce Wiebe Department
More informationBuilding a Semantic Role Labelling System for Vietnamese
Building a emantic Role Labelling ystem for Vietnamese Thai-Hoang Pham FPT University hoangpt@fpt.edu.vn Xuan-Khoai Pham FPT University khoaipxse02933@fpt.edu.vn Phuong Le-Hong Hanoi University of cience
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationA General Class of Noncontext Free Grammars Generating Context Free Languages
INFORMATION AND CONTROL 43, 187-194 (1979) A General Class of Noncontext Free Grammars Generating Context Free Languages SARWAN K. AGGARWAL Boeing Wichita Company, Wichita, Kansas 67210 AND JAMES A. HEINEN
More informationExtracting and Ranking Product Features in Opinion Documents
Extracting and Ranking Product Features in Opinion Documents Lei Zhang Department of Computer Science University of Illinois at Chicago 851 S. Morgan Street Chicago, IL 60607 lzhang3@cs.uic.edu Bing Liu
More informationChapter 2 Rule Learning in a Nutshell
Chapter 2 Rule Learning in a Nutshell This chapter gives a brief overview of inductive rule learning and may therefore serve as a guide through the rest of the book. Later chapters will expand upon the
More informationMachine Learning from Garden Path Sentences: The Application of Computational Linguistics
Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationNatural Language Processing. George Konidaris
Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans
More informationINPE São José dos Campos
INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA
More informationA Named Entity Recognition Method using Rules Acquired from Unlabeled Data
A Named Entity Recognition Method using Rules Acquired from Unlabeled Data Tomoya Iwakura Fujitsu Laboratories Ltd. 1-1, Kamikodanaka 4-chome, Nakahara-ku, Kawasaki 211-8588, Japan iwakura.tomoya@jp.fujitsu.com
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More informationParallel Evaluation in Stratal OT * Adam Baker University of Arizona
Parallel Evaluation in Stratal OT * Adam Baker University of Arizona tabaker@u.arizona.edu 1.0. Introduction The model of Stratal OT presented by Kiparsky (forthcoming), has not and will not prove uncontroversial
More informationGCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education
GCSE Mathematics B (Linear) Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education Mark Scheme for November 2014 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge
More informationMETHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS
METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS Ruslan Mitkov (R.Mitkov@wlv.ac.uk) University of Wolverhampton ViktorPekar (v.pekar@wlv.ac.uk) University of Wolverhampton Dimitar
More informationThe Discourse Anaphoric Properties of Connectives
The Discourse Anaphoric Properties of Connectives Cassandre Creswell, Kate Forbes, Eleni Miltsakaki, Rashmi Prasad, Aravind Joshi Λ, Bonnie Webber y Λ University of Pennsylvania 3401 Walnut Street Philadelphia,
More informationApproaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque
Approaches to control phenomena handout 6 5.4 Obligatory control and morphological case: Icelandic and Basque Icelandinc quirky case (displaying properties of both structural and inherent case: lexically
More informationDEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS
DEVELOPMENT OF A MULTILINGUAL PARALLEL CORPUS AND A PART-OF-SPEECH TAGGER FOR AFRIKAANS Julia Tmshkina Centre for Text Techitology, North-West University, 253 Potchefstroom, South Africa 2025770@puk.ac.za
More informationThe Karlsruhe Institute of Technology Translation Systems for the WMT 2011
The Karlsruhe Institute of Technology Translation Systems for the WMT 2011 Teresa Herrmann, Mohammed Mediani, Jan Niehues and Alex Waibel Karlsruhe Institute of Technology Karlsruhe, Germany firstname.lastname@kit.edu
More information2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases
POS Tagging Problem Part-of-Speech Tagging L545 Spring 203 Given a sentence W Wn and a tagset of lexical categories, find the most likely tag T..Tn for each word in the sentence Example Secretariat/P is/vbz
More informationHeuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger
Page 1 of 35 Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger Kaihong Liu, MD, MS, Wendy Chapman, PhD, Rebecca Hwa, PhD, and Rebecca S. Crowley, MD, MS
More informationExtracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models
Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models Richard Johansson and Alessandro Moschitti DISI, University of Trento Via Sommarive 14, 38123 Trento (TN),
More informationSom and Optimality Theory
Som and Optimality Theory This article argues that the difference between English and Norwegian with respect to the presence of a complementizer in embedded subject questions is attributable to a larger
More informationContext Free Grammars. Many slides from Michael Collins
Context Free Grammars Many slides from Michael Collins Overview I An introduction to the parsing problem I Context free grammars I A brief(!) sketch of the syntax of English I Examples of ambiguous structures
More informationThe Smart/Empire TIPSTER IR System
The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of
More information