The effect of using a thesaurus in Arabic information retrieval system
|
|
- Jack Tate
- 6 years ago
- Views:
Transcription
1 ISSN (Olie) The effect of usig a thesaurus i Arabic iformatio retrieval system Mohammad Wedya, Basim Alhadidi ad Ada Alrabea Computer Sciece Departmet, Al-Balqa Applied Uiversity, Al-Salt, Jorda Abstract Automatic query expasio methods for Eglish ad other laguages text retrieval have bee studied for a log time. I this research we study the retrieval effectiveess, achieved whe we apply a successful automatic query expasio method i Arabic text retrieval based o a automatic thesaurus. Our experimets show that the automatic query expasio method resulted i a otable improvemet i Arabic text retrieval usig a sample of abstracts of Arabic documets. The study showed that the use of a thesaurus has improved iformatio retrieval system by 10% -20%. The study also shows that the greater the umber of documets i the buildig thesaurus, Thesaurus was more accurate. Keywords Arabic retrieval, thesaurus, stop words, idexig, iformatio retrieval system. 1. Itroductio Arabic is a laguage that holds the miracle of holy Qura, ad that accomplished all the requiremets of Arabic ad Islamic civilizatio i its peak flourishig. Arab books i Medicie ad Sciece had bee the mai referece books for the west ad i most of its importat uiversities. [1] Iteratioally, it gaied full acceptace ad recogitio ad become a credited laguage i UN istitutios alog side with the other five laguages previously used. [1] Arabic has may Properties, first, Arabic laguage cosists of 28 letters, 16 of them have oe dot, two or three dots. Secod, Writig is from right to left. Third, varyig ways of writig. For example completely mashkool (all sigs of tashkeel are used) or partially mashkool or Not mashkool Fourth, Letters chage their shape accordig to the place of occur i. fifth, Dual laguage formal ad iformal.sixth, Grammatical flexibility, words may be arraged i may differet ways. [2] Experimetal results show that spellig ormalizatio ad stemmig ca sigificatly improve Arabic mooligual retrieval. Character tri-grams from stems improved retrieval modestly o the test corpus, but the improvemet is ot statistically sigificat. [3] Therefore this study will statemet effect of usig a thesaurus o the iformatio retrieval system (IRS), ad compared the improvemet after usig automatic thesaurus from the traditioal system. 2. Evaluatig iformatio retrieval systems. Ay retrieval system is usually evaluated accordig to its efficiecy ad effectiveess. There are two aspects of efficiecy, they are time ad space. Time is the speed of matchig the i-use queries with the documet descriptios. Space is the space eeded i a disk that the system eeds. Efficiecy is determied accordig to the ability of the system to retur documets relevat to the user query. The perfect status of the system is referrig all the files that are relevat to the process of query ad ever referrig ay irrelevat files. The difficulty lies i the determiatio of relevace because the process of determiig relevace of documets is a subjective oe. [4] The decisio of the perso depeds much o may factors; experiece, for example. Ay professioal i a certai field may see the geeral iformatio retrieved from a system as irrelevat while ay amateur (begier) sees it as fully relevat. This may lead to icreasig i the determiatio of relevace. I research, researchers usually cosider the process of determiatio of relevace as a objective process. [4] We suggest here that evaluatio process is objective ad previously agreed o. Criteria used i the process of evaluatig the performace of a system are precisio ad recall. Precisio meas the ability of the system to retur documets that have relevace to the query. [4] The most commoly used measuremets of retrieval performace are precisio ad recall. Precisio measures the ability of the system to retrieve oly the documets that are relevat to a query [4] A mout of relevat documets retrieved Precisio = A mout of documets retrieved Recall measures the ability of the system to retrieve all documets that are relevat to a query [4] Recall = 3-Idexdig A mout of relevat documets retrieved A mout of relevat documets i the collectio Idexig is defied as the process of choosig a term or a umber of terms that ca represet what the documet cotais. These terms are called (Idex terms). [3]
2 ISSN (Olie) Idexig ca be performed either maually (Maual Idexig) or through usig computers software ad programs (automatic Idexig) [4]. Maual idexig has some weakesses that metioed. The perso who performs idexig must have the complete kowledge of what the documet cotais, ad what the documet talks about. The result may vary due to differet experieces of idexers. This leads to icreasig cost.[5] This research uses automatic idexig, so it will be our focus. 3-2 Automatic Idexig The first step i idexig is the Lexical Aalysis. The process of chagig the text ito a group of separate words, each word is called (toke), a toke is a group of letters. Lexical aalysis is also the first step i queries aalysis [6]. The process of lexical aalysis may preset idioms that ca be used as (Idex Terms), i order to assig the suitable idex term to reach the suitable documet.[6] The comes the process of separatig uecessary words, they are called 'Stop Words" as (قد) ad,(هذا) they are repeated i all documets ad texts. The importace of this step is discussed later i this study. 3-3 Elimiatig Stop words- Stop words are those words that are repeated i every documet, so they are cosidered as weak to be distiguished, we caot distiguish the cotet of a text depedig o them.[5] There are other beefits from elimiatig them as "shorteig idexig structure"[7]ad are useful i makig the process faster ad does't have iformatio Retrieval ad the degree of the efficiecy of recallig system. [6] It does't also burde the system with uecessary iformatio [8] It is ot clear which words ca be cosidered o stop words ad which caot. Traditioal methods cosider that words that are repeated may times are stop words, but there are some words that are repeated i a certai documet ad cosidered as importat words "idexig terms''. But whe the subjects are more specialized, as to say a subject specialized i data base. The the use of repeated words, eve if simply, as "idex terms" as computer laguage egies" are useless to be "idex terms''. [6] The other way is to save stop words i a list, the we search for each toke separately. That result from lexical aalysis ad comparig it with the list, if it is i the list, it will be igored ad ot processed later. [6] Arabic is very rich i lexical tokes, that meas stop words are available i big quatities. [8] Swaie said several characteristics of stop words i his book. First, they have o meaig if they are used separately. Secod, appear may times i a text. Third, ecessary for the costructio of the laguage. Fourth, mostly adjectives. Fifth, geeral words ad ot particularly used i a certai field. Sixth, ay researcher does't ask about such words. Seveth, ever form a full setece whe used aloe. 4. Thesaurus Thesaurus is a efficiet tool i IRS specially i the moder systems, i idexig or i searchig which helps i extedig queries through usig more suitable tokes. [4] Costructig thesauruses has a great beefit i IRS, it stregthes precisio ad cotrol of idioms i order to serve ad icreasig format i the process of documets. Idexig ad retrieval ad i usig the best idioms ad helps the user to reformulate his queries if ecessary [6]. Simply the thesaurus cosists of a list of the importat words, a certai subject, each word is coected with other words i the list. [7]. Most thesauruses we use have bee built maually depedig o experts i certai fields or o the experts i the field of documet descriptio. Buildig thesauruses maually is a waste of time ad moey, the result may also be subjective, because the perso who builds it may use his ow choices which may affect the costructio of the thesauruses, so we are i eed of a automatic costructio of thesauruses which will save time, effort ad cost ad make the results more objective easy to be modified i the future [4] Takig ito cosideratio what is metioed previously, we will build Automatic Thesauruses which have may beefits over the maual oe [7]. It supports stadard vocabulary i idexig or i searchig it helps the user i puttig dow the suitable expressios i queries. It supports differet hierarchies as it allows broadeig or arrowig the query accordig to the user eeds. 4.1 Automatic Thesaurus Costructio. i vector space models documets are represeted by vectors as bellow D j =(W 1,j, W 2,j, W 3,j,.,W t,j ) t Total Number of Idex Terms W weight D j Vector for doc j We ca compute the weight by these equatios Wi,j = the f i,j weight * log N/ of the i [7] term i i the documet j. N umber of documets i the system. i the umber of documets that term i appear i it. Fi.j Normalized Frequecy ad compute by
3 ISSN (Olie) f i, j = freq i,j / MAX L freq L,j -----[7] Freqi,j the uber of times the term i appeared i the text of the documet j. MAX L freq L,j the maximum is compute over all terms which are metioed i the text of the documets dj. These vectors of a group of documets va be represeted as follows D 1 D 2 W 11 W 21 W 12 W 22 W 13 W 23 T W 1 W 2 Cosie similarity S j, k w w i, j 2 i, j w w These equatios to calculate similarity betwee each idex term brigs out a matrix as the followig ( S 11 S 12 S 13 S 21 S 22 S 23 S 31 S 32 S 33 * * i, k ) 2 i, k T S 1 S 2 S 3 D 3 W 31 W 32 W 33 W 3 T m S m1 S m2 S m3 S m D m W m1 W m2 W m3 W m Figure (3) The term-term similarity Figure (1) Documets Vectors The comes the step of calculatig similarity betwee idex terms usig ay of the equatios of similarity calculatios as i the followig table D 1 D 2 D 3 D m W 11 W 21 W 31 W m1 W 12 W 22 W 32 W m2 W 13 W 23 W 33 W m3 W 1 W 2 W 3 Wm Sm, resembles the similarity betwee the term (N) ad the term (M). We have ow similarity matrix; because the similarity betwee (Tx) ad (Ty) equals the similarity betwee (Tx) ad (Ty). 5. RELATED WORK Despite the very little Arabic efforts i developig thesauruses, the theoretical efforts supported ad opeed ew paths for buildig Arabic thesaurus, eve though very limited, the first trials i this field were traslatio of foreig thesauruses, example of this is the list of Arabic Idioms prepared by Idustrial Developmet Ceter for the Arab World i 1970, ad the Islamic thesaurus which was built maually[9]. Cosie Similarity S j,k Figure (2) compute the term-term similarity Some studies i IRS ad i buildig thesauruses. Abu salem (1992) for example, studies the IR i Arabic Laguage. His study was based o 120 documets he received from the Saudi Arabia Natioal Computer Coferece ad o 32 queries. i his research, he studied idexig by usig full words ad by usig the roots oly. He foud that usig the roots is superior to other ways. He also built a maual thesaurus usig the relatio betwee expressios to test the possibility of supportig a IRS through this thesaurus. He foud that the thesaurus makes IR much better. The Geeral Thesaurus preseted by UN Aid Program.The Program of Authorizatio i the Arabic World (2003). This oe uses iitially syoyms that help the researcher to choose his expressios that he has to look for. This thesaurus icludes also the relatios of origi ad braches ad those of cotextualizatio betwee expressios. This helps i boardig the search, if the search has o
4 ISSN (Olie) matches whe usig a certai expressio, the researcher ca use either broad terms or arrower oes. Syoyms are the first step i this thesaurus Precisio Kaaa ad wedya (2006). Their study was based o 242 documets they received from the Saudi Arabia Natioal Computer Coferece ad o 24 queries. I their research, they studied idexig by usig full words ad by usig the roots. They foud that usig the roots is superior to other ways. They also built a Automatic thesaurus usig the relatio betwee expressios to test the possibility of supportig a IRS through thesaurus. They foud that the thesaurus makes IR much better betwee 1% ad 10%. 6. Coclusios This study aims at reiforcig IRS depedig o Arabic. The results after applyig 35 queries, this study was based o 500 documets those were give to a group of studets who have certai liks with those subjects to determie the relevat documet to each query. Accordig to the determiatio of those studets, work o these results bega ad results were aalyzed usig the criteria of Precisio ad Recallig ad by usig smoothig Algorithm that was used by Abu Salem (1992) ad by Kaaa (1997). Average Recall Precisio was calculated. Recall without usig thesaurus Average Recall Precisio with use thesaurus Improvemet (%) Table (1) The above Table Showig how better were the results whe usig with the thesaurus. Figure A compariso betwee the values of average Recall Precisio whe full words were used with ad without the thesaurus. 10 Figure (4) Showig how better were the results whe usig full words with the thesaurus. The previous chart shows the effect of usig the thesaurus o makig the system efficiecy that depeds o whole words better by applyig the criterio of average recall precisio. Whe the thesaurus was used, the results were better. This goes well with what Hai Abu Salem(1992) ad Kaa(2006) calculated whe he aid that the use of thesaurus i Arabic will make the efficiecy of the Arabic IRS better whe full words were used. Ad whe we icrease umber of documets that used to build thesaurus the result will be better. Kaa ad wedya (2006) used 242 documets to build their thesaurus ad i this study we use same equatios to build our thesaurus but we used 500 documets This study may be applied o other equatios as Jaccard ad Dice or be applied o huge umber of documets. The user ca be utilized i feedig the system i order to have a high precisio thesaurus. Refereces [1] Khatib, Ahmed Shafiq,1997, termiological specificatios ad applicatios i the Arabic laguage, cultural fifteeth seaso of the Arabic Laguage Academy of Jorda, Amma, Jorda, pp (Arabic) [2]Ali, Nabil, 1988, Arabic ad computer, localizatio, Cairo. (Arabic) [3] J. Xu, A. Fraser, ad R. Weischedel, 2002, Empirical studies i strategies for Arabic retrieval, Proceedigs of the 25th aual iteratioal ACM SIGIR coferece o Research ad developmet i iformatio retrieval, Tampere, Filad ACM, pp [4] Lassi, M., 2002, Automatic Thesaurus Costructio, uiversity collage of boras, [5] Salto, G., ad McGill, M., 1983, Itroductio to Moder Iformatio Retrieval, McGraw-Hill, New-York. [6] Frakes, W., ad Baeza-yates, R.,1992, Iformatio Retrieval Data Stractures & Algorithms, P T R Pretice Hall, New Jersey.
5 ISSN (Olie) [7] Baeza-yates, R.,ad Rierio-eto, B.,1999, Moder Iformatio Retrieval, Addiso-Wesley,New-York. [8] Soaa, Ali Suleima,1994, iformatio retrieval i the Arabic laguage, Kig Fahd Natioal Library.(Arabic). [9] Abdul-Jabbar,Abdul Rahma,1993, The use of a system cosultat i buildig thesauruses, scietific record of the Symposium o the use of Arabic i Iformatio Techology orgaized by the Kig Abdul Aziz Library public, Riyadh, Saudi Arabia.(Arabic). [10] Abu Salem, H.,1992, A Microcomputer BasedArabic Bibliographic Iformatio Retrieval system With Relatioal Thesau ri, Ph.D. Thesis, Uiversity of Illiois,Chicago,USA. [11] Kaaa, G.,1997, Comparig Automatic Statistical ad Sytactic Phrase Idexig for Arabic Iformatio Retrieval,1997, Ph.D. Thesis, Uiversity of Illiois, Chicago, USA. [12] Kaaa, G., M, Wedya.,2006, Costructig a Automatic Thesaurus to Ehace Arabic Iformatio Retrieval System, The 2d Jordaia Iteratioal Coferece o Computer Sciece ad egieerig, JICCSE, Salt, Jorda
Natural language processing implementation on Romanian ChatBot
Proceedigs of the 9th WSEAS Iteratioal Coferece o SIMULATION, MODELLING AND OPTIMIZATION Natural laguage processig implemetatio o Romaia ChatBot RALF FABIAN, MARCU ALEXANDRU-NICOLAE Departmet for Iformatics
More informationManagement Science Letters
Maagemet Sciece Letters 4 (24) 2 26 Cotets lists available at GrowigSciece Maagemet Sciece Letters homepage: www.growigsciece.com/msl A applicatio of data evelopmet aalysis for measurig the relative efficiecy
More informationE-LEARNING USABILITY: A LEARNER-ADAPTED APPROACH BASED ON THE EVALUATION OF LEANER S PREFERENCES. Valentina Terzieva, Yuri Pavlov, Rumen Andreev
Titre du documet / Documet title E-learig usability : A learer-adapted approach based o the evaluatio of leaer's prefereces Auteur(s) / Author(s) TERZIEVA Valetia ; PAVLOV Yuri (1) ; ANDREEV Rume (2) ;
More informationHANDBOOK. Career Center Handbook. Tools & Tips for Career Search Success CALIFORNIA STATE UNIVERSITY, SACR AMENTO
HANDBOOK Career Ceter Hadbook CALIFORNIA STATE UNIVERSITY, SACR AMENTO Tools & Tips for Career Search Success Academic Advisig ad Career Ceter 6000 J Street Lasse Hall 1013 Sacrameto, CA 95819-6064 916-278-6231
More informationarxiv: v1 [cs.dl] 22 Dec 2016
ScieceWISE: Topic Modelig over Scietific Literature Networks arxiv:1612.07636v1 [cs.dl] 22 Dec 2016 A. Magalich, V. Gemmetto, D. Garlaschelli, A. Boyarsky Uiversity of Leide, The Netherlads {magalich,
More informationApplication for Admission
Applicatio for Admissio Admissio Office PO Box 2900 Illiois Wesleya Uiversity Bloomig, Illiois 61702-2900 Apply o-lie at: www.iwu.edu Applicatio Iformatio I am applyig: Early Actio Regular Decisio Early
More informationConsortium: North Carolina Community Colleges
Associatio of Research Libraries / Texas A&M Uiversity www.libqual.org Cotributors Collee Cook Texas A&M Uiversity Fred Heath Uiversity of Texas BruceThompso Texas A&M Uiversity Martha Kyrillidou Associatio
More information'Norwegian University of Science and Technology, Department of Computer and Information Science
The helpful Patiet Record System: Problem Orieted Ad Kowledge Based Elisabeth Bayega, MS' ad Samso Tu, MS2 'Norwegia Uiversity of Sciece ad Techology, Departmet of Computer ad Iformatio Sciece ad Departmet
More informationFuzzy Reference Gain-Scheduling Approach as Intelligent Agents: FRGS Agent
Fuzzy Referece Gai-Schedulig Approach as Itelliget Agets: FRGS Aget J. E. ARAUJO * eresto@lit.ipe.br K. H. KIENITZ # kieitz@ita.br S. A. SANDRI sadra@lac.ipe.br J. D. S. da SILVA demisio@lac.ipe.br * Itegratio
More informationCONSTITUENT VOICE TECHNICAL NOTE 1 INTRODUCING Version 1.1, September 2014
preview begis oct 2014 lauches ja 2015 INTRODUCING WWW.FEEDBACKCOMMONS.ORG A serviced cloud platform to share ad compare feedback data ad collaboratively develop feedback ad learig practice CONSTITUENT
More informationVISION, MISSION, VALUES, AND GOALS
6 VISION, MISSION, VALUES, AND GOALS 2010-2015 VISION STATEMENT Ohloe College will be kow throughout Califoria for our iclusiveess, iovatio, ad superior rates of studet success. MISSION STATEMENT The Missio
More informationpart2 Participatory Processes
part part2 Participatory Processes Participatory Learig Approaches Whose Learig? Participatory learig is based o the priciple of ope expressio where all sectios of the commuity ad exteral stakeholders
More informationalso inside Continuing Education Alumni Authors College Events
SUMMER 2016 JAMESTOWN COMMUNITY COLLEGE ALUMNI MAGAZINE create a etrepreeur creatig a busiess a artist creatig beauty a citize creatig the future also iside Cotiuig Educatio Alumi Authors College Evets
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More information2014 Gold Award Winner SpecialParent
Award Wier SpecialParet Dedicated to all families of childre with special eeds 6 th Editio/Fall/Witer 2014 Desig ad Editorial Awards Competitio MISSION Our goal is to provide parets of childre with special
More informationCross Language Information Retrieval
Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................
More informationOn March 15, 2016, Governor Rick Snyder. Continuing Medical Education Becomes Mandatory in Michigan. in this issue... 3 Great Lakes Veterinary
michiga veteriary medical associatio i this issue... 3 Great Lakes Veteriary Coferece 4 What You Need to Kow Whe Issuig a Iterstate Certificate of Ispectio 6 Low Pathogeic Avia Iflueza H5 Virus Detectios
More informationPerformance Analysis of Optimized Content Extraction for Cyrillic Mongolian Learning Text Materials in the Database
Journal of Computer and Communications, 2016, 4, 79-89 Published Online August 2016 in SciRes. http://www.scirp.org/journal/jcc http://dx.doi.org/10.4236/jcc.2016.410009 Performance Analysis of Optimized
More informationCROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2
1 CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2 Peter A. Chew, Brett W. Bader, Ahmed Abdelali Proceedings of the 13 th SIGKDD, 2007 Tiago Luís Outline 2 Cross-Language IR (CLIR) Latent Semantic Analysis
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationDetecting English-French Cognates Using Orthographic Edit Distance
Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationThe Extend of Adaptation Bloom's Taxonomy of Cognitive Domain In English Questions Included in General Secondary Exams
Advances in Language and Literary Studies ISSN: 2203-4714 Vol. 5 No. 2; April 2014 Copyright Australian International Academic Centre, Australia The Extend of Adaptation Bloom's Taxonomy of Cognitive Domain
More informationA Comparison of Two Text Representations for Sentiment Analysis
010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational
More informationGiven a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations
4 Interior point algorithms for network ow problems Mauricio G.C. Resende AT&T Bell Laboratories, Murray Hill, NJ 07974-2070 USA Panos M. Pardalos The University of Florida, Gainesville, FL 32611-6595
More informationCross-Lingual Text Categorization
Cross-Lingual Text Categorization Nuria Bel 1, Cornelis H.A. Koster 2, and Marta Villegas 1 1 Grup d Investigació en Lingüística Computacional Universitat de Barcelona, 028 - Barcelona, Spain. {nuria,tona}@gilc.ub.es
More informationSouth Carolina English Language Arts
South Carolina English Language Arts A S O F J U N E 2 0, 2 0 1 0, T H I S S TAT E H A D A D O P T E D T H E CO M M O N CO R E S TAT E S TA N DA R D S. DOCUMENTS REVIEWED South Carolina Academic Content
More informationOn-Line Data Analytics
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob
More informationTeaching Vocabulary Summary. Erin Cathey. Middle Tennessee State University
Teaching Vocabulary Summary Erin Cathey Middle Tennessee State University 1 Teaching Vocabulary Summary Introduction: Learning vocabulary is the basis for understanding any language. The ability to connect
More informationSaeed Rajaeepour Associate Professor, Department of Educational Sciences. Seyed Ali Siadat Professor, Department of Educational Sciences
Investigating and Comparing Primary, Secondary, and High School Principals and Teachers Attitudes in the City of Isfahan towards In-Service Training Courses Masoud Foroutan (Corresponding Author) PhD Student
More informationDOES RETELLING TECHNIQUE IMPROVE SPEAKING FLUENCY?
DOES RETELLING TECHNIQUE IMPROVE SPEAKING FLUENCY? Noor Rachmawaty (itaw75123@yahoo.com) Istanti Hermagustiana (dulcemaria_81@yahoo.com) Universitas Mulawarman, Indonesia Abstract: This paper is based
More informationMatching Similarity for Keyword-Based Clustering
Matching Similarity for Keyword-Based Clustering Mohammad Rezaei and Pasi Fränti University of Eastern Finland {rezaei,franti}@cs.uef.fi Abstract. Semantic clustering of objects such as documents, web
More informationBig Fish. Big Fish The Book. Big Fish. The Shooting Script. The Movie
Big Fish The Book Big Fish The Shooting Script Big Fish The Movie Carmen Sánchez Sadek Central Question Can English Learners (Level 4) or 8 th Grade English students enhance, elaborate, further develop
More informationDERMATOLOGY. Sponsored by the NYU Post-Graduate Medical School. 129 Years of Continuing Medical Education
Advaces i DERMATOLOGY THURSDAY - FRIDAY JUNE 7-8, 2012 New York, NY Sposored by the NYU Post-Graduate Medical School 129 Years of Cotiuig Medical Educatio THE RONALD O. PERELMAN DEPARTMENT OF DERMATOLOGY
More informationLanguage Independent Passage Retrieval for Question Answering
Language Independent Passage Retrieval for Question Answering José Manuel Gómez-Soriano 1, Manuel Montes-y-Gómez 2, Emilio Sanchis-Arnal 1, Luis Villaseñor-Pineda 2, Paolo Rosso 1 1 Polytechnic University
More informationDOES OUR EDUCATIONAL SYSTEM ENHANCE CREATIVITY AND INNOVATION AMONG GIFTED STUDENTS?
DOES OUR EDUCATIONAL SYSTEM ENHANCE CREATIVITY AND INNOVATION AMONG GIFTED STUDENTS? M. Aichouni 1*, R. Al-Hamali, A. Al-Ghamdi, A. Al-Ghonamy, E. Al-Badawi, M. Touahmia, and N. Ait-Messaoudene 1 University
More informationBridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models
Bridging Lexical Gaps between Queries and Questions on Large Online Q&A Collections with Compact Translation Models Jung-Tae Lee and Sang-Bum Kim and Young-In Song and Hae-Chang Rim Dept. of Computer &
More informationDetecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011
Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011 Cristian-Alexandru Drăgușanu, Marina Cufliuc, Adrian Iftene UAIC: Faculty of Computer Science, Alexandru Ioan Cuza University,
More informationAbdul Rahman Chik a*, Tg. Ainul Farha Tg. Abdul Rahman b
Available online at www.sciencedirect.com Procedia - Social and Behavioral Sciences 66 ( 2012 ) 223 231 The 8th International Language for Specific Purposes (LSP) Seminar - Aligning Theoretical Knowledge
More information2 Mitsuru Ishizuka x1 Keywords Automatic Indexing, PAI, Asserted Keyword, Spreading Activation, Priming Eect Introduction With the increasing number o
PAI: Automatic Indexing for Extracting Asserted Keywords from a Document 1 PAI: Automatic Indexing for Extracting Asserted Keywords from a Document Naohiro Matsumura PRESTO, Japan Science and Technology
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More informationCLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH
ISSN: 0976-3104 Danti and Bhushan. ARTICLE OPEN ACCESS CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH Ajit Danti 1 and SN Bharath Bhushan 2* 1 Department
More informationA DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA
International Journal of Semantic Computing Vol. 5, No. 4 (2011) 433 462 c World Scientific Publishing Company DOI: 10.1142/S1793351X1100133X A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF
More informationCross-Language Information Retrieval
Cross-Language Information Retrieval ii Synthesis One liner Lectures Chapter in Title Human Language Technologies Editor Graeme Hirst, University of Toronto Synthesis Lectures on Human Language Technologies
More informationTHE IMPLEMENTATION OF SPEED READING TECHNIQUE TO IMPROVE COMPREHENSION ACHIEVEMENT
THE IMPLEMENTATION OF SPEED READING TECHNIQUE TO IMPROVE COMPREHENSION ACHIEVEMENT Fusthaathul Rizkoh 1, Jos E. Ohoiwutun 2, Nur Sehang Thamrin 3 Abstract This study investigated that the implementation
More informationMining Association Rules in Student s Assessment Data
www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama
More information9TH GRADE HEALTH BOOK ONLINE PDF
9TH GRADE HEALTH BOOK ONLINE PDF ==> Download: 9TH GRADE HEALTH BOOK ONLINE PDF 9TH GRADE HEALTH BOOK ONLINE PDF - Are you searching for 9th Grade Health Book Online Books? Now, you will be happy that
More informationLEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE
LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE Submitted in partial fulfillment of the requirements for the degree of Sarjana Sastra (S.S.)
More informationLearning Style Patterns Among Special Needs Adult Students at King Saud University
University of South Florida Scholar Commons Graduate Theses and Dissertations Graduate School July 2017 Learning Style Patterns Among Special Needs Adult Students at King Saud University Abdulrahman Alshuaibi
More informationCurriculum and Assessment Policy
*Note: Much of policy heavily based on Assessment Policy of The International School Paris, an IB World School, with permission. Principles of assessment Why do we assess? How do we assess? Students not
More informationLower and Upper Secondary
Lower and Upper Secondary Type of Course Age Group Content Duration Target General English Lower secondary Grammar work, reading and comprehension skills, speech and drama. Using Multi-Media CD - Rom 7
More informationModule Title: Managing and Leading Change. Lesson 4 THE SIX SIGMA
Module Title: Managing and Leading Change Lesson 4 THE SIX SIGMA Learning Objectives: At the end of the lesson, the students should be able to: 1. Define what is Six Sigma 2. Discuss the brief history
More informationE-LEARNING IN LIBRARY OF JAMIA HAMDARD UNIVERSITY
Library Science E-LEARNING IN LIBRARY OF JAMIA HAMDARD UNIVERSITY Kirtika Bhatli* ABSTRACT The paper is study of E-learning system in Jamia Hamdard University, Hamdard Nagar Delhi. The objectives of the
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationWhat the National Curriculum requires in reading at Y5 and Y6
What the National Curriculum requires in reading at Y5 and Y6 Word reading apply their growing knowledge of root words, prefixes and suffixes (morphology and etymology), as listed in Appendix 1 of the
More informationThe Role of String Similarity Metrics in Ontology Alignment
The Role of String Similarity Metrics in Ontology Alignment Michelle Cheatham and Pascal Hitzler August 9, 2013 1 Introduction Tim Berners-Lee originally envisioned a much different world wide web than
More informationChapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard
Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA Alta de Waal, Jacobus Venter and Etienne Barnard Abstract Most actionable evidence is identified during the analysis phase of digital forensic investigations.
More informationReducing Features to Improve Bug Prediction
Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science
More informationMultimedia Courseware of Road Safety Education for Secondary School Students
Multimedia Courseware of Road Safety Education for Secondary School Students Hanis Salwani, O 1 and Sobihatun ur, A.S 2 1 Universiti Utara Malaysia, Malaysia, hanisalwani89@hotmail.com 2 Universiti Utara
More informationImproved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form
Orthographic Form 1 Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form The development and testing of word-retrieval treatments for aphasia has generally focused
More informationGeorgetown University at TREC 2017 Dynamic Domain Track
Georgetown University at TREC 2017 Dynamic Domain Track Zhiwen Tang Georgetown University zt79@georgetown.edu Grace Hui Yang Georgetown University huiyang@cs.georgetown.edu Abstract TREC Dynamic Domain
More informationScienceDirect. Malayalam question answering system
Available online at www.sciencedirect.com ScienceDirect Procedia Technology 24 (2016 ) 1388 1392 International Conference on Emerging Trends in Engineering, Science and Technology (ICETEST - 2015) Malayalam
More information& Jenna Bush. New Children s Book Authors. Award Winner. Volume XIII, No. 9 New York City May 2008 THE EDUCATION U.S.
Awrd Wier Volume XIII, No. 9 New York City My 2008 For Prets, ductors & Studets www.ductioupdte.com New Childre s Book Authors U.S. POSTAG PAI TH UCATION UPAT PRSORT STANAR First Ldy Lur Bush & Je Bush
More informationIntegrating Semantic Knowledge into Text Similarity and Information Retrieval
Integrating Semantic Knowledge into Text Similarity and Information Retrieval Christof Müller, Iryna Gurevych Max Mühlhäuser Ubiquitous Knowledge Processing Lab Telecooperation Darmstadt University of
More informationAn Interactive Intelligent Language Tutor Over The Internet
An Interactive Intelligent Language Tutor Over The Internet Trude Heift Linguistics Department and Language Learning Centre Simon Fraser University, B.C. Canada V5A1S6 E-mail: heift@sfu.ca Abstract: This
More informationPractice Examination IREB
IREB Examination Requirements Engineering Advanced Level Elicitation and Consolidation Practice Examination Questionnaire: Set_EN_2013_Public_1.2 Syllabus: Version 1.0 Passed Failed Total number of points
More informationMatching Meaning for Cross-Language Information Retrieval
Matching Meaning for Cross-Language Information Retrieval Jianqiang Wang Department of Library and Information Studies University at Buffalo, the State University of New York Buffalo, NY 14260, U.S.A.
More informationThe Impact of Morphological Awareness on Iranian University Students Listening Comprehension Ability
International Journal of Applied Linguistics & English Literature ISSN 2200-3592 (Print), ISSN 2200-3452 (Online) Vol. 2 No. 3; May 2013 Copyright Australian International Academic Centre, Australia The
More informationHLTCOE at TREC 2013: Temporal Summarization
HLTCOE at TREC 2013: Temporal Summarization Tan Xu University of Maryland College Park Paul McNamee Johns Hopkins University HLTCOE Douglas W. Oard University of Maryland College Park Abstract Our team
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationThe Implementation of Interactive Multimedia Learning Materials in Teaching Listening Skills
English Language Teaching; Vol. 8, No. 12; 2015 ISSN 1916-4742 E-ISSN 1916-4750 Published by Canadian Center of Science and Education The Implementation of Interactive Multimedia Learning Materials in
More informationMaximizing Learning Through Course Alignment and Experience with Different Types of Knowledge
Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February
More informationA New Computing Book Series From ACM
A New Computing Book Series From ACM ACM BOOKS &C M ACM BOOKS Published by ACM in conjunction with Morgan & Claypool Publishers, ACM Books is a new series of high quality, advanced level books for the
More informationBooks Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny
By the End of Year 8 All Essential words lists 1-7 290 words Commonly Misspelt Words-55 working out more complex, irregular, and/or ambiguous words by using strategies such as inferring the unknown from
More informationSystem Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationFormulaic Language and Fluency: ESL Teaching Applications
Formulaic Language and Fluency: ESL Teaching Applications Formulaic Language Terminology Formulaic sequence One such item Formulaic language Non-count noun referring to these items Phraseology The study
More informationWord Stress and Intonation: Introduction
Word Stress and Intonation: Introduction WORD STRESS One or more syllables of a polysyllabic word have greater prominence than the others. Such syllables are said to be accented or stressed. Word stress
More informationDaily Common Core Ela Warm Ups
Daily Ela Warm Ups Free PDF ebook Download: Daily Ela Warm Ups Download or Read Online ebook daily common core ela warm ups in PDF Format From The Best User Guide Database Daily Applying The State Standards.
More informationHow to read a Paper ISMLL. Dr. Josif Grabocka, Carlotta Schatten
How to read a Paper ISMLL Dr. Josif Grabocka, Carlotta Schatten Hildesheim, April 2017 1 / 30 Outline How to read a paper Finding additional material Hildesheim, April 2017 2 / 30 How to read a paper How
More informationThe Smart/Empire TIPSTER IR System
The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of
More informationMABEL ABRAHAM. 710 Uris Hall Broadway mabelabraham.com New York, New York Updated January 2017 EMPLOYMENT
MABEL ABRAHAM Columbia Business School mabel.abraham@columbia.edu 710 Uris Hall 212-854-7788 3022 Broadway mabelabraham.com New York, New York 10027 Updated January 2017 EMPLOYMENT 2015 Columbia University,
More information4th Grade Science Test Ecosystems
4th Grade Science Free PDF ebook Download: 4th Grade Science Download or Read Online ebook 4th grade science test ecosystems in PDF Format From The Best User Guide Database 4th Grade--LIFE SCIENCE. Unit
More informationOn document relevance and lexical cohesion between query terms
Information Processing and Management 42 (2006) 1230 1247 www.elsevier.com/locate/infoproman On document relevance and lexical cohesion between query terms Olga Vechtomova a, *, Murat Karamuftuoglu b,
More informationMULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY
MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationAnalyzing Linguistically Appropriate IEP Goals in Dual Language Programs
Analyzing Linguistically Appropriate IEP Goals in Dual Language Programs 2016 Dual Language Conference: Making Connections Between Policy and Practice March 19, 2016 Framingham, MA Session Description
More informationA Pumpkin Grows. Written by Linda D. Bullock and illustrated by Debby Fisher
GUIDED READING REPORT A Pumpkin Grows Written by Linda D. Bullock and illustrated by Debby Fisher KEY IDEA This nonfiction text traces the stages a pumpkin goes through as it grows from a seed to become
More informationLongest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 6, Ver. IV (Nov Dec. 2015), PP 01-07 www.iosrjournals.org Longest Common Subsequence: A Method for
More informationData Fusion Models in WSNs: Comparison and Analysis
Proceedings of 2014 Zone 1 Conference of the American Society for Engineering Education (ASEE Zone 1) Data Fusion s in WSNs: Comparison and Analysis Marwah M Almasri, and Khaled M Elleithy, Senior Member,
More informationBook Review: Build Lean: Transforming construction using Lean Thinking by Adrian Terry & Stuart Smith
Howell, Greg (2011) Book Review: Build Lean: Transforming construction using Lean Thinking by Adrian Terry & Stuart Smith. Lean Construction Journal 2011 pp 3-8 Book Review: Build Lean: Transforming construction
More informationDeveloping True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability
Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability Shih-Bin Chen Dept. of Information and Computer Engineering, Chung-Yuan Christian University Chung-Li, Taiwan
More informationWeb as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics
(L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes
More informationLoughton School s curriculum evening. 28 th February 2017
Loughton School s curriculum evening 28 th February 2017 Aims of this session Share our approach to teaching writing, reading, SPaG and maths. Share resources, ideas and strategies to support children's
More informationDyslexia and Dyscalculia Screeners Digital. Guidance and Information for Teachers
Dyslexia and Dyscalculia Screeners Digital Guidance and Information for Teachers Digital Tests from GL Assessment For fully comprehensive information about using digital tests from GL Assessment, please
More informationLiterature and the Language Arts Experiencing Literature
Correlation of Literature and the Language Arts Experiencing Literature Grade 9 2 nd edition to the Nebraska Reading/Writing Standards EMC/Paradigm Publishing 875 Montreal Way St. Paul, Minnesota 55102
More informationScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 98 (2016 ) 368 373 The 6th International Conference on Current and Future Trends of Information and Communication Technologies
More informationChildren need activities which are
59 PROFILE INTRODUCTION Children need activities which are exciting and stimulate their curiosity; they need to be involved in meaningful situations that emphasize interaction through the use of English
More informationMulti-Lingual Text Leveling
Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency
More information