Text Analytics Using Latent Semantic Analysis
|
|
- Alexandrina Glenn
- 6 years ago
- Views:
Transcription
1 Text Analytics Using Latent Semantic Analysis John Martin Small Bear Technologies, Inc.
2 Overview Text Analytics Need for automated methods What is LSA How LSA works Applications of LSA Misconceptions Conclusion - Q&A 2011 Small Bear Technologies, Inc. 2
3 What is Text Analytics? A set of linguistic, statistical, and machine learning techniques that model and structure the information content of textual sources for business intelligence, exploratory data analysis, research, or investigation (Wikipedia) Text Analytics Text Mining 2011 Small Bear Technologies, Inc. 3
4 Text Analytics Derive meaning from (textual) data sources Structured data Fixed format Known attributes Unstructured data Natural language 2011 Small Bear Technologies, Inc. 4
5 Unstructured Text News feeds Call center logs traffic Surveys Social network postings Publishing Observational data 2011 Small Bear Technologies, Inc. 5
6 Automated Methods Required Volume of information Speed of change/production Complexity Need impartial/consistent analysis 2011 Small Bear Technologies, Inc. 6
7 The Goal Maintaining and increasing the value of information Collecting information is not the problem gaining understanding is the issue Too much information is just as useless at too little information 2011 Small Bear Technologies, Inc. 7
8 Some Common Methods Lexical matching Statistical evaluation Vector space models Rule based systems Parts of speech analysis 2011 Small Bear Technologies, Inc. 8
9 The Problem Failure to capture meaning and provide insight Methods not universally applicable Language or domain dependent Need for specialized prior knowledge of data Require human interaction Tagging, Keyword identification, Categorizations Not practical for large data sets 2011 Small Bear Technologies, Inc. 9
10 The Cost Too much information is just as useless at too little information Failure to understand the information we have leads to: Lost opportunities Unsatisfied customers Inability to fulfill mission Financial repercussions 2011 Small Bear Technologies, Inc. 10
11 What is LSA? Latent Semantic Analysis Latent Semantic Indexing The distinction is really one of application, as the same mathematics and computation are employed for both. LSA may be considered to refer to a broad collection of application while LSI is more closely associated with information retrieval Small Bear Technologies, Inc. 11
12 Latent Semantic Analysis Theory of meaning[8] Creates a mapping of meaning acquired from the text itself Computational model Can perform many of the cognitive tasks that humans do essentially as well as humans [7] 2011 Small Bear Technologies, Inc. 12
13 How LSA Works LSA processing constructs a mapping of meaning in a semantic space The mapping gives the meaning of words and documents not vice versa 2011 Small Bear Technologies, Inc. 13
14 Compositionality Constraint The meaning of a document is the sum of the meaning of its words The meaning of a word is defined by the documents in which it appears (and does not appear) 2011 Small Bear Technologies, Inc. 14
15 LSA Space Construction LSA models a document as a simple linear equation A collection of documents (corpus) is a large set of simultaneous equations 2011 Small Bear Technologies, Inc. 15
16 Processing a Corpus Divide text corpus into units (documents) Typically paragraphs of text Raw matrix is constructed from units One row for each word type One column for each unit (document) Cells contain the number of times a particular word appears in a particular document Weighting functions may be applied [5] 2011 Small Bear Technologies, Inc. 16
17 Sparse Matrix The weighted term by document matrix represents a large set of simultaneous equations The term by document matrix is sparse Typically less than 1% of the values are nonzero [2] 2011 Small Bear Technologies, Inc. 17
18 Solving Simultaneous Equations The system of simultaneous equations is solved for the meaning of each word type and document Sparse matrix Singular Value Decomposition Lanczos algorithm is typically used Only solve for a reduced number of dimensions Produces vectors representing the meaning of each term and document 2011 Small Bear Technologies, Inc. 18
19 Singular Value Decomposition[10] The rows of matrix U are the vectors for the word types Columns of U are the eigenvectors defining the axes for word type space A=U Σ V T 2011 Small Bear Technologies, Inc. 19
20 Singular Value Decomposition[10] The rows of matrix V are the text unit (document) vectors Columns of V are eigenvectors defining the axes for document space A=U Σ V T 2011 Small Bear Technologies, Inc. 20
21 Dimensional Reduction Typically solve for dimensions [10] Dimensional reduction allows comparison of all terms and all documents with each other In the sparse matrix comparison was not possible Dimensions are orthogonal 2011 Small Bear Technologies, Inc. 21
22 Dimensional Reduction With enough variables every object is different With too few variables every object is the same [a r f j] [a r c g] Consider a geographic map 2011 Small Bear Technologies, Inc. 22
23 Semantic Space Vectors represent the meaning of a document (or term) Items similar in meaning are near each other in the semantic space 2011 Small Bear Technologies, Inc. 23
24 Computational Issues Nontrivial computation Large sparse symmetric eigenproblem Scalability concerns [11] Size of document set Speed of processing Accuracy issues Finite arithmetic introduces significant error 2011 Small Bear Technologies, Inc. 24
25 Operations Retrieval Clustering Comparison Interpretation Completion 2011 Small Bear Technologies, Inc. 25
26 Applications of LSA Library Illustration Retrieval Content analysis Evaluation of fit into an existing collection Comparison of multiple collections Indexing of multilingual collections 2011 Small Bear Technologies, Inc. 26
27 Applications of LSA Repairing/cleaning data Education Grading Summarizing Non-textual applications Bio-informatics Personality profiles/compatibility analysis 2011 Small Bear Technologies, Inc. 27
28 Misconceptions and Misunderstandings Driven by term co-occurrence Word order issues Data collection size Content and Meaning 2011 Small Bear Technologies, Inc. 28
29 Co-occurrence LSA starts with a kind of co-occurrence Appearing in the same document does not make words similar Similarity is determined by the effect of the word meaning on the system of equations 2011 Small Bear Technologies, Inc. 29
30 Word Order Word order effects almost entirely within single sentences Research indicates only around 10% of meaning is word order dependent (for English) [ order syntax? Much. Ignoring word Missed by is how ][8] 2011 Small Bear Technologies, Inc. 30
31 Data Collection Size Beware of small data collections Generally - Use at least 100,000 documents 2011 Small Bear Technologies, Inc. 31
32 Content LSA builds its notion of meaning from the content of the data collection 2011 Small Bear Technologies, Inc. 32
33 The Problem Revisited Failure to capture meaning and provide insight Methods not universally applicable Need for specialized prior knowledge of data Require human interaction Not practical for large data sets 2011 Small Bear Technologies, Inc. 33
34 Conclusion LSA offers powerful capabilities for gaining insight and understanding the contents of a data collection LSA provides analysis techniques not available with other Text Analytic methods Small Bear Technologies provides the core technology, tools, and support for performing Latent Semantic Analysis 2011 Small Bear Technologies, Inc. 34
35 Suggested Reading Handbook of Latent Semantic Analysis; Landauer, T., McNamara D., Dennis, S., Kintsch, W., Eds.; Lawrence Erlbaum Associates, Inc.: Mahwah, New Jersey, Indexing by Latent Semantic Analysis Deerwester, S.; Dumais, S.; Furnas, G.; Landauer, T.; Harshman, R., Journal of the American Society for Information Sciences 1990, 41, Improving the retrieval of information from external sources Dumais, S., Behavior Research Methods, Instruments, & Computers 1991, 23, An Introduction to Latent Semantic Analysis Landauer, T.; Foltz, P.; Laham, D., Discourse Processes 1998, 25, A solution to Plato's problem: The Latent Semantic Analysis Theory of acquisition, induction, and representation of Knowledge Landauer, T.; Dumais, S., Psychological Review 1997, 104, Small Bear Technologies, Inc. 35
36 References [1] Berry, M.W., Large Sparse Singular Value Computations. In International Journal of Supercomputer Applications, 1992, Vol. 6, pp [2] Berry, M.W., & Browne, M., Understanding Search Engines: Mathematical Modeling and Text Retrieval (2 nd ed.) SIAM, Philadelphia, [3] Berry, M.W., & Martin, D., Principle Component Analysis for Information Retrieval Applications. In Statistics: A series of textbooks and monographs: Handbook of Parallel Computing and Statistics, Chapman & Hall/CRC, Boca Raton, 2005, pp [4] Deerwester, S., Dumais, S, Furnas, G., Landauer, T., & Harshman, R, Indexing by Latent Semantic Analysis. In Journal of the American Society of Information Sciences, 1990, Vol. 41, pp [5] Dumais, S., Improving the Retrieval of Information from External Sources. In Behavior Research Methods, Instruments, and Computers, 1991, Vol. 23, pp Small Bear Technologies, Inc. 36
37 References (cont.) [6] Grimes, S., Text Analytics 2009: User Perspectives on Solutions and Providers. Alta Plana Research: [7] Landauer, T.K., On the Computational Basis of Cognition: Arguments from LSA. In The Psychology of Learning and Motivation. B.H. Ross (Ed.), Academic Press, New York, 2002; pp [8] Landauer, T. K., LSA as a Theory of Meaning. In The Handbook of Latent Semantic Analysis, Landauer, McNamara, Dennis, & Kintsch (Eds.), Lawrence Erlbaum Associates, Inc., Mahwah, New Jersey, 2007; pp [9] Landauer, T.K., & Dumais, S., A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. In Psychological Review, 1997, Vol. 104, pp [10] Martin, D., Berry, M., Mathematical Foundations Behind Latent Semantic Analysis, In The Handbook of Latent Semantic Analysis, Landauer, McNamara, Dennis, & Kintsch (Eds.), Lawrence Erlbaum Associates, Inc., Mahwah, New Jersey, 2007; pp Small Bear Technologies, Inc. 37
38 References (cont.) [11] Martin, D., Martin, J., Berry, M., Browne, M., Out-of-Core SVD performance for document indexing. In Applied Numerical Mathematics, 2007, Vol. 14, No. 10.
39 John Martin Small Bear Technologies, Inc.
Probabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationCROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2
1 CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2 Peter A. Chew, Brett W. Bader, Ahmed Abdelali Proceedings of the 13 th SIGKDD, 2007 Tiago Luís Outline 2 Cross-Language IR (CLIR) Latent Semantic Analysis
More informationUsing Web Searches on Important Words to Create Background Sets for LSI Classification
Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract
More informationLatent Semantic Analysis
Latent Semantic Analysis Adapted from: www.ics.uci.edu/~lopes/teaching/inf141w10/.../lsa_intro_ai_seminar.ppt (from Melanie Martin) and http://videolectures.net/slsfs05_hofmann_lsvm/ (from Thomas Hoffman)
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationAutomatic Essay Assessment
Assessment in Education, Vol. 10, No. 3, November 2003 Automatic Essay Assessment THOMAS K. LANDAUER University of Colorado and Knowledge Analysis Technologies, USA DARRELL LAHAM Knowledge Analysis Technologies,
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationUnsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model
Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.
More informationA Comparison of Two Text Representations for Sentiment Analysis
010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational
More informationOn-the-Fly Customization of Automated Essay Scoring
Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationChapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard
Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA Alta de Waal, Jacobus Venter and Etienne Barnard Abstract Most actionable evidence is identified during the analysis phase of digital forensic investigations.
More informationMatching Similarity for Keyword-Based Clustering
Matching Similarity for Keyword-Based Clustering Mohammad Rezaei and Pasi Fränti University of Eastern Finland {rezaei,franti}@cs.uef.fi Abstract. Semantic clustering of objects such as documents, web
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationKnowledge-Free Induction of Inflectional Morphologies
Knowledge-Free Induction of Inflectional Morphologies Patrick SCHONE Daniel JURAFSKY University of Colorado at Boulder University of Colorado at Boulder Boulder, Colorado 80309 Boulder, Colorado 80309
More informationAssessing Entailer with a Corpus of Natural Language From an Intelligent Tutoring System
Assessing Entailer with a Corpus of Natural Language From an Intelligent Tutoring System Philip M. McCarthy, Vasile Rus, Scott A. Crossley, Sarah C. Bigham, Arthur C. Graesser, & Danielle S. McNamara Institute
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationComment-based Multi-View Clustering of Web 2.0 Items
Comment-based Multi-View Clustering of Web 2.0 Items Xiangnan He 1 Min-Yen Kan 1 Peichu Xie 2 Xiao Chen 3 1 School of Computing, National University of Singapore 2 Department of Mathematics, National University
More informationLongest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 6, Ver. IV (Nov Dec. 2015), PP 01-07 www.iosrjournals.org Longest Common Subsequence: A Method for
More informationMathematics. Mathematics
Mathematics Program Description Successful completion of this major will assure competence in mathematics through differential and integral calculus, providing an adequate background for employment in
More informationCLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH
ISSN: 0976-3104 Danti and Bhushan. ARTICLE OPEN ACCESS CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH Ajit Danti 1 and SN Bharath Bhushan 2* 1 Department
More informationA Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval
A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval Yelong Shen Microsoft Research Redmond, WA, USA yeshen@microsoft.com Xiaodong He Jianfeng Gao Li Deng Microsoft Research
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationEvidence for Reliability, Validity and Learning Effectiveness
PEARSON EDUCATION Evidence for Reliability, Validity and Learning Effectiveness Introduction Pearson Knowledge Technologies has conducted a large number and wide variety of reliability and validity studies
More informationStatewide Framework Document for:
Statewide Framework Document for: 270301 Standards may be added to this document prior to submission, but may not be removed from the framework to meet state credit equivalency requirements. Performance
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationApplications of memory-based natural language processing
Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal
More informationSystem Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationThe Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms
IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationA DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA
International Journal of Semantic Computing Vol. 5, No. 4 (2011) 433 462 c World Scientific Publishing Company DOI: 10.1142/S1793351X1100133X A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF
More informationDeveloping True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability
Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability Shih-Bin Chen Dept. of Information and Computer Engineering, Chung-Yuan Christian University Chung-Li, Taiwan
More informationMulti-Dimensional, Multi-Level, and Multi-Timepoint Item Response Modeling.
Multi-Dimensional, Multi-Level, and Multi-Timepoint Item Response Modeling. Bengt Muthén & Tihomir Asparouhov In van der Linden, W. J., Handbook of Item Response Theory. Volume One. Models, pp. 527-539.
More informationUniversity of Groningen. Systemen, planning, netwerken Bosman, Aart
University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document
More informationA PEDAGOGY OF TEACHING THE TEST
A PEDAGOGY OF TEACHING THE TEST Du Toit Erna, Department of Education, Sol Plaatje University, Kimberley & Du Toit Jacqueline, Student support Services, Wellness Centre, Central University of Technology,Welkom,
More informationTerm Weighting based on Document Revision History
Term Weighting based on Document Revision History Sérgio Nunes, Cristina Ribeiro, and Gabriel David INESC Porto, DEI, Faculdade de Engenharia, Universidade do Porto. Rua Dr. Roberto Frias, s/n. 4200-465
More informationAutomating the E-learning Personalization
Automating the E-learning Personalization Fathi Essalmi 1, Leila Jemni Ben Ayed 1, Mohamed Jemni 1, Kinshuk 2, and Sabine Graf 2 1 The Research Laboratory of Technologies of Information and Communication
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationThe Smart/Empire TIPSTER IR System
The Smart/Empire TIPSTER IR System Chris Buckley, Janet Walz Sabir Research, Gaithersburg, MD chrisb,walz@sabir.com Claire Cardie, Scott Mardis, Mandar Mitra, David Pierce, Kiri Wagstaff Department of
More informationCross Language Information Retrieval
Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationSyntactic and Semantic Factors in Processing Difficulty: An Integrated Measure
Syntactic and Semantic Factors in Processing Difficulty: An Integrated Measure Jeff Mitchell, Mirella Lapata, Vera Demberg and Frank Keller University of Edinburgh Edinburgh, United Kingdom jeff.mitchell@ed.ac.uk,
More informationSCHEMA ACTIVATION IN MEMORY FOR PROSE 1. Michael A. R. Townsend State University of New York at Albany
Journal of Reading Behavior 1980, Vol. II, No. 1 SCHEMA ACTIVATION IN MEMORY FOR PROSE 1 Michael A. R. Townsend State University of New York at Albany Abstract. Forty-eight college students listened to
More information2 nd grade Task 5 Half and Half
2 nd grade Task 5 Half and Half Student Task Core Idea Number Properties Core Idea 4 Geometry and Measurement Draw and represent halves of geometric shapes. Describe how to know when a shape will show
More informationBENCHMARK TREND COMPARISON REPORT:
National Survey of Student Engagement (NSSE) BENCHMARK TREND COMPARISON REPORT: CARNEGIE PEER INSTITUTIONS, 2003-2011 PREPARED BY: ANGEL A. SANCHEZ, DIRECTOR KELLI PAYNE, ADMINISTRATIVE ANALYST/ SPECIALIST
More informationAnalyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio
SCSUG Student Symposium 2016 Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio Praneth Guggilla, Tejaswi Jha, Goutam Chakraborty, Oklahoma State
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationLearning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for
Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationLarge-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy
Large-Scale Web Page Classification by Sathi T Marath Submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy at Dalhousie University Halifax, Nova Scotia November 2010
More informationAs a high-quality international conference in the field
The New Automated IEEE INFOCOM Review Assignment System Baochun Li and Y. Thomas Hou Abstract In academic conferences, the structure of the review process has always been considered a critical aspect of
More informationA Statistical Approach to the Semantics of Verb-Particles
A Statistical Approach to the Semantics of Verb-Particles Colin Bannard School of Informatics University of Edinburgh 2 Buccleuch Place Edinburgh EH8 9LW, UK c.j.bannard@ed.ac.uk Timothy Baldwin CSLI Stanford
More informationMaximizing Learning Through Course Alignment and Experience with Different Types of Knowledge
Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February
More information*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN
From: AAAI Technical Report WS-98-08. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Recommender Systems: A GroupLens Perspective Joseph A. Konstan *t, John Riedl *t, AI Borchers,
More informationIT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University
IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University 06.11.16 13.11.16 Hannover Our group from Peter the Great St. Petersburg
More informationKnowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute
Page 1 of 28 Knowledge Elicitation Tool Classification Janet E. Burge Artificial Intelligence Research Group Worcester Polytechnic Institute Knowledge Elicitation Methods * KE Methods by Interaction Type
More informationLecture 1: Basic Concepts of Machine Learning
Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010
More informationMathematics Scoring Guide for Sample Test 2005
Mathematics Scoring Guide for Sample Test 2005 Grade 4 Contents Strand and Performance Indicator Map with Answer Key...................... 2 Holistic Rubrics.......................................................
More informationThe CTQ Flowdown as a Conceptual Model of Project Objectives
The CTQ Flowdown as a Conceptual Model of Project Objectives HENK DE KONING AND JEROEN DE MAST INSTITUTE FOR BUSINESS AND INDUSTRIAL STATISTICS OF THE UNIVERSITY OF AMSTERDAM (IBIS UVA) 2007, ASQ The purpose
More informationIntroduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.
to as a linguistic theory to to a member of the family of linguistic frameworks that are called generative grammars a grammar which is formalized to a high degree and thus makes exact predictions about
More informationPostprint.
http://www.diva-portal.org Postprint This is the accepted version of a paper presented at CLEF 2013 Conference and Labs of the Evaluation Forum Information Access Evaluation meets Multilinguality, Multimodality,
More informationThe Representation of Concrete and Abstract Concepts: Categorical vs. Associative Relationships. Jingyi Geng and Tatiana T. Schnur
RUNNING HEAD: CONCRETE AND ABSTRACT CONCEPTS The Representation of Concrete and Abstract Concepts: Categorical vs. Associative Relationships Jingyi Geng and Tatiana T. Schnur Department of Psychology,
More informationEpistemic Cognition. Petr Johanes. Fourth Annual ACM Conference on Learning at Scale
Epistemic Cognition Petr Johanes Fourth Annual ACM Conference on Learning at Scale 2017 04 20 Paper Structure Introduction The State of Epistemic Cognition Research Affordance #1 Additional Explanatory
More informationMeta-Cognitive Strategies
Meta-Cognitive Strategies Meta-cognitive Strategies Metacognition is commonly referred to as thinking about thinking. It includes monitoring one s performance, apportioning time and cognitive capacity
More informationConstraining X-Bar: Theta Theory
Constraining X-Bar: Theta Theory Carnie, 2013, chapter 8 Kofi K. Saah 1 Learning objectives Distinguish between thematic relation and theta role. Identify the thematic relations agent, theme, goal, source,
More informationME 4495 Computational Heat Transfer and Fluid Flow M,W 4:00 5:15 (Eng 177)
ME 4495 Computational Heat Transfer and Fluid Flow M,W 4:00 5:15 (Eng 177) Professor: Daniel N. Pope, Ph.D. E-mail: dpope@d.umn.edu Office: VKH 113 Phone: 726-6685 Office Hours:, Tues,, Fri 2:00-3:00 (or
More informationAustralian Journal of Basic and Applied Sciences
AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean
More informationConcepts and Properties in Word Spaces
Concepts and Properties in Word Spaces Marco Baroni 1 and Alessandro Lenci 2 1 University of Trento, CIMeC 2 University of Pisa, Department of Linguistics Abstract Properties play a central role in most
More informationThe role of word-word co-occurrence in word learning
The role of word-word co-occurrence in word learning Abdellah Fourtassi (a.fourtassi@ueuromed.org) The Euro-Mediterranean University of Fes FesShore Park, Fes, Morocco Emmanuel Dupoux (emmanuel.dupoux@gmail.com)
More informationarxiv: v1 [math.at] 10 Jan 2016
THE ALGEBRAIC ATIYAH-HIRZEBRUCH SPECTRAL SEQUENCE OF REAL PROJECTIVE SPECTRA arxiv:1601.02185v1 [math.at] 10 Jan 2016 GUOZHEN WANG AND ZHOULI XU Abstract. In this note, we use Curtis s algorithm and the
More informationThe Good Judgment Project: A large scale test of different methods of combining expert predictions
The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania
More informationWhat s in a Step? Toward General, Abstract Representations of Tutoring System Log Data
What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data Kurt VanLehn 1, Kenneth R. Koedinger 2, Alida Skogsholm 2, Adaeze Nwaigwe 2, Robert G.M. Hausmann 1, Anders Weinstein
More informationCOPING WITH LANGUAGE DATA SPARSITY: SEMANTIC HEAD MAPPING OF COMPOUND WORDS
COPING WITH LANGUAGE DATA SPARSITY: SEMANTIC HEAD MAPPING OF COMPOUND WORDS Joris Pelemans 1, Kris Demuynck 2, Hugo Van hamme 1, Patrick Wambacq 1 1 Dept. ESAT, Katholieke Universiteit Leuven, Belgium
More informationMachine Learning from Garden Path Sentences: The Application of Computational Linguistics
Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,
More informationCircuit Simulators: A Revolutionary E-Learning Platform
Circuit Simulators: A Revolutionary E-Learning Platform Mahi Itagi Padre Conceicao College of Engineering, Verna, Goa, India. itagimahi@gmail.com Akhil Deshpande Gogte Institute of Technology, Udyambag,
More informationAN EXAMPLE OF THE GOMORY CUTTING PLANE ALGORITHM. max z = 3x 1 + 4x 2. 3x 1 x x x x N 2
AN EXAMPLE OF THE GOMORY CUTTING PLANE ALGORITHM Consider the integer programme subject to max z = 3x 1 + 4x 2 3x 1 x 2 12 3x 1 + 11x 2 66 The first linear programming relaxation is subject to x N 2 max
More informationEnglish Language and Applied Linguistics. Module Descriptions 2017/18
English Language and Applied Linguistics Module Descriptions 2017/18 Level I (i.e. 2 nd Yr.) Modules Please be aware that all modules are subject to availability. If you have any questions about the modules,
More informationPhysics 270: Experimental Physics
2017 edition Lab Manual Physics 270 3 Physics 270: Experimental Physics Lecture: Lab: Instructor: Office: Email: Tuesdays, 2 3:50 PM Thursdays, 2 4:50 PM Dr. Uttam Manna 313C Moulton Hall umanna@ilstu.edu
More informationParsing of part-of-speech tagged Assamese Texts
IJCSI International Journal of Computer Science Issues, Vol. 6, No. 1, 2009 ISSN (Online): 1694-0784 ISSN (Print): 1694-0814 28 Parsing of part-of-speech tagged Assamese Texts Mirzanur Rahman 1, Sufal
More informationEECS 700: Computer Modeling, Simulation, and Visualization Fall 2014
EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014 Course Description The goals of this course are to: (1) formulate a mathematical model describing a physical phenomenon; (2) to discretize
More informationarxiv: v2 [cs.ir] 22 Aug 2016
Exploring Deep Space: Learning Personalized Ranking in a Semantic Space arxiv:1608.00276v2 [cs.ir] 22 Aug 2016 ABSTRACT Jeroen B. P. Vuurens The Hague University of Applied Science Delft University of
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationBug triage in open source systems: a review
Int. J. Collaborative Enterprise, Vol. 4, No. 4, 2014 299 Bug triage in open source systems: a review V. Akila* and G. Zayaraz Department of Computer Science and Engineering, Pondicherry Engineering College,
More informationLanguage Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus
Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,
More informationMontana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011
Montana Content Standards for Mathematics Grade 3 Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011 Contents Standards for Mathematical Practice: Grade
More informationEvaluating vector space models with canonical correlation analysis
Natural Language Engineering: page 1 of 38. c Cambridge University Press 211 doi:1.117/s1351324911271 1 Evaluating vector space models with canonical correlation analysis SAMI VIRPIOJA 1, MARI-SANNA PAUKKERI
More informationP. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas
Exploiting Distance Learning Methods and Multimediaenhanced instructional content to support IT Curricula in Greek Technological Educational Institutes P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou,
More informationWE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT
WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working
More informationOn-Line Data Analytics
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob
More informationMTH 215: Introduction to Linear Algebra
MTH 215: Introduction to Linear Algebra Fall 2017 University of Rhode Island, Department of Mathematics INSTRUCTOR: Jonathan A. Chávez Casillas E-MAIL: jchavezc@uri.edu LECTURE TIMES: Tuesday and Thursday,
More informationSecond Exam: Natural Language Parsing with Neural Networks
Second Exam: Natural Language Parsing with Neural Networks James Cross May 21, 2015 Abstract With the advent of deep learning, there has been a recent resurgence of interest in the use of artificial neural
More informationProcedia - Social and Behavioral Sciences 143 ( 2014 ) CY-ICER Teacher intervention in the process of L2 writing acquisition
Available online at www.sciencedirect.com ScienceDirect Procedia - Social and Behavioral Sciences 143 ( 2014 ) 238 242 CY-ICER 2014 Teacher intervention in the process of L2 writing acquisition Blanka
More informationWHEN THERE IS A mismatch between the acoustic
808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,
More informationMulti-Lingual Text Leveling
Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency
More informationA Semantic Similarity Measure Based on Lexico-Syntactic Patterns
A Semantic Similarity Measure Based on Lexico-Syntactic Patterns Alexander Panchenko, Olga Morozova and Hubert Naets Center for Natural Language Processing (CENTAL) Université catholique de Louvain Belgium
More informationCompositional Semantics
Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language
More informationRadius STEM Readiness TM
Curriculum Guide Radius STEM Readiness TM While today s teens are surrounded by technology, we face a stark and imminent shortage of graduates pursuing careers in Science, Technology, Engineering, and
More informationAttributed Social Network Embedding
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, MAY 2017 1 Attributed Social Network Embedding arxiv:1705.04969v1 [cs.si] 14 May 2017 Lizi Liao, Xiangnan He, Hanwang Zhang, and Tat-Seng Chua Abstract Embedding
More information