Introduction to statistical learning

Size: px
Start display at page:

Download "Introduction to statistical learning"

Transcription

1 Introduction to statistical learning 1. Introduction V. Lefieux June /42

2 Table of contents 2/42

3 Table of contents 3/42

4 Data everywhere 4/42

5 Data everywhere Before: structured data, generated by companies and organizations, regular but not so frequent updates (e.g monthly). Now: unstructured data, generated by users, real time data. 5/42

6 Some data generated by companies and organization 6/42

7 Some data generated by users 7/42

8 Some networks 8/42

9 And now health data 9/42

10 3 V? 10/42

11 4 V? 11/42

12 5 V? 12/42

13 The new oil? Clive Huby, /42

14 A landscape 14/42

15 Gartner hype cycle /42

16 Table of contents 16/42

17 A process: collecting, organizing (cleaning and storing), analyzing, visualizing large sets of data. An objective: discover useful information to improve business decisions. 17/42

18 A new idea? Four major influences act on data analysis today: The formal theories of statistics. Accelerating developments in computers and display devices. The challenge, in many fields, of more and ever larger bodies of data. The emphasis on quantification in an ever wider variety of disciplines. 18/42

19 Not so new! Data analysis and statistics: an expository overview J. W. Tukey and M. B. Wilk 1966 Four major influences act on data analysis today: The formal theories of statistics. Accelerating developments in computers and display devices. The challenge, in many fields, of more and ever larger bodies of data. The emphasis on quantification in an ever wider variety of disciplines. 19/42

20 Spam filter 20/42

21 Web search 21/42

22 Recommendations 22/42

23 Marketing 23/42

24 Customer relationship management (CRM) Hotel chain uses big data to increase bookings. Pizza chain earns more dough in bad weather. Music distributor applies big data for demand planning. Financial services company scores new clients. Retailer creates pregnancy detection model. 24/42

25 Smart grids And smart cities. 25/42

26 Genomics 26/42

27 Table of contents 27/42

28 The data scientist 28/42

29 Data scientist skills 29/42

30 Superhero skills? 30/42

31 Some definitions: is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured, which is a continuation of some of the data analysis fields such as statistics, machine learning, data mining, and predictive analytics, similar to Knowledge Discovery in Databases (KDD). 31/42

32 Table of contents 32/42

33 Some definitions: Machine learning Machine learning is a field of computer science that often uses statistical techniques to give computers the ability to learn (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed. 33/42

34 Some definitions: theory is a framework for machine learning drawing from the fields of statistics and functional analysis. theory deals with the problem of finding a predictive function based on data. Statistical learning theory has led to successful applications in fields such as computer vision, speech recognition, bioinformatics and baseball. 34/42

35 vs Machine learning Machine learning, from Artificial Intelligence: large scale applications, prediction accuracy., from Statistics: interpretability, precision, uncertainty, inference. For some statisticians: statistical learning is a mathematical formalisation of the machine learning. 35/42

36 Some concepts: online/offline learning Online learning (real-time): under time constraints. Some examples: Personalized advertising. Personalized healthcare. Navigation & transit tools. Autonomous cars. Load curve forecasts. Weather forecasts. Offline learning (batch). 36/42

37 Some concepts: supervised/unsupervised learning Supervised learning: Infer (predict) a function/relationship from labeled training data (e.g. classification, regression). Unsupervised learning: Find structure in unlabeled data (e.g. clustering). Even if it is more subjective than supervised learning, it can be useful as a pre-processing step for supervised learning. 37/42

38 Supervised learning There are many different paradigms, including: Parametric statistics (linear or non-linear). Non-parametric statistics (local estimation methods, e.g smoothing kernel methods, k-nearest neighbors). Tree based methods. Support Vector Machines. Deep learning. 38/42

39 Some key points Trade-off between prediction accuracy and interpretability. Avoid over-fitting. Parsimonious model vs (full) black box: less is more. 39/42

40 Table of contents 40/42

41 Outline Introduction. Unsupervised learning: PCA & clustering. Supervised learning: Cross validation & bootstrap. Reminders on linear regression & logistic regression. Tree based methods. Support Vector Machines. 41/42

42 Software tools 42/42

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

Lecture 1: Basic Concepts of Machine Learning

Lecture 1: Basic Concepts of Machine Learning Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010

More information

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

CSL465/603 - Machine Learning

CSL465/603 - Machine Learning CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am

More information

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence COURSE DESCRIPTION This course presents computing tools and concepts for all stages

More information

Knowledge based expert systems D H A N A N J A Y K A L B A N D E

Knowledge based expert systems D H A N A N J A Y K A L B A N D E Knowledge based expert systems D H A N A N J A Y K A L B A N D E What is a knowledge based system? A Knowledge Based System or a KBS is a computer program that uses artificial intelligence to solve problems

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1 Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

Welcome. Paulo Goes Dean, Eller College of Management Welcome Our region

Welcome. Paulo Goes Dean, Eller College of Management Welcome Our region Welcome. Paulo Goes Dean, Welcome. Our region Outlook for Tucson Patricia Feeney Executive Director, Southern Arizona Market Chase George W. Hammond, Ph.D. Director, University of Arizona 1 Visit the award-winning

More information

Mining Association Rules in Student s Assessment Data

Mining Association Rules in Student s Assessment Data www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama

More information

Welcome to. ECML/PKDD 2004 Community meeting

Welcome to. ECML/PKDD 2004 Community meeting Welcome to ECML/PKDD 2004 Community meeting A brief report from the program chairs Jean-Francois Boulicaut, INSA-Lyon, France Floriana Esposito, University of Bari, Italy Fosca Giannotti, ISTI-CNR, Pisa,

More information

Reducing Features to Improve Bug Prediction

Reducing Features to Improve Bug Prediction Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science

More information

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH ISSN: 0976-3104 Danti and Bhushan. ARTICLE OPEN ACCESS CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH Ajit Danti 1 and SN Bharath Bhushan 2* 1 Department

More information

Statistics and Data Analytics Minor

Statistics and Data Analytics Minor October 28, 2014 Page 1 of 6 PROGRAM IDENTIFICATION NAME OF THE MINOR Statistics and Data Analytics ACADEMIC PROGRAM PROPOSING THE MINOR Mathematics PROGRAM DESCRIPTION DESCRIPTION OF THE MINOR AND STUDENT

More information

University of Groningen. Systemen, planning, netwerken Bosman, Aart

University of Groningen. Systemen, planning, netwerken Bosman, Aart University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean

More information

Universidade do Minho Escola de Engenharia

Universidade do Minho Escola de Engenharia Universidade do Minho Escola de Engenharia Universidade do Minho Escola de Engenharia Dissertação de Mestrado Knowledge Discovery is the nontrivial extraction of implicit, previously unknown, and potentially

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Exposé for a Master s Thesis

Exposé for a Master s Thesis Exposé for a Master s Thesis Stefan Selent January 21, 2017 Working Title: TF Relation Mining: An Active Learning Approach Introduction The amount of scientific literature is ever increasing. Especially

More information

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE Mingon Kang, PhD Computer Science, Kennesaw State University Self Introduction Mingon Kang, PhD Homepage: http://ksuweb.kennesaw.edu/~mkang9

More information

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Time series prediction

Time series prediction Chapter 13 Time series prediction Amaury Lendasse, Timo Honkela, Federico Pouzols, Antti Sorjamaa, Yoan Miche, Qi Yu, Eric Severin, Mark van Heeswijk, Erkki Oja, Francesco Corona, Elia Liitiäinen, Zhanxing

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

arxiv: v2 [cs.cv] 30 Mar 2017

arxiv: v2 [cs.cv] 30 Mar 2017 Domain Adaptation for Visual Applications: A Comprehensive Survey Gabriela Csurka arxiv:1702.05374v2 [cs.cv] 30 Mar 2017 Abstract The aim of this paper 1 is to give an overview of domain adaptation and

More information

Multivariate k-nearest Neighbor Regression for Time Series data -

Multivariate k-nearest Neighbor Regression for Time Series data - Multivariate k-nearest Neighbor Regression for Time Series data - a novel Algorithm for Forecasting UK Electricity Demand ISF 2013, Seoul, Korea Fahad H. Al-Qahtani Dr. Sven F. Crone Management Science,

More information

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 98 (2016 ) 368 373 The 6th International Conference on Current and Future Trends of Information and Communication Technologies

More information

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics 2017-2018 GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics Entrance requirements, program descriptions, degree requirements and other program policies for Biostatistics Master s Programs

More information

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming Data Mining VI 205 Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming C. Romero, S. Ventura, C. Hervás & P. González Universidad de Córdoba, Campus Universitario de

More information

TotalLMS. Getting Started with SumTotal: Learner Mode

TotalLMS. Getting Started with SumTotal: Learner Mode TotalLMS Getting Started with SumTotal: Learner Mode Contents Learner Mode... 1 TotalLMS... 1 Introduction... 3 Objectives of this Guide... 3 TotalLMS Overview... 3 Logging on to SumTotal... 3 Exploring

More information

Green Belt Curriculum (This workshop can also be conducted on-site, subject to price change and number of participants)

Green Belt Curriculum (This workshop can also be conducted on-site, subject to price change and number of participants) Green Belt Curriculum (This workshop can also be conducted on-site, subject to price change and number of participants) Notes: 1. We use Mini-Tab in this workshop. Mini-tab is available for free trail

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach #BaselOne7 Deep search Enhancing a search bar using machine learning Ilgün Ilgün & Cedric Reichenbach We are not researchers Outline I. Periscope: A search tool II. Goals III. Deep learning IV. Applying

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion

More information

BYLINE [Heng Ji, Computer Science Department, New York University,

BYLINE [Heng Ji, Computer Science Department, New York University, INFORMATION EXTRACTION BYLINE [Heng Ji, Computer Science Department, New York University, hengji@cs.nyu.edu] SYNONYMS NONE DEFINITION Information Extraction (IE) is a task of extracting pre-specified types

More information

Mining Student Evolution Using Associative Classification and Clustering

Mining Student Evolution Using Associative Classification and Clustering Mining Student Evolution Using Associative Classification and Clustering 19 Mining Student Evolution Using Associative Classification and Clustering Kifaya S. Qaddoum, Faculty of Information, Technology

More information

CS 446: Machine Learning

CS 446: Machine Learning CS 446: Machine Learning Introduction to LBJava: a Learning Based Programming Language Writing classifiers Christos Christodoulopoulos Parisa Kordjamshidi Motivation 2 Motivation You still have not learnt

More information

Modeling user preferences and norms in context-aware systems

Modeling user preferences and norms in context-aware systems Modeling user preferences and norms in context-aware systems Jonas Nilsson, Cecilia Lindmark Jonas Nilsson, Cecilia Lindmark VT 2016 Bachelor's thesis for Computer Science, 15 hp Supervisor: Juan Carlos

More information

MMOG Subscription Business Models: Table of Contents

MMOG Subscription Business Models: Table of Contents DFC Intelligence DFC Intelligence Phone 858-780-9680 9320 Carmel Mountain Rd Fax 858-780-9671 Suite C www.dfcint.com San Diego, CA 92129 MMOG Subscription Business Models: Table of Contents November 2007

More information

Handling Concept Drifts Using Dynamic Selection of Classifiers

Handling Concept Drifts Using Dynamic Selection of Classifiers Handling Concept Drifts Using Dynamic Selection of Classifiers Paulo R. Lisboa de Almeida, Luiz S. Oliveira, Alceu de Souza Britto Jr. and and Robert Sabourin Universidade Federal do Paraná, DInf, Curitiba,

More information

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach To cite this

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES

DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES Luiz Fernando Gonçalves, luizfg@ece.ufrgs.br Marcelo Soares Lubaszewski, luba@ece.ufrgs.br Carlos Eduardo Pereira, cpereira@ece.ufrgs.br

More information

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME The following resources are currently available: DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME 2016-17 What is the Doctoral School? The main purpose of the Doctoral School is to enhance your experience

More information

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer

More information

Applications of data mining algorithms to analysis of medical data

Applications of data mining algorithms to analysis of medical data Master Thesis Software Engineering Thesis no: MSE-2007:20 August 2007 Applications of data mining algorithms to analysis of medical data Dariusz Matyja School of Engineering Blekinge Institute of Technology

More information

INSTRUCTIONAL FOCUS DOCUMENT Grade 5/Science

INSTRUCTIONAL FOCUS DOCUMENT Grade 5/Science Exemplar Lesson 01: Comparing Weather and Climate Exemplar Lesson 02: Sun, Ocean, and the Water Cycle State Resources: Connecting to Unifying Concepts through Earth Science Change Over Time RATIONALE:

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17. Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link

More information

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN From: AAAI Technical Report WS-98-08. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Recommender Systems: A GroupLens Perspective Joseph A. Konstan *t, John Riedl *t, AI Borchers,

More information

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF Read Online and Download Ebook ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF Click link bellow and free register to download

More information

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees Mariusz Łapczy ski 1 and Bartłomiej Jefma ski 2 1 The Chair of Market Analysis and Marketing Research,

More information

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Using Web Searches on Important Words to Create Background Sets for LSI Classification Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract

More information

The taming of the data:

The taming of the data: The taming of the data: Using text mining in building a corpus for diachronic analysis Stefania Degaetano-Ortlieb, Hannah Kermes, Ashraf Khamis, Jörg Knappen, Noam Ordan and Elke Teich Background Big data

More information

MARKETING FOR THE BOP WORKSHOP

MARKETING FOR THE BOP WORKSHOP MARKETING FOR THE BOP WORKSHOP Concept Note This note presents our methodology to help refine the marketing and sales practices of organizations that sell innovative devices (such as water filters or improved

More information

Customized Question Handling in Data Removal Using CPHC

Customized Question Handling in Data Removal Using CPHC International Journal of Research Studies in Computer Science and Engineering (IJRSCSE) Volume 1, Issue 8, December 2014, PP 29-34 ISSN 2349-4840 (Print) & ISSN 2349-4859 (Online) www.arcjournals.org Customized

More information

Date : Controller of Examinations Principal Wednesday Saturday Wednesday

Date : Controller of Examinations Principal Wednesday Saturday Wednesday Tamil /Hindi /Malayalam /French N6BXX2TX1A/B/C/D @@ @# English for Enrichment N6BXX2T62Z @@ Sree Saraswathi Thyagaraja College (Autonomous), Pollachi 642 107 06.05.2017 10.05.2017 13.05.2017 I B.Sc (MAT)

More information

Tap vs. Bottled Water

Tap vs. Bottled Water Tap vs. Bottled Water CSU Expository Reading and Writing Modules Tap vs. Bottled Water Student Version 1 CSU Expository Reading and Writing Modules Tap vs. Bottled Water Student Version 2 Name: Block:

More information

Top US Tech Talent for the Top China Tech Company

Top US Tech Talent for the Top China Tech Company THE FALL 2017 US RECRUITING TOUR Top US Tech Talent for the Top China Tech Company INTERVIEWS IN 7 CITIES Tour Schedule CITY Boston, MA New York, NY Pittsburgh, PA Urbana-Champaign, IL Ann Arbor, MI Los

More information

Computerized Adaptive Psychological Testing A Personalisation Perspective

Computerized Adaptive Psychological Testing A Personalisation Perspective Psychology and the internet: An European Perspective Computerized Adaptive Psychological Testing A Personalisation Perspective Mykola Pechenizkiy mpechen@cc.jyu.fi Introduction Mixed Model of IRT and ES

More information

MKTG 611- Marketing Management The Wharton School, University of Pennsylvania Fall 2016

MKTG 611- Marketing Management The Wharton School, University of Pennsylvania Fall 2016 MKTG 611- Marketing Management The Wharton School, University of Pennsylvania Fall 2016 Professor Jonah Berger and Professor Barbara Kahn Teaching Assistants: Nashvia Alvi nashvia@wharton.upenn.edu Puranmalka

More information

Learning Methods for Fuzzy Systems

Learning Methods for Fuzzy Systems Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8

More information

Interactive Whiteboard

Interactive Whiteboard 50 Graphic Organizers for the Interactive Whiteboard Whiteboard-ready graphic organizers for reading, writing, math, and more to make learning engaging and interactive by Jennifer Jacobson & Dottie Raymer

More information

JOB OUTLOOK 2018 NOVEMBER 2017 FREE TO NACE MEMBERS $52.00 NONMEMBER PRICE NATIONAL ASSOCIATION OF COLLEGES AND EMPLOYERS

JOB OUTLOOK 2018 NOVEMBER 2017 FREE TO NACE MEMBERS $52.00 NONMEMBER PRICE NATIONAL ASSOCIATION OF COLLEGES AND EMPLOYERS NOVEMBER 2017 FREE TO NACE MEMBERS $52.00 NONMEMBER PRICE JOB OUTLOOK 2018 NATIONAL ASSOCIATION OF COLLEGES AND EMPLOYERS 62 Highland Avenue, Bethlehem, PA 18017 www.naceweb.org 610,868.1421 TABLE OF CONTENTS

More information

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming. Computer Science 1 COMPUTER SCIENCE Office: Department of Computer Science, ECS, Suite 379 Mail Code: 2155 E Wesley Avenue, Denver, CO 80208 Phone: 303-871-2458 Email: info@cs.du.edu Web Site: Computer

More information

JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD (410)

JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD (410) JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD 21218. (410) 516 5728 wrightj@jhu.edu EDUCATION Harvard University 1993-1997. Ph.D., Economics (1997).

More information

Full text of O L O W Science As Inquiry conference. Science as Inquiry

Full text of O L O W Science As Inquiry conference. Science as Inquiry Page 1 of 5 Full text of O L O W Science As Inquiry conference Reception Meeting Room Resources Oceanside Unifying Concepts and Processes Science As Inquiry Physical Science Life Science Earth & Space

More information

Automating the E-learning Personalization

Automating the E-learning Personalization Automating the E-learning Personalization Fathi Essalmi 1, Leila Jemni Ben Ayed 1, Mohamed Jemni 1, Kinshuk 2, and Sabine Graf 2 1 The Research Laboratory of Technologies of Information and Communication

More information

Unit 7 Data analysis and design

Unit 7 Data analysis and design 2016 Suite Cambridge TECHNICALS LEVEL 3 IT Unit 7 Data analysis and design A/507/5007 Guided learning hours: 60 Version 2 - revised May 2016 *changes indicated by black vertical line ocr.org.uk/it LEVEL

More information

48 contact hours using STANDARD version of Study & Solutions Kit

48 contact hours using STANDARD version of Study & Solutions Kit Proposed course outline for: ICB Management Accounting Control Systems 48 contact hours using STANDARD version of Study & Solutions Kit Note: Should learners use the PRO version of the Study & Solutions

More information

A Comparison of Two Text Representations for Sentiment Analysis

A Comparison of Two Text Representations for Sentiment Analysis 010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational

More information

Online Updating of Word Representations for Part-of-Speech Tagging

Online Updating of Word Representations for Part-of-Speech Tagging Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org

More information

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

Department of Statistics. STAT399 Statistical Consulting. Semester 2, Unit Outline. Unit Convener: Dr Ayse Bilgin

Department of Statistics. STAT399 Statistical Consulting. Semester 2, Unit Outline. Unit Convener: Dr Ayse Bilgin Department of Statistics STAT399 Statistical Consulting Semester 2, 2012 Unit Outline Unit Convener: Dr Ayse Bilgin John Tukey: An approximate answer to the right question is worth a great deal more than

More information

Computational Data Analysis Techniques In Economics And Finance

Computational Data Analysis Techniques In Economics And Finance Computational Data Analysis Techniques In Economics And Finance If searched for a ebook Computational Data Analysis Techniques in Economics and Finance in pdf format, in that case you come on to correct

More information

A Comparison of Standard and Interval Association Rules

A Comparison of Standard and Interval Association Rules A Comparison of Standard and Association Rules Choh Man Teng cmteng@ai.uwf.edu Institute for Human and Machine Cognition University of West Florida 4 South Alcaniz Street, Pensacola FL 325, USA Abstract

More information

An Introduction to Simio for Beginners

An Introduction to Simio for Beginners An Introduction to Simio for Beginners C. Dennis Pegden, Ph.D. This white paper is intended to introduce Simio to a user new to simulation. It is intended for the manufacturing engineer, hospital quality

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY FALL 2017 COURSE SYLLABUS Course Instructors Kagan Kerman (Theoretical), e-mail: kagan.kerman@utoronto.ca Office hours: Mondays 3-6 pm in EV502 (on the 5th floor

More information

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes Stacks Teacher notes Activity description (Interactive not shown on this sheet.) Pupils start by exploring the patterns generated by moving counters between two stacks according to a fixed rule, doubling

More information

Handouts and Resources

Handouts and Resources Handouts and Resources 6 th Grade MELD Culminating Task: Compare the information presented in the articles and interview regarding the effect or purpose of activist groups in influencing society. You will

More information

Georgia Tech College of Management Project Management Leadership Program Eight Day Certificate Program: October 8-11 and November 12-15, 2007

Georgia Tech College of Management Project Management Leadership Program Eight Day Certificate Program: October 8-11 and November 12-15, 2007 Proven Methods for Project Planning, Scheduling and Control Managing Project Risk Project Managers as Agents of Change and Innovation Georgia Tech College of Management Project Management Leadership Program

More information

Probability estimates in a scenario tree

Probability estimates in a scenario tree 101 Chapter 11 Probability estimates in a scenario tree An expert is a person who has made all the mistakes that can be made in a very narrow field. Niels Bohr (1885 1962) Scenario trees require many numbers.

More information

FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification

FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification arxiv:1709.09268v2 [cs.lg] 15 Nov 2017 Kamran Kowsari, Nima Bari, Roman Vichr and Farhad A. Goodarzi Department of Computer

More information

Corrective Feedback and Persistent Learning for Information Extraction

Corrective Feedback and Persistent Learning for Information Extraction Corrective Feedback and Persistent Learning for Information Extraction Aron Culotta a, Trausti Kristjansson b, Andrew McCallum a, Paul Viola c a Dept. of Computer Science, University of Massachusetts,

More information

November 17, 2017 ARIZONA STATE UNIVERSITY. ADDENDUM 3 RFP Digital Integrated Enrollment Support for Students

November 17, 2017 ARIZONA STATE UNIVERSITY. ADDENDUM 3 RFP Digital Integrated Enrollment Support for Students November 17, 2017 ARIZONA STATE UNIVERSITY ADDENDUM 3 RFP 331801 Digital Integrated Enrollment Support for Students Please note the following answers to questions that were asked prior to the deadline

More information

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence

More information

Affective Classification of Generic Audio Clips using Regression Models

Affective Classification of Generic Audio Clips using Regression Models Affective Classification of Generic Audio Clips using Regression Models Nikolaos Malandrakis 1, Shiva Sundaram, Alexandros Potamianos 3 1 Signal Analysis and Interpretation Laboratory (SAIL), USC, Los

More information

Len Lundstrum, Ph.D., FRM

Len Lundstrum, Ph.D., FRM , Ph.D., FRM Professor of Finance Department of Finance College of Business Office: 815 753-0317 Northern Illinois University Fax: 815 753-0504 Dekalb, IL 60115 llundstrum@niu.edu Education Indiana University

More information