# CSC 4510/9010: Applied Machine Learning Rule Inference

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 CSC 4510/9010: Applied Machine Learning Rule Inference Dr. Paula Matuszek (610) CSC Spring Paula Matuszek 1

2 Red Tape Going to use Blackboard for remaining submissions Assignment 3 should be there. Let me know if you have a problem seeing it. CSC Spring Paula Matuszek 2

3 Grad Student Presentations Please send me when you have your topic, but at least a week ahead of time. Feb 10 Raja Harish Vempati Feb 17 Bharadwaj Vadlamannati Feb 24 Midterm Mar 3 Spring break Mar 10 Nikhil Dasari Mar 17 Sruthi Moola Mar 24 Gopi Krishna Chitluri Mar 31 Pradeep Musku April 7 Sai Koushik Haddunoori CSC Spring Paula Matuszek 3

4 Output: representing structural patterns Many different ways of representing patterns Decision trees, rules, instance-based, Also called knowledge representation Representation determines inference method Understanding the output is the key to understanding the underlying learning methods Different types of output for different learning problems (e.g. classification, regression, ) Data Mining: Practical Machine Learning Tools and Techniques (Chapter 3) 4

5 Nominal and numeric attributes Nominal: number of children usually equal to number values attribute won t get tested more than once Other possibility: division into two subsets Numeric: test whether value is greater or less than constant attribute may get tested several times Other possibility: three-way split (or multi-way split) Integer: less than, equal to, greater than Real: below, within, above Data Mining: Practical Machine Learning Tools and Techniques (Chapter 3) 5

6 Missing values Does absence of value have some significance? Yes missing is a separate value No missing must be treated in a special way Solution A: assign instance to most popular branch Solution B: split instance into pieces Pieces receive weight according to fraction of training instances that go down each branch Classifications from leave nodes are combined using the weights that have percolated to them Data Mining: Practical Machine Learning Tools and Techniques (Chapter 3) 6

7 Simplicity first Simple algorithms often work very well! There are many kinds of simple structure, eg: One attribute does all the work All attributes contribute equally & independently A weighted linear combination might do Instance-based: use a few prototypes Use simple logical rules Success of method depends on the domain Data Mining: Practical Machine Learning Tools and Techniques (Chapter 4) 7

8 Classification rules Popular alternative to decision trees Antecedent (pre-condition): a series of tests (just like the tests at the nodes of a decision tree) Tests are usually logically ANDed together (but may also be general logical expressions) Consequent (conclusion): classes, set of classes, or probability distribution assigned by rule Individual rules are often logically ORed together Conflicts arise if different conclusions apply Data Mining: Practical Machine Learning Tools and Techniques (Chapter 3) 8

9 Model spaces Decision trees Partition the instance space into axis-parallel regions, labeled with class value Nearest-neighbor classifiers Partition the instance space into regions defined by the centroid instances (or cluster of k instances) Associative rules (feature values class) (more to come!) CSC 8520 Fall, Paula Matuszek. Some sldies from M. DesJardins, CSC Spring Paula Matuszek and 9

10 Rule Induction Given Features Training examples Output for training examples Generate automatically a set of rules or a decision tree which will allow you to judge new objects Basic approach is Combinations of features become antecedents or links Examples become consequents or nodes CSC Spring Paula Matuszek 10

11 Rule Induction Example Starting with 100 cases, 10 outcomes, 15 variables Form 100 rules, each with 15 antecedents and one consequent. Collapse rules. Cancellations: If we have C, A => B and C, A => B, collapse to A => B Drop Terms: D, E => F and D, G => F, collapse to D => F Test rules and undo collapse if performance gets worse CSC Spring Paula Matuszek 11

12 Rose Diagnosis Yellow Leaves Wilted Leaves Brown Spots Fungus N Y Y Bugs N Y Y Nutrition Y N N Fungus N N Y Fungus Y N Y Bugs Y Y N R1: If not yellow leaves and wilted leaves and brown spots then fungus. R6: If wilted leaves and yellow leaves and not brown spots then bugs CSC Spring Paula Matuszek 12

13 Rose Diagnosis Cases 1 and 4 have opposite values for wilted leaves, so create new rule: R7: If not yellow leaves and brown spots then fungus. KB is rules. Learner is system collapsing and test rules. Critic is the test cases. Performer is rule-based inference. Problems: Over-generalization Irrelevance Need data on all features for all training cases Computationally painful. Useful if you have enough good training cases. Output can be understood and modified by humans CSC Spring Paula Matuszek 13

14 Inferring rudimentary rules 1R: learns a 1-level decision tree I.e., rules that all test one particular attribute Basic version One branch for each value Each branch assigns most frequent class Error rate: proportion of instances that don t belong to the majority class of their corresponding branch Choose attribute with lowest error rate (assumes nominal attributes) Data Mining: Practical Machine Learning Tools and Techniques (Chapter 4) 14

15 Pseudo-code for 1R For each attribute, For each value of the attribute, make a rule as follows: count how often each class appears find the most frequent class make the rule assign that class to this attribute-value Calculate the error rate of the rules Choose the rules with the smallest error rate Note: missing is treated as a separate attribute value Data Mining: Practical Machine Learning Tools and Techniques (Chapter 4) 15

16 Evaluating the weather attributes Outlook Temp Humidity Windy Play Sunny Sunny Overcast Rainy Rainy Rainy Overcast Sunny Sunny Rainy Sunny Overcast Overcast Hot Hot Hot Mild Cool Cool Cool Mild Cool Mild Mild Mild Hot High High High High Normal Normal Normal High Normal Normal Normal High Normal False True False False False True True False False False True True False No No Yes Yes Yes No Yes No Yes Yes Yes Yes Yes Attribute Outlook Temp Humidity Windy Rules Sunny No Overcast Yes Rainy Yes Hot No* Mild Yes Cool Yes High No Normal Yes False Yes True No* Errors Rainy Mild High True No * indicates a tie Data Mining: Practical Machine Learning Tools and Techniques (Chapter 4) 16 2/5 0/4 2/5 2/4 2/6 1/4 3/7 1/7 2/8 3/6 Total errors 4/14 5/14 4/14 5/14

17 Dealing with numeric attributes Discretize numeric attributes Divide each attribute s range into intervals Sort instances according to attribute s values Place breakpoints where class changes (majority class) This minimizes the total error Example: temperature from weather data Yes No Yes Yes Yes No No Yes Yes Yes No Yes Yes No Outlook Sunny Sunny Overcast Rainy Temperature Humidity Windy False False False Data Mining: Practical Machine Learning Tools and Techniques (Chapter 4) 17 True Play No No Yes Yes

18 The problem of overfitting This procedure is very sensitive to noise One instance with an incorrect class label will probably produce a separate interval Also: time stamp attribute will have zero errors Simple solution: enforce minimum number of instances in majority class per interval Example (with min = 3): Yes No Yes Yes Yes No No Yes Yes Yes No Yes Yes No Yes No Yes Yes Yes No No Yes Yes Yes No Yes Yes No Data Mining: Practical Machine Learning Tools and Techniques (Chapter 4) 18

19 Discussion of 1R 1R was described in a paper by Holte (1993) Contains an experimental evaluation on 16 datasets (using cross-validation so that results were representative of performance on future data) Minimum number of instances was set to 6 after some experimentation 1R s simple rules performed not much worse than much more complex decision trees Simplicity first pays off! Very Simple Classification Rules Perform Well on Most Commonly Used Datasets Robert C. Holte, Computer Science Department, University of Ottawa Data Mining: Practical Machine Learning Tools and Techniques (Chapter 4) 19

20 Discussion of 1R: Hyperpipes Another simple technique: build one rule for each class Each rule is a conjunction of tests, one for each attribute For numeric attributes: test checks whether instance's value is inside an interval Interval given by minimum and maximum observed in training data For nominal attributes: test checks whether value is one of a subset of attribute values Subset given by all possible values observed in training data Class with most matching tests is predicted Data Mining: Practical Machine Learning Tools and Techniques (Chapter 4) 20

21 From trees to rules Easy: converting a tree into a set of rules One rule for each leaf: Antecedent contains a condition for every node on the path from the root to the leaf Consequent is class assigned by the leaf Produces rules that are unambiguous Doesn t matter in which order they are executed But: resulting rules are unnecessarily complex Pruning to remove redundant tests/rules Data Mining: Practical Machine Learning Tools and Techniques (Chapter 3) 21

22 From rules to trees More difficult: transforming a rule set into a tree Tree cannot easily express disjunction between rules Example: rules which test different attributes If a and b then x If c and d then x Symmetry needs to be broken Corresponding tree contains identical subtrees ( replicated subtree problem ) Data Mining: Practical Machine Learning Tools and Techniques (Chapter 3) 22

23 Rules Summary Multiple approaches, but the basic idea is the same: infer simple rules that make the decision based on logical combinations of attributes 1R is a good first test For simple domains the rules are easy to understand by humans Sensitive to noise, overfitting Not a good fit for complex domains, large number of attributes CSC Spring Paula Matuszek 23

24 Examples in Weka Section 4.1 in text CSC Spring Paula Matuszek 24

### Inductive Learning and Decision Trees

Inductive Learning and Decision Trees Doug Downey EECS 349 Winter 2014 with slides from Pedro Domingos, Bryan Pardo Outline Announcements Homework #1 assigned Have you completed it? Inductive learning

### Machine Learning B, Fall 2016

Machine Learning 10-601 B, Fall 2016 Decision Trees (Summary) Lecture 2, 08/31/ 2016 Maria-Florina (Nina) Balcan Learning Decision Trees. Supervised Classification. Useful Readings: Mitchell, Chapter 3

### Inductive Learning and Decision Trees

Inductive Learning and Decision Trees Doug Downey EECS 349 Spring 2017 with slides from Pedro Domingos, Bryan Pardo Outline Announcements Homework #1 was assigned on Monday (due in five days!) Inductive

### Decision Tree for Playing Tennis

Decision Tree Decision Tree for Playing Tennis (outlook=sunny, wind=strong, humidity=normal,? ) DT for prediction C-section risks Characteristics of Decision Trees Decision trees have many appealing properties

### Supervised learning can be done by choosing the hypothesis that is most probable given the data: = arg max ) = arg max

The learning problem is called realizable if the hypothesis space contains the true function; otherwise it is unrealizable On the other hand, in the name of better generalization ability it may be sensible

### 18 LEARNING FROM EXAMPLES

18 LEARNING FROM EXAMPLES An intelligent agent may have to learn, for instance, the following components: A direct mapping from conditions on the current state to actions A means to infer relevant properties

### PRESENTATION TITLE. A Two-Step Data Mining Approach for Graduation Outcomes CAIR Conference

PRESENTATION TITLE A Two-Step Data Mining Approach for Graduation Outcomes 2013 CAIR Conference Afshin Karimi (akarimi@fullerton.edu) Ed Sullivan (esullivan@fullerton.edu) James Hershey (jrhershey@fullerton.edu)

### Machine Learning and Auto-Evaluation

Machine Learning and Auto-Evaluation In very simple terms, Machine Learning is about training or teaching computers to take decisions or actions without explicitly programming them. For example, whenever

### Decision Tree For Playing Tennis

Decision Tree For Playing Tennis ROOT NODE BRANCH INTERNAL NODE LEAF NODE Disjunction of conjunctions Another Perspective of a Decision Tree Model Age 60 40 20 NoDefault NoDefault + + NoDefault Default

### Classifying Breast Cancer By Using Decision Tree Algorithms

Classifying Breast Cancer By Using Decision Tree Algorithms Nusaibah AL-SALIHY, Turgay IBRIKCI (Presenter) Cukurova University, TURKEY What Is A Decision Tree? Why A Decision Tree? Why Decision TreeClassification?

### IAI : Machine Learning

IAI : Machine Learning John A. Bullinaria, 2005 1. What is Machine Learning? 2. The Need for Learning 3. Learning in Neural and Evolutionary Systems 4. Problems Facing Expert Systems 5. Learning in Rule

### Assignment 6 (Sol.) Introduction to Machine Learning Prof. B. Ravindran

Assignment 6 (Sol.) Introduction to Machine Learning Prof. B. Ravindran 1. Assume that you are given a data set and a neural network model trained on the data set. You are asked to build a decision tree

### Evaluation and Comparison of Performance of different Classifiers

Evaluation and Comparison of Performance of different Classifiers Bhavana Kumari 1, Vishal Shrivastava 2 ACE&IT, Jaipur Abstract:- Many companies like insurance, credit card, bank, retail industry require

### Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University. January 11, 2011

Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 11, 2011 Today: What is machine learning? Decision tree learning Course logistics Readings: The Discipline

### Data Mining CAP

Data Mining CAP 5771-001 Administrative Details The text is a high-level overview of data mining. You can supplement this by papers from the bibliography available on the Web. They will provide some details.

### A Combination of Decision Trees and Instance-Based Learning Master s Scholarly Paper Peter Fontana,

A Combination of Decision s and Instance-Based Learning Master s Scholarly Paper Peter Fontana, pfontana@cs.umd.edu March 21, 2008 Abstract People are interested in developing a machine learning algorithm

### Predicting Student Performance by Using Data Mining Methods for Classification

BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

### P(A, B) = P(A B) = P(A) + P(B) - P(A B)

AND Probability P(A, B) = P(A B) = P(A) + P(B) - P(A B) P(A B) = P(A) + P(B) - P(A B) Area = Probability of Event AND Probability P(A, B) = P(A B) = P(A) + P(B) - P(A B) If, and only if, A and B are independent,

### Dudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA

Adult Income and Letter Recognition - Supervised Learning Report An objective look at classifier performance for predicting adult income and Letter Recognition Dudon Wai Georgia Institute of Technology

### CS Machine Learning

CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

### Analysis of Different Classifiers for Medical Dataset using Various Measures

Analysis of Different for Medical Dataset using Various Measures Payal Dhakate ME Student, Pune, India. K. Rajeswari Associate Professor Pune,India Deepa Abin Assistant Professor, Pune, India ABSTRACT

### Session 1: Gesture Recognition & Machine Learning Fundamentals

IAP Gesture Recognition Workshop Session 1: Gesture Recognition & Machine Learning Fundamentals Nicholas Gillian Responsive Environments, MIT Media Lab Tuesday 8th January, 2013 My Research My Research

### Constrained Dynamic Rule Induction Learning

Constrained Dynamic Rule Induction Learning Fadi Thabtah a, Issa Qabajeh b, Francisco Chiclana c a. Applied Business and Computing, NMIT, Auckland, New Zealand b. School of Computer Sciences and Informatics,

### Cost-Sensitive Learning and the Class Imbalance Problem

To appear in Encyclopedia of Machine Learning. C. Sammut (Ed.). Springer. 2008 Cost-Sensitive Learning and the Class Imbalance Problem Charles X. Ling, Victor S. Sheng The University of Western Ontario,

### Outline. Learning from Observations. Learning agents. Learning. Inductive learning (a.k.a. Science) Environment. Agent.

Outline Learning agents Learning from Observations Inductive learning Decision tree learning Measuring learning performance Chapter 18, Sections 1 3 Chapter 18, Sections 1 3 1 Chapter 18, Sections 1 3

### CSC-272 Exam #2 March 20, 2015

CSC-272 Exam #2 March 20, 2015 Name Questions are weighted as indicated. Show your work and state your assumptions for partial credit consideration. Unless explicitly stated, there are NO intended errors

### Introduction to Classification, aka Machine Learning

Introduction to Classification, aka Machine Learning Classification: Definition Given a collection of examples (training set ) Each example is represented by a set of features, sometimes called attributes

### Tanagra Tutorials. Figure 1 Tree size and generalization error rate (Source:

1 Topic Describing the post pruning process during the induction of decision trees (CART algorithm, Breiman and al., 1984 C RT component into TANAGRA). Determining the appropriate size of the tree is a

### CS 354R: Computer Game Technology

CS 354R: Computer Game Technology AI Decision Trees and Rule Systems Fall 2017 Decision Trees Nodes represent attribute tests One child for each outcome Leaves represent classifications Can have same classification

### Assignment #6: Neural Networks (with Tensorflow) CSCI 374 Fall 2017 Oberlin College Due: Tuesday November 21 at 11:59 PM

Background Assignment #6: Neural Networks (with Tensorflow) CSCI 374 Fall 2017 Oberlin College Due: Tuesday November 21 at 11:59 PM Our final assignment this semester has three main goals: 1. Implement

### A COMPARATIVE ANALYSIS OF META AND TREE CLASSIFICATION ALGORITHMS USING WEKA

A COMPARATIVE ANALYSIS OF META AND TREE CLASSIFICATION ALGORITHMS USING WEKA T.Sathya Devi 1, Dr.K.Meenakshi Sundaram 2, (Sathya.kgm24@gmail.com 1, lecturekms@yahoo.com 2 ) 1 (M.Phil Scholar, Department

### Machine Learning in Practice/ Applied Machine Learning ,11-663,05-834,05-434

Machine Learning in Practice/ Applied Machine Learning 11-344,11-663,05-834,05-434 Instructor: Dr. Carolyn P. Rosé, cprose@cs.cmu.edu Office Hours: Gates-Hillman Center 5415, Time TBA Teaching Assistants:

### Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University. January 12, 2015

Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 12, 2015 Today: What is machine learning? Decision tree learning Course logistics Readings: The Discipline

### Practical considerations about the implementation of some Machine Learning LGD models in companies

Practical considerations about the implementation of some Machine Learning LGD models in companies September 15 th 2017 Louvain-la-Neuve Sébastien de Valeriola Please read the important disclaimer at the

### Improving Classifier Utility by Altering the Misclassification Cost Ratio

Improving Classifier Utility by Altering the Misclassification Cost Ratio Michelle Ciraco, Michael Rogalewski and Gary Weiss Department of Computer Science Fordham University Rose Hill Campus Bronx, New

### CS 540: Introduction to Artificial Intelligence

CS 540: Introduction to Artificial Intelligence Midterm Exam: 4:00-5:15 pm, October 25, 2016 B130 Van Vleck CLOSED BOOK (one sheet of notes and a calculator allowed) Write your answers on these pages and

### Analytical Study of Some Selected Classification Algorithms in WEKA Using Real Crime Data

Analytical Study of Some Selected Classification Algorithms in WEKA Using Real Crime Data Obuandike Georgina N. Department of Mathematical Sciences and IT Federal University Dutsinma Katsina state, Nigeria

### MASTER THESIS AUTOMATIC ESSAY SCORING: MACHINE LEARNING MEETS APPLIED LINGUISTICS. Victor Dias de Oliveira Santos July, 2011

1 MASTER THESIS AUTOMATIC ESSAY SCORING: MACHINE LEARNING MEETS APPLIED LINGUISTICS Victor Dias de Oliveira Santos July, 2011 European Masters in Language and Communication Technologies Supervisors: Prof.

### Introduction to Classification

Introduction to Classification Classification: Definition Given a collection of examples (training set ) Each example is represented by a set of features, sometimes called attributes Each example is to

### Performance Analysis of Various Data Mining Techniques on Banknote Authentication

International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 5 Issue 2 February 2016 PP.62-71 Performance Analysis of Various Data Mining Techniques on

### TOWARDS DATA-DRIVEN AUTONOMICS IN DATA CENTERS

TOWARDS DATA-DRIVEN AUTONOMICS IN DATA CENTERS ALINA SIRBU, OZALP BABAOGLU SUMMARIZED BY ARDA GUMUSALAN MOTIVATION 2 MOTIVATION Human-interaction-dependent data centers are not sustainable for future data

### Foundations of Intelligent Systems CSCI (Fall 2015)

Foundations of Intelligent Systems CSCI-630-01 (Fall 2015) Final Examination, Fri. Dec 18, 2015 Instructor: Richard Zanibbi, Duration: 120 Minutes Name: Instructions The exam questions are worth a total

### Optimization of Naïve Bayes Data Mining Classification Algorithm

Optimization of Naïve Bayes Data Mining Classification Algorithm Maneesh Singhal #1, Ramashankar Sharma #2 Department of Computer Engineering, University College of Engineering, Rajasthan Technical University,

### Data Mining: A Prediction for Academic Performance Improvement of Science Students using Classification

Data Mining: A Prediction for Academic Performance Improvement of Science Students using Classification I.A Ganiyu Department of Computer Science, Ramon Adedoyin College of Science and Technology, Oduduwa

### Machine Learning 2nd Edition

INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

### KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge

KNOWLEDGE ACQUISITION AND CONSTRUCTION Transfer of Knowledge Knowledge acquisition is the process of extracting knowledge from whatever source including document, manuals, case studies, etc. Knowledge

### Applied Machine Learning Lecture 1: Introduction

Applied Machine Learning Lecture 1: Introduction Richard Johansson January 16, 2018 welcome to the course! machine learning is getting increasingly popular among students our courses are full! many thesis

### Big Data Analytics Clustering and Classification

E6893 Big Data Analytics Lecture 4: Big Data Analytics Clustering and Classification Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science September 28th, 2017 1

### Course Title: Math Grade Level: Sixth

Course Title: Math Grade Level: Sixth 8TH GRADE STATE STANDARDS STUDENT PERFORMANCE STRATEGIES, ACTIVITIES, AND RESOURCES 2.1 Numbers, Number Systems and Number Relationships: A. Represent and use numbers

### Admission Prediction System Using Machine Learning

Admission Prediction System Using Machine Learning Jay Bibodi, Aasihwary Vadodaria, Anand Rawat, Jaidipkumar Patel bibodi@csus.edu, aaishwaryvadoda@csus.edu, anandrawat@csus.edu, jaidipkumarpate@csus.edu

### Childhood Obesity epidemic analysis using classification algorithms

Childhood Obesity epidemic analysis using classification algorithms Suguna. M M.Phil. Scholar Trichy, Tamilnadu, India suguna15.9@gmail.com Abstract Obesity is the one of the most serious public health

### CS221 Practice Midterm #1

CS221 Practice Midterm #1 Summer 2013 The following pages are excerpts from similar classes midterms. The content is similar to our midterm but I have opted to give you a document with more problems rather

### Classification of Arrhythmia Using Machine Learning Techniques

Classification of Arrhythmia Using Machine Learning Techniques THARA SOMAN PATRICK O. BOBBIE School of Computing and Software Engineering Southern Polytechnic State University (SPSU) 1 S. Marietta Parkway,

### The Principles of Designing an Expert System in Teaching Mathematics

Universal Journal of Educational Research 1(2): 42-47, 2013 DOI: 10.13189/ujer.2013.010202 http://www.hrpub.org The Principles of Designing an Expert System in Teaching Mathematics Lailya Salekhova *,

### 1 Subject. 2 Dataset. 3 Descriptive statistics. 3.1 Data importation. SIPINA proposes some descriptive statistics functionalities.

1 Subject proposes some descriptive statistics functionalities. In itself, the information is not really exceptional; there is a large number of freeware which do that. It becomes more interesting when

### Exemplar Grade 4 Mathematics Test Questions

Exemplar Grade 4 Mathematics Test Questions discoveractaspire.org 2017 by ACT, Inc. All rights reserved. ACT Aspire is a registered trademark of ACT, Inc. AS1008 Introduction Introduction This booklet

### Learning dispatching rules via an association rule mining approach. Dongwook Kim. A thesis submitted to the graduate faculty

Learning dispatching rules via an association rule mining approach by Dongwook Kim A thesis submitted to the graduate faculty in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE

### CSE 546 Machine Learning

CSE 546 Machine Learning Instructor: Luke Zettlemoyer TA: Lydia Chilton Slides adapted from Pedro Domingos and Carlos Guestrin Logistics Instructor: Luke Zettlemoyer Email: lsz@cs Office: CSE 658 Office

### A Few Useful Things to Know about Machine Learning. Pedro Domingos Department of Computer Science and Engineering University of Washington" 2012"

A Few Useful Things to Know about Machine Learning Pedro Domingos Department of Computer Science and Engineering University of Washington 2012 A Few Useful Things to Know about Machine Learning Machine

### COMPARATIVE STUDY: FEATURE SELECTION METHODS IN THE BLENDED LEARNING ENVIRONMENT UDC :( )

FACTA UNIVERSITATIS Series: Automatic Control and Robotics Vol. 16, N o 2, 2017, pp. 95-116 DOI: 10.22190/FUACR1702095D COMPARATIVE STUDY: FEATURE SELECTION METHODS IN THE BLENDED LEARNING ENVIRONMENT

### INTRODUCTION TO DATA SCIENCE

DATA11001 INTRODUCTION TO DATA SCIENCE EPISODE 6: MACHINE LEARNING TODAY S MENU 1. WHAT IS ML? 2. CLASSIFICATION AND REGRESSSION 3. EVALUATING PERFORMANCE & OVERFITTING WHAT IS MACHINE LEARNING? Definition:

### Linear Regression. Chapter Introduction

Chapter 9 Linear Regression 9.1 Introduction In this class, we have looked at a variety of di erent models and learning methods, such as finite state machines, sequence models, and classification methods.

### Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

### Exemplar Grade 5 Mathematics Test Questions

Exemplar Grade 5 Mathematics Test Questions discoveractaspire.org 2015 by ACT, Inc. All rights reserved. ACT Aspire is a registered trademark of ACT, Inc. 4147 Introduction Introduction This booklet explains

### Machine Learning with Weka

Machine Learning with Weka SLIDES BY (TOTAL 5 Session of 1.5 Hours Each) ANJALI GOYAL & ASHISH SUREKA (www.ashish-sureka.in) CS 309 INFORMATION RETRIEVAL COURSE ASHOKA UNIVERSITY NOTE: Slides created and

### Comprehensible Data Mining: Gaining Insight from Data

Comprehensible Data Mining: Gaining Insight from Data Michael J. Pazzani Information and Computer Science University of California, Irvine pazzani@ics.uci.edu http://www.ics.uci.edu/~pazzani Outline UC

### KNOWLEDGE INTEGRATION AND FORGETTING

KNOWLEDGE INTEGRATION AND FORGETTING Luís Torgo LIACC - Laboratory of AI and Computer Science University of Porto Rua Campo Alegre, 823-2º 4100 Porto, Portugal Miroslav Kubat Computer Center Technical

### Predicting Academic Success from Student Enrolment Data using Decision Tree Technique

Predicting Academic Success from Student Enrolment Data using Decision Tree Technique M Narayana Swamy Department of Computer Applications, Presidency College Bangalore,India M. Hanumanthappa Department

### Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

### Scaling Quality On Quora Using Machine Learning

Scaling Quality On Quora Using Machine Learning Nikhil Garg @nikhilgarg28 @Quora @QconSF 11/7/16 Goals Of The Talk Introducing specific product problems we need to solve to stay high-quality Describing

### Generalized FLIC: Learning with misclassification for Binary Classifiers

Generalized LIC: Learning with misclassification for Binary Classifiers By Arunabha Choudhury Submitted to the graduate degree program in Electrical Engineering and Computer Science and the Graduate faculty

### A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Department of Computer Science

KNOWLEDGE EXTRACTION FROM SURVEY DATA USING NEURAL NETWORKS by IMRAN AHMED KHAN A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science Department

### A Quantitative Study of Small Disjuncts in Classifier Learning

Submitted 1/7/02 A Quantitative Study of Small Disjuncts in Classifier Learning Gary M. Weiss AT&T Labs 30 Knightsbridge Road, Room 31-E53 Piscataway, NJ 08854 USA Keywords: classifier learning, small

### Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

### A Rules-to-Trees Conversion in the Inductive Database System VINLEN

A Rules-to-Trees Conversion in the Inductive Database System VINLEN Tomasz Szyd lo 1, Bart lomiej Śnieżyński1, and Ryszard S. Michalski 2,3 1 Institute of Computer Science, AGH University of Science and

### NPTEL NPTEL ONLINE CERTIFICATION COURSE. Introduction to Machine Learning. Lecture 1

NPTEL NPTEL ONLINE CERTIFICATION COURSE Introduction to Machine Learning Lecture 1 Prof. Balaraman Ravindran Computer Scince and Engineering Indian Institute of Technology Madras Introduction to Machine

### Concession Curve Analysis for Inspire Negotiations

Concession Curve Analysis for Inspire Negotiations Vivi Nastase SITE University of Ottawa, Ottawa, ON vnastase@site.uottawa.ca Gregory Kersten John Molson School of Business Concordia University, Montreal,

### Lecture 1: Basic Concepts of Machine Learning

Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010

### Systematic Data Selection to Mine Concept Drifting Data Streams

Systematic Data Selection to Mine Concept Drifting Data Streams Wei Fan IBM T.J.Watson Research 19 Skyline Drive Hawthorne, NY 10532, USA weifan@us.ibm.com ABSTRACT One major problem of existing methods

Correlation to Curriculum and Grade Classroom Resources Note: Leaps and Bounds 7/ is a math intervention resource and therefore does not include new content and concepts being introduced to students for

### ANALYZING BIG DATA WITH DECISION TREES

San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2014 ANALYZING BIG DATA WITH DECISION TREES Lok Kei Leong Follow this and additional works at:

### WEKA tutorial exercises

WEKA tutorial exercises These tutorial exercises introduce WEKA and ask you to try out several machine learning, visualization, and preprocessing methods using a wide variety of datasets: Learners: decision

### Phonemes based Speech Word Segmentation using K-Means

International Journal of Engineering Sciences Paradigms and Researches () Phonemes based Speech Word Segmentation using K-Means Abdul-Hussein M. Abdullah 1 and Esra Jasem Harfash 2 1, 2 Department of Computer

### The Study and Analysis of Classification Algorithm for Animal Kingdom Dataset

www.seipub.org/ie Information Engineering Volume 2 Issue 1, March 2013 The Study and Analysis of Classification Algorithm for Animal Kingdom Dataset E. Bhuvaneswari *1, V. R. Sarma Dhulipala 2 Assistant

### Cost-Sensitive Learning vs. Sampling: Which is Best for Handling Unbalanced Classes with Unequal Error Costs?

Cost-Sensitive Learning vs. Sampling: Which is Best for Handling Unbalanced Classes with Unequal Error Costs? Gary M. Weiss, Kate McCarthy, and Bibi Zabar Department of Computer and Information Science

### Improving Real-time Expert Control Systems through Deep Data Mining of Plant Data

Improving Real-time Expert Control Systems through Deep Data Mining of Plant Data Lynn B. Hales Michael L. Hales KnowledgeScape, Salt Lake City, Utah USA Abstract Expert control of grinding and flotation

### Grade 6: Correlated to AGS Basic Math Skills

Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and

### Compacting Instances: Creating models

Decision Trees Compacting Instances: Creating models Food Chat Speedy Price Bar BigTip (3) (2) (2) (2) (2) 1 great yes yes adequate no yes 2 great no yes adequate no yes 3 mediocre yes no high no no 4

### On-Line Data Analytics

International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

### NON-LINEAR DATA ANALYSIS ON KANSEI ENGINEERING AND DESIGN EVALUATION BY GENETIC ALGORITHM

Engineering Vol.6 No.4 pp.55-62 (2006) ORIGINAL ARTICLES NON-LINEAR DATA ANALYSIS ON KANSEI ENGINEERING AND DESIGN EVALUATION BY GENETIC ALGORITHM Toshio TSUCHIYA*, Yukihiro MATSUBARA** *Shimonoseki City

### Computer Security: A Machine Learning Approach

Computer Security: A Machine Learning Approach We analyze two learning algorithms, NBTree and VFI, for the task of detecting intrusions. SANDEEP V. SABNANI AND ANDREAS FUCHSBERGER Produced by the Information

### Probability and Statistics Curriculum Pacing Guide

Unit 1 Terms PS.SPMJ.3 PS.SPMJ.5 Plan and conduct a survey to answer a statistical question. Recognize how the plan addresses sampling technique, randomization, measurement of experimental error and methods

### Classification with Deep Belief Networks. HussamHebbo Jae Won Kim

Classification with Deep Belief Networks HussamHebbo Jae Won Kim Table of Contents Introduction... 3 Neural Networks... 3 Perceptron... 3 Backpropagation... 4 Deep Belief Networks (RBM, Sigmoid Belief

### arxiv: v3 [cs.lg] 9 Mar 2014

Learning Factored Representations in a Deep Mixture of Experts arxiv:1312.4314v3 [cs.lg] 9 Mar 2014 David Eigen 1,2 Marc Aurelio Ranzato 1 Ilya Sutskever 1 1 Google, Inc. 2 Dept. of Computer Science, Courant

### K Nearest Neighbor Edition to Guide Classification Tree Learning

K Nearest Neighbor Edition to Guide Classification Tree Learning J. M. Martínez-Otzeta, B. Sierra, E. Lazkano and A. Astigarraga Department of Computer Science and Artificial Intelligence University of

### Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time

Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time Aditya Sarkar, Julien Kawawa-Beaudan, Quentin Perrot Friday, December 11, 2014 1 Problem Definition Driving while drowsy inevitably

### Model Tracing A Diagnostic Technique in Intelligent Tutoring Systems

Model Tracing A Diagnostic Technique in Intelligent Tutoring Systems Ani Amižić, Slavomir Stankov, Marko Rosić Faculty of Natural Sciences, Mathematics and Education Nikole Tesle 12, 21000 Split, Croatia