# Assignment 6 (Sol.) Introduction to Machine Learning Prof. B. Ravindran

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Assignment 6 (Sol.) Introduction to Machine Learning Prof. B. Ravindran 1. Assume that you are given a data set and a neural network model trained on the data set. You are asked to build a decision tree model with the sole purpose of understanding/interpreting the built neural network model. In such a scenario, which among the following measures would you concentrate most on optimising? (a) Accuracy of the decision tree model on the given data set (b) F1 measure of the decision tree model on the given data set (c) Fidelity of the decision tree model, which is the fraction of instances on which the neural network and the decision tree give the same output (d) Comprehensibility of the decision tree model, measured in terms of the size of the corresponding rule set Sol. (c) Here the aim is not the traditional one of modelling the data well, rather it is to build a decision tree model which is as close to the existing neural network model as possible, so that we can use the decision tree to interpret the neural network model. Hence, we optimise the fidelity measure. 2. Which of the following properties are characteristic of decision trees? (a) High bias (b) High variance (c) Lack of smoothness of prediction surfaces (d) Unbounded parameter set Sol. (b), (c) & (d) Decision trees are generally unstable considering that a small change in the data set can result in a very different set of splits. This is mainly due to the hierarchical nature of decision trees, since a change in split points in the initial stages will affect all the subsequent splits. The decision surfaces that result from decision tree learning are generated by recursive splitting of the feature space using axis parallel hyper planes. They clearly do not produce smooth prediction surfaces such as the ones produced by, say, neural networks. Decision trees do not make any assumptions about the distribution of the data. They are non-parametric methods where the number of parameters depends solely on the data set on which training is carried out. 3. To control the size of the tree, we need to control the number of regions. One approach to do this would be to split tree nodes only if the resultant decrease in the sum of squares error exceeds some threshold. For the described method, which among the following are true? (a) It would, in general, help restrict the size of the trees 1

2 (b) It has the potential to affect the performance of the resultant regression/classification model (c) It is computationally infeasible Sol. (a) & (b) While this approach may restrict the eventual number of regions produced, the main problem with this approach is that it is too restrictive and may result in poor performance. It is very common for splits at one level, which themselves are not that good (i.e., they do not decrease the error significantly), to lead to very good splits (i.e., where the error is significantly reduced) down the line. Think about the XOR problem. 4. Which among the following statements best describes our approach to learning decision trees (a) Identify the best partition of the input space and response per partition to minimise sum of squares error (b) Identify the best approximation of the above by the greedy approach (to identifying the partitions) (c) Identify the model which gives the best performance using the greedy approximation (option (b)) with the smallest partition scheme (d) Identify the model which gives performance close to the best performance (option (a)) with the smallest partition scheme (e) Identify the model which gives performance close to the best greedy approximation performance (option (b)) with the smallest partition scheme Sol. (e) As was discussed in class we use a greedy approximation to identifying the partitions and typically use pruning techniques which result in a smaller tree with probably some degradation in the performance. 5. Having built a decision tree, we are using reduced error pruning to reduce the size of the tree. We select a node to collapse. For this particular node, on the left branch, there are 3 training data points with the following outputs: 5, 7, 9.6 and for the right branch, there are four training data points with the following outputs: 8.7, 9.8, 10.5, 11. What were the original responses for data points along the two branches (left & right respectively) and what is the new response after collapsing the node? (a) 10.8, 13.33, (b) 10.8, 13.33, (c) 7.2, 10, 8.8 (d) 7.2, 10, 8.6 Sol. (c) Original responses: Left: 3 = = ,9.8,10.5,11 Right: 4 = 40 4 = 10 New response: = 8.8 2

3 6. Given that we can select the same feature multiple times during the recursive partitioning of the input space, is it always possible to achieve 100% accuracy on the training data (given that we allow for trees to grow to their maximum size) when building decision trees? (a) Yes (b) No Sol. (b) Consider a pair of data points with identical input features but different class labels. Such points can be part of the training data but will not be able to be classified without error. 7. Suppose on performing reduced error pruning, we collapsed a node and observed an improvement in the prediction accuracy on the validation set. Which among the following statements are possible in light of the performance improvement observed? (a) The collapsed node helped overcome the effect of one or more noise affected data points in the training set (b) The validation set had one or more noise affected data points in the region corresponding to the collapsed node (c) The validation set did not have any data points along at least one of the collapsed branches (d) The validation set did have data points adversely affected by the collapsed node Sol. (a), (b), (c) & (d) The first option is the kind of error we normally expect pruning to help us overcome. However, a node collapse which ideally should result in an increase in the overall error of the model may actually show an improvement due to a number of factors. Perhaps the points which should have been misclassified due to the collapse are mislabelled in the validation set (option (b)). Such points may also be missing from the the validation set (option (c)). Finally, even if the increased error due to the collapsed node is registered in the validation set, it may be masked by the absence of errors (existing in the training data) in other parts of the validation set (option (d)). 8. Consider the following data set: price maintenance capacity airbag profitable low low 2 no yes low med 4 yes no low low 4 no yes low high 4 no no med med 4 no no med med 4 yes yes med high 2 yes no med high 5 no yes high med 4 yes yes high high 2 yes no high high 5 yes yes 3

4 Considering profitable as the binary values attribute we are trying to predict, which of the attributes would you select as the root in a decision tree with multi-way splits using the cross-entropy impurity measure? (a) price (b) maintenance (c) capacity (d) airbag Sol. (c) cross entropy price (D) = 4 11 ( 2 4 log log ) ( 2 4 log log ) ( 2 3 log log ) = cross entropy maintenance (D) = 2 11 ( 2 2 log log ) ( 2 4 log log ) ( 2 5 log log ) = cross entropy capacity (D) = 3 11 ( 1 3 log log ) ( 3 6 log log ) ( 2 2 log log ) = cross entropy price (D) = 5 11 ( 3 5 log log ) ( 3 6 log log ) = For the same data set, suppose we decide to construct a decision tree using binary splits and the Gini index impurity measure. Which among the following feature and split point combinations would be the best to use as the root node assuming that we consider each of the input features to be unordered? (a) price - {low, med} {high} (b) maintenance - {high} {med, low} (c) maintenance - {high, med} {low} (d) capacity - {2} {4, 5} Sol. (c) gini price({low,med} {high}) (D) = = gini maintenance({high} {med,low}) (D) = = gini maintenance({high,med} {low}) (D) = = ginit capacity({2} {4,5}) (D) = = Consider building a spam filter for distinguishing between genuine s and unwanted spam s. Assuming spam to be the positive class, which among the following would be more important to optimise? (a) Precision (b) Recall Sol. (a) If we optimise recall, we may be able to capture more spam s, but in the process, we may also increase the number of genuine mails being predicted as spam. On the other hand, if we optimise precision, we may not be able to capture as many spam s as in the previous 4

5 approach, but the percentage of s being classified as spam actually being spam will be high. Of the two possible trade-offs, we would prefer the later (optimising precision), since in the former approach, we will be risking more genuine s being classified as spam, which is a costlier error to make in a spam filtering application (as compared to missing out a few spam s which the user may have to deal with manually). Weka-based assignment questions In this assignment, we will use the UCI Mushroom data set available here. We will be using the J48 decision tree algorithm which can be found in Weka under classifiers/trees. We will consider the following to be the default parameter settings: 5

6 Note the following: The class for which the prediction model is to be learned is named class and is the first attribute in the data. We will use the default Cross-validation test option with Folds = 10. Once a decision tree model has been built, you can right click on the corresponding entry in the Result list pane on the bottom left, and select Visualize tree to see a visual representation of the learned tree. 11. How many levels does the unpruned tree contain considering multi-way and binary splits respectively, with the other parameters remaining the same as above? 6

7 (a) 6, 8 (b) 6, 7 (c) 5, 7 (d) 5, 8 Sol. (b) Multi-way split 7

8 Binary split 12. How many levels does the pruned tree (unpruned = false, reducederrorpruning = false) contain considering multi-way and binary splits respectively? (a) 6, 6 (b) 6, 7 (c) 5, 7 (d) 5, 6 Sol. (a) 8

9 Multi-way split Binary split 13. Consider the effect of the parameter minnumobj. Try modifying the parameter value and observe the changes in the performance values. For example, considering respectively, multiway and binary splits with the default parameters as discussed before, at what (maximum) value of the parameter do we start to see zero error? 9

10 (a) 11, 7 (b) 11, 6 (c) 10, 6 (d) 10, 7 Sol. (c) This can be observed by trying out the listed values with the default parameters. The idea is for you to appreciate the significance of the minnumobj parameter. 14. Which among the following pairs of attributes seem to be the most important for this particular classification task? (a) population, gill-spacing (b) stalk-surface-below-ring, cap-color (c) gill-spacing, ring-number (d) odor, spore-print-color Sol. (d) From multiple models built using binary/multi-way splits and with and without pruning, we observe that the attributes, odor and spore-print-color always appear at higher levels in the trees indicating their importance for the classification task. 10

### Decision Tree for Playing Tennis

Decision Tree Decision Tree for Playing Tennis (outlook=sunny, wind=strong, humidity=normal,? ) DT for prediction C-section risks Characteristics of Decision Trees Decision trees have many appealing properties

### Supervised learning can be done by choosing the hypothesis that is most probable given the data: = arg max ) = arg max

The learning problem is called realizable if the hypothesis space contains the true function; otherwise it is unrealizable On the other hand, in the name of better generalization ability it may be sensible

### 18 LEARNING FROM EXAMPLES

18 LEARNING FROM EXAMPLES An intelligent agent may have to learn, for instance, the following components: A direct mapping from conditions on the current state to actions A means to infer relevant properties

### Performance Analysis of Various Data Mining Techniques on Banknote Authentication

International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 5 Issue 2 February 2016 PP.62-71 Performance Analysis of Various Data Mining Techniques on

### TOWARDS DATA-DRIVEN AUTONOMICS IN DATA CENTERS

TOWARDS DATA-DRIVEN AUTONOMICS IN DATA CENTERS ALINA SIRBU, OZALP BABAOGLU SUMMARIZED BY ARDA GUMUSALAN MOTIVATION 2 MOTIVATION Human-interaction-dependent data centers are not sustainable for future data

### CSC 4510/9010: Applied Machine Learning Rule Inference

CSC 4510/9010: Applied Machine Learning Rule Inference Dr. Paula Matuszek Paula.Matuszek@villanova.edu Paula.Matuszek@gmail.com (610) 647-9789 CSC 4510.9010 Spring 2015. Paula Matuszek 1 Red Tape Going

### P(A, B) = P(A B) = P(A) + P(B) - P(A B)

AND Probability P(A, B) = P(A B) = P(A) + P(B) - P(A B) P(A B) = P(A) + P(B) - P(A B) Area = Probability of Event AND Probability P(A, B) = P(A B) = P(A) + P(B) - P(A B) If, and only if, A and B are independent,

### Dudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA

Adult Income and Letter Recognition - Supervised Learning Report An objective look at classifier performance for predicting adult income and Letter Recognition Dudon Wai Georgia Institute of Technology

### A Quantitative Study of Small Disjuncts in Classifier Learning

Submitted 1/7/02 A Quantitative Study of Small Disjuncts in Classifier Learning Gary M. Weiss AT&T Labs 30 Knightsbridge Road, Room 31-E53 Piscataway, NJ 08854 USA Keywords: classifier learning, small

### Adaptive Testing Without IRT in the Presence of Multidimensionality

RESEARCH REPORT April 2002 RR-02-09 Adaptive Testing Without IRT in the Presence of Multidimensionality Duanli Yan Charles Lewis Martha Stocking Statistics & Research Division Princeton, NJ 08541 Adaptive

### Decision Boundary. Hemant Ishwaran and J. Sunil Rao

32 Decision Trees, Advanced Techniques in Constructing define impurity using the log-rank test. As in CART, growing a tree by reducing impurity ensures that terminal nodes are populated by individuals

### Machine Learning B, Fall 2016

Machine Learning 10-601 B, Fall 2016 Decision Trees (Summary) Lecture 2, 08/31/ 2016 Maria-Florina (Nina) Balcan Learning Decision Trees. Supervised Classification. Useful Readings: Mitchell, Chapter 3

### Decision Tree For Playing Tennis

Decision Tree For Playing Tennis ROOT NODE BRANCH INTERNAL NODE LEAF NODE Disjunction of conjunctions Another Perspective of a Decision Tree Model Age 60 40 20 NoDefault NoDefault + + NoDefault Default

### Introduction to Classification

Introduction to Classification Classification: Definition Given a collection of examples (training set ) Each example is represented by a set of features, sometimes called attributes Each example is to

### Classifying Breast Cancer By Using Decision Tree Algorithms

Classifying Breast Cancer By Using Decision Tree Algorithms Nusaibah AL-SALIHY, Turgay IBRIKCI (Presenter) Cukurova University, TURKEY What Is A Decision Tree? Why A Decision Tree? Why Decision TreeClassification?

### Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture

### Tanagra Tutorials. Figure 1 Tree size and generalization error rate (Source:

1 Topic Describing the post pruning process during the induction of decision trees (CART algorithm, Breiman and al., 1984 C RT component into TANAGRA). Determining the appropriate size of the tree is a

### INTRODUCTION TO DATA SCIENCE

DATA11001 INTRODUCTION TO DATA SCIENCE EPISODE 6: MACHINE LEARNING TODAY S MENU 1. WHAT IS ML? 2. CLASSIFICATION AND REGRESSSION 3. EVALUATING PERFORMANCE & OVERFITTING WHAT IS MACHINE LEARNING? Definition:

### Predicting Academic Success from Student Enrolment Data using Decision Tree Technique

Predicting Academic Success from Student Enrolment Data using Decision Tree Technique M Narayana Swamy Department of Computer Applications, Presidency College Bangalore,India M. Hanumanthappa Department

### Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University. January 11, 2011

Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 11, 2011 Today: What is machine learning? Decision tree learning Course logistics Readings: The Discipline

### Lecture 1: Machine Learning Basics

1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

### Applied Machine Learning Lecture 1: Introduction

Applied Machine Learning Lecture 1: Introduction Richard Johansson January 16, 2018 welcome to the course! machine learning is getting increasingly popular among students our courses are full! many thesis

### I400 Health Informatics Data Mining Instructions (KP Project)

I400 Health Informatics Data Mining Instructions (KP Project) Casey Bennett Spring 2014 Indiana University 1) Import: First, we need to import the data into Knime. add CSV Reader Node (under IO>>Read)

### CS Machine Learning

CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

### Inductive Learning and Decision Trees

Inductive Learning and Decision Trees Doug Downey EECS 349 Spring 2017 with slides from Pedro Domingos, Bryan Pardo Outline Announcements Homework #1 was assigned on Monday (due in five days!) Inductive

### Session 7: Face Detection (cont.)

Session 7: Face Detection (cont.) John Magee 8 February 2017 Slides courtesy of Diane H. Theriault Question of the Day: How can we find faces in images? Face Detection Compute features in the image Apply

### Machine Learning with Weka

Machine Learning with Weka SLIDES BY (TOTAL 5 Session of 1.5 Hours Each) ANJALI GOYAL & ASHISH SUREKA (www.ashish-sureka.in) CS 309 INFORMATION RETRIEVAL COURSE ASHOKA UNIVERSITY NOTE: Slides created and

### Improving Document Clustering by Utilizing Meta-Data*

Improving Document Clustering by Utilizing Meta-Data* Kam-Fai Wong Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong kfwong@se.cuhk.edu.hk Nam-Kiu Chan Centre

### 1 Subject. 2 Dataset. 3 Descriptive statistics. 3.1 Data importation. SIPINA proposes some descriptive statistics functionalities.

1 Subject proposes some descriptive statistics functionalities. In itself, the information is not really exceptional; there is a large number of freeware which do that. It becomes more interesting when

### Big Data Analytics Clustering and Classification

E6893 Big Data Analytics Lecture 4: Big Data Analytics Clustering and Classification Ching-Yung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science September 28th, 2017 1

### Introduction to Classification, aka Machine Learning

Introduction to Classification, aka Machine Learning Classification: Definition Given a collection of examples (training set ) Each example is represented by a set of features, sometimes called attributes

### Practical considerations about the implementation of some Machine Learning LGD models in companies

Practical considerations about the implementation of some Machine Learning LGD models in companies September 15 th 2017 Louvain-la-Neuve Sébastien de Valeriola Please read the important disclaimer at the

### CS 354R: Computer Game Technology

CS 354R: Computer Game Technology AI Decision Trees and Rule Systems Fall 2017 Decision Trees Nodes represent attribute tests One child for each outcome Leaves represent classifications Can have same classification

### Predicting Student Performance by Using Data Mining Methods for Classification

BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 13, No 1 Sofia 2013 Print ISSN: 1311-9702; Online ISSN: 1314-4081 DOI: 10.2478/cait-2013-0006 Predicting Student Performance

### Machine Learning for NLP

Natural Language Processing SoSe 2014 Machine Learning for NLP Dr. Mariana Neves April 30th, 2014 (based on the slides of Dr. Saeedeh Momtazi) Introduction Field of study that gives computers the ability

### Evaluation and Comparison of Performance of different Classifiers

Evaluation and Comparison of Performance of different Classifiers Bhavana Kumari 1, Vishal Shrivastava 2 ACE&IT, Jaipur Abstract:- Many companies like insurance, credit card, bank, retail industry require

### ANALYZING BIG DATA WITH DECISION TREES

San Jose State University SJSU ScholarWorks Master's Projects Master's Theses and Graduate Research Spring 2014 ANALYZING BIG DATA WITH DECISION TREES Lok Kei Leong Follow this and additional works at:

### IAI : Machine Learning

IAI : Machine Learning John A. Bullinaria, 2005 1. What is Machine Learning? 2. The Need for Learning 3. Learning in Neural and Evolutionary Systems 4. Problems Facing Expert Systems 5. Learning in Rule

### A Few Useful Things to Know about Machine Learning. Pedro Domingos Department of Computer Science and Engineering University of Washington" 2012"

A Few Useful Things to Know about Machine Learning Pedro Domingos Department of Computer Science and Engineering University of Washington 2012 A Few Useful Things to Know about Machine Learning Machine

### Unsupervised Learning and Dimensionality Reduction A Continued Study on Letter Recognition and Adult Income

Unsupervised Learning and Dimensionality Reduction A Continued Study on Letter Recognition and Adult Income Dudon Wai, dwai3 Georgia Institute of Technology CS 7641: Machine Learning Abstract: This paper

### Analytical Study of Some Selected Classification Algorithms in WEKA Using Real Crime Data

Analytical Study of Some Selected Classification Algorithms in WEKA Using Real Crime Data Obuandike Georgina N. Department of Mathematical Sciences and IT Federal University Dutsinma Katsina state, Nigeria

### Analysis of Different Classifiers for Medical Dataset using Various Measures

Analysis of Different for Medical Dataset using Various Measures Payal Dhakate ME Student, Pune, India. K. Rajeswari Associate Professor Pune,India Deepa Abin Assistant Professor, Pune, India ABSTRACT

### Computer Vision for Card Games

Computer Vision for Card Games Matias Castillo matiasct@stanford.edu Benjamin Goeing bgoeing@stanford.edu Jesper Westell jesperw@stanford.edu Abstract For this project, we designed a computer vision program

### BUILDING COMPACT N-GRAM LANGUAGE MODELS INCREMENTALLY

BUILDING COMPACT N-GRAM LANGUAGE MODELS INCREMENTALLY Vesa Siivola Neural Networks Research Centre, Helsinki University of Technology, Finland Abstract In traditional n-gram language modeling, we collect

### Classification with Deep Belief Networks. HussamHebbo Jae Won Kim

Classification with Deep Belief Networks HussamHebbo Jae Won Kim Table of Contents Introduction... 3 Neural Networks... 3 Perceptron... 3 Backpropagation... 4 Deep Belief Networks (RBM, Sigmoid Belief

### Data Mining: A Prediction for Academic Performance Improvement of Science Students using Classification

Data Mining: A Prediction for Academic Performance Improvement of Science Students using Classification I.A Ganiyu Department of Computer Science, Ramon Adedoyin College of Science and Technology, Oduduwa

### Class Noise vs. Attribute Noise: A Quantitative Study of Their Impacts

Artificial Intelligence Review 22: 177 210, 2004. Ó 2004 Kluwer Academic Publishers. Printed in the Netherlands. 177 Class Noise vs. Attribute Noise: A Quantitative Study of Their Impacts XINGQUAN ZHU*

### On Using Class-Labels in Evaluation of Clusterings

On Using Class-Labels in Evaluation of Clusterings Ines Färber Stephan Günnemann Hans-Peter Kriegel Peer Kröger Emmanuel Müller Erich Schubert Thomas Seidl Arthur Zimek RWTH Aachen University, Germany

### How to Estimate Scale-Adjusted Latent Class (SALC) Models and Obtain Better Segments with Discrete Choice Data

Latent GOLD Choice 5.0 tutorial #10B (1-file format) How to Estimate Scale-Adjusted Latent Class (SALC) Models and Obtain Better Segments with Discrete Choice Data Introduction and Goal of this tutorial

### Outline. Learning from Observations. Learning agents. Learning. Inductive learning (a.k.a. Science) Environment. Agent.

Outline Learning agents Learning from Observations Inductive learning Decision tree learning Measuring learning performance Chapter 18, Sections 1 3 Chapter 18, Sections 1 3 1 Chapter 18, Sections 1 3

### Multiple classifiers. JERZY STEFANOWSKI Institute of Computing Sciences Poznań University of Technology. Doctoral School, Catania-Troina, April, 2008

Multiple classifiers JERZY STEFANOWSKI Institute of Computing Sciences Poznań University of Technology Doctoral School, Catania-Troina, April, 2008 Outline of the presentation 1. Introduction 2. Why do

### White Paper. Using Sentiment Analysis for Gaining Actionable Insights

corevalue.net info@corevalue.net White Paper Using Sentiment Analysis for Gaining Actionable Insights Sentiment analysis is a growing business trend that allows companies to better understand their brand,

### Inductive Learning and Decision Trees

Inductive Learning and Decision Trees Doug Downey EECS 349 Winter 2014 with slides from Pedro Domingos, Bryan Pardo Outline Announcements Homework #1 assigned Have you completed it? Inductive learning

### PRESENTATION TITLE. A Two-Step Data Mining Approach for Graduation Outcomes CAIR Conference

PRESENTATION TITLE A Two-Step Data Mining Approach for Graduation Outcomes 2013 CAIR Conference Afshin Karimi (akarimi@fullerton.edu) Ed Sullivan (esullivan@fullerton.edu) James Hershey (jrhershey@fullerton.edu)

### Improving Machine Learning Through Oracle Learning

Brigham Young University BYU ScholarsArchive All Theses and Dissertations 2007-03-12 Improving Machine Learning Through Oracle Learning Joshua Ephraim Menke Brigham Young University - Provo Follow this

### Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

### The Study and Analysis of Classification Algorithm for Animal Kingdom Dataset

www.seipub.org/ie Information Engineering Volume 2 Issue 1, March 2013 The Study and Analysis of Classification Algorithm for Animal Kingdom Dataset E. Bhuvaneswari *1, V. R. Sarma Dhulipala 2 Assistant

### Homework III Using Logistic Regression for Spam Filtering

Homework III Using Logistic Regression for Spam Filtering Introduction to Machine Learning - CMPS 242 By Bruno Astuto Arouche Nunes February 14 th 2008 1. Introduction In this work we study batch learning

### Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University. January 12, 2015

Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 12, 2015 Today: What is machine learning? Decision tree learning Course logistics Readings: The Discipline

### Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

### Kobe University Repository : Kernel

Title Author(s) Kobe University Repository : Kernel A Multitask Learning Model for Online Pattern Recognition Ozawa, Seiichi / Roy, Asim / Roussinov, Dmitri Citation IEEE Transactions on Neural Neworks,

### 1. Subject. 2. Dataset. Resampling approaches for prediction error estimation.

1. Subject Resampling approaches for prediction error estimation. The ability to predict correctly is one of the most important criteria to evaluate classifiers in supervised learning. The preferred indicator

### Assignment #6: Neural Networks (with Tensorflow) CSCI 374 Fall 2017 Oberlin College Due: Tuesday November 21 at 11:59 PM

Background Assignment #6: Neural Networks (with Tensorflow) CSCI 374 Fall 2017 Oberlin College Due: Tuesday November 21 at 11:59 PM Our final assignment this semester has three main goals: 1. Implement

### WING-NUS at CL-SciSumm 2017: Learning from Syntactic and Semantic Similarity for Citation Contextualization

WING-NUS at CL-SciSumm 2017: Learning from Syntactic and Semantic Similarity for Citation Contextualization Animesh Prasad School of Computing, National University of Singapore, Singapore a0123877@u.nus.edu

### Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network

Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network Nick Latourette and Hugh Cunningham 1. Introduction Our paper investigates the use of named entities

### Petal-Width > 1.7. Petal-Length Iris-Viginica > 4.9

Combining Decision Trees Learned in Parallel Lawrence O. Hall, Nitesh Chawla and Kevin W. Bowyer Department of Computer Science and Engineering, ENB 118 University ofsouthflorida 4202 E. Fowler Ave. Tampa,

### Session 1: Gesture Recognition & Machine Learning Fundamentals

IAP Gesture Recognition Workshop Session 1: Gesture Recognition & Machine Learning Fundamentals Nicholas Gillian Responsive Environments, MIT Media Lab Tuesday 8th January, 2013 My Research My Research

### A Combination of Decision Trees and Instance-Based Learning Master s Scholarly Paper Peter Fontana,

A Combination of Decision s and Instance-Based Learning Master s Scholarly Paper Peter Fontana, pfontana@cs.umd.edu March 21, 2008 Abstract People are interested in developing a machine learning algorithm

### Practical Feature Subset Selection for Machine Learning

Practical Feature Subset Selection for Machine Learning Mark A. Hall, Lloyd A. Smith {mhall, las}@cs.waikato.ac.nz Department of Computer Science, University of Waikato, Hamilton, New Zealand. Abstract

### Sequence Discriminative Training;Robust Speech Recognition1

Sequence Discriminative Training; Robust Speech Recognition Steve Renals Automatic Speech Recognition 16 March 2017 Sequence Discriminative Training;Robust Speech Recognition1 Recall: Maximum likelihood

### Online Ensemble Learning: An Empirical Study

Online Ensemble Learning: An Empirical Study Alan Fern AFERN@ECN.PURDUE.EDU Robert Givan GIVAN@ECN.PURDUE.EDU Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN 4797

### Advanced Probabilistic Binary Decision Tree Using SVM for large class problem

Advanced Probabilistic Binary Decision Tree Using for large class problem Anita Meshram 1 Roopam Gupta 2 and Sanjeev Sharma 3 1 School of Information Technology, UTD, RGPV, Bhopal, M.P., India. 2 Information

### Compacting Instances: Creating models

Decision Trees Compacting Instances: Creating models Food Chat Speedy Price Bar BigTip (3) (2) (2) (2) (2) 1 great yes yes adequate no yes 2 great no yes adequate no yes 3 mediocre yes no high no no 4

### Two hierarchical text categorization approaches for BioASQ semantic indexing challenge. BioASQ challenge 2013 Valencia, September 2013

Two hierarchical text categorization approaches for BioASQ semantic indexing challenge Francisco J. Ribadas Víctor M. Darriba Compilers and Languages Group Universidade de Vigo (Spain) http://www.grupocole.org/

### THE PITFALLS OF OVERFITTING IN OPTIMIZATION OF A MANUFACTURING QUALITY CONTROL PROCEDURE

THE PITFALLS OF OVERFITTING IN OPTIMIZATION OF A MANUFACTURING QUALITY CONTROL PROCEDURE Tea Tušar DOLPHIN Team, INRIA Lille Nord Europe, France Department of Intelligent Systems, Jožef Stefan Institute,

### NoiseOut: A Simple Way to Prune Neural Networks

NoiseOut: A Simple Way to Prune Neural Networks Mohammad Babaeizadeh, Paris Smaragdis & Roy H. Campbell Department of Computer Science University of Illinois at Urbana-Champaign {mb2,paris,rhc}@illinois.edu.edu

### USING DATA MINING METHODS KNOWLEDGE DISCOVERY FOR TEXT MINING

USING DATA MINING METHODS KNOWLEDGE DISCOVERY FOR TEXT MINING D.M.Kulkarni 1, S.K.Shirgave 2 1, 2 IT Department Dkte s TEI Ichalkaranji (Maharashtra), India Abstract Many data mining techniques have been

### Conformal Prediction Using Decision Trees

2013 IEEE 13th International Conference on Data Mining Conformal Prediction Using Decision Trees Ulf Johansson, Henrik Boström, Tuve Löfström School of Business and IT University of Borås, Sweden Email:

### Multiple classifiers

Multiple classifiers JERZY STEFANOWSKI Institute of Computing Sciences Poznań University of Technology Zajęcia dla TPD - ZED 2009 Oparte na wykładzie dla Doctoral School, Catania-Troina, April, 2008 Outline

### Machine Learning in Patent Analytics:: Binary Classification for Prioritizing Search Results

Machine Learning in Patent Analytics:: Binary Classification for Prioritizing Search Results Anthony Trippe Managing Director, Patinformatics, LLC Patent Information Fair & Conference November 10, 2017

### COMPARATIVE STUDY: FEATURE SELECTION METHODS IN THE BLENDED LEARNING ENVIRONMENT UDC :( )

FACTA UNIVERSITATIS Series: Automatic Control and Robotics Vol. 16, N o 2, 2017, pp. 95-116 DOI: 10.22190/FUACR1702095D COMPARATIVE STUDY: FEATURE SELECTION METHODS IN THE BLENDED LEARNING ENVIRONMENT

### Predictive Analysis of Text: Concepts, Features, and Instances

of Text: Concepts, Features, and Instances Jaime Arguello jarguell@email.unc.edu August 26, 2015 of Text Objective: developing and evaluating computer programs that automatically detect a particular concept

### A COMPARATIVE ANALYSIS OF META AND TREE CLASSIFICATION ALGORITHMS USING WEKA

A COMPARATIVE ANALYSIS OF META AND TREE CLASSIFICATION ALGORITHMS USING WEKA T.Sathya Devi 1, Dr.K.Meenakshi Sundaram 2, (Sathya.kgm24@gmail.com 1, lecturekms@yahoo.com 2 ) 1 (M.Phil Scholar, Department

### Arrhythmia Classification for Heart Attack Prediction Michelle Jin

Arrhythmia Classification for Heart Attack Prediction Michelle Jin Introduction Proper classification of heart abnormalities can lead to significant improvements in predictions of heart failures. The variety

### (-: (-: SMILES :-) :-)

(-: (-: SMILES :-) :-) A Multi-purpose Learning System Vicent Estruch, Cèsar Ferri, José Hernández-Orallo, M.José Ramírez-Quintana {vestruch, cferri, jorallo, mramirez}@dsic.upv.es Dep. de Sistemes Informàtics

Link Learning with Wikipedia (Milne and Witten, 2008b) Dominikus Wetzel dwetzel@coli.uni-sb.de Department of Computational Linguistics Saarland University December 4, 2009 1 / 28 1 Semantic Relatedness

### Machine Learning with MATLAB Antti Löytynoja Application Engineer

Machine Learning with MATLAB Antti Löytynoja Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB MATLAB as an interactive

### WEKA tutorial exercises

WEKA tutorial exercises These tutorial exercises introduce WEKA and ask you to try out several machine learning, visualization, and preprocessing methods using a wide variety of datasets: Learners: decision

### M3 - Machine Learning for Computer Vision

M3 - Machine Learning for Computer Vision Traffic Sign Detection and Recognition Adrià Ciurana Guim Perarnau Pau Riba Index Correctly crop dataset Bootstrap Dataset generation Extract features Normalization

### Cost-Sensitive Learning and the Class Imbalance Problem

To appear in Encyclopedia of Machine Learning. C. Sammut (Ed.). Springer. 2008 Cost-Sensitive Learning and the Class Imbalance Problem Charles X. Ling, Victor S. Sheng The University of Western Ontario,

### Evaluating the Effectiveness of Ensembles of Decision Trees in Disambiguating Senseval Lexical Samples

Evaluating the Effectiveness of Ensembles of Decision Trees in Disambiguating Senseval Lexical Samples Ted Pedersen Department of Computer Science University of Minnesota Duluth, MN, 55812 USA tpederse@d.umn.edu

### Childhood Obesity epidemic analysis using classification algorithms

Childhood Obesity epidemic analysis using classification algorithms Suguna. M M.Phil. Scholar Trichy, Tamilnadu, India suguna15.9@gmail.com Abstract Obesity is the one of the most serious public health

### Classification of Arrhythmia Using Machine Learning Techniques

Classification of Arrhythmia Using Machine Learning Techniques THARA SOMAN PATRICK O. BOBBIE School of Computing and Software Engineering Southern Polytechnic State University (SPSU) 1 S. Marietta Parkway,

### Prediction Of Student Performance Using Weka Tool

Prediction Of Student Performance Using Weka Tool Gurmeet Kaur 1, Williamjit Singh 2 1 Student of M.tech (CE), Punjabi university, Patiala 2 (Asst. Professor) Department of CE, Punjabi University, Patiala

### A study of the NIPS feature selection challenge

A study of the NIPS feature selection challenge Nicholas Johnson November 29, 2009 Abstract The 2003 Nips Feature extraction challenge was dominated by Bayesian approaches developed by the team of Radford

### A HYBRID CLASSIFICATION MODEL EMPLOYING GENETIC ALGORITHM AND ROOT GUIDED DECISION TREE FOR IMPROVED CATEGORIZATION OF DATA

A HYBRID CLASSIFICATION MODEL EMPLOYING GENETIC ALGORITHM AND ROOT GUIDED DECISION TREE FOR IMPROVED CATEGORIZATION OF DATA R. Geetha Ramani, Lakshmi Balasubramanian and Alaghu Meenal A. Department of

### Comprehensible Data Mining: Gaining Insight from Data

Comprehensible Data Mining: Gaining Insight from Data Michael J. Pazzani Information and Computer Science University of California, Irvine pazzani@ics.uci.edu http://www.ics.uci.edu/~pazzani Outline UC