IM S5028. Customer Analytics. Supervised vs unsupervised techniques. Data Mining techniques

Similar documents
A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

(Sub)Gradient Descent

ABSTRACT. A major goal of human genetics is the discovery and validation of genetic polymorphisms

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Lecture 1: Basic Concepts of Machine Learning

CS Machine Learning

Rule Learning With Negation: Issues Regarding Effectiveness

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Computerized Adaptive Psychological Testing A Personalisation Perspective

Applications of data mining algorithms to analysis of medical data

Python Machine Learning

Artificial Neural Networks written examination

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

SARDNET: A Self-Organizing Feature Map for Sequences

MOODLE 2.0 GLOSSARY TUTORIALS

Mining Student Evolution Using Associative Classification and Clustering

A Case Study: News Classification Based on Term Frequency

Generative models and adversarial training

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Student Mobility Rates in Massachusetts Public Schools

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

Algebra 2- Semester 2 Review

Rule Learning with Negation: Issues Regarding Effectiveness

arxiv: v1 [cs.cv] 10 May 2017

Learning Methods in Multilingual Speech Recognition

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

Welcome to. ECML/PKDD 2004 Community meeting

Evolutive Neural Net Fuzzy Filtering: Basic Description

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Mining Association Rules in Student s Assessment Data

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

FRAMEWORK FOR IDENTIFYING THE MOST LIKELY SUCCESSFUL UNDERPRIVILEGED TERTIARY STUDY BURSARY APPLICANTS

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Grade 2: Using a Number Line to Order and Compare Numbers Place Value Horizontal Content Strand

Ricopili: Postimputation Module. WCPG Education Day Stephan Ripke / Raymond Walters Toronto, October 2015

Assignment 1: Predicting Amazon Review Ratings

Human Emotion Recognition From Speech

Trends in College Pricing

Issues in the Mining of Heart Failure Datasets

CNS 18 21th Communications and Networking Simulation Symposium

Chapter 2 Rule Learning in a Nutshell

Lecture 1: Machine Learning Basics

Laboratorio di Intelligenza Artificiale e Robotica

Probabilistic Latent Semantic Analysis

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Welcome. Paulo Goes Dean, Eller College of Management Welcome Our region

Learning Methods for Fuzzy Systems

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

CSL465/603 - Machine Learning

Managing Experience for Process Improvement in Manufacturing

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Word Segmentation of Off-line Handwritten Documents

The taming of the data:

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Multimedia Application Effective Support of Education

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Matching Similarity for Keyword-Based Clustering

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

SECTION 12 E-Learning (CBT) Delivery Module

Multiple Measures Assessment Project - FAQs

Switchboard Language Model Improvement with Conversational Data from Gigaword

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

Naviance Family Connection

FY year and 3-year Cohort Default Rates by State and Level and Control of Institution

Using a Native Language Reference Grammar as a Language Learning Tool

Team Formation for Generalized Tasks in Expertise Social Networks

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Syllabus Education Department Lincoln University EDU 311 Social Studies Methods

Calibration of Confidence Measures in Speech Recognition

Grammars & Parsing, Part 1:

ACCOUNTING FOR MANAGERS BU-5190-AU7 Syllabus

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

learning collegiate assessment]

Higher Education Six-Year Plans

A Version Space Approach to Learning Context-free Grammars

INTERMEDIATE ALGEBRA PRODUCT GUIDE

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Preference Learning in Recommender Systems

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

K-Medoid Algorithm in Clustering Student Scholarship Applicants

university of wisconsin MILWAUKEE Master Plan Report

Electromagnetic Spectrum Webquest Answer Key

CS 446: Machine Learning

AQUA: An Ontology-Driven Question Answering System

Linking Task: Identifying authors and book titles in verbose queries

Improving Simple Bayes. Abstract. The simple Bayesian classier (SBC), sometimes called

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

Transcription:

Customer Analytics Data Mining Techniques and applications to CRM: decision trees and neural networks Data Mining techniques Data mining, or knowledge discovery, is the process of discovering valid, novel and useful patterns in large data sets Many different data mining algorithms exist Statistics Decision trees Neural networks Clustering algorithms Rule induction algorithms Rough sets Genetic algorithms decision trees and neural networks are widely used, they are part of most data mining tools 2 Supervised vs unsupervised techniques Supervised learning techniques - guided by known output(decision trees, most neural network types) Or Unsupervised learning techniques use inputs only Clustering algorithms Kohonen Feature Maps (type of neural networks) They us use similarity measures 3 1

Decision Trees Tree-shaped structures Can be converted to rules 4 Decision Trees Decision trees are built by iterative process of splitting the data into partitions Many different algorithms Most common are: ID3, C4.5, CART, CHAID Algorithms differ in the number of splits and the diversity function: Gini index, Entropy 5 Decision trees example (adapted from Information Discovery.Inc,1996) Develop a tree to predict profit Manufact urer State City Smith CA Product Color Profit Los Angeles Blue High Smith AZ Flagstaff Green Low Adams NY NYC Blue High Adams AZ Flagstaff Red Low Johnson NY NYC Green Avg. Johnson CA Los Angeles Red Avg. 6 2

First step: the tree algorithm splits the original table into three tables using State attribute Table 1A : For State = AZ Manufacture r Smith Adams These two tables must be split further Stat City e Flagstaf AZ fflagstaf AZ Red Table 1B f : For State = CA Product Color Green Manufacture Stat City r e Smith CA Los Angeles Johnson CA Los Table 1C : For State Angeles = NY Profi t Low Low Product Color Blue Red Manufacture Stat Product Profi City r e Color t Adams NY NYC Blue High Johnson NY NYC Green Avg. This one is classified Profi t High Avg. A decision tree derived from this table 8 Corresponding rules This decision tree above can in fact be translated into a set of rules as follows: 1. IF State= AZ THEN Profit = Low; 2. IF State= CA and Manufacturer = Smith THEN Profit = High; 3. IF State= CA and Manufacturer = Johnson THEN Profit = Avg; 4. IF State= NY and Manufacturer = Adams THEN Profit = High; 5. IF State= NY and Manufacturer = Johnson THEN Profit = Avg; 9 3

Tree 2 10 Exercise Note: different trees are possible 1. Derive a different tree using manufacturer attribute first and then state 2. Derive a tree starting with colour attribute Represent the trees as rules Compare the rules 11 Classification And Regression Trees (CART) algorithm Based on binary recursive partitioning looks at all possible splits for all variables searches through them all ranks order each splitting rule on the basis of a quality-of-split criterion measured by a diversity function eg. Gini rule or entropy how well the splitting rule separates the classes contained in the parent node 12 4

Entropy Entropy is a measure of disorder If an object can be classified into n classes (Ci,... Cn) and the probability of an object belonging to class Ci is p(ci) the entropy of classification is 13 CART: Classification And Regression Trees once a best split is found, the search is repeated for each child node, until further splitting is impossible or some other criterion is met The tree is then tested on a testing sample(prediction/classification rate, error rate) 14 Decision trees, applications Decision trees can be used for: Prediction (eg. churn prediction) classification, (eg. into, good and bad accounts) Exploration Segmentation 15 5

Decision trees can be used for segmentation All Customers Young Old Income high Income low Segment1 Segment2 Segment3 (Zikmund et al 2003) 16 Decision trees can be used for Churn modelling 50 Churners 50 Non-churners New technology? ne old w 30 Churners 20 Churners 50 Non-churners 0 Non-churners Years as customer <= 2.3 years > 2.3 years 25 Churners 10 Non-churners 5 Churners 40 Non-churners Age <= 45 > 45 20 Churners 0 Non-churners 5 Churners 10 Non-churners (adapted from Berson et al, 2000) 17 Case study: data analysis for marketing Groth resources<ftp://ftp.prenhall.com/pub/ptr/ c++_programming.w-050/groth/> case study 8 18 6

Data mining exercises: building models using decision tress tutorial in building models using decision trees Students are to complete the decision tree tutorial from the Groth text, p 127-147 : Tool : KnowledgeSEEKER download and install (next week tutorial) from <ftp://ftp.prenhall.com/pub/ptr/c++_programmi ng.w-050/groth/> 19 References Groth R. (2000) Data Mining Also: Rules are Much More than Decision Trees, Information Discovery Inc. www.thearling.com Salford Systems White Paper Series, An Overview of CART Methodology Berson A.., Smith S., Thearling K (2000), Building Data Mining Applications for CRM, McGrow-Hill 20 7