Machine Learning with MATLAB

Similar documents
Python Machine Learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Reducing Features to Improve Bug Prediction

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Speech Emotion Recognition Using Support Vector Machine

Human Emotion Recognition From Speech

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Learning From the Past with Experiment Databases

CSL465/603 - Machine Learning

Lecture 1: Machine Learning Basics

Generative models and adversarial training

Word Segmentation of Off-line Handwritten Documents

Learning Methods for Fuzzy Systems

Rule Learning With Negation: Issues Regarding Effectiveness

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Applications of data mining algorithms to analysis of medical data

Rule Learning with Negation: Issues Regarding Effectiveness

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

CS Machine Learning

(Sub)Gradient Descent

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Lecture 1: Basic Concepts of Machine Learning

Assignment 1: Predicting Amazon Review Ratings

Switchboard Language Model Improvement with Conversational Data from Gigaword

Learning Methods in Multilingual Speech Recognition

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Softprop: Softmax Neural Network Backpropagation Learning

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Data Fusion Through Statistical Matching

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Time series prediction

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

A Case Study: News Classification Based on Term Frequency

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Australian Journal of Basic and Applied Sciences

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

CS 446: Machine Learning

Circuit Simulators: A Revolutionary E-Learning Platform

Artificial Neural Networks written examination

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

Speech Recognition at ICSI: Broadcast News and beyond

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Modeling function word errors in DNN-HMM based LVCSR systems

arxiv: v2 [cs.cv] 30 Mar 2017

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Universidade do Minho Escola de Engenharia

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

BUAD 425 Data Analysis for Decision Making Syllabus Fall 2015

Speech Recognition by Indexing and Sequencing

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Ordered Incremental Training with Genetic Algorithms

Modeling function word errors in DNN-HMM based LVCSR systems

Corrective Feedback and Persistent Learning for Information Extraction

Evolutive Neural Net Fuzzy Filtering: Basic Description

Mining Association Rules in Student s Assessment Data

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Handling Concept Drifts Using Dynamic Selection of Classifiers

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Improving Simple Bayes. Abstract. The simple Bayesian classier (SBC), sometimes called

Why Did My Detector Do That?!

Activity Recognition from Accelerometer Data

Multivariate k-nearest Neighbor Regression for Time Series data -

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Calibration of Confidence Measures in Speech Recognition

A study of speaker adaptation for DNN-based speech synthesis

WHEN THERE IS A mismatch between the acoustic

Large vocabulary off-line handwriting recognition: A survey

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Evolution of Symbolisation in Chimpanzees and Neural Nets

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Running Head: STUDENT CENTRIC INTEGRATED TECHNOLOGY

What Different Kinds of Stratification Can Reveal about the Generalizability of Data-Mined Skill Assessment Models

Green Belt Curriculum (This workshop can also be conducted on-site, subject to price change and number of participants)

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Date : Controller of Examinations Principal Wednesday Saturday Wednesday

The Boosting Approach to Machine Learning An Overview

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Laboratorio di Intelligenza Artificiale e Robotica

An Online Handwriting Recognition System For Turkish

Conference Presentation

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Indian Institute of Technology, Kanpur

Welcome to. ECML/PKDD 2004 Community meeting

Identification of Opinion Leaders Using Text Mining Technique in Virtual Community

Course Content Concepts

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Transcription:

Machine Learning with MATLAB Leuven Statistics Day2014 Rachid Adarghal, Account Manager Jean-Philippe Villaréal, Application Engineer 2014 The MathWorks, Inc. 1

Side note: Design of Experiments with MATLAB 2

What You Will Learn Get an overview of Machine Learning Machine learning models and techniques available in MATLAB MATLAB as an interactive environment Evaluate and choose the best algorithm 3

Machine Learning Characteristics Lots of data (many variables) System too complex to understand the governing equation 4

Domains of Application Handwriting recognition Autonomous vehicles DNA sequencing / Genomics Cancer tumor classification Social Network Analysis Astronomical Data Analysis Market Segmentation Organizing Computer Cluster for efficiency Spam / non spam email classification Hearing headsets: optimizing signal (Cocktail party) Shazam / SoundHound FingerPrinting 5

Challenges Machine Learning Lots of data, with many variables (predictors) Data is too complex to know the governing equation Significant technical expertise required Black box modelling No one size fits all approach: Requires an iterative approach: Try multiple algorithms, see what works best Time consuming to conduct the analysis Know-how required to debug your algorithm efficiently 6

MATLAB Solutions Strong environment for interactive exploration Algorithms and Apps to get started Clustering, Classification, Regression Neural Network app, Curve fitting app Easy to evaluate, iterate, and choose the best algorithm Parallel Computing Deployment for Data Analytics workflows 7

Overview Machine Learning Type of Learning Categories of Algorithms Unsupervised Learning Clustering Machine Learning Group and interpret data based only on input data Recommender systems Supervised Learning Classification Develop predictive model based on both input and output data Regression 9

Unsupervised Learning k-means Clustering Partitional Clustering Overlapping Clustering Self-Organizing Maps Hierarchical clustering Fuzzy C-Means Gaussian Mixture Hidden Markov Model 10

Supervised Learning Regression Neural Networks Decision Trees Ensemble Methods Non-linear Reg. (GLM, Logistic) Linear Regression Classification Support Vector Machines Discriminant Analysis Naive Bayes Nearest Neighbor 11

Supervised Learning - Workflow Speed up Computations Select Model Data Train the Model Use for Prediction Import Data Explore Data Prepare Data Known data Known responses Model Model New Data Predicted Responses Measure Accuracy 12

Example Bank Marketing Campaign Goal: Predict if customer would subscribe to bank term deposit based on different attributes 100 90 80 Bank Marketing Campaign Misclassification Rate 70 Approach: Train a classifier using different models Percentage 60 50 40 30 20 10 No Misclassified Yes Misclassified Measure accuracy and compare models Reduce model complexity 0 Neural Net Logistic Regression Discriminant Analysis k-nearest Neighbors Naive Bayes Support VM Decision Trees TreeBagger Reduced TB Use classifier for prediction Data set downloaded from UCI Machine Learning repository http://archive.ics.uci.edu/ml/datasets/bank+marketing 13

Summary Bank Marketing Campaign Numerous predictive models with rich documentation Clustering, regression, classification Percentage 100 90 80 70 60 50 40 Bank Marketing Campaign Misclassification Rate No Misclassified Yes Misclassified 30 20 Interactive tools to help discovery Histograms, bar charts, ROC curves 10 0 Neural Net Logistic Regression Discriminant Analysis k-nearest Neighbors Naive Bayes Support VM Decision Trees TreeBagger Reduced TB Graphical Apps Built-in parallel computing support Quick prototyping Focus on modeling not programming 14

Learn More: Machine Learning with MATLAB Visit our discovery page: www.mathworks.com/machine-learning 15

Deploying / Sharing Your Application APPS Builder NE Web MATLAB Compiler.dll.lib.exe.CTF Builder JA Builder Ex Web MATLAB Coder Web MEX.exe.lib.dll 16

MathWorks Services Trainings: More that 30 course offerings Consulting Services Enhance your team Technical Support Ask questions An active community: MATLAB Central File exchange Blogs Newsletters 17

Thank you for attending! 2014 The MathWorks, Inc. 18