Machine Learning

Similar documents
Lecture 1: Machine Learning Basics

Python Machine Learning

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

(Sub)Gradient Descent

CSL465/603 - Machine Learning

CS 446: Machine Learning

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Reducing Features to Improve Bug Prediction

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

CS Machine Learning

Evaluating Interactive Visualization of Multidimensional Data Projection with Feature Transformation

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Word Segmentation of Off-line Handwritten Documents

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Learning From the Past with Experiment Databases

Assignment 1: Predicting Amazon Review Ratings

Name: Class: Date: ID: A

A Case Study: News Classification Based on Term Frequency

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Probabilistic Latent Semantic Analysis

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Speech Emotion Recognition Using Support Vector Machine

Laboratorio di Intelligenza Artificiale e Robotica

Lecture 1: Basic Concepts of Machine Learning

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

Rule Learning With Negation: Issues Regarding Effectiveness

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Development Policy

Universidade do Minho Escola de Engenharia

Forget catastrophic forgetting: AI that learns after deployment

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

Using dialogue context to improve parsing performance in dialogue systems

Communication around Interactive Tables

Laboratorio di Intelligenza Artificiale e Robotica

Welcome to. ECML/PKDD 2004 Community meeting

Multivariate k-nearest Neighbor Regression for Time Series data -

Topic: Making A Colorado Brochure Grade : 4 to adult An integrated lesson plan covering three sessions of approximately 50 minutes each.

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Interactive Whiteboard

Applications of data mining algorithms to analysis of medical data

Evaluation of Teach For America:

Learning Methods for Fuzzy Systems

Generative models and adversarial training

Rule Learning with Negation: Issues Regarding Effectiveness

Education for an Information Age

Time series prediction

James H. Williams, Ed.D. CICE, Hiroshima University George Washington University August 2, 2012

A. What is research? B. Types of research

Computerized Adaptive Psychological Testing A Personalisation Perspective

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Human Emotion Recognition From Speech

Software Maintenance

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

Switchboard Language Model Improvement with Conversational Data from Gigaword

OFFICE SUPPORT SPECIALIST Technical Diploma

Axiom 2013 Team Description Paper

Top US Tech Talent for the Top China Tech Company

Functional Maths Skills Check E3/L x

Applications of memory-based natural language processing

What is related to student retention in STEM for STEM majors? Abstract:

Data Fusion Through Statistical Matching

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Model Ensemble for Click Prediction in Bing Search Ads

For Jury Evaluation. The Road to Enlightenment: Generating Insight and Predicting Consumer Actions in Digital Markets

SARDNET: A Self-Organizing Feature Map for Sequences

Handling Concept Drifts Using Dynamic Selection of Classifiers

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Indian Institute of Technology, Kanpur

Memory-based grammatical error correction

Seminar - Organic Computing

May To print or download your own copies of this document visit Name Date Eurovision Numeracy Assignment

Australian Journal of Basic and Applied Sciences

A study of speaker adaptation for DNN-based speech synthesis

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Calibration of Confidence Measures in Speech Recognition

INPE São José dos Campos

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Lesson M4. page 1 of 2

COLLEGE OF ENGINEERING (WOMEN)

Digital Transformation in Education. Future-Ready Skills

Using EEG to Improve Massive Open Online Courses Feedback Interaction

Softprop: Softmax Neural Network Backpropagation Learning

Preference Learning in Recommender Systems

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

COLLEGE ADMISSIONS Spring 2017

EdX Learner s Guide. Release

Text-mining the Estonian National Electronic Health Record

Semi-Supervised Face Detection

Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking

Transcription:

Machine Learning with R @MatthewRenze @netcorebcn

John Jane Miko Lee

Job Postings for Machine Learning Source: Indeed.com

Average Salary by Job Type (USA) Source: Stack Overflow 2017

Overview 1. Introduction to ML 2. Introduction to R 3. Classification 4. Regression 5. Beyond the Basics

About Me Data Science Consultant Education B.S. in Computer Science B.A. in Philosophy Community Public Speaker Pluralsight Author Microsoft MVP ASPInsider Open-source Software

How Does This Apply to Me? Make decisions using data Make predictions using data Make recommendations using data Find patterns of interest in data Find anomalies in data Write code that does all these things

Introduction to Machine Learning

What is Machine Learning?

Artificial Intelligence Machine Learning Statistics

f x

f x Data Function Prediction

f x Data Function Prediction

f x Data Function Prediction Cat Dog

f x Data Function Prediction Cat Dog

f x Data Function Prediction Cat Dog Is cat?

f x Data Function Prediction Cat Dog Is cat?

f x Data Function Prediction Cat Dog Is cat? Yes

Find a question

Find a question Prepare the data

Find a question Prepare the data Train the model

Find a question Prepare the data Train the model Evaluate the model

Find a question Prepare the data Deploy the model Train the model Evaluate the model

Find a question Monitor the model Prepare the data Deploy the model Train the model Evaluate the model

Find a question Monitor the model Prepare the data Deploy the model Train the model Evaluate the model

Data

Training Data Test

ML Algorithm Training Data Test

ML Algorithm ML Model Training Data Test

ML Algorithm ML Model Training Data Test

ML Algorithm ML Model Training Data Test New Data

ML Algorithm ML Model Training Prediction Data Test New Data

What Can Machine Learning Do?

f x 1.23

Source: Futurama

Introduction to R

What is R? Open source Language and environment Numerical and graphical analysis Cross platform

What is R? Active development Large user community Modular and extensible 9000+ extensions

FREE

FREE

Code Demo

Classification

f x

Count of Spam Words Correct Spelling Ratio

Count of Spam Words Correct Spelling Ratio

Count of Spam Words Correct Spelling Ratio

Count of Spam Words Correct Spelling Ratio

Count of Spam Words Correct Spelling Ratio

Count of Spam Words Correct Spelling Ratio

Classification Algorithms k-nearest Neighbor Classifier Decision Tree Classifier Naïve Bayes Classifier Support Vector Machine Neural Network Classifier x2 x1

Decision Tree Classifier Supervised learning is age > 9.5? is sex male? Survived Died is family > 2.5? Died Survived

Decision Tree Classifier is sex male? Supervised learning Tree of decisions is age > 9.5? Survived Died is family > 2.5? Died Survived

Decision Tree Classifier is sex male? Supervised learning Tree of decisions Information gain Died is age > 9.5? is family > 2.5? Survived Died Survived

Decision Tree Classifier is sex male? Supervised learning Tree of decisions Information gain Simple and easy Died is age > 9.5? is family > 2.5? Survived Died Survived

Titanic Passenger Manifest Name Gender Age Family Survived Elizabeth Allen Female 29 0 Yes Hudson Allison Jr. Male 1 3 Yes Helen Allison Female 2 3 No Hudson Allison Sr. Male 30 3 No Bessie Allison Female 25 3 No

is sex male? is age > 9.5? Survived Died is family > 2.5? Died Survived

Neural Network Classifier Supervised learning Source: Wikipedia

Neural Network Classifier Supervised learning Neurons in a brain Source: Wikipedia

Neural Network Classifier Supervised learning Neurons in a brain Complex Source: Wikipedia

Neural Network Classifier Supervised learning Neurons in a brain Complex Not transparent Source: Wikipedia

Real-World Examples Should we approve this loan? Will this customer buy from us? Should we replace this part? Does this person have cancer? x2 x1

Iris Data Set Iris Setosa Iris Versicolor Iris Virginica Photos by Radomił Binek, Danielle Langlois, and Frank Mayfield

Iris Data Set Fisher s Iris Data Species Petal Length Petal Width Sepal Length Sepal Width setosa 1.1 0.1 4.3 3 setosa 1.4 0.2 4.4 2.9 setosa 1.3 0.2 4.4 3 setosa 1.3 0.2 4.4 3.2 setosa 1.3 0.3 4.5 2.3

Classification Demo Goal: Predict species based on petal and sepal measurements

Regression

f x 1.23

Sale Price Area

Sale Price Area

Sale Price Area

Regression Algorithms Linear Regression Polynomial Regression Lasso Regression ElasticNet Regression Neural Network Regression x2 x1

Simple Linear Regression Relationship

Simple Linear Regression Relationship Linear model

Simple Linear Regression Relationship Linear model Explanatory variable

Simple Linear Regression Relationship Linear model Explanatory variable Outcome variable

Simple Linear Regression Linear predictor function

Simple Linear Regression Linear predictor function y = m x + b

Simple Linear Regression Linear predictor function y = m x + b Parameters estimated

Simple Linear Regression Linear predictor function y = m x + b Parameters estimated Relies on assumptions

Neural Network Regression Same as before Numeric vs. Categorical Source: Wikipedia

Real-World Examples How much profit will we make? What will the price be tomorrow? How many will this person buy? How long until this part fails? x2 x1

Regression Demo Goal: Predict petal width of Iris flowers

Beyond the Basics

This is just the tip of the iceberg! This is just the tip of the iceberg!

Find a question Monitor the model Prepare the data Deploy the model Train the model Evaluate the model

Creating accurate and robust models is not easy

Find a question Monitor the model Prepare the data Deploy the model Train the model Evaluate the model

Data are messy Cleaning and Transforming Data

Cleaning and Transforming Data Data are messy 80% of work

Cleaning and Transforming Data Data are messy 80% of work R helps a lot

Cleaning and Transforming Data Data are messy 80% of work R helps a lot Record all steps

Goodness of Fit

Underfit Goodness of Fit

Goodness of Fit Underfit Overfit

Goodness of Fit Underfit Good fit Overfit

Deep Learning

John Jane Miko Lee

f x 1.23

f x 1.23

f x 1.23

Source: YOLO: Real-Time Object Detection

Source: http://grail.cs.washington.edu/projects/audiotoobama/ Source: Nvidia

Source: http://grail.cs.washington.edu/projects/audiotoobama/

Source: Pouff Google - Grocery Deep Mind Trip

Source: Boston Dynamics

Practical Demo Goal: Predict who will survive the Titanic

Conclusion

Where to Go Next Pluralsight: https://www.pluralsight.com Data Camp: https://www.datacamp.com Coursera: https://www.coursera.org Tensorflow: http://playground.tensorflow.org

www.pluralsight.com/authors/matthew-renze

www.matthewrenze.com

Feedback Very important to me! What did you like? What could I improve?

Conclusion 1. Introduction to ML 2. Introduction to R 3. Classification 4. Regression 5. Beyond the Basics

Are you prepared? Is your organization? Is our world prepared?

Contact Info Matthew Renze Data Science Consultant Renze Consulting Twitter: @matthewrenze Email: info@matthewrenze.com Website: www.matthewrenze.com Thank You! : )