Introducing Machine Learning

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Python Machine Learning

Lecture 1: Basic Concepts of Machine Learning

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Lecture 1: Machine Learning Basics

CSL465/603 - Machine Learning

Time series prediction

A Case Study: News Classification Based on Term Frequency

Speech Emotion Recognition Using Support Vector Machine

Executive Guide to Simulation for Health

Rule Learning With Negation: Issues Regarding Effectiveness

Assignment 1: Predicting Amazon Review Ratings

Australian Journal of Basic and Applied Sciences

Generative models and adversarial training

Probabilistic Latent Semantic Analysis

CS Machine Learning

Human Emotion Recognition From Speech

CS 446: Machine Learning

Word Segmentation of Off-line Handwritten Documents

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Medical Complexity: A Pragmatic Theory

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Multivariate k-nearest Neighbor Regression for Time Series data -

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Learning Methods for Fuzzy Systems

Reducing Features to Improve Bug Prediction

Learning From the Past with Experiment Databases

Journal title ISSN Full text from

Evolutive Neural Net Fuzzy Filtering: Basic Description

Rule Learning with Negation: Issues Regarding Effectiveness

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Circuit Simulators: A Revolutionary E-Learning Platform

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

Len Lundstrum, Ph.D., FRM

Gifted/Challenge Program Descriptions Summer 2016

Speech Recognition at ICSI: Broadcast News and beyond

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

Laboratorio di Intelligenza Artificiale e Robotica

Electromagnetic Spectrum Webquest Answer Key

Applications of data mining algorithms to analysis of medical data

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Unit 1: Scientific Investigation-Asking Questions

INSTRUCTIONAL FOCUS DOCUMENT Grade 5/Science

DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES

ACHIEVING SUSTAINABILITY THROUGH GREEN OFFICES PRACTICES

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Learning Methods in Multilingual Speech Recognition

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

The taming of the data:

SELECCIÓN DE CURSOS CAMPUS CIUDAD DE MÉXICO. Instructions for Course Selection

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Multidisciplinary Engineering Systems 2 nd and 3rd Year College-Wide Courses

Axiom 2013 Team Description Paper

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Data Fusion Through Statistical Matching

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Introduction to Causal Inference. Problem Set 1. Required Problems

(Sub)Gradient Descent

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

Seminar - Organic Computing

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Laboratorio di Intelligenza Artificiale e Robotica

Graphic Imaging Technology II - Part two of a two-year program designed to offer students skills in typesetting, art and pasteup,

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

What is a Mental Model?

A study of speaker adaptation for DNN-based speech synthesis

Pharmaceutical Medicine as a Specialised Discipline of Medicine

Mining Association Rules in Student s Assessment Data

USING A RECOMMENDER TO INFLUENCE CONSUMER ENERGY USAGE

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

A Reinforcement Learning Variant for Control Scheduling

Knowledge-Based - Systems

Date : Controller of Examinations Principal Wednesday Saturday Wednesday

Functional Maths Skills Check E3/L x

Leveraging MOOCs to bring entrepreneurship and innovation to everyone on campus

Bachelor of Science in Mechanical Engineering with Co-op

K5 Math Practice. Free Pilot Proposal Jan -Jun Boost Confidence Increase Scores Get Ahead. Studypad, Inc.

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

Knowledge based expert systems D H A N A N J A Y K A L B A N D E

Multisensor Data Fusion: From Algorithms And Architectural Design To Applications (Devices, Circuits, And Systems)

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Artificial Neural Networks written examination

Name Class Date. Graphing Proportional Relationships

GUIDE CURRICULUM. Science 10

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

Physical Features of Humans

Transcription:

Introducing Machine Learning

What is Machine Learning? Machine learning teaches computers to do what comes naturally to humans and animals: learn from experience. Machine learning algorithms use computational methods to learn information directly from data without relying on a predetermined equation as a model. The algorithms adaptively improve their performance as the number of samples available for learning increases.

More Data, More Questions, Better Answers Machine learning algorithms find natural patterns in data that generate insight and help you make better decisions and predictions. They are used every day to make critical decisions in medical diagnosis, stock trading, energy load forecasting, and more. Media sites rely on machine learning to sift through millions of options to give you song or movie recommendations. Retailers use it to gain insight into their customers purchasing behavior. Real World Applications With the rise in big data, machine learning has become particularly important for solving problems in areas like these: Computational finance, for credit scoring and algorithmic trading Image processing and computer vision, for face recognition, motion detection, and object detection Computational biology, for tumor detection, drug discovery, and DNA sequencing Energy production, for price and load forecasting Automotive, aerospace, and manufacturing, for predictive maintenance Natural language processing 3

How Machine Learning Works Machine learning uses two types of techniques: supervised learning, which trains a model on known input and output data so that it can predict future outputs, and unsupervised learning, which finds hidden patterns or intrinsic structures in input data. MACHINE LEARNING UNSUPERVISED LEARNING Group and interpret data based only on input data SUPERVISED LEARNING Develop predictive model based on both input and output data CLUSTERING CLASSIFICATION REGRESSION 4

Supervised Learning The aim of supervised machine learning is to build a model that makes predictions based on evidence in the presence of uncertainty. A supervised learning algorithm takes a known set of input data and known responses to the data (output) and trains a model to generate reasonable predictions for the response to new data. Supervised learning uses classification and regression techniques to develop predictive models. Using Supervised Learning to Predict Heart Attacks Suppose clinicians want to predict whether someone will have a heart attack within a year. They have data on previous patients, including age, weight, height, and blood pressure. They know whether the previous patients had heart attacks within a year. So the problem is combining the existing data into a model that can predict whether a new person will have a heart attack within a year. Classification techniques predict discrete responses for example, whether an email is genuine or spam, or whether a tumor is cancerous or benign. Classification models classify input data into categories. Typical applications include medical imaging, speech recognition, and credit scoring. Regression techniques predict continuous responses for example, changes in temperature or fluctuations in power demand. Typical applications include electricity load forecasting and algorithmic trading. 5

Unsupervised Learning Unsupervised learning finds hidden patterns or intrinsic structures in data. It is used to draw inferences from datasets consisting of input data without labeled responses. Clustering is the most common unsupervised learning technique. It is used for exploratory data analysis to find hidden patterns or groupings in data. Applications for clustering include gene sequence analysis, market research, and object recognition. Clustering Patterns in the Data the Data 6

How Do You Decide Which Algorithm to Use? Choosing the right algorithm can seem overwhelming there are dozens of supervised and unsupervised machine learning algorithms, and each takes a different approach to learning. MACHINE LEARNING There is no best method or one size fits all. Finding the right algorithm is partly just trial and error even highly experienced data scientists can t tell whether an algorithm will work without trying it out. But algorithm selection also depends on the size and type of data you re working with, the insights you want to get from the data, and how those insights will be used. SUPERVISED LEARNING UNSUPERVISED LEARNING CLASSIFICATION REGRESSION CLUSTERING Support Vector Machines Linear Regression, GLM K-Means, K-Medoids Fuzzy C-Means Discriminant Analysis SVR, GPR Hierarchical Naive Bayes Ensemble Methods Gaussian Mixture Nearest Neighbor Decision Trees Neural Networks Neural Networks Hidden Markov Model 7

When Should You Use Machine Learning? Consider using machine learning when you have a complex task or problem involving a large amount of data and lots of variables, but no existing formula or equation. For example, machine learning is a good option if you need to handle situations like these: Hand-written rules and equations are too complex as in face recognition and speech recognition. The rules of a task are constantly changing as in fraud detection from transaction records. The nature of the data keeps changing, and the program needs to adapt as in automated trading, energy demand forecasting, and predicting shopping trends. 8

Real World Examples Creating Algorithms that Can Analyze Works of Art Researchers at the Art and Artificial Intelligence Laboratory at Rutgers University wanted to see whether a computer algorithm could classify paintings by style, genre, and artist as easily as a human. They began by identifying visual features for classifying a painting s style. The algorithms they developed classified the styles of paintings in the database with 60% accuracy, outperforming typical non-expert humans. The researchers hypothesized that visual features useful for style classification (a supervised learning problem) could also be used to determine artistic influences (an unsupervised problem). They used classification algorithms trained on Google images to identify specific objects. They tested the algorithms on more than 1,700 paintings from 66 different artists working over a span of 550 years. The algorithm readily identified connected works, including the influence of Diego Velazquez s Portrait of Pope Innocent X on Francis Bacon s Study After Velazquez s Portrait of Pope Innocent X. 9

Real World Examples Optimizing HVAC Energy Usage in Large Buildings The heating, ventilation, and air-conditioning (HVAC) systems in office buildings, hospitals, and other large-scale commercial buildings are often inefficient because they do not take into account changing weather patterns, variable energy costs, or the building s thermal properties. Building IQ s cloud-based software platform addresses this problem. The platform uses advanced algorithms and machine learning methods to continuously process gigabytes of information from power meters, thermometers, and HVAC pressure sensors, as well as weather and energy cost. In particular, machine learning is used to segment data and determine the relative contributions of gas, electric, steam, and solar power to heating and cooling processes. The building IQ platform reduces HVAC energy consumption in large-scale commercial buildings by 10% - 25% during normal operation. 10

Real World Examples Detecting Low-Speed Car Crashes With more than 8 million members, the RAC is one of the UK s largest motoring organizations, providing roadside assistance, insurance, and other services to private and business motorists. To enable rapid response to roadside incidents, reduce crashes, and mitigate insurance costs, the RAC developed an onboard crash sensing system that uses advanced machine learning algorithms to detect low-speed collisions and distinguish these events from more common driving events, such as driving over speed bumps or potholes. Independent tests showed the RAC system to be 92% accurate in detecting test crashes. 11

Learn More Ready for a deeper dive? Explore these resources to learn more about machine learning methods, examples, and tools. Watch Machine Learning Made Easy 34:34 Signal Processing and Machine Learning Techniques for Sensor Data Analytics 42:45 Read Machine Learning Blog Posts: Social Network Analysis, Text Mining, Bayesian Reasoning, and more The Netflix Prize and Production Machine Learning Systems: An Insider Look Machine Learning Challenges: Choosing the Best Model and Avoiding Overfitting Explore MATLAB Machine Learning Examples Machine Learning Solutions Classify Data with the Classification Learner App 2016 The MathWorks, Inc. MATLAB and Simulink are registered trademarks of The MathWorks, Inc. See mathworks.com/trademarks for a list of additional trademarks. Other product or brand names may be trademarks or registered trademarks of their respective holders. 92991v00