Intro to Deep Learning for Core ML

Similar documents
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Python Machine Learning

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

Model Ensemble for Click Prediction in Bing Search Ads

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Artificial Neural Networks written examination

CSL465/603 - Machine Learning

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

CS 446: Machine Learning

arxiv: v1 [cs.lg] 15 Jun 2015

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Knowledge Transfer in Deep Convolutional Neural Nets

Lecture 1: Machine Learning Basics

Second Exam: Natural Language Parsing with Neural Networks

arxiv: v1 [cs.cv] 10 May 2017

Axiom 2013 Team Description Paper

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Forget catastrophic forgetting: AI that learns after deployment

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

THE enormous growth of unstructured data, including

Generative models and adversarial training

Artificial Neural Networks

Top US Tech Talent for the Top China Tech Company

arxiv: v1 [cs.lg] 7 Apr 2015

On the Formation of Phoneme Categories in DNN Acoustic Models

Laboratorio di Intelligenza Artificiale e Robotica

A Review: Speech Recognition with Deep Learning Methods

Cultivating DNN Diversity for Large Scale Video Labelling

arxiv: v4 [cs.cl] 28 Mar 2016

Lecture 1: Basic Concepts of Machine Learning

(Sub)Gradient Descent

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Attributed Social Network Embedding

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

A deep architecture for non-projective dependency parsing

INPE São José dos Campos

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Assignment 1: Predicting Amazon Review Ratings

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Human Emotion Recognition From Speech

Rule Learning With Negation: Issues Regarding Effectiveness

Evolutive Neural Net Fuzzy Filtering: Basic Description

Deep Neural Network Language Models

Dual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors

arxiv: v1 [cs.cl] 27 Apr 2016

Modeling function word errors in DNN-HMM based LVCSR systems

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak

Speech Emotion Recognition Using Support Vector Machine

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

CS Machine Learning

Modeling function word errors in DNN-HMM based LVCSR systems

A study of speaker adaptation for DNN-based speech synthesis

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Methods for Fuzzy Systems

An empirical study of learning speed in backpropagation

Dialog-based Language Learning

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Speaker Identification by Comparison of Smart Methods. Abstract

Dropout improves Recurrent Neural Networks for Handwriting Recognition

Learning From the Past with Experiment Databases

Lip Reading in Profile

THE world surrounding us involves multiple modalities

Diverse Concept-Level Features for Multi-Object Classification

Test Effort Estimation Using Neural Network

Learning to Schedule Straight-Line Code

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

OFFICE SUPPORT SPECIALIST Technical Diploma

Knowledge-Based - Systems

Improvements to the Pruning Behavior of DNN Acoustic Models

Evolution of Symbolisation in Chimpanzees and Neural Nets

Welcome to. ECML/PKDD 2004 Community meeting

Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Laboratorio di Intelligenza Artificiale e Robotica

Rule Learning with Negation: Issues Regarding Effectiveness

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Softprop: Softmax Neural Network Backpropagation Learning

SARDNET: A Self-Organizing Feature Map for Sequences

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Using focal point learning to improve human machine tacit coordination

Kamaldeep Kaur University School of Information Technology GGS Indraprastha University Delhi

A Deep Bag-of-Features Model for Music Auto-Tagging

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

A Reinforcement Learning Variant for Control Scheduling

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

arxiv: v2 [cs.cl] 26 Mar 2015

arxiv: v1 [cs.dc] 19 May 2017

Transcription:

Intro to Deep Learning for Core ML It s Difficult to Make Predictions. Especially About the Future. @JulioBarros Consultant E-String.com @JulioBarros http://e-string.com 1

Core ML "With Core ML, you can integrate trained machine learning models into your app." -- Apple Documentation @JulioBarros http://e-string.com 2

What is a model? An artifact created by training a machine learning algorithm. Basically a file with a bunch of numbers and some meta data. @JulioBarros http://e-string.com 3

What even is Machine Learning? @JulioBarros http://e-string.com 4

Artificial Intelligence Artificial Intelligence (AI) - the study of "intelligent agents". Reasoning, knowledge representation, planning, robotics, etc. Artificial Narrow Intelligence (ANI) Artificial General Intelligence (AGI) Artificial Superintelligence (ASI) @JulioBarros http://e-string.com 5

Machine Learning Machine Learning (ML) - Programs that learn from the data and make predictions. Tree Ensembles Support Vector Machines Generalized Linear Models Deep Neural Nets @JulioBarros http://e-string.com 6

Deep Learning Deep Learning (DL) - ML/AI using artificial neural networks (ANNs) @JulioBarros http://e-string.com 7

Hype or Reality? @JulioBarros http://e-string.com 8

"It is a renaissance, it is a golden age," "Machine learning and AI is a horizontal enabling layer. It will empower and improve every business, every government organization, every philanthropy basically there s no institution in the world that cannot be improved with machine learning." Bezos @JulioBarros http://e-string.com 9

Microso! Last year "Our strategy is to build best-in-class platforms and productivity services for a mobile-first, cloud-first world." -- 2016 Form 10K Now "Our strategy is to build best-in-class platforms and productivity services for an intelligent cloud and an intelligent edge infused with artificial intelligence ( AI )." -- 2017 Form 10K @JulioBarros http://e-string.com 10

Investments in AI Microsoft - MS Research AI Lab, CNTK Intel - Neon, Nervana Google - DeepMind, Google Brain, PAIR, TF Facebook - FAIR, PyTorch, Caffe2 Amazon - GPU instances, MXNet Apple - Core ML, Siri, car, maps, AR, blog China - AI leadership by 2030 Canada,... and everyone else @JulioBarros http://e-string.com 11

Every industry can expect to be transformed by Artificial Intelligence @JulioBarros http://e-string.com 12

Healthcare "Near or better than human level performance." @JulioBarros http://e-string.com 13

Performance ML models make mistakes Humans (experts) make mistakes Experts don't agree with other experts Experts don't agree with themselves ML can augment human performance @JulioBarros http://e-string.com 14

Applications Text, audio, image, video understanding User intent predictions Recommendations Games - asset generation, character control Manufacturing, maintenance and control Many many more @JulioBarros http://e-string.com 15

Not Hotdog @JulioBarros http://e-string.com 16

Intended use @JulioBarros http://e-string.com 17

Is this a hot dog? @JulioBarros http://e-string.com 18

Is this a hot dog? @JulioBarros http://e-string.com 19

Is this a hot dog? @JulioBarros http://e-string.com 20

Is this a hot dog? @JulioBarros http://e-string.com 21

Is this a hot dog? @JulioBarros http://e-string.com 22

Image Classification Justin Johnson, Andrej Karpathy, Li Fei-Fei - Stanford @JulioBarros http://e-string.com 23

Imagenet 14,197,122 images in 21,841 (?) classes @JulioBarros http://e-string.com 24

Object Detection @JulioBarros http://e-string.com 25

Image Captioning @JulioBarros http://e-string.com 26

Dense Captioning @JulioBarros http://e-string.com 27

How do they do it? Where do machine learning models come from? Libraries for decision trees, ensembles, etc. scikit-learn XGBoost LibSVM @JulioBarros http://e-string.com 28

The New Shiny: Deep Learning Core ML calls out Caffe Keras Also: Tensorflow, Theano, MXNet, CNTK, PyTorch, Neon, Caffe2,... @JulioBarros http://e-string.com 29

Third time is a charm Dramatic improvements due to advancements in: Data Algorithms Hardware @JulioBarros http://e-string.com 30

Steps for your ML project Definition Prep Training Prediction (inference, scoring) / Production @JulioBarros http://e-string.com 31

Problem Definition Types of business questions How much? - Regression What is it? - Classification What now? - Reinforcement learning What is our measure of success? - Error function @JulioBarros http://e-string.com 32

Data Prep What data do I have or can get? Why do I think it is useful? What biases are in it? How does it need to be processed? @JulioBarros http://e-string.com 33

Don't underestimate the prep @JulioBarros http://e-string.com 34

Our Demo Data: Wine Quality Data @JulioBarros http://e-string.com 35

Types of Features Numeric in similar ranges numbers - scaled to ~ (-1,1) categorical - "1 hot" encoded, vector embedding text - word2vec, Glove, custom embedding dates - Unix time, DOW, MOY, etc. @JulioBarros http://e-string.com 36

Types of data Labeled - supervised Unlabeled - un-supervised @JulioBarros http://e-string.com 37

Building a NN to Train @JulioBarros http://e-string.com 38

Neurons: Biologically inspired 1942 McCulloch and Pitts 1957 Rosenblatt for i in len(w): o = x[i] * w[i] o = o + b return A(o) A(zip(x, w).map(*).reduce(0, +) + b) @JulioBarros http://e-string.com 39

Activation Function Introduces non linearity Historically: Step, Sigmoid, Tanh Commonly: Rectified linear Unit (relu), Softmax @JulioBarros http://e-string.com 40

Universal Approximation Theorem (1989)... a feed-forward network with a single hidden layer containing a finite number of neurons can approximate continuous functions... @JulioBarros http://e-string.com 41

Deep Neural Nets A net with more than one hidden layer. @JulioBarros http://e-string.com 42

VGG16 1 1 138,357,544 parameters @JulioBarros http://e-string.com 43

GoogleLeNet (Inception) @JulioBarros http://e-string.com 44

Training 0) Pick an architecture 1) Initialize weights randomly 2) Make prediction 3) Measure error (loss) 4) Adjust weights in the right direction 5) GOTO 2 @JulioBarros http://e-string.com 45

Gradient Descent To know the right direction calculate the gradient of the loss function with respect to each weight. @JulioBarros http://e-string.com 46

Backpropagation Use the chain rule. f(x) = g(h(x)) f'(x) = g'(h(x)) h'(x) 1) Feed the signal forward through the network 2) Propagate the error back across the network. Don't worry. The libraries do it for you. @JulioBarros http://e-string.com 47

Millions of Knobs Parameters @JulioBarros http://e-string.com 48

Or... @JulioBarros http://e-string.com 49

Our Demo Data: Wine Quality Data 11 features, 1 target column, 1599 samples @JulioBarros http://e-string.com 50

UCI Machine Learning Repository https://archive.ics.uci.edu/ml/datasets/wine+quality Source: Paulo Cortez, University of Minho, Guimarães, Portugal, h!p://www3.dsi.uminho.pt/ pcortez A. Cerdeira, F. Almeida, T. Matos and J. Reis, Viticulture Commission of the Vinho Verde Region(CVRVV), Porto, Portugal @2009 @JulioBarros http://e-string.com 51

A Simple Neural Net in Keras model = Sequential() # input layer model.add(dense(16,input_dim=11,activation='relu')) # hidden layer model.add(dense(8,activation='relu')) # output layer model.add(dense(1)) @JulioBarros http://e-string.com 52

Demo @JulioBarros http://e-string.com 53

Cool Demo Bro But thats a long way from a cat riding a skateboard. @JulioBarros http://e-string.com 54

How Do We Work With Images Well, images are just numbers/data. Though numbers close to each other are more related. @JulioBarros http://e-string.com 55

Convolutional Layers Similar to correlations from signal processing or filters from photoshop. A small NxN filter is slid over and convolved/correlated with the image. Learns to find features. Then lower level features are combined into higher level features. @JulioBarros http://e-string.com 56

Types of ANN (layers) 1. Dense Neural Net (DNN) - fully connected 2. Convolutional Neural Net (CNN) - image/2d data 3. Recurrent Neural Net (RNN) - time series, sequential 4. Everything else - mostly innovative architectures and combinations @JulioBarros http://e-string.com 57

Want to add AI/ML to your projects? Options API calls to third party service Use traditional ML models Fine tune existing model (transfer learning) Create your own custom DL model Some combination of all of these @JulioBarros http://e-string.com 58

Challenges with DL Needs lots of data. Labeled data is expensive. Lacks explainability Computational requirements - training and inference Performance limits unclear Best architecture unclear @JulioBarros http://e-string.com 59

Benefits of DL Handles much of the feature engineering Handles complex (non linear) problems Advancements coming quickly @JulioBarros http://e-string.com 60

Think carefully about Your business question How you'll measure success Gathering relevant data Compensating for biases Handling errors Managing changes in production Updating models (online learning) @JulioBarros http://e-string.com 61

Recommendations Do not be intimidated the math. Start with Keras (w/tensorflow) or maybe Pytorch. Later choose language/framework as needs dictate. @JulioBarros http://e-string.com 62

Resources Andrew Ng's Coursera and Fast.AI courses Deep Learning Book - Goodfellow, Bengio and Courville Meetups - Portland-Data-Science-Group - Portland-Machine-Learning-Meetup - Portland-Deep-Learning 2 2 I run this meetup. @JulioBarros http://e-string.com 63

Thank you! Questions? Julio@E-String.com @JulioBarros @JulioBarros http://e-string.com 64

Programming Abstractions Level Python ios Prediction Keras Core ML Training Computation Graph, Backprop, Autograd Keras Tensorflow, Caffe Matrix Math CUDA, Eigen3 Metal, Accelerate @JulioBarros http://e-string.com 65