Intro to Deep Learning for Core ML

Intro to Deep Learning for Core ML It s Difficult to Make Predictions. Especially About the Future. @JulioBarros Consultant E-String.com @JulioBarros http://e-string.com 1

Core ML "With Core ML, you can integrate trained machine learning models into your app." -- Apple Documentation @JulioBarros http://e-string.com 2

What is a model? An artifact created by training a machine learning algorithm. Basically a file with a bunch of numbers and some meta data. @JulioBarros http://e-string.com 3

What even is Machine Learning? @JulioBarros http://e-string.com 4

Artificial Intelligence Artificial Intelligence (AI) - the study of "intelligent agents". Reasoning, knowledge representation, planning, robotics, etc. Artificial Narrow Intelligence (ANI) Artificial General Intelligence (AGI) Artificial Superintelligence (ASI) @JulioBarros http://e-string.com 5

Machine Learning Machine Learning (ML) - Programs that learn from the data and make predictions. Tree Ensembles Support Vector Machines Generalized Linear Models Deep Neural Nets @JulioBarros http://e-string.com 6

Deep Learning Deep Learning (DL) - ML/AI using artificial neural networks (ANNs) @JulioBarros http://e-string.com 7

Hype or Reality? @JulioBarros http://e-string.com 8

"It is a renaissance, it is a golden age," "Machine learning and AI is a horizontal enabling layer. It will empower and improve every business, every government organization, every philanthropy basically there s no institution in the world that cannot be improved with machine learning." Bezos @JulioBarros http://e-string.com 9

Microso! Last year "Our strategy is to build best-in-class platforms and productivity services for a mobile-first, cloud-first world." -- 2016 Form 10K Now "Our strategy is to build best-in-class platforms and productivity services for an intelligent cloud and an intelligent edge infused with artificial intelligence ( AI )." -- 2017 Form 10K @JulioBarros http://e-string.com 10

Investments in AI Microsoft - MS Research AI Lab, CNTK Intel - Neon, Nervana Google - DeepMind, Google Brain, PAIR, TF Facebook - FAIR, PyTorch, Caffe2 Amazon - GPU instances, MXNet Apple - Core ML, Siri, car, maps, AR, blog China - AI leadership by 2030 Canada,... and everyone else @JulioBarros http://e-string.com 11

Every industry can expect to be transformed by Artificial Intelligence @JulioBarros http://e-string.com 12

Healthcare "Near or better than human level performance." @JulioBarros http://e-string.com 13

Performance ML models make mistakes Humans (experts) make mistakes Experts don't agree with other experts Experts don't agree with themselves ML can augment human performance @JulioBarros http://e-string.com 14

Applications Text, audio, image, video understanding User intent predictions Recommendations Games - asset generation, character control Manufacturing, maintenance and control Many many more @JulioBarros http://e-string.com 15

Not Hotdog @JulioBarros http://e-string.com 16

Intended use @JulioBarros http://e-string.com 17

Is this a hot dog? @JulioBarros http://e-string.com 18

Is this a hot dog? @JulioBarros http://e-string.com 19

Is this a hot dog? @JulioBarros http://e-string.com 20

Is this a hot dog? @JulioBarros http://e-string.com 21

Is this a hot dog? @JulioBarros http://e-string.com 22

Image Classification Justin Johnson, Andrej Karpathy, Li Fei-Fei - Stanford @JulioBarros http://e-string.com 23

Imagenet 14,197,122 images in 21,841 (?) classes @JulioBarros http://e-string.com 24

Object Detection @JulioBarros http://e-string.com 25

Image Captioning @JulioBarros http://e-string.com 26

Dense Captioning @JulioBarros http://e-string.com 27

How do they do it? Where do machine learning models come from? Libraries for decision trees, ensembles, etc. scikit-learn XGBoost LibSVM @JulioBarros http://e-string.com 28

The New Shiny: Deep Learning Core ML calls out Caffe Keras Also: Tensorflow, Theano, MXNet, CNTK, PyTorch, Neon, Caffe2,... @JulioBarros http://e-string.com 29

Third time is a charm Dramatic improvements due to advancements in: Data Algorithms Hardware @JulioBarros http://e-string.com 30

Steps for your ML project Definition Prep Training Prediction (inference, scoring) / Production @JulioBarros http://e-string.com 31

Problem Definition Types of business questions How much? - Regression What is it? - Classification What now? - Reinforcement learning What is our measure of success? - Error function @JulioBarros http://e-string.com 32

Data Prep What data do I have or can get? Why do I think it is useful? What biases are in it? How does it need to be processed? @JulioBarros http://e-string.com 33

Don't underestimate the prep @JulioBarros http://e-string.com 34

Our Demo Data: Wine Quality Data @JulioBarros http://e-string.com 35

Types of Features Numeric in similar ranges numbers - scaled to ~ (-1,1) categorical - "1 hot" encoded, vector embedding text - word2vec, Glove, custom embedding dates - Unix time, DOW, MOY, etc. @JulioBarros http://e-string.com 36

Types of data Labeled - supervised Unlabeled - un-supervised @JulioBarros http://e-string.com 37

Building a NN to Train @JulioBarros http://e-string.com 38

Neurons: Biologically inspired 1942 McCulloch and Pitts 1957 Rosenblatt for i in len(w): o = x[i] * w[i] o = o + b return A(o) A(zip(x, w).map(*).reduce(0, +) + b) @JulioBarros http://e-string.com 39

Activation Function Introduces non linearity Historically: Step, Sigmoid, Tanh Commonly: Rectified linear Unit (relu), Softmax @JulioBarros http://e-string.com 40

Universal Approximation Theorem (1989)... a feed-forward network with a single hidden layer containing a finite number of neurons can approximate continuous functions... @JulioBarros http://e-string.com 41

Deep Neural Nets A net with more than one hidden layer. @JulioBarros http://e-string.com 42

VGG16 1 1 138,357,544 parameters @JulioBarros http://e-string.com 43

GoogleLeNet (Inception) @JulioBarros http://e-string.com 44

Training 0) Pick an architecture 1) Initialize weights randomly 2) Make prediction 3) Measure error (loss) 4) Adjust weights in the right direction 5) GOTO 2 @JulioBarros http://e-string.com 45

Gradient Descent To know the right direction calculate the gradient of the loss function with respect to each weight. @JulioBarros http://e-string.com 46

Backpropagation Use the chain rule. f(x) = g(h(x)) f'(x) = g'(h(x)) h'(x) 1) Feed the signal forward through the network 2) Propagate the error back across the network. Don't worry. The libraries do it for you. @JulioBarros http://e-string.com 47

Millions of Knobs Parameters @JulioBarros http://e-string.com 48

Or... @JulioBarros http://e-string.com 49

Our Demo Data: Wine Quality Data 11 features, 1 target column, 1599 samples @JulioBarros http://e-string.com 50

UCI Machine Learning Repository https://archive.ics.uci.edu/ml/datasets/wine+quality Source: Paulo Cortez, University of Minho, Guimarães, Portugal, h!p://www3.dsi.uminho.pt/ pcortez A. Cerdeira, F. Almeida, T. Matos and J. Reis, Viticulture Commission of the Vinho Verde Region(CVRVV), Porto, Portugal @2009 @JulioBarros http://e-string.com 51

A Simple Neural Net in Keras model = Sequential() # input layer model.add(dense(16,input_dim=11,activation='relu')) # hidden layer model.add(dense(8,activation='relu')) # output layer model.add(dense(1)) @JulioBarros http://e-string.com 52

Demo @JulioBarros http://e-string.com 53

Cool Demo Bro But thats a long way from a cat riding a skateboard. @JulioBarros http://e-string.com 54

How Do We Work With Images Well, images are just numbers/data. Though numbers close to each other are more related. @JulioBarros http://e-string.com 55

Convolutional Layers Similar to correlations from signal processing or filters from photoshop. A small NxN filter is slid over and convolved/correlated with the image. Learns to find features. Then lower level features are combined into higher level features. @JulioBarros http://e-string.com 56

Types of ANN (layers) 1. Dense Neural Net (DNN) - fully connected 2. Convolutional Neural Net (CNN) - image/2d data 3. Recurrent Neural Net (RNN) - time series, sequential 4. Everything else - mostly innovative architectures and combinations @JulioBarros http://e-string.com 57

Want to add AI/ML to your projects? Options API calls to third party service Use traditional ML models Fine tune existing model (transfer learning) Create your own custom DL model Some combination of all of these @JulioBarros http://e-string.com 58

Challenges with DL Needs lots of data. Labeled data is expensive. Lacks explainability Computational requirements - training and inference Performance limits unclear Best architecture unclear @JulioBarros http://e-string.com 59

Benefits of DL Handles much of the feature engineering Handles complex (non linear) problems Advancements coming quickly @JulioBarros http://e-string.com 60

Think carefully about Your business question How you'll measure success Gathering relevant data Compensating for biases Handling errors Managing changes in production Updating models (online learning) @JulioBarros http://e-string.com 61

Recommendations Do not be intimidated the math. Start with Keras (w/tensorflow) or maybe Pytorch. Later choose language/framework as needs dictate. @JulioBarros http://e-string.com 62

Resources Andrew Ng's Coursera and Fast.AI courses Deep Learning Book - Goodfellow, Bengio and Courville Meetups - Portland-Data-Science-Group - Portland-Machine-Learning-Meetup - Portland-Deep-Learning 2 2 I run this meetup. @JulioBarros http://e-string.com 63

Thank you! Questions? Julio@E-String.com @JulioBarros @JulioBarros http://e-string.com 64

Programming Abstractions Level Python ios Prediction Keras Core ML Training Computation Graph, Backprop, Autograd Keras Tensorflow, Caffe Matrix Math CUDA, Eigen3 Metal, Accelerate @JulioBarros http://e-string.com 65