Introduction to Intelligent Systems

430.457 Fall 2018 Introduction to Intelligent Systems Lecture 8 Prof. Songhwai Oh ECE, SNU Prof. Songhwai Oh Intelligent Systems Fall 2018 1

LINEAR REGRESSION Prof. Songhwai Oh Intelligent Systems Fall 2018 2

Univariate Linear Regression Prof. Songhwai Oh Intelligent Systems Fall 2018 3

Example Prof. Songhwai Oh Intelligent Systems Fall 2018 4

Gradient Descent Batch gradient descent: (Steepest descent) Stochastic gradient descent: processes one data point at a time Prof. Songhwai Oh Intelligent Systems Fall 2018 5

Multivariate Linear Regression Prof. Songhwai Oh Intelligent Systems Fall 2018 6

Closed Form Solution to Linear Regression Prof. Songhwai Oh Intelligent Systems Fall 2018 7

General Linear Model h i [n]: nonlinear function of n Prof. Songhwai Oh Intelligent Systems Fall 2018 8

Example: Linear modeling of the SINC function Model 1: Model 2: Prof. Songhwai Oh Intelligent Systems Fall 2018 9

Example: Linear modeling of the SINC function N=50 N=100 N=1000 Data Linear Model 1 Linear Model 2 Prof. Songhwai Oh Intelligent Systems Fall 2018 10

Regularization L 1 Regularization L 2 Regularization Prof. Songhwai Oh Intelligent Systems Fall 2018 11

LINEAR CLASSIFICATION Prof. Songhwai Oh Intelligent Systems Fall 2018 12

Linear Classifiers Linearly Separable Case Prof. Songhwai Oh Intelligent Systems Fall 2018 13

Perceptron Learning Rule Threshold function Update rule: (converges if the problem is linearly separable.) Prof. Songhwai Oh Intelligent Systems Fall 2018 14

Learning Curve Separable case Non separable case Learning curve Learning curve (constant learning rate) Learning curve (decreasing learning rate) Prof. Songhwai Oh Intelligent Systems Fall 2018 15

Logistic Regression Logistic function Logistic regression (chain rule) Soft thresholding Prof. Songhwai Oh Intelligent Systems Fall 2018 16

Separable case Non separable case Learning curve Learning curve (constant learning rate) Learning curve (decreasing learning rate) Prof. Songhwai Oh Intelligent Systems Fall 2018 17

Human brain 100 billion neurons 100 to 500 trillion synapses ARTIFICIAL NEURAL NETWORKS (ANN) Prof. Songhwai Oh Intelligent Systems Fall 2018 18

Neural Network Structure Perceptron: hard thresholding Sigmoid perceptron: soft thresholding, e.g., logistic function Feed forward network Recurrent network Prof. Songhwai Oh Intelligent Systems Fall 2018 19

Single Layer Feed Forward Neural Networks Perceptron learning rule Logistic regression Prof. Songhwai Oh Intelligent Systems Fall 2018 20

Majority function (11 Boolean inputs) WillWait (Restaurant example) Prof. Songhwai Oh Intelligent Systems Fall 2018 21

Multilayer Feed Forward Neural Networks Input units: input units hidden units output units An ANN with a single (sufficiently large) hidden layer can represent any continuous function. Prof. Songhwai Oh Intelligent Systems Fall 2018 22

Back Propagation Prof. Songhwai Oh Intelligent Systems Fall 2018 23

from the j th hidden unit to the k th output a k w j,k Prof. Songhwai Oh Intelligent Systems Fall 2018 24

from the i th input to the j th hidden unit a k w i,j w j,k Prof. Songhwai Oh Intelligent Systems Fall 2018 25

DEEP LEARNING Prof. Songhwai Oh Intelligent Systems Fall 2018 26

ImageNet Large Scale Visual Recognition Challenge Tasks: Decide whether a given image contains a particular type of object or not. For example, a contestant might decide that there are cars in this image but no tigers. Find a particular object and draw a box around it. For example, a contestant might decide that there is a screwdriver at a certain position with a width of 50 pixels and a height of 30 pixels. 1000 different categories Over 1 million images Training set: 456,567 images Prof. Songhwai Oh Intelligent Systems Fall 2018 27 Year Winning Error Rate 2010 28.2% 2011 25.8% 2012 16.4% (2 nd 25.2%) 2013 11.2% 2014 6.7% 2015 3.57% Human About 5.1% ImageNet Large Scale Visual Recognition Challenge. Russakovsky et al. arxiv preprint arxiv:1409.0575. URL: http://arxiv.org/abs/1409.0575v1

Deep Convolutional Neural Networks SuperVision (2012) Deep convolutional neural network 650,000 neurons 5 convolutional layers Over 60 million parameters Clarifai (2013) GoogleLeNet (2014) 22 layers ResNet (2015) 152 layers Prof. Songhwai Oh Intelligent Systems Fall 2018 28