Introduction: Convolutional Neural Networks for Visual Recognition.

Similar documents
THE enormous growth of unstructured data, including

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Python Machine Learning

Taxonomy-Regularized Semantic Deep Convolutional Neural Networks

Knowledge Transfer in Deep Convolutional Neural Nets

Diverse Concept-Level Features for Multi-Object Classification

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

SORT: Second-Order Response Transform for Visual Recognition

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Offline Writer Identification Using Convolutional Neural Network Activation Features

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Second Exam: Natural Language Parsing with Neural Networks

Generative models and adversarial training

arxiv: v1 [cs.lg] 15 Jun 2015

CS Machine Learning

Lip Reading in Profile

A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation

arxiv: v1 [cs.cl] 27 Apr 2016

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Cultivating DNN Diversity for Large Scale Video Labelling

A Deep Bag-of-Features Model for Music Auto-Tagging

Learning Methods for Fuzzy Systems

arxiv: v2 [cs.cv] 30 Mar 2017

(Sub)Gradient Descent

Глубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках

arxiv: v2 [cs.cv] 4 Mar 2016

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

Human Emotion Recognition From Speech

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

INPE São José dos Campos

Learning to Schedule Straight-Line Code

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Artificial Neural Networks written examination

Dual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors

Model Ensemble for Click Prediction in Bing Search Ads

Evolutive Neural Net Fuzzy Filtering: Basic Description

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

CSL465/603 - Machine Learning

arxiv: v1 [cs.cv] 10 May 2017

arxiv: v4 [cs.cv] 13 Aug 2017

Modeling function word errors in DNN-HMM based LVCSR systems

THE world surrounding us involves multiple modalities

Lecture 1: Machine Learning Basics

Lecture 1: Basic Concepts of Machine Learning

Webly Supervised Learning of Convolutional Networks

Modeling function word errors in DNN-HMM based LVCSR systems

SARDNET: A Self-Organizing Feature Map for Sequences

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

The University of Amsterdam s Concept Detection System at ImageCLEF 2011

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Image based Static Facial Expression Recognition with Multiple Deep Network Learning

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Review: Speech Recognition with Deep Learning Methods

Calibration of Confidence Measures in Speech Recognition

A study of speaker adaptation for DNN-based speech synthesis

arxiv: v1 [cs.lg] 7 Apr 2015

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Speech Emotion Recognition Using Support Vector Machine

arxiv: v4 [cs.cl] 28 Mar 2016

CS224d Deep Learning for Natural Language Processing. Richard Socher, PhD

arxiv: v2 [stat.ml] 30 Apr 2016 ABSTRACT

Dropout improves Recurrent Neural Networks for Handwriting Recognition

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

arxiv: v2 [cs.cl] 26 Mar 2015

Knowledge-Based - Systems

Artificial Neural Networks

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Word Segmentation of Off-line Handwritten Documents

Linking Task: Identifying authors and book titles in verbose queries

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak

arxiv:submit/ [cs.cv] 2 Aug 2017

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Laboratorio di Intelligenza Artificiale e Robotica

Forget catastrophic forgetting: AI that learns after deployment

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

Test Effort Estimation Using Neural Network

Beyond the Pipeline: Discrete Optimization in NLP

Distributed Learning of Multilingual DNN Feature Extractors using GPUs

Deep Facial Action Unit Recognition from Partially Labeled Data

On the Formation of Phoneme Categories in DNN Acoustic Models

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

Natural Language Processing. George Konidaris

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Assignment 1: Predicting Amazon Review Ratings

CS 446: Machine Learning

Probabilistic Latent Semantic Analysis

Neuro-Symbolic Approaches for Knowledge Representation in Expert Systems

Deep Neural Network Language Models

Residual Stacking of RNNs for Neural Machine Translation

Softprop: Softmax Neural Network Backpropagation Learning

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Transcription:

Introduction: Convolutional Neural Networks for Visual Recognition boris.ginzburg@intel.com 1

Acknowledgments This presentation is heavily based on: http://cs.nyu.edu/~fergus/pmwiki/pmwiki.php http://deeplearning.net/reading-list/tutorials/ http://deeplearning.net/tutorial/lenet.html http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial and many other 2

Agenda 1. Course overview 2. Introduction to Deep Learning Classical Computer Vision vs. Deep learning 3. Introduction to Convolutional Networks Basic CNN Architecture Large Scale Image Classifications How deep should be Conv Nets? Detection and Other Visual Apps 3

Course overview 1. Introduction Intro to Deep Learning Caffe: Getting started CNN: network topology, layers definition 2. CNN Training Backward propagation Optimization for Deep Learning: SGD : monentum, rate adaptation, Adagrad, SGD with Line Search, CGD Regularization (Dropout, Maxout) 4

Course overview 3. Localization and Detection Overfeat R-CNN (Regions with CNN) 4. CPU / GPU performance optimization CUDA Vtune, OpenMP, and BLAS/MKL 5

Introduction to Deep Learning 6

Buzz 7

Deep Learning from Research to Technology Deep Learning - breakthrough in visual and speech recognition 8

Classical Computer Vision Pipeline 9

Classical Computer Vision Pipeline. CV experts 1. Select / develop features: SURF, HoG, SIFT, RIFT, 2. Add on top of this Machine Learning for multi-class recognition and train classifier Feature Extraction: SIFT, HoG... Detection, Classification Recognition Classical CV feature definition is domainspecific and time-consuming 10

Deep Learning based Vision Pipeline. Deep Learning: Build features automatically based on training data Combine feature extraction and classification DL experts: define NN topology and train NN Deep NN... Detection, Classification Deep NN... Recognition Deep Learning promise: train good feature automatically, same method for different domain 11

Computer Vision +Deep Learning + Machine Learning We want to combine Deep Learning + CV + ML Combine pre-defined features with learned features; Use best ML methods for multi-class recognition CV+DL+ML experts needed to build the best-in-class CV features HoG, SIFT Deep NN... ML AdaBoost Combine best of Computer Vision Deep Learning and Machine Learning 12

Deep Learning Basics Deep Learning is a set of machine learning algorithms based on multi-layer networks CAT DOG OUTPUTS HIDDEN NODES INPUTS 13

Deep Learning Basics Deep Learning is a set of machine learning algorithms based on multi-layer networks CAT DOG Training 1 14

Deep Learning Basics Deep Learning is a set of machine learning algorithms based on multi-layer networks CAT DOG 1 15

Deep Learning Basics Deep Learning is a set of machine learning algorithms based on multi-layer networks CAT DOG 16

Deep Learning Taxonomy Supervised: Convolutional NN ( LeCun) Recurrent Neural nets (Schmidhuber ) Unsupervised Deep Belief Nets / Stacked RBMs (Hinton) Stacked denoising autoencoders (Bengio) Sparse AutoEncoders ( LeCun, A. Ng, ) 17

Convolutional Networks 18

Convolutional NN Convolutional Neural Networks is extension of traditional Multi-layer Perceptron, based on 3 ideas: 1. Local receive fields 2. Shared weights 3. Spatial / temporal sub-sampling See LeCun paper (1998) on text recognition: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf 19

What is Convolutional NN? CNN - multi-layer NN architecture Convolutional + Non-Linear Layer Sub-sampling Layer Convolutional +Non-L inear Layer Fully connected layers Supervised Feature Extraction Classification 20

What is Convolutional NN? 2x2 Convolution + NL Sub-sampling Convolution + NL 21

CNN story: 1996 - MNIST Lenet-5 (1996) : core of CNR check reading system, used by US banks. 22

CNN story: 2012 - ILSVRC Imagenet data base: 14 mln labeled images, 20K categories 23

ILSVRC: Classification 24

Imagenet Classifications 2012 25

ILSVRC 2012: top rankers http://www.image-net.org/challenges/lsvrc/2012/results.html N Error-5 Algorithm Team Authors 1 0.153 Deep Conv. Neural Network 2 0.262 Features + Fisher Vectors + Linear classifier Univ. of Toronto ISI 3 0.270 Features + FV + SVM OXFORD_VG G Krizhevsky et al Gunji et al Simonyan et al 4 0.271 SIFT + FV + PQ + SVM XRCE/INRIA Perronin et al 5 0.300 Color desc. + SVM Univ. of Amsterdam van de Sande et al 26

Imagenet 2013: top rankers http://www.image-net.org/challenges/lsvrc/2013/results.php N Error-5 Algorithm Team Authors 1 0.117 Deep Convolutional Neural Network 2 0.129 Deep Convolutional Neural Networks 3 0.135 Deep Convolutional Neural Networks 4 0.135 Deep Convolutional Neural Networks 5 0.137 Deep Convolutional Neural Networks Clarifi Nat.Univ Singapore NYU Overfeat NYU Zeiler Min LIN Zeiler Fergus Andrew Howard Pierre Sermanet et al 27

Imagenet Classifications 2013 28

Conv Net Topology 5 convolutional layers 3 fully connected layers + soft-max 650K neurons, 60 Mln weights 29

Why ConvNet should be Deep? Rob Fergus, NIPS 2013 30

Why ConvNet should be Deep? 31

Why ConvNet should be Deep? 32

Why ConvNet should be Deep? 33

Why ConvNet should be Deep? 34

Conv Nets: beyond Visual Classification 35

CNN applications CNN is a big hammer Plenty low hanging fruits You need just a right nail! 36

Conv NN: Detection Sermanet, CVPR 2014 37

Conv NN: Scene parsing Farabet, PAMI 2013 38

CNN: indoor semantic labeling RGBD Farabet, 2013 39

Conv NN: Action Detection Taylor, ECCV 2010 40

Conv NN: Image Processing Eigen, ICCV 2010 41

BACKUP BUZZ 42

A lot of buzz about Deep Learning July 2012 - Started DL lab Nov 2012- Big improvement in Speech, OCR: Speech reduce Error Rate by 25% OCR reduce Error rate by 30% 2013 launched 5 DL based products Voice search Photo Wonder Visual search 43

A lot of buzz about Deep Learning Microsoft On Deep Learning for Speech goto 3:00-5:10 44

A lot of buzz about Deep Learning Why Google invest in Deep Learning 45

A lot of buzz about Deep Learning NYU Deep Learning Professor LeCun Will Head Facebook s New Artificial Intelligence Lab, Dec 10, 2013 46