Deep Learning for Computer Vision

Similar documents
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

arxiv: v1 [cs.lg] 15 Jun 2015

Generative models and adversarial training

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

Python Machine Learning

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

The University of Amsterdam s Concept Detection System at ImageCLEF 2011

arxiv: v1 [cs.cv] 10 May 2017

Dual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

Lip Reading in Profile

Word Segmentation of Off-line Handwritten Documents

Forget catastrophic forgetting: AI that learns after deployment

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation

Taxonomy-Regularized Semantic Deep Convolutional Neural Networks

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Circuit Simulators: A Revolutionary E-Learning Platform

arxiv:submit/ [cs.cv] 2 Aug 2017

THE enormous growth of unstructured data, including

arxiv: v2 [cs.cv] 30 Mar 2017

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

SORT: Second-Order Response Transform for Visual Recognition

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

A Deep Bag-of-Features Model for Music Auto-Tagging

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Diverse Concept-Level Features for Multi-Object Classification

Cultivating DNN Diversity for Large Scale Video Labelling

Human Emotion Recognition From Speech

Modeling function word errors in DNN-HMM based LVCSR systems

Learning Methods for Fuzzy Systems

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Multi-tasks Deep Learning Model for classifying MRI images of AD/MCI Patients

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Modeling function word errors in DNN-HMM based LVCSR systems

LEGO MINDSTORMS Education EV3 Coding Activities

arxiv: v2 [cs.cl] 26 Mar 2015

Image based Static Facial Expression Recognition with Multiple Deep Network Learning

A Case Study: News Classification Based on Term Frequency

Rule Learning With Negation: Issues Regarding Effectiveness

Copyright by Sung Ju Hwang 2013

Offline Writer Identification Using Convolutional Neural Network Activation Features

Speech Recognition at ICSI: Broadcast News and beyond

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Assignment 1: Predicting Amazon Review Ratings

Knowledge Transfer in Deep Convolutional Neural Nets

A Review: Speech Recognition with Deep Learning Methods

Lecture 1: Machine Learning Basics

A student diagnosing and evaluation system for laboratory-based academic exercises

arxiv: v2 [cs.ro] 3 Mar 2017

Webly Supervised Learning of Convolutional Networks

Model Ensemble for Click Prediction in Bing Search Ads

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Глубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках

Master s Programme in Computer, Communication and Information Sciences, Study guide , ELEC Majors

CS 446: Machine Learning

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Australian Journal of Basic and Applied Sciences

ENGINEERING What is it all about?

CSL465/603 - Machine Learning

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Rule Learning with Negation: Issues Regarding Effectiveness

Driving Author Engagement through IEEE Collabratec

Using Deep Convolutional Neural Networks in Monte Carlo Tree Search

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

Using focal point learning to improve human machine tacit coordination

The role of word-word co-occurrence in word learning

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Deep Facial Action Unit Recognition from Partially Labeled Data

Mining Association Rules in Student s Assessment Data

arxiv: v1 [cs.cl] 27 Apr 2016

Spring 2016 Stony Brook University Instructor: Dr. Paul Fodor

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

Twenty-One Suggestions for Writing Good Scientific Papers. Michal Delong and Ken Lertzman. 1. Know your audience and write for that specific audience.

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

THE world surrounding us involves multiple modalities

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Education for an Information Age

arxiv: v2 [cs.cv] 4 Mar 2016

COMMUNITY ENGAGEMENT

WebLogo-2M: Scalable Logo Detection by Deep Learning from the Web

Evolutive Neural Net Fuzzy Filtering: Basic Description

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Functional Maths Skills Check E3/L x

What is a Mental Model?

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Linking Task: Identifying authors and book titles in verbose queries

Evidence for Reliability, Validity and Learning Effectiveness

SARDNET: A Self-Organizing Feature Map for Sequences

Exposé for a Master s Thesis

Learning to Schedule Straight-Line Code

On the Formation of Phoneme Categories in DNN Acoustic Models

Speech Emotion Recognition Using Support Vector Machine

Transcription:

Deep Learning for Computer Vision David Willingham Senior Application Engineer david.willingham@mathworks.com.au 2016 The MathWorks, Inc. 1

Learning Game Question At what age does a person recognise: Car or Plane Car or SUV Toyota or Mazda 2

What dog breeds are these? Source 3

Demo : Live Object Recognition with Webcam 4

Computer Vision Applications Pedestrian and traffic sign detection Landmark identification Scene recognition Medical diagnosis and drug discovery Public Safety / Surveillance Automotive Robotics and many more 5

Deep Learning investment is rising 6

What is Deep Learning? Deep learning performs end-end learning by learning features, representations and tasks directly from images, text and sound Traditional Machine Learning Manual Feature Extraction Classification Machine Learning Car Truck Bicycle Deep Learning approach Convolutional Neural Network (CNN) Learned features 95% End-to-end learning 3% Feature learning + Classification 2% Car Truck Bicycle 7

What is Feature Extraction? Bag of Words SURF HOG Image Pixels Feature Extraction Representations often invariant to changes in scale, rotation, illumination More compact than storing pixel data Feature selection based on nature of problem Sparse Dense 8

Why is Deep Learning so Popular? Results: Achieved substantially better results on ImageNet large scale recognition challenge 95% + accuracy on ImageNet 1000 class challenge Year Pre-2012 (traditional computer vision and machine learning techniques) Error Rate > 25% 2012 (Deep Learning ) ~ 15% 2015 ( Deep Learning) <5 % Computing Power: GPU s and advances to processor technologies have enabled us to train networks on massive sets of data. Data: Availability of storage and access to large sets of labeled data E.g. ImageNet, PASCAL VoC, Kaggle 9

Two Approaches for Deep Learning 1. Train a Deep Neural Network from Scratch Lots of data Convolutional Neural Network (CNN) Learned features 95% 3% 2% Car Truck Bicycle 2. Fine-tune a pre-trained model ( transfer learning) Fine-tune network weights Pre-trained CNN New Task Car Truck Medium amounts of data 10

Two Deep Learning Approaches Approach 1: Train a Deep Neural Network from Scratch Convolutional Neural Network (CNN) Learned features 95% 3% 2% Car Truck Bicycle Recommended only when: Training data 1000s to millions of labeled images Computation Compute intensive (requires GPU) Training Time Days to Weeks for real problems Model accuracy High (can over fit to small datasets) 11

Two Deep Learning Approaches Approach 2:Fine-tune a pre-trained model ( transfer learning) CNN trained on massive sets of data Learned robust representations of images from larger data set Can be fine-tuned for use with new data or task with small medium size datasets Pre-trained CNN Fine-tune network weights New Task Car Truck New Data Recommended when: Training data 100s to 1000s of labeled images (small) Computation Moderate computation (GPU optional) Training Time Seconds to minutes Model accuracy Good, depends on the pre-trained CNN model 12

Convolutional Neural Networks Train deep neural networks on structured data (e.g. images, signals, text) Implements Feature Learning: Eliminates need for hand crafted features Trained using GPUs for performance car truck van bicycle Input Convolution + ReLu Pooling Convolution + ReLu Pooling Flatten Fully Connected Softmax Feature Learning Classification 13

Challenges using Deep Learning for Computer Vision Steps Importing Data Preprocessing Choosing an architecture Training and Classification Challenge Managing large sets of labeled images Resizing, Data augmentation Background in neural networks (deep learning) Computation intensive task (requires GPU) Iterative design 19

Demo: Classifying the CIFAR-10 dataset Objective: Train a Convolutional Neural Network to classify the CIFAR-10 dataset Data: Input Data Response Thousands of images of 10 different Classes AIRPLANE, AUTOMOBILE, BIRD, CAT, DEER, DOG, FROG, HORSE, SHIP, TRUCK Approach: Import the data Define an architecture Train and test the CNN Data Credit: Learning Multiple Layers of Features from Tiny Images, Alex Krizhevsky, 2009. https://www.cs.toronto.edu/~kriz/cifar.html 20

Demo: Classifying the CIFAR-10 dataset 21

Addressing Challenges in Deep Learning for Computer Vision Challenge Managing large sets of labeled images Resizing, Data augmentation Background in neural networks (deep learning) Computation intensive task (requires GPU) Solution imageset or imagedatastore to handle large sets of images imresize, imcrop, imadjust, imageinputlayer, etc. Intuitive interfaces, well-documented architectures and examples Training supported on GPUs No GPU expertise is required Automate. Offload computations to a cluster and test multiple architectures 22

Demo Fine-tune a pre-trained model ( transfer learning) Pre-trained CNN (AlexNet 1000 Classes) Car SUV New Data New Task 2 Class Classification 23

Demo Fine-tune a pre-trained model ( transfer learning) 24

Addressing Challenges in Deep Learning for Computer Vision Challenge Managing large sets of labeled images Resizing, Data augmentation Background in neural networks (deep learning) Computation intensive task (requires GPU) Solution imageset or imagedatastore to handle large sets of images imresize, imcrop, imadjust, imageinputlayer, etc. Intuitive interfaces, well-documented architectures and examples Training supported on GPUs No GPU expertise is required Automate. Offload computations to a cluster and test multiple architectures 25

Key Takeaways Consider Deep Learning when: Accuracy of traditional classifiers is not sufficient ImageNet classification problem You have a pre-trained network that can be fine-tuned Too many image categories (100s 1000s or more) Face recognition MATLAB for Deep Learning and Computer Vision 26

Further Resources on our File Exchange http://www.mathworks.com/matlabcentral/fileexchange/38310-deeplearning-toolbox 27