Semantic Segmentation

Similar documents
arxiv: v2 [cs.cv] 4 Mar 2016

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Generative models and adversarial training

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

SORT: Second-Order Response Transform for Visual Recognition

arxiv: v2 [cs.lg] 8 Aug 2017

THE enormous growth of unstructured data, including

Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs

Webly Supervised Learning of Convolutional Networks

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Diverse Concept-Level Features for Multi-Object Classification

Word Segmentation of Off-line Handwritten Documents

Image based Static Facial Expression Recognition with Multiple Deep Network Learning

Dual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

Second Exam: Natural Language Parsing with Neural Networks

WebLogo-2M: Scalable Logo Detection by Deep Learning from the Web

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Lip Reading in Profile

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

arxiv: v1 [cs.cv] 10 May 2017

Offline Writer Identification Using Convolutional Neural Network Activation Features

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

WebLogo-2M: Scalable Logo Detection by Deep Learning from the Web

Deep Neural Network Language Models

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation

arxiv: v1 [cs.lg] 15 Jun 2015

Using Deep Convolutional Neural Networks in Monte Carlo Tree Search

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

Python Machine Learning

Глубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

arxiv: v4 [cs.cl] 28 Mar 2016

Semi-Supervised Face Detection

Lecture 1: Machine Learning Basics

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Artificial Neural Networks written examination

Georgetown University at TREC 2017 Dynamic Domain Track

Rule Learning With Negation: Issues Regarding Effectiveness

Speech Emotion Recognition Using Support Vector Machine

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

arxiv: v4 [cs.cv] 13 Aug 2017

IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, VOL XXX, NO. XXX,

Taxonomy-Regularized Semantic Deep Convolutional Neural Networks

Probing for semantic evidence of composition by means of simple classification tasks

Deep Facial Action Unit Recognition from Partially Labeled Data

Jack Jilly can play. 1. Can Jack play? 2. Can Jilly play? 3. Jack can play. 4. Jilly can play. 5. Play, Jack, play! 6. Play, Jilly, play!

Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues

LEGO MINDSTORMS Education EV3 Coding Activities

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Forget catastrophic forgetting: AI that learns after deployment

arxiv: v2 [cs.cl] 26 Mar 2015

Modeling function word errors in DNN-HMM based LVCSR systems

ON THE USE OF WORD EMBEDDINGS ALONE TO

Rule Learning with Negation: Issues Regarding Effectiveness

Residual Stacking of RNNs for Neural Machine Translation

Ask Me Anything: Dynamic Memory Networks for Natural Language Processing

CSL465/603 - Machine Learning

arxiv:submit/ [cs.cv] 2 Aug 2017

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

A Review: Speech Recognition with Deep Learning Methods

arxiv: v1 [cs.cl] 27 Apr 2016

arxiv: v5 [cs.ai] 18 Aug 2015

Knowledge Transfer in Deep Convolutional Neural Nets

arxiv: v1 [cs.lg] 7 Apr 2015

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Generating Natural-Language Video Descriptions Using Text-Mined Knowledge

arxiv: v2 [cs.cv] 3 Aug 2017

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

There are some definitions for what Word

The University of Amsterdam s Concept Detection System at ImageCLEF 2011

A Neural Network GUI Tested on Text-To-Phoneme Mapping

J j W w. Write. Name. Max Takes the Train. Handwriting Letters Jj, Ww: Words with j, w 321

Dropout improves Recurrent Neural Networks for Handwriting Recognition

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Australian Journal of Basic and Applied Sciences

Transferring End-to-End Visuomotor Control from Simulation to Real World for a Multi-Stage Task

arxiv: v1 [cs.cv] 2 Jun 2017

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

THE world surrounding us involves multiple modalities

Axiom 2013 Team Description Paper

Evolutive Neural Net Fuzzy Filtering: Basic Description

Device Independence and Extensibility in Gesture Recognition

arxiv: v3 [cs.cl] 7 Feb 2017

Visual CP Representation of Knowledge

CS 446: Machine Learning

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Challenges in Deep Reinforcement Learning. Sergey Levine UC Berkeley

The A2iA Multi-lingual Text Recognition System at the second Maurdor Evaluation

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

INPE São José dos Campos

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Humboldt-Universität zu Berlin

5 Guidelines for Learning to Spell

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

Transcription:

Semantic Segmentation TINGWU WANG MACHINE LEARNING GROUP, UNIVERSITY OF TORONTO

Contents 1. What is semantic segmentation? 1. What is segmentation in the first place? 2. What is semantic segmentation? 3. Why semantic segmentation 2. Deep Learning in Segmentation 1. Semantic Segmentation before Deep Learning 2. Conditional Random Fields 3. A Brief Review on Detection 4. Fully Convolutional Network 3. Discussions and Demos 1. Demos of CNN + CRF 2. Segmentation from Natural Language Expression 3. Make CRF Great Again?

What is semantic segmentation 1. What is segmentation in the first place? 1. Input: images 2. Output: regions, structures 1. line segments, curve segments, circles, etc.

What is semantic segmentation 1. What is segmentation in the first place? 1. Input: images 2. Output: regions, structures 3. Most of the time, we need to "process the image" 1. filters 2. gradient information 3. color information 4. etc. That's not quite so human. What if we want to understand the image? Arbelaez, Pablo, et al. [1]

What is semantic segmentation 1. What is semantic segmentation? 1. Idea: recognizing, understanding what's in the image in pixel level. "Two men riding on a bike in front of a building on the road. And there is a car." Roozbeh Mottaghi, et al. [2]

What is semantic segmentation 1. What is semantic segmentation? 1. Idea: recognizing, understanding what's in the image in pixel level. 2. A lot more difficult (Most of the traditional methods cannot tell different objects.)

What is semantic segmentation 1. What is semantic segmentation? 1. Idea: recognizing, understanding what's in the image in pixel level. 2. A lot more difficult (Most of the traditional methods cannot tell different objects.) No worries, even the best ML researchers find it very challenging. 3. Output: regions with different (and limited number of) classes 1. COCO detection challenge: 80 classes. 2. PASCAL VOC challenge: 21 classes

What is semantic segmentation 1. Why semantic segmentation? 1. robot vision and understanding 2. autonomous driving (remember your assignment?)

What is semantic segmentation 1. Why semantic segmentation? 3. medical purposes (ISBI Challenge) OAJ del Toro, et al. [5]

Contents 1. What is semantic segmentation? 1. What is segmentation in the first place? 2. What is semantic segmentation? 3. Why semantic segmentation 2. Deep Learning in Segmentation 1. Semantic Segmentation before Deep Learning 2. Conditional Random Fields 3. A Brief Review on Detection 4. Fully Convolutional Network 3. Discussions and Demos 1. Demos of CNN + RCF 2. Segmentation from Natural Language Expression 3. Make CRF Great Again?

Deep Learning in semantic Segmentation 1. Semantic segmentation before deep learning 1. relying on conditional random field. 2. operating on pixels or superpixels 3. incorporate local evidence in unary potentials 4. interactions between label assignments J Shotton, et al. [3]

Deep Learning in semantic Segmentation 1. What is conditional random field? 1. probabilistic framework for labeling and segmenting structured data 2. no need to understand the math, just know the idea what it tries to model is the relationship between pixels, e.g.: 1. nearby pixels more likely to have same label 2. pixels with similar color more likely to have same label 3. the pixels above the pixels "chair" more likely to be "person" instead of "plane" 4. refine results by iterations

Deep Learning in Semantic Segmentation 1. A Brief Review on Classification 0. Again, it is totally fine if you don't understand the deep neural network. imagine it as a black magic box if you want :) 1. Deep learning in classification. 1. input: the whole image 2. output: the probability of each class (person, dog, cat,...) 3. not appliable on semantic segmentation A. Krizhevsky, et al. [4]

Deep Learning in Semantic Segmentation 1. How to move from classification to semantic segmentation? 1. remember traditionally we use superpixels (Polygon)? Brian Fulkerson, et al. [7]

Deep Learning in Semantic Segmentation 1. Transition to segmentation; early ideas 1. superpixel proposals 2. do classification on each superpixel. M Mostajabi, et al. [6]

Deep Learning in Semantic Segmentation 1. Fully Convolutional Networks for Semantic Segmentation 1. forget about pixels/superpixel input Long, J., et al. [8]

Deep Learning in Semantic Segmentation 1. Fully Convolutional Networks for Semantic Segmentation Long, J., et al. [8]

Deep Learning in Semantic Segmentation 1. Fully Convolutional Networks + CRF 1. the output from DCNN is blurry and inaccurate 2. rediscovery of CRF LC Chen, et al. [9]

Deep Learning in Semantic Segmentation 1. Conditional Random Fields as Recurrent Neural Networks 1. end-to-end training optimize(a) + optimize(b given A) < optimize(a, B together) Zheng S., et al. [10]

Contents 1. What is semantic segmentation? 1. What is segmentation in the first place? 2. What is semantic segmentation? 3. Why semantic segmentation 2. Deep Learning in Segmentation 1. Semantic Segmentation before Deep Learning 2. Conditional Random Fields 3. A Brief Review on Detection 4. Fully Convolutional Network 3. Discussions and Demos 1. Demos of CNN + RCF 2. Segmentation from Natural Language Expression 3. Make CRF Great Again?

Discussions and Demos 1. Online Demos about CRF as RNN semantic segmentation Zheng S., et al. [10]

Discussions and Demos 1. Segmentation from Natural Language Expression 1. what does it mean? e.g., the phrase "two men sitting on the right bench" requires segmenting only the two people on the right bench and no one standing or sitting on another. R. Hu, et al. [11]

Discussions and Demos 1. Segmentation from Natural Language Expression

Discussions and Demos 1. Make Probabilistic Graphical Model Great Again? 1. what happened to DPM [12] 1. mixtures of multiscale deformable part models 2. later people found DPM could be placed by a CNN layer [13] 3. no one uses dpm now. 2. what happened to object proposals in detection 1. Human designed proposals (selective search, edge box,...) [14] 2. later people found proposal generating could be replaced by a CNN layer [15, 16] 3. no one (well, maybe still many people) uses human designed proposals now. 3. what is happening to CRF in semantic segmentation 1. pairwise relationship between pixels 2. later people find CRF could be replaced by a CNN layer 3. no one uses CRF? well, we don't know future

Discussions and Demos 1. The powerfulness of deep learning Agent Smith: If you can't beat us... Agent Smith Clone: Join us!

References [1] Arbelaez, Pablo, et al. "Contour detection and hierarchical image segmentation." IEEE transactions on pattern analysis and machine intelligence, 2011. [2] Roozbeh Mottaghi, Xianjie Chen, Xiaobai Liu, Nam-Gyu Cho, Seong-Whan Lee, Sanja Fidler, Raquel Urtasun, Alan Yuille. CVPR, 2014. [3] Shotton, Jamie, et al. "Textonboost for image understanding: Multi-class object recognition and segmentation by jointly modeling texture, layout, and context." International Journal of Computer Vision, 2009. [4] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. [5] del Toro, Oscar Alfonso Jiménez, Orcun Goksel, Bjoern Menze, Henning Müller, Georg Langs, Marc- André Weber, Ivan Eggel et al. "VISCERAL VISual Concept Extraction challenge in RAdioLogy: ISBI 2014 challenge organization." Proceedings of the VISCERAL Challenge at ISBI 1194 (2014): 6-15. [6] Mostajabi, Mohammadreza, Payman Yadollahpour, and Gregory Shakhnarovich. "Feedforward semantic segmentation with zoom-out features." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [7] Fulkerson, Brian, Andrea Vedaldi, and Stefano Soatto. "Class segmentation and object localization with superpixel neighborhoods." In ICCV, 2009.

References [8] Long, J., Shelhamer, E. and Darrell, T., 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. [9] Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K. and Yuille, A.L., 2014. Semantic image segmentation with deep convolutional nets and fully connected crfs. arxiv preprint arxiv:1412.7062. [10] Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet, V., Su, Z., Du, D., Huang, C. and Torr, P.H., 2015. Conditional random fields as recurrent neural networks. In Proceedings of the IEEE International Conference on Computer Vision. [11] Hu, Ronghang, Marcus Rohrbach, and Trevor Darrell. "Segmentation from Natural Language Expressions." arxiv preprint arxiv:1603.06180 (2016). [12] Felzenszwalb, Pedro, David McAllester, and Deva Ramanan. "A discriminatively trained, multiscale, deformable part model." In Computer Vision and Pattern Recognition, 2008. CVPR. [13] Girshick, Ross, et al. "Deformable part models are convolutional neural networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015. [14] Uijlings, Jasper RR, et al. "Selective search for object recognition." International journal of computer vision, 2013. [15] Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE International Conference on Computer Vision. 2015. [16] Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." arxiv preprint arxiv:1506.02640 (2015).

Q&A For those who are interested in CRF and want to know the math, I recommend this tutorial: [17] Nowozin, Sebastian, and Christoph H. Lampert. "Structured learning and prediction in computer vision." Foundations and Trends in Computer Graphics and Vision 6.3 4 (2011): 185-365. (might take a long time to understand. good luck ;P)