Machine Learning for Computer Vision

Similar documents
Lecture 1: Machine Learning Basics

Probabilistic Latent Semantic Analysis

Lecture 10: Reinforcement Learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Generative models and adversarial training

Artificial Neural Networks written examination

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Python Machine Learning

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Lecture 1: Basic Concepts of Machine Learning

Switchboard Language Model Improvement with Conversational Data from Gigaword

Exploration. CS : Deep Reinforcement Learning Sergey Levine

CSL465/603 - Machine Learning

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

A Case Study: News Classification Based on Term Frequency

Speech Emotion Recognition Using Support Vector Machine

Learning Methods in Multilingual Speech Recognition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Word Segmentation of Off-line Handwritten Documents

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Human Emotion Recognition From Speech

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Modeling function word errors in DNN-HMM based LVCSR systems

The Evolution of Random Phenomena

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Introduction to Causal Inference. Problem Set 1. Required Problems

Rule Learning With Negation: Issues Regarding Effectiveness

Semi-Supervised Face Detection

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH

Universidade do Minho Escola de Engenharia

Modeling function word errors in DNN-HMM based LVCSR systems

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

INPE São José dos Campos

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Corrective Feedback and Persistent Learning for Information Extraction

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Why Did My Detector Do That?!

Reducing Features to Improve Bug Prediction

A survey of multi-view machine learning

CS Machine Learning

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Interactive Whiteboard

Active Learning. Yingyu Liang Computer Sciences 760 Fall

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

THE UNIVERSITY OF SYDNEY Semester 2, Information Sheet for MATH2068/2988 Number Theory and Cryptography

Calibration of Confidence Measures in Speech Recognition

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Learning From the Past with Experiment Databases

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

arxiv: v2 [cs.cv] 30 Mar 2017

Truth Inference in Crowdsourcing: Is the Problem Solved?

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Agents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators

A study of speaker adaptation for DNN-based speech synthesis

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Multi-Dimensional, Multi-Level, and Multi-Timepoint Item Response Modeling.

Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus

Rule Learning with Negation: Issues Regarding Effectiveness

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Julia Smith. Effective Classroom Approaches to.

Australian Journal of Basic and Applied Sciences

Axiom 2013 Team Description Paper

Missouri Mathematics Grade-Level Expectations

Data Fusion Through Statistical Matching

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

stateorvalue to each variable in a given set. We use p(x = xjy = y) (or p(xjy) as a shorthand) to denote the probability that X = x given Y = y. We al

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Students Understanding of Graphical Vector Addition in One and Two Dimensions

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

Speech Recognition at ICSI: Broadcast News and beyond

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Laboratorio di Intelligenza Artificiale e Robotica

Managerial Decision Making

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Evolutive Neural Net Fuzzy Filtering: Basic Description

An Online Handwriting Recognition System For Turkish

Intelligent Agents. Chapter 2. Chapter 2 1

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

Probability and Statistics Curriculum Pacing Guide

Welcome to. ECML/PKDD 2004 Community meeting

Functional Maths Skills Check E3/L x

Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining. Predictive Data Mining with Finite Mixtures

Go fishing! Responsibility judgments when cooperation breaks down

TEACHING AND EXAMINATION REGULATIONS PART B: programme-specific section MASTER S PROGRAMME IN LOGIC

Reinforcement Learning by Comparing Immediate Reward

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Transcription:

Computer Group Prof. Daniel Cremers Machine Learning for Computer PD Dr. Rudolph Triebel

Lecturers PD Dr. Rudolph Triebel rudolph.triebel@in.tum.de Room number 02.09.059 Main lecture MSc. Ioannis John Chiotellis ioannis.chiotellis@gmail.com Room number 02.09.059 Assistance and exercises 2 Computer Group

Topics Covered Introduction (today) Regression Graphical Models (directed and undirected); note: special class on PGM Hidden Markov Models Mixture models and EM Neural Networks and Deep Learning Boosting Kernel Methods Gaussian Processes Sampling Methods Variational Inference and Expectation Propagation Clustering 3 Computer Group

Literature Recommended textbook for the lecture: Christopher M. Bishop: Pattern Recognition and Machine Learning More detailed: Gaussian Processes for Machine Learning Rasmussen/Williams Machine Learning - A Probabilistic Perspective Murphy 4 Computer Group

The Tutorials Bi-weekly tutorial classes Participation in tutorial classes and submission of solved assignment sheets is totally free The submitted solutions can be corrected and returned In class, you have the opportunity to present your solution Assignments will be theoretical and practical problems 5 Computer Group

The Exam No qualification necessary for the final exam Final exam will be oral From a given number of known questions, some will be drawn by chance Usually, from each part a fixed number of questions appears 6 Computer Group

Class Webpage https://vision.in.tum.de/teaching/ss2016/mlcv16 Contains the slides and assignments for download Also used for communication, in addition to email list Some further material will be developed in class 7 Computer Group

Computer Group Prof. Daniel Cremers 1. Introduction to Learning and Probabilistic Reasoning

Motivation Suppose a robot stops in front of a door. It has a sensor (e.g. a camera) to measure the state of the door (open or closed). Problem: the sensor may fail. 9 Computer Group

Motivation Question: How can we obtain knowledge about the environment from sensors that may return incorrect results? Using Probabilities! 10 Computer Group

Basics of Probability Theory Definition 1.1: A sample space of a given experiment. is a set of outcomes Examples: a) Coin toss experiment: b) Distance measurement: Definition 1.2: A random variable is a function that assigns a real number to each element of. Example: Coin toss experiment: Values of random variables are denoted with small letters, e.g.: 11 Computer Group

Discrete and Continuous If is countable then is a discrete random variable, else it is a continuous random variable. The probability that takes on a certain value is a real number between 0 and 1. It holds: Discrete case Continuous case 12 Computer Group

A Discrete Random Variable Suppose a robot knows that it is in a room, but it does not know in which room. There are 4 possibilities: Kitchen, Office, Bathroom, Living room Then the random variable Room is discrete, because it can take on one of four values. The probabilities are, for example: 13 Computer Group

A Continuous Random Variable Suppose a robot travels 5 meters forward from a given start point. Its position is a continuous random variable with a Normal distribution: Shorthand: 14 Computer Group

Joint and Conditional Probability The joint probability of two random variables is the probability that the events and occur at the same time: and Shorthand: Definition 1.3: The conditional probability of is defined as: given 15 Computer Group

Independency, Sum and Product Rule Definition 1.4: Two random variables and are independent iff: For independent random variables and we have: Furthermore, it holds: Sum Rule Product Rule 16 Computer Group

Law of Total Probability Theorem 1.1: For two random variables and it holds: Discrete case Continuous case The process of obtaining from by summing or integrating over all values of is called Marginalisation 17 Computer Group

Bayes Rule Theorem 1.2: For two random variables and it holds: Bayes Rule Proof: I. (definition) II. (definition) III. (from II.) 18 Computer Group

Bayes Rule: Background Knowledge For it holds: Background knowledge Shorthand: Normalizer 19 Computer Group

Computing the Normalizer Bayes rule Total probability can be computed without knowing 20 Computer Group

Conditional Independence Definition 1.5: Two random variables and are conditional independent given a third random variable iff: This is equivalent to: and 21 Computer Group

Expectation and Covariance Definition 1.6: The expectation of a random variable is defined as: (discrete case) (continuous case) Definition 1.7: The covariance of a random variable is defined as: Cov[X] =E[(X E[X]) 2 ]=E[X 2 ] E[X] 2 22 Computer Group

Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? 23 Computer Group

Causal vs. Diagnostic Reasoning Searching for reasoning Searching for is called diagnostic is called causal reasoning Often causal knowledge is easier to obtain Bayes rule allows us to use causal knowledge: 24 Computer Group

Example with Numbers Assume we have this sensor model: and: Prior prob. then: raises the probability that the door is open 25 Computer Group

Combining Evidence Suppose our robot obtains another observation, where the index is the point in time. Question: How can we integrate this new information? Formally, we want to estimate. Using Bayes formula with background knowledge:?? 26 Computer Group

Markov Assumption If we know the state of the door at time then the measurement does not give any further information about. Formally: and are conditional independent given. This means: This is called the Markov Assumption. 27 Computer Group

Example with Numbers Assume we have a second sensor: Then: (from above) lowers the probability that the door is open 28 Computer Group

General Form Measurements: Markov assumption: and are conditionally independent given the state. Recursion 29 Computer Group

Example: Sensing and Acting Now the robot senses the door state and acts (it opens or closes the door). 30 Computer Group

State Transitions The outcome of an action is modeled as a random variable where in our case means state after closing the door. State transition example: If the door is open, the action close door succeeds in 90% of all cases. 31 Computer Group

The Outcome of Actions For a given action we want to know the probability. We do this by integrating over all possible previous states. If the state space is discrete: If the state space is continuous: 32 Computer Group

Back to the Example 33 Computer Group

Sensor Update and Action Update So far, we learned two different ways to update the system state: Sensor update: Action update: Now we want to combine both: Definition 2.1: Let be a sequence of sensor measurements and actions until time. Then the belief of the current state is defined as 34 Computer Group

Graphical Representation We can describe the overall process using a Dynamic Bayes Network: This incorporates the following Markov assumptions: (measurement) (state) 35 Computer Group

The Overall Bayes Filter (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 36 Computer Group

The Bayes Filter Algorithm Algorithm Bayes_filter : 1. if is a sensor measurement then 2. 3. for all do 4. 5. 6. for all do 7. else if is an action then 8. for all do 9. return 37 Computer Group

Bayes Filter Variants The Bayes filter principle is used in Kalman filters Particle filters Hidden Markov models Dynamic Bayesian networks Partially Observable Markov Decision Processes (POMDPs) 38 Computer Group

Summary Probabilistic reasoning is necessary to deal with uncertain information, e.g. sensor measurements Using Bayes rule, we can do diagnostic reasoning based on causal knowledge The outcome of a robot s action can be described by a state transition diagram Probabilistic state estimation can be done recursively using the Bayes filter using a sensor and a motion update A graphical representation for the state estimation problem is the Dynamic Bayes Network 39 Computer Group

Computer Group Prof. Daniel Cremers 2. Introduction to Learning

Motivation Most objects in the environment can be classified, e.g. with respect to their size, functionality, dynamic properties, etc. Robots need to interact with the objects (move around, manipulate, inspect, etc.) and with humans For all these tasks it is necessary that the robot knows to which class an object belongs Which object is a door? 41 Computer Group

Object Classification Applications Two major types of applications: Object detection: For a given test data set find all previously learned objects, e.g. pedestrians Object recognition: Find the particular kind of object as it was learned from the training data, e.g. handwritten character recognition 42 Computer Group

Learning A natural way to do object classification is to first learn the categories of the objects and then infer from the learned data a possible class for a new object. The area of machine learning deals with the formulation and investigates methods to do the learning automatically. Nowadays, machine learning algorithms are more and more used in robotics and computer vision 43 Computer Group

Mathematical Formulation Suppose we are given a set of objects and a set of object categories (classes). In the learning task we search for a mapping such that similar elements in are mapped to similar elements in. Examples: Object classification: chairs, tables, etc. Optical character recognition Speech recognition Important problem: Measure of similarity! 44 Computer Group

Categories of Learning Learning Unsupervised Learning clustering, density estimation Supervised Learning learning from a training data set, inference on the test data Reinforcement Learning no supervision, but a reward function Discriminant Function no prob. formulation, learns a function from objects to labels. Discriminative Model estimates the posterior for each class Generative Model est. the likelihoods and use Bayes rule for the post. 45 Computer Group

Categories of Learning Learning Unsupervised Learning clustering, density estimation Supervised Learning learning from a training data set, inference on the test data Reinforcement Learning no supervision, but a reward function Supervised Learning is the main topic of this lecture! Methods used in Computer include: Regression Conditional Random Fields Boosting Support Vector Machines Gaussian Processes Hidden Markov Models 46 Computer Group

Categories of Learning Learning Unsupervised Learning clustering, density estimation Supervised Learning learning from a training data set, inference on the test data Reinforcement Learning no supervision, but a reward function Most Unsupervised Learning methods are based on Clustering. Will be handled at the end of this semester 47 Computer Group

Categories of Learning Learning Unsupervised Learning clustering, density estimation Supervised Learning learning from a training data set, inference on the test data Reinforcement Learning no supervision, but a reward function Reinforcement Learning requires an action the reward defines the quality of an action mostly used in robotics (e.g. manipulation) can be dangerous, actions need to be tried out not handled in this course 48 Computer Group

Generative Model: Example Nearest-neighbor classification: Given: data points Rule: Each new data point is assigned to the class of its nearest neighbor in feature space 1. Training instances in feature space 49 Computer Group

Generative Model: Example Nearest-neighbor classification: Given: data points Rule: Each new data point is assigned to the class of its nearest neighbor in feature space 2. Map new data point into feature space 50 Computer Group

Generative Model: Example Nearest-neighbor classification: Given: data points Rule: Each new data point is assigned to the class of its nearest neighbor in feature space 3. Compute the distances to the neighbors 51 Computer Group

Generative Model: Example Nearest-neighbor classification: Given: data points Rule: Each new data point is assigned to the class of its nearest neighbor in feature space 4. Assign the label of the nearest training instance 52 Computer Group

Generative Model: Example Nearest-neighbor classification: General case: K nearest neighbors We consider a sphere around each training instance that has a fixed volume V. K k : Number of points from class k inside sphere N k : Number of all points from class k 53 Computer Group

Generative Model: Example Nearest-neighbor classification: General case: K nearest neighbors We consider a sphere around a training / test sample that has a fixed volume V. With this we can estimate: likelihood # points in sphere and likewise: using Bayes rule: # all points uncond. prob. posterior 54 Computer Group

Generative Model: Example Nearest-neighbor classification: General case: K nearest neighbors To classify the new data point we compute the posterior for each class k = 1,2, and assign the label that maximizes the posterior (MAP). 55 Computer Group

Summary Learning is usually a two-step process consisting in a training and an inference step Learning is useful to extract semantic information, e.g. about the objects in an environment There are three main categories of learning: unsupervised, supervised and reinforcement learning Supervised learning can be split into discriminant function, discriminant model, and generative model learning An example for a generative model is nearest neighbor classification 56 Computer Group