Introduction to Pattern Recognition

Similar documents
Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Python Machine Learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Machine Learning Basics

(Sub)Gradient Descent

A Case Study: News Classification Based on Term Frequency

Word Segmentation of Off-line Handwritten Documents

Probabilistic Latent Semantic Analysis

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Speech Recognition at ICSI: Broadcast News and beyond

INPE São José dos Campos

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Knowledge Transfer in Deep Convolutional Neural Nets

Lecture 1: Basic Concepts of Machine Learning

Rule Learning With Negation: Issues Regarding Effectiveness

Generative models and adversarial training

Statistical Analysis of Climate Change, Renewable Energies, and Sustainability An Independent Investigation for Introduction to Statistics

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Learning Methods in Multilingual Speech Recognition

CS Machine Learning

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Going to School: Measuring Schooling Behaviors in GloFish

Australian Journal of Basic and Applied Sciences

Learning Methods for Fuzzy Systems

Visit us at:

Lecture 10: Reinforcement Learning

Improving Fairness in Memory Scheduling

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

About Advisory Committee

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Exploration. CS : Deep Reinforcement Learning Sergey Levine

CSL465/603 - Machine Learning

Rule Learning with Negation: Issues Regarding Effectiveness

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Speech Emotion Recognition Using Support Vector Machine

Linking Task: Identifying authors and book titles in verbose queries

Probability and Statistics Curriculum Pacing Guide

A Reinforcement Learning Variant for Control Scheduling

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

An Introduction to Simio for Beginners

Evolution of Symbolisation in Chimpanzees and Neural Nets

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Software Maintenance

Conversation Starters: Using Spatial Context to Initiate Dialogue in First Person Perspective Games

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Large vocabulary off-line handwriting recognition: A survey

Human Emotion Recognition From Speech

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Module Title: Managing and Leading Change. Lesson 4 THE SIX SIGMA

DO YOU HAVE THESE CONCERNS?

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH

On-Line Data Analytics

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Interpreting ACER Test Results

Individual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION

Laboratorio di Intelligenza Artificiale e Robotica

Learning From the Past with Experiment Databases

Visual CP Representation of Knowledge

PREP S SPEAKER LISTENER TECHNIQUE COACHING MANUAL

WHEN THERE IS A mismatch between the acoustic

Revisiting the role of prosody in early language acquisition. Megha Sundara UCLA Phonetics Lab

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Rule-based Expert Systems

CS 446: Machine Learning

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Economics 201 Principles of Microeconomics Fall 2010 MWF 10:00 10:50am 160 Bryan Building

A study of speaker adaptation for DNN-based speech synthesis

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Why Did My Detector Do That?!

SARDNET: A Self-Organizing Feature Map for Sequences

Shockwheat. Statistics 1, Activity 1

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

UMass at TDT Similarity functions 1. BASIC SYSTEM Detection algorithms. set globally and apply to all clusters.

Case study Norway case 1

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

BOOK INFORMATION SHEET. For all industries including Versions 4 to x 196 x 20 mm 300 x 209 x 20 mm 0.7 kg 1.1kg

Axiom 2013 Team Description Paper

use different techniques and equipment with guidance

Go fishing! Responsibility judgments when cooperation breaks down

16.1 Lesson: Putting it into practice - isikhnas

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

elearning OVERVIEW GFA Consulting Group GmbH 1

How to Judge the Quality of an Objective Classroom Test

Probability estimates in a scenario tree

Robot manipulations and development of spatial imagery

Introduction. Educational policymakers in most schools and districts face considerable pressure to

Early Warning System Implementation Guide

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

LEGO MINDSTORMS Education EV3 Coding Activities

An NFR Pattern Approach to Dealing with Non-Functional Requirements

Calibration of Confidence Measures in Speech Recognition

Transcription:

Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 1 / 39

Human Perception Humans have developed highly sophisticated skills for sensing their environment and taking actions according to what they observe, e.g., recognizing a face, understanding spoken words, reading handwriting, distinguishing fresh food from its smell. We would like to give similar capabilities to machines. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 2 / 39

What is Pattern Recognition? A pattern is an entity, vaguely defined, that could be given a name, e.g., fingerprint image, handwritten word, human face, speech signal, DNA sequence,... Pattern recognition is the study of how machines can observe the environment, learn to distinguish patterns of interest, make sound and reasonable decisions about the categories of the patterns. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 3 / 39

Human and Machine Perception We are often influenced by the knowledge of how patterns are modeled and recognized in nature when we develop pattern recognition algorithms. Research on machine perception also helps us gain deeper understanding and appreciation for pattern recognition systems in nature. Yet, we also apply many techniques that are purely numerical and do not have any correspondence in natural systems. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 4 / 39

Pattern Recognition Applications Figure 1: English handwriting recognition. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 5 / 39

Pattern Recognition Applications Figure 2: Chinese handwriting recognition. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 6 / 39

Pattern Recognition Applications Figure 3: Fingerprint recognition. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 7 / 39

Pattern Recognition Applications Figure 4: Biometric recognition. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 8 / 39

Pattern Recognition Applications Figure 5: Autonomous navigation. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 9 / 39

Pattern Recognition Applications Figure 6: Cancer detection and grading using microscopic tissue data. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 10 / 39

Pattern Recognition Applications Figure 7: Cancer detection and grading using microscopic tissue data. (left) A whole slide image with 75568 74896 pixels. (right) A region of interest with 7440 8260 pixels. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 11 / 39

Pattern Recognition Applications Figure 8: Land cover classification using satellite data. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 12 / 39

Pattern Recognition Applications Figure 9: Building and building group recognition using satellite data. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 13 / 39

Pattern Recognition Applications Figure 10: License plate recognition: US license plates. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 14 / 39

Pattern Recognition Applications Figure 11: Clustering of microarray data. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 15 / 39

An Example Problem: Sorting incoming fish on a conveyor belt according to species. Assume that we have only two kinds of fish: sea bass, salmon. Figure 12: Picture taken from a camera. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 16 / 39

An Example: Decision Process What kind of information can distinguish one species from the other? length, width, weight, number and shape of fins, tail shape, etc. What can cause problems during sensing? lighting conditions, position of fish on the conveyor belt, camera noise, etc. What are the steps in the process? capture image isolate fish take measurements make decision CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 17 / 39

An Example: Selecting Features Assume a fisherman told us that a sea bass is generally longer than a salmon. We can use length as a feature and decide between sea bass and salmon according to a threshold on length. How can we choose this threshold? CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 18 / 39

An Example: Selecting Features Figure 13: Histograms of the length feature for two types of fish in training samples. How can we choose the threshold l to make a reliable decision? CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 19 / 39

An Example: Selecting Features Even though sea bass is longer than salmon on the average, there are many examples of fish where this observation does not hold. Try another feature: average lightness of the fish scales. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 20 / 39

An Example: Selecting Features Figure 14: Histograms of the lightness feature for two types of fish in training samples. It looks easier to choose the threshold x but we still cannot make a perfect decision. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 21 / 39

An Example: Cost of Error We should also consider costs of different errors we make in our decisions. For example, if the fish packing company knows that: Customers who buy salmon will object vigorously if they see sea bass in their cans. Customers who buy sea bass will not be unhappy if they occasionally see some expensive salmon in their cans. How does this knowledge affect our decision? CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 22 / 39

An Example: Multiple Features Assume we also observed that sea bass are typically wider than salmon. We can use two features in our decision: lightness: x1 width: x2 Each fish image is now represented as a point (feature vector) ( x = x 1 x 2 ) in a two-dimensional feature space. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 23 / 39

An Example: Multiple Features Figure 15: Scatter plot of lightness and width features for training samples. We can draw a decision boundary to divide the feature space into two regions. Does it look better than using only lightness? CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 24 / 39

An Example: Multiple Features Does adding more features always improve the results? Avoid unreliable features. Be careful about correlations with existing features. Be careful about measurement costs. Be careful about noise in the measurements. Is there some curse for working in very high dimensions? CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 25 / 39

An Example: Decision Boundaries Can we do better with another decision rule? More complex models result in more complex boundaries. Figure 16: We may distinguish training samples perfectly but how can we predict how well we can generalize to unknown samples? CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 26 / 39

An Example: Decision Boundaries How can we manage the tradeoff between complexity of decision rules and their performance to unknown samples? Figure 17: Different criteria lead to different decision boundaries. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 27 / 39

More on Complexity 1 0 1 0 1 Figure 18: Regression example: plot of 10 sample points for the input variable x along with the corresponding target variable t. Green curve is the true function that generated the data. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 28 / 39

More on Complexity 1 1 0 0 1 1 0 1 0 1 (a) 0 th order polynomial (b) 1 st order polynomial 1 1 0 0 1 1 0 1 0 1 (c) 3 rd order polynomial (d) 9 th order polynomial Figure 19: Polynomial curve fitting: plots of polynomials having various orders, shown as red curves, fitted to the set of 10 sample points. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 29 / 39

More on Complexity 1 1 0 0 1 1 0 1 0 1 (a) 15 sample points (b) 100 sample points Figure 20: Polynomial curve fitting: plots of 9 th order polynomials fitted to 15 and 100 sample points. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 30 / 39

Pattern Recognition Systems Physical environment Data acquisition/sensing Training data Pre processing Pre processing Feature extraction Feature extraction/selection Features Features Classification Model Model learning/estimation Post processing Decision Figure 21: Object/process diagram of a pattern recognition system. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 31 / 39

Pattern Recognition Systems Data acquisition and sensing: Measurements of physical variables. Important issues: bandwidth, resolution, sensitivity, distortion, SNR, latency, etc. Pre-processing: Removal of noise in data. Isolation of patterns of interest from the background. Feature extraction: Finding a new representation in terms of features. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 32 / 39

Pattern Recognition Systems Model learning and estimation: Learning a mapping between features and pattern groups and categories. Classification: Using features and learned models to assign a pattern to a category. Post-processing: Evaluation of confidence in decisions. Exploitation of context to improve performance. Combination of experts. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 33 / 39

The Design Cycle Collect data Select features Select model Train classifier Evaluate classifier Figure 22: The design cycle. Data collection: Collecting training and testing data. How can we know when we have adequately large and representative set of samples? CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 34 / 39

The Design Cycle Feature selection: Domain dependence and prior information. Computational cost and feasibility. Discriminative features. Similar values for similar patterns. Different values for different patterns. Invariant features with respect to translation, rotation and scale. Robust features with respect to occlusion, distortion, deformation, and variations in environment. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 35 / 39

The Design Cycle Model selection: Domain dependence and prior information. Definition of design criteria. Parametric vs. non-parametric models. Handling of missing features. Computational complexity. Types of models: templates, decision-theoretic or statistical, syntactic or structural, neural, and hybrid. How can we know how close we are to the true model underlying the patterns? CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 36 / 39

The Design Cycle Training: How can we learn the rule from data? Supervised learning: a teacher provides a category label or cost for each pattern in the training set. Unsupervised learning: the system forms clusters or natural groupings of the input patterns. Reinforcement learning: no desired category is given but the teacher provides feedback to the system such as the decision is right or wrong. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 37 / 39

The Design Cycle Evaluation: How can we estimate the performance with training samples? How can we predict the performance with future data? Problems of overfitting and generalization. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 38 / 39

Summary Pattern recognition techniques find applications in many areas: machine learning, statistics, mathematics, computer science, biology, etc. There are many sub-problems in the design process. Many of these problems can indeed be solved. More complex learning, searching and optimization algorithms are developed with advances in computer technology. There remain many fascinating unsolved problems. CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University) 39 / 39