Lectures and exercises Introduction to Pattern Recognition: Lecture 1. Goal and contents. Generalities

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Lecture 1: Machine Learning Basics

Python Machine Learning

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

(Sub)Gradient Descent

Lecture 1: Basic Concepts of Machine Learning

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Laboratorio di Intelligenza Artificiale e Robotica

CSL465/603 - Machine Learning

Learning Methods for Fuzzy Systems

Probabilistic Latent Semantic Analysis

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Speech Recognition at ICSI: Broadcast News and beyond

Axiom 2013 Team Description Paper

Generative models and adversarial training

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Rule Learning With Negation: Issues Regarding Effectiveness

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Switchboard Language Model Improvement with Conversational Data from Gigaword

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Artificial Neural Networks written examination

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Word Segmentation of Off-line Handwritten Documents

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Laboratorio di Intelligenza Artificiale e Robotica

Answer Key Applied Calculus 4

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Probability and Statistics Curriculum Pacing Guide

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Using Web Searches on Important Words to Create Background Sets for LSI Classification

A Case Study: News Classification Based on Term Frequency

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Answer Key For The California Mathematics Standards Grade 1

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Computerized Adaptive Psychological Testing A Personalisation Perspective

Modeling function word errors in DNN-HMM based LVCSR systems

WHEN THERE IS A mismatch between the acoustic

AQUA: An Ontology-Driven Question Answering System

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Linking Task: Identifying authors and book titles in verbose queries

Assignment 1: Predicting Amazon Review Ratings

Evolution of Symbolisation in Chimpanzees and Neural Nets

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Time series prediction

Modeling function word errors in DNN-HMM based LVCSR systems

A student diagnosing and evaluation system for laboratory-based academic exercises

Speech Emotion Recognition Using Support Vector Machine

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

PREP S SPEAKER LISTENER TECHNIQUE COACHING MANUAL

ICTCM 28th International Conference on Technology in Collegiate Mathematics

A Neural Network GUI Tested on Text-To-Phoneme Mapping

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Human Emotion Recognition From Speech

Math 96: Intermediate Algebra in Context

Evolutive Neural Net Fuzzy Filtering: Basic Description

Circuit Simulators: A Revolutionary E-Learning Platform

Rule Learning with Negation: Issues Regarding Effectiveness

arxiv: v2 [cs.cv] 30 Mar 2017

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Methods in Multilingual Speech Recognition

Calibration of Confidence Measures in Speech Recognition

Syllabus ENGR 190 Introductory Calculus (QR)

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Master s Programme in Computer, Communication and Information Sciences, Study guide , ELEC Majors

INPE São José dos Campos

POLA: a student modeling framework for Probabilistic On-Line Assessment of problem solving performance

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

Speaker Identification by Comparison of Smart Methods. Abstract

Radius STEM Readiness TM

COMPUTER-AIDED DESIGN TOOLS THAT ADAPT

Conversation Starters: Using Spatial Context to Initiate Dialogue in First Person Perspective Games

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Knowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute

Australian Journal of Basic and Applied Sciences

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Pre-AP Geometry Course Syllabus Page 1

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Self Study Report Computer Science

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

A study of speaker adaptation for DNN-based speech synthesis

CS Machine Learning

Dublin City Schools Mathematics Graded Course of Study GRADE 4

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

Reducing Features to Improve Bug Prediction

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

University of Groningen. Systemen, planning, netwerken Bosman, Aart

School of Innovative Technologies and Engineering

Lecture 10: Reinforcement Learning

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Welcome to. ECML/PKDD 2004 Community meeting

Software Maintenance

Penn State University - University Park MATH 140 Instructor Syllabus, Calculus with Analytic Geometry I Fall 2010

TEACHING AND EXAMINATION REGULATIONS PART B: programme-specific section MASTER S PROGRAMME IN LOGIC

Transcription:

Lectures and exercises 8001652 Introduction to Pattern Recognition: Lecture 1 Jussi Tohka jussi.tohka@tut.fi Institute of Signal Processing Tampere University of Technology Lecturers: Jussi Tohka and Ulla Ruotsalainen e-mail: jussi.tohka@tut.fi, ulla.ruotsalainen@tut.fi Offices: TE309 (Jussi) and TE311 (Ulla) Assistants: Anu Kivimäki and Edisson Alban. Lectures Mondays 14-16 at TB104. Exercises: (English) (Finnish) First exam: December Homepage: http://www.cs.tut.fi/kurssit/8001652/ 8001652 Introduction to Pattern Recognition: Lecture 1 p.1/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.2/29 Generalities Goal and contents Lectures 26h (2h / week) Exercises 12h (1h /week). Lecture schedule will be available from the web-page of the course. REQUIREMENTS: Final examination and active participation in exercises. LITERATURE: Duda, Hart, Stork: Pattern Classification, 2nd edition, Wiley, 2001. PREREQUISITES: Introduction to signal processing 2, some basic skills in mathematics (matrix calculus). This course is a required prerequisite for advanced courses in pattern and speech recognition. The goal is to introduce basic methods and principles of pattern recognition. The contents : Recap on multivariate probability and statistics. Bayesian decision theory. Parameter estimation from training data. Linear classifiers Unsupervised classification and clustering 8001652 Introduction to Pattern Recognition: Lecture 1 p.3/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.4/29

Exercise bonus Course outline Exercise bonuses: 10 % of exercises must be done in order to pass the course. Thereafter every 20 % earns an extra point in the exam. Thus 10 % pass the course 30 % 1 point to the exam 50 % 2 points to the exam 70 % 3 points to the exam 90 % 4 points to the exam This lecture (Jussi) Basics on probability and statistics (Jussi) Basics on probability and statistics continued (Jussi) Bayesian decision theory (Jussi) Bayesian decision theory (Jussi) Parameter estimation, maximum-likelihood estimator (Ulla) 8001652 Introduction to Pattern Recognition: Lecture 1 p.5/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.6/29 Course outline Parzen windows (Ulla) k-nearest neighbours rule (Ulla) Linear discriminant functions (Ulla) Linear discriminant functions (Ulla) Unsupervised classification (Ulla) Unsupervised classification (Ulla) No-free-lunch theorem and recap (Ulla) Briefly and broadly speaking, pattern recognition is a task of finding some conceptual and relevant information from raw data. The definition of relevant information depends on the application as does that of raw data. In summary, pattern recognition can mean a number of things and finding an universal definition is hard if not impossible. However, good opinions about the meaning of pattern recognition can be given. 8001652 Introduction to Pattern Recognition: Lecture 1 p.7/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.8/29

www.cs.uvm.edu/ snapp/teaching/cs295pr/whatispr.html It is generally a easy for a person to differentiate the sound of a human voice, from that of a violin; a handwritten numeral "3," from an "8"; and the aroma of a rose, from that of an onion. However, it is difficult for a programmable computer to solve these kinds of perceptual problems. These problems are difficult because each pattern usually contains a large amount of information, and the recognition problems typically have an inconspicuous, high-dimensional, structure. www.cs.uvm.edu/ snapp/teaching/cs295pr/whatispr.html Pattern recognition is the science of making inferences from perceptual data, using tools from statistics, probability, computational geometry, machine learning, signal processing, and algorithm design. Thus, it is of central importance to artificial intelligence and computer vision, and has far-reaching applications in engineering, science, medicine, and business. In particular, advances made during the last half century, now allow computers to interact more effectively with humans and the natural world (e.g., speech recognition software). However, the most important problems in pattern recognition are yet to be solved. 8001652 Introduction to Pattern Recognition: Lecture 1 p.9/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.10/29 www.um.ac.ir/ patternrec/about.htm Pattern recognition is the scientific discipline whose goal is the classification of objects into a number of categories or classes. Depending on the application, these objects can be images or signal waveforms or any type of measurements that need to be classified. We will refer to these objects using the generic term patterns. Pattern recognition has a long history, but before the 1960s it was mostly the output of theoretical research in the area of statistics. As with everything else, the advent of computers increased the demand for practical applications of pattern recognition, which in turn set new demands for further theoretical developments. www.um.ac.ir/ patternrec/about.htm As our society evolves from the industrial to its postindustrial phase, automation in industrial production and the need for information handling and retrieval are becoming increasingly important. This trend has pushed pattern recognition to the high edge of today?s engineering applications and research. Pattern recognition is an integral part in most machine intelligence systems built for decision making. 8001652 Introduction to Pattern Recognition: Lecture 1 p.11/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.12/29

From prlab.ee.memphis.edu/frigui/elec7901/intro/intro.html Pattern: a description of an object. Recognition: classifying an object to a pattern class. PR is the science that concerns the description or classification (recognition) of measurements. PR techniques are an important component of intelligent systems and are used for Decision making Object and pattern classification Data preprocessing The course book says: The ease with which we recognize a face, understand spoken words, read handwritten characters, identify our car keys in our pocket by feel, and decide whether an apple is ripe by its smell belies astoundingly complex processes that underlie these acts of pattern recognition. Pattern recognition - the act of taking in raw data and making an action based on the category of the pattern - has been crucial for our survival, and over the past tens of millions of years we have evolved highly sophisticated neural and cognitive systems for such tasks. 8001652 Introduction to Pattern Recognition: Lecture 1 p.13/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.14/29 Machine perception Examples The task: Seek to design and build machines that can recognize patterns. Applications: Biomedical (Neuroscience, ECG monitoring, drug development, Dna sequences) speech recognition fingerprint identification, optical character recognition Industrial inspection Differentiate between salmon and sea-bass Animal footprints Hand-written numeral recognition SPAM identification 8001652 Introduction to Pattern Recognition: Lecture 1 p.15/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.16/29

Pattern recognition systems Segmentation The partition of the whole data into single objects. Sometimes obvious: mailbox vs. a single e-mail message Sometimes challenging: images of animal footprints vs. an image of a single animal footprint, in speech recognition a difficult problem. 8001652 Introduction to Pattern Recognition: Lecture 1 p.17/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.18/29 Feature extraction Feature extraction The traditional goal of the feature extractor is to characterize an object by making numerical measurements: In animal footprints-example features were squareness and solidness of the footprint shape. Good features are those whose values are similar for objects belonging to the same category and distinct for objects in different categories. Feature extraction is very problem dependent: Good features for sorting fish are of little use for recognizing fingerprints. Usually one feature is not enough to differentiate between objects from different categories. Multiple features representing the same object are organized into feature vectors. The set of all possible feature vectors is called the feature space. Invariant features are such that remain the same if something (irrelevant) is done to the sensed input. 8001652 Introduction to Pattern Recognition: Lecture 1 p.19/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.20/29

Classification Classification The task of the classifier component is to use the feature vector provided by the feature extractor to assign the object to a category. Classification is the main topic of this course. The abstraction provided by the feature vector representation of the input data enables the development of a largely domain-independent theory of classification. Essentially the classifier divides the feature space into regions corresponding to different categories. The degree of difficulty of the classification problem depends on the variability in the feature values for objects in the same category relative to the feature value variation between the categories. Variability is natural or is due to noise. Variability can be described through statistics leading to statistical pattern recognition. Questions: How to design a classifier that can cope with the variability in feature values? What is the best possible performance? 8001652 Introduction to Pattern Recognition: Lecture 1 p.21/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.22/29 Classification Post processing The post-processor uses output of the classifier to decide on the recommended action. For example, in the SPAM-identification the possible actions are 1) delete mail as spam, 2) not a spam, keep in inbox. If a single decision of the object category corresponds to a single action, then actions can be selected by minimizing the error rate. The error rate is the ratio of the wrong classifications and total classifications, that is, the probability of a wrong classification. 8001652 Introduction to Pattern Recognition: Lecture 1 p.23/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.24/29

Post processing The design cycle However, sometimes different actions can have different costs. For example, the SPAM detector should not delete important e-mails under any circumstances. But, letting a spam stay in the inbox is not so bad thing. These costs can be taken into account when designing an optimal classification system. The post-processor can also exploit the context or combine results of several classifiers. 8001652 Introduction to Pattern Recognition: Lecture 1 p.25/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.26/29 Learning and adaptation Unsupervised and supervised learning In the broadest sense, any method that incorporates information from training samples in the design of a classifier employs learning. Due to complexity of classification problems, we cannot guess the best classification decision ahead of time, we need to learn it. Creating classifiers then involves positing some general form of model, or form of the classifier, and using examples to learn the complete classifier. In supervised learning, a teacher provides a category label for each pattern in a training set. These are then used to train a classifier which can thereafter solve similar classification problems by itself. In unsupervised learning, or clustering, there is no explicit teacher or training data. The system forms natural clusters of input patterns and classifies them based on clusters they belong to. In reinforcement learning, a teacher only says to classifier whether it is right when suggesting a category for a pattern. The teacher does not tell what the correct category is. 8001652 Introduction to Pattern Recognition: Lecture 1 p.27/29 8001652 Introduction to Pattern Recognition: Lecture 1 p.28/29

Summary Pattern recognition systems aim to decide for an action based on the data provided. Classification is an important step in pattern recognition systems. A classifier uses feature values to assign an object into a category. Feature values contain variability which needs to be modeled statistics. It is challenging to construct good classifiers. Nevertheless many problems can be solved. 8001652 Introduction to Pattern Recognition: Lecture 1 p.29/29