Biometric Fusion. Venu Govindaraju. Center for Unified Biometrics and Sensors, University at Buffalo

Similar documents
Lecture 1: Machine Learning Basics

Python Machine Learning

(Sub)Gradient Descent

Non intrusive multi-biometrics on a mobile device: a comparison of fusion techniques

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Word Segmentation of Off-line Handwritten Documents

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Assignment 1: Predicting Amazon Review Ratings

Softprop: Softmax Neural Network Backpropagation Learning

Knowledge Transfer in Deep Convolutional Neural Nets

INPE São José dos Campos

Generative models and adversarial training

IN a biometric identification system, it is often the case that

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Active Learning. Yingyu Liang Computer Sciences 760 Fall

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Evolutive Neural Net Fuzzy Filtering: Basic Description

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

Learning From the Past with Experiment Databases

Lecture 10: Reinforcement Learning

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

WHEN THERE IS A mismatch between the acoustic

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

On the Combined Behavior of Autonomous Resource Management Agents

Reducing Features to Improve Bug Prediction

Calibration of Confidence Measures in Speech Recognition

Lecture 1: Basic Concepts of Machine Learning

Semi-Supervised Face Detection

CS Machine Learning

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Generating Test Cases From Use Cases

A study of speaker adaptation for DNN-based speech synthesis

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Speech Emotion Recognition Using Support Vector Machine

Human Emotion Recognition From Speech

SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT

SARDNET: A Self-Organizing Feature Map for Sequences

A Reinforcement Learning Variant for Control Scheduling

The Good Judgment Project: A large scale test of different methods of combining expert predictions

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

CSL465/603 - Machine Learning

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Speech Recognition at ICSI: Broadcast News and beyond

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

Major Milestones, Team Activities, and Individual Deliverables

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Probabilistic Latent Semantic Analysis

Reinforcement Learning by Comparing Immediate Reward

Axiom 2013 Team Description Paper

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Attributed Social Network Embedding

Artificial Neural Networks written examination

Introduction to Causal Inference. Problem Set 1. Required Problems

Probability and Statistics Curriculum Pacing Guide

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Rule Learning With Negation: Issues Regarding Effectiveness

B. How to write a research paper

Comment-based Multi-View Clustering of Web 2.0 Items

Support Vector Machines for Speaker and Language Recognition

Classify: by elimination Road signs

MYCIN. The MYCIN Task

Automatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment

An Online Handwriting Recognition System For Turkish

Australian Journal of Basic and Applied Sciences

arxiv: v2 [cs.cv] 30 Mar 2017

Speaker Identification by Comparison of Smart Methods. Abstract

Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method

An Empirical and Computational Test of Linguistic Relativity

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Lecture 15: Test Procedure in Engineering Design

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design

Switchboard Language Model Improvement with Conversational Data from Gigaword

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Why Did My Detector Do That?!

An empirical study of learning speed in backpropagation

A Pipelined Approach for Iterative Software Process Model

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

Universidade do Minho Escola de Engenharia

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Value Creation Through! Integration Workshop! Value Stream Analysis and Mapping for PD! January 31, 2002!

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Rule Learning with Negation: Issues Regarding Effectiveness

Cooperative evolutive concept learning: an empirical study

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Level 1 Mathematics and Statistics, 2015

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition

Introduction to Simulation

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Statewide Framework Document for:

Software Maintenance

Transcription:

Biometric Fusion Venu Govindaraju Center for Unified Biometrics and Sensors, University at Buffalo venu@cubs.buffalo.edu

Field of Fusion Classifier combination Other fusion application Non-ensemble combinations Classifier Ensembles Large number of classes Small number of classes BIOMETRICS

Field of Fusion Classifier combination Other fusion application Non-ensemble combinations Classifier Ensembles Large number of classes Small number of classes BIOMETRICS

Fixed Approaches (Transformation Based) Kittler et al., On Combining Classifiers, 1998 6 rules are justified under different assumptions - score assigned to class i by the classifier j - Sum rule - Product rule - Max rule - Min rule - Median rule - Majority vote Different rules can be best in different applications

Combination Methods Approaches Fixed Statistical Description Use predetermined rules; choose best Machine Learning of Combination function learned from training data Ease of use Easy Difficult Training data requirements Optimality of combination Average Somewhat High Yes Overfitting No Maybe

Outline [Tulyakov and Govindaraju 2008] IJPRAI, IEEE TIFS PAMI 1997 Kittler et al. - Justify different Fixed combination rules - Statistical Learning PAMI 2005 Snelick et al. - Fixed rules with adaptive normalization and user weighting - Explicit use of matching score set statistics PR 2002 - Prabhakar, Jain; PAMI 2008 Nandakumar et al., - Decision-Level Fusion; Likelihood Ratio-Based Biometric Score Fusion - Not optimal for identification tasks - Explore iterative methods - Score set statistics for implicit quality measure

Verification vs Identification Task Verification Task Fingerprint matching Signature matching Face matching Combined score is thresholded to accept to verify hypothesis 26 0.31 5.54 Optimization minimize FRR for given FAR Combination algorithm 0.95 Performance indicator ROC curve Identification Task Class of max combined scores is chosen Optimization to maximize correct rate Performance indicator correct ID rate Fingerprint matching Alice Bob 26 12 Combination algorithm Signature matching Alice Bob 0.31 0.45 Face matching Alice Bob Alice Bob 5.54 7.81 0.95 0.11

Principled Statistical Approach Combination function Map M matchers x N classes scores to N combined scores - score of class i - by classifier j Learn Mapping Possible if N and M are small Handwritten digit recognition - 10 classes, 2 OCR algorithms - NNs 10x2 inputs ; 10 outputs 1 i N Matcher 1 Matcher j Matcher M Biometrics number of classes N is large; number of matchers M is large

Principled Statistical Approach Combination function Verification Task Combined score thresholded to accept to verify hypothesis Ɵ IdentificationTask Class of max combined score 1 i N Matcher 1 Matcher j Matcher M

Principled Statistical Approach Combination function Architecture 1 1 i N Matcher 1 Matcher j Matcher M

Principled Statistical Approach Combination function Architecture 2 1 i N Matcher 1 Matcher j Matcher M

Find the Optimal Combination Function Using the Column Architecture 1 Using the Matrix Architecture 2 1 i N Matcher 1 Fingerprint Matcher j Face Matcher M Iris Column Matrix

Find the Optimal Combination Function Using the Column Using the Matrix 1 i N Matcher 1 Fingerprint Matcher j Face Matcher M Iris Verification Task Identification Task

4 Architectures Score Set Matrix 1 i N 1 i N M biometrics; N users Low Medium I Medium II 1 i N 1 i N High - complexity of the family of functions accepting dimensional input VC (Vapnik-Chervonenkis) dimension [Tulyakov 06]

Parameters Score Set Matrix M biometrics; N users Low Medium I Medium II High - complexity of the family of functions accepting dimensional input VC (Vapnik-Chervonenkis) dimension [Tulyakov 06]

Parameters Score Set Matrix 1 i N 1 i N M biometrics; N users Low Medium I Medium II 1 i N 1 i N High

Independence of Matchers Low Complexity Medium II Complexity Matcher 1 Fingerprint Matcher j Face 1 i N Matcher M Iris

Independence of Matchers Low Complexity Medium II Complexity Matcher 1 Fingerprint 1 i N Independent? Matcher M Iris

Independence of Scores In a single trial Low Complexity Medium II Complexity Matcher 1 Fingerprint 1 i N Independent? Matcher M Iris Independent?

Independence of Scores In a single trial Matcher 1 Fingerprint 1 i N Dependent Matcher M Iris Dependent

RESEARCH AGENDA Find the Optimal trainable function Given the score vectors (Low Complexity) Given the entire scores matrix (Medium II Complexity) For the two tasks- verification and identification

Likelihood Ratio function Verification Tasks Pattern classification approach 2 classes genuine and impostor verification attempts Biometric score 2 Impostor for Genuine Biometric score 1 Minimum risk criteria optimal decision boundaries coincide with the contours of likelihood ratio function well-known [Prabhakar, Jain 02] [Nandkumar, Jain, Das 08] Density-based OR multiple classification-based methods (NN, SVM, etc.) possible

Verification Task li & C (given columns)

Verification Task li & G (given columns)

Optimal Combination functions Likelihood Ratio (Using Columns) Identification Task Results Top choice correct rate Verification Task Results ROC CMR is correct 54.8% WMR is correct 77.2% Both are correct 48.9% Either is correct 83.0% Likelihood Ratio 69.8% Weighted Sum 81.6% LR combination is worse than single matcher

Dependence of Scores In a single trial Matcher 1 Fingerprint 1 i N Dependent? Matcher M Iris Dependent?

Optimal Trainable Combination Function Minimizing misclassification cost Classify as rather than Assume that scores assigned to different classes are independent

Example (Low complexity) Hypothetical densities of genuine and impostor scores generated as follows Matcher 1 genuine and impostor scores sampled independently from corresponding densities Matcher 2 in every identification trial score dependency exists Verification task Matchers 1 & 2 have same performance Identification task Matcher 2 has perfect performance; genuine score is always on top

Example Identification Task Likelihood ratio combination can be worse in identification system than single matcher Matcher 2 1 2 (identified) s 1 2 Matcher 1 Matcher 2 Matcher 1 identification trial failure Impostor ( 2) obtains higher LR score than genuine ( 1);

Training Iterative Methods for Identification Tasks Biometric score 2 Impostor Genuine Biometric score 2 Impostor Genuine Biometric score 1 Biometric score 1 Biometric score 2 No! Traditional Training mixes the genuine and imposter scores from different trials. Biometric score 1

Training Iterative Methods for Identification Tasks Biometric score 2 Impostor Genuine Biometric score 2 Impostor Genuine Biometric score 1 Model Biometric score 1 Biometric score 2 Training MUST process scores from one identification trial as a single training sample. Biometric score 1

Iterative Algorithms Identification Tasks Initialize a combination function Do for all identification trials Get scores from the same identification trial Adjust combination function using the criterion Genuine score should be better than any impostor score Train impostor density using Best Impostors iteratively Sum of Logistic Functions (monotonic); Coefficients are chosen so genuine score is separated from best impostor score in current iteration

Algorithm (Best Impostor LR) 2 matchers 1. Make initialization of by selecting the random impostor score pairs from each training identification trial for training 2. For each training identification trial find the impostor score pair with the biggest value of the combined score according to currently trained 3. Update by replacing the impostor score pair of this training identification trail with the current best impostor score pair 4. Repeat steps 2-3 for all training identification trials 5. Repeat steps 2-4 for predetermined number of training epochs Algorithm converges fast - after 2-3 training epochs

Iterative Methods Instead of using all impostor scores in an identification trial use best impostor score Best impostor score determined using current trained combination function Considered approaches Best impostor likelihood ratio Sum of logistic functions Neural networks utilizing best impostor scores Preliminary results Likelihood Ratio Weighted sum Best Impostor Likelihood Ratio Logistic Sum Neural Network CMR&WMR 69.84 81.58 80.07 81.43 81.67 li & C 97.24 97.23 97.01 97.34 97.39 li & G 95.90 95.47 95.99 96.17 96.29 Future Research Theoretically - still do not know if any of proposed algorithms is optimal Practically - need more experiments and possibly other algorithms

Summary Verification Task (a) (b) Low Medium I Medium II High (c) Identification Task Low Medium I Medium II High Principled Approach to Fusion 8 different combination methods 4 architectures and 2 operation modes Theoretically proved difference in optimal combination functions for pairs (a) and (b) Medium I and Medium II combinations utilization of score set statistics Different score set statistics for (c) (BTAS08) Identification task combinations iterative methods

Thank You Venu@cubs.buffalo.edu