Biometric Fusion. Venu Govindaraju. Center for Unified Biometrics and Sensors, University at Buffalo

Biometric Fusion Venu Govindaraju Center for Unified Biometrics and Sensors, University at Buffalo venu@cubs.buffalo.edu

Field of Fusion Classifier combination Other fusion application Non-ensemble combinations Classifier Ensembles Large number of classes Small number of classes BIOMETRICS

Fixed Approaches (Transformation Based) Kittler et al., On Combining Classifiers, 1998 6 rules are justified under different assumptions - score assigned to class i by the classifier j - Sum rule - Product rule - Max rule - Min rule - Median rule - Majority vote Different rules can be best in different applications

Combination Methods Approaches Fixed Statistical Description Use predetermined rules; choose best Machine Learning of Combination function learned from training data Ease of use Easy Difficult Training data requirements Optimality of combination Average Somewhat High Yes Overfitting No Maybe

Outline [Tulyakov and Govindaraju 2008] IJPRAI, IEEE TIFS PAMI 1997 Kittler et al. - Justify different Fixed combination rules - Statistical Learning PAMI 2005 Snelick et al. - Fixed rules with adaptive normalization and user weighting - Explicit use of matching score set statistics PR 2002 - Prabhakar, Jain; PAMI 2008 Nandakumar et al., - Decision-Level Fusion; Likelihood Ratio-Based Biometric Score Fusion - Not optimal for identification tasks - Explore iterative methods - Score set statistics for implicit quality measure

Verification vs Identification Task Verification Task Fingerprint matching Signature matching Face matching Combined score is thresholded to accept to verify hypothesis 26 0.31 5.54 Optimization minimize FRR for given FAR Combination algorithm 0.95 Performance indicator ROC curve Identification Task Class of max combined scores is chosen Optimization to maximize correct rate Performance indicator correct ID rate Fingerprint matching Alice Bob 26 12 Combination algorithm Signature matching Alice Bob 0.31 0.45 Face matching Alice Bob Alice Bob 5.54 7.81 0.95 0.11

Principled Statistical Approach Combination function Map M matchers x N classes scores to N combined scores - score of class i - by classifier j Learn Mapping Possible if N and M are small Handwritten digit recognition - 10 classes, 2 OCR algorithms - NNs 10x2 inputs ; 10 outputs 1 i N Matcher 1 Matcher j Matcher M Biometrics number of classes N is large; number of matchers M is large

Principled Statistical Approach Combination function Verification Task Combined score thresholded to accept to verify hypothesis Ɵ IdentificationTask Class of max combined score 1 i N Matcher 1 Matcher j Matcher M

Principled Statistical Approach Combination function Architecture 1 1 i N Matcher 1 Matcher j Matcher M

Principled Statistical Approach Combination function Architecture 2 1 i N Matcher 1 Matcher j Matcher M

Find the Optimal Combination Function Using the Column Architecture 1 Using the Matrix Architecture 2 1 i N Matcher 1 Fingerprint Matcher j Face Matcher M Iris Column Matrix

Find the Optimal Combination Function Using the Column Using the Matrix 1 i N Matcher 1 Fingerprint Matcher j Face Matcher M Iris Verification Task Identification Task

4 Architectures Score Set Matrix 1 i N 1 i N M biometrics; N users Low Medium I Medium II 1 i N 1 i N High - complexity of the family of functions accepting dimensional input VC (Vapnik-Chervonenkis) dimension [Tulyakov 06]

Parameters Score Set Matrix M biometrics; N users Low Medium I Medium II High - complexity of the family of functions accepting dimensional input VC (Vapnik-Chervonenkis) dimension [Tulyakov 06]

Parameters Score Set Matrix 1 i N 1 i N M biometrics; N users Low Medium I Medium II 1 i N 1 i N High

Independence of Matchers Low Complexity Medium II Complexity Matcher 1 Fingerprint Matcher j Face 1 i N Matcher M Iris

Independence of Matchers Low Complexity Medium II Complexity Matcher 1 Fingerprint 1 i N Independent? Matcher M Iris

Independence of Scores In a single trial Low Complexity Medium II Complexity Matcher 1 Fingerprint 1 i N Independent? Matcher M Iris Independent?

Independence of Scores In a single trial Matcher 1 Fingerprint 1 i N Dependent Matcher M Iris Dependent

RESEARCH AGENDA Find the Optimal trainable function Given the score vectors (Low Complexity) Given the entire scores matrix (Medium II Complexity) For the two tasks- verification and identification

Likelihood Ratio function Verification Tasks Pattern classification approach 2 classes genuine and impostor verification attempts Biometric score 2 Impostor for Genuine Biometric score 1 Minimum risk criteria optimal decision boundaries coincide with the contours of likelihood ratio function well-known [Prabhakar, Jain 02] [Nandkumar, Jain, Das 08] Density-based OR multiple classification-based methods (NN, SVM, etc.) possible

Verification Task li & C (given columns)

Verification Task li & G (given columns)

Optimal Combination functions Likelihood Ratio (Using Columns) Identification Task Results Top choice correct rate Verification Task Results ROC CMR is correct 54.8% WMR is correct 77.2% Both are correct 48.9% Either is correct 83.0% Likelihood Ratio 69.8% Weighted Sum 81.6% LR combination is worse than single matcher

Dependence of Scores In a single trial Matcher 1 Fingerprint 1 i N Dependent? Matcher M Iris Dependent?

Optimal Trainable Combination Function Minimizing misclassification cost Classify as rather than Assume that scores assigned to different classes are independent

Example (Low complexity) Hypothetical densities of genuine and impostor scores generated as follows Matcher 1 genuine and impostor scores sampled independently from corresponding densities Matcher 2 in every identification trial score dependency exists Verification task Matchers 1 & 2 have same performance Identification task Matcher 2 has perfect performance; genuine score is always on top

Example Identification Task Likelihood ratio combination can be worse in identification system than single matcher Matcher 2 1 2 (identified) s 1 2 Matcher 1 Matcher 2 Matcher 1 identification trial failure Impostor ( 2) obtains higher LR score than genuine ( 1);

Training Iterative Methods for Identification Tasks Biometric score 2 Impostor Genuine Biometric score 2 Impostor Genuine Biometric score 1 Biometric score 1 Biometric score 2 No! Traditional Training mixes the genuine and imposter scores from different trials. Biometric score 1

Training Iterative Methods for Identification Tasks Biometric score 2 Impostor Genuine Biometric score 2 Impostor Genuine Biometric score 1 Model Biometric score 1 Biometric score 2 Training MUST process scores from one identification trial as a single training sample. Biometric score 1

Iterative Algorithms Identification Tasks Initialize a combination function Do for all identification trials Get scores from the same identification trial Adjust combination function using the criterion Genuine score should be better than any impostor score Train impostor density using Best Impostors iteratively Sum of Logistic Functions (monotonic); Coefficients are chosen so genuine score is separated from best impostor score in current iteration

Algorithm (Best Impostor LR) 2 matchers 1. Make initialization of by selecting the random impostor score pairs from each training identification trial for training 2. For each training identification trial find the impostor score pair with the biggest value of the combined score according to currently trained 3. Update by replacing the impostor score pair of this training identification trail with the current best impostor score pair 4. Repeat steps 2-3 for all training identification trials 5. Repeat steps 2-4 for predetermined number of training epochs Algorithm converges fast - after 2-3 training epochs

Iterative Methods Instead of using all impostor scores in an identification trial use best impostor score Best impostor score determined using current trained combination function Considered approaches Best impostor likelihood ratio Sum of logistic functions Neural networks utilizing best impostor scores Preliminary results Likelihood Ratio Weighted sum Best Impostor Likelihood Ratio Logistic Sum Neural Network CMR&WMR 69.84 81.58 80.07 81.43 81.67 li & C 97.24 97.23 97.01 97.34 97.39 li & G 95.90 95.47 95.99 96.17 96.29 Future Research Theoretically - still do not know if any of proposed algorithms is optimal Practically - need more experiments and possibly other algorithms

Summary Verification Task (a) (b) Low Medium I Medium II High (c) Identification Task Low Medium I Medium II High Principled Approach to Fusion 8 different combination methods 4 architectures and 2 operation modes Theoretically proved difference in optimal combination functions for pairs (a) and (b) Medium I and Medium II combinations utilization of score set statistics Different score set statistics for (c) (BTAS08) Identification task combinations iterative methods

Thank You Venu@cubs.buffalo.edu