Spot Me - A Smart Attendance System Based On Face Recognition

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Word Segmentation of Off-line Handwritten Documents

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Lecture 1: Machine Learning Basics

Python Machine Learning

Knowledge Transfer in Deep Convolutional Neural Nets

Rule Learning With Negation: Issues Regarding Effectiveness

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

Human Emotion Recognition From Speech

Australian Journal of Basic and Applied Sciences

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Modeling function word errors in DNN-HMM based LVCSR systems

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Learning Methods for Fuzzy Systems

A study of speaker adaptation for DNN-based speech synthesis

Rule Learning with Negation: Issues Regarding Effectiveness

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

Modeling function word errors in DNN-HMM based LVCSR systems

Calibration of Confidence Measures in Speech Recognition

Speech Recognition at ICSI: Broadcast News and beyond

INPE São José dos Campos

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Forget catastrophic forgetting: AI that learns after deployment

Probabilistic Latent Semantic Analysis

A Case-Based Approach To Imitation Learning in Robotic Agents

arxiv: v1 [cs.cv] 10 May 2017

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Generative models and adversarial training

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Time series prediction

Evolution of Symbolisation in Chimpanzees and Neural Nets

arxiv: v2 [cs.cv] 30 Mar 2017

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

SARDNET: A Self-Organizing Feature Map for Sequences

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

A Context-Driven Use Case Creation Process for Specifying Automotive Driver Assistance Systems

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Disambiguation of Thai Personal Name from Online News Articles

Speech Emotion Recognition Using Support Vector Machine

Linking Task: Identifying authors and book titles in verbose queries

WHEN THERE IS A mismatch between the acoustic

Data Fusion Models in WSNs: Comparison and Analysis

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Airplane Rescue: Social Studies. LEGO, the LEGO logo, and WEDO are trademarks of the LEGO Group The LEGO Group.

A Neural Network GUI Tested on Text-To-Phoneme Mapping

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

CS Machine Learning

Evolutive Neural Net Fuzzy Filtering: Basic Description

Evaluating Interactive Visualization of Multidimensional Data Projection with Feature Transformation

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

On the Combined Behavior of Autonomous Resource Management Agents

LEGO MINDSTORMS Education EV3 Coding Activities

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

AUTOMATED FABRIC DEFECT INSPECTION: A SURVEY OF CLASSIFIERS

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

THE WEB 2.0 AS A PLATFORM FOR THE ACQUISITION OF SKILLS, IMPROVE ACADEMIC PERFORMANCE AND DESIGNER CAREER PROMOTION IN THE UNIVERSITY

16.1 Lesson: Putting it into practice - isikhnas

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Artificial Neural Networks written examination

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

Lip Reading in Profile

A Review: Speech Recognition with Deep Learning Methods

THE world surrounding us involves multiple modalities

Using EEG to Improve Massive Open Online Courses Feedback Interaction

Probability estimates in a scenario tree

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

International Journal of Computational Intelligence and Informatics, Vol. 1 : No. 4, January - March 2012

Device Independence and Extensibility in Gesture Recognition

Learning Methods in Multilingual Speech Recognition

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

Course Law Enforcement II. Unit I Careers in Law Enforcement

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

A Reinforcement Learning Variant for Control Scheduling

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Strengthening assessment integrity of online exams through remote invigilation

On the Formation of Phoneme Categories in DNN Acoustic Models

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma

A Comparison of Two Text Representations for Sentiment Analysis

Automating the E-learning Personalization

A Case Study: News Classification Based on Term Frequency

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Education the telstra BLuEPRint

A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK. Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren

Test Effort Estimation Using Neural Network

arxiv: v1 [cs.lg] 15 Jun 2015

Transcription:

Spot Me - A Smart Attendance System Based On Face Recognition Kacela D Silva 1, Sharvari Shanbhag 2, Ankita Chaudhari 3, Ms. Pranali Patil 4 1,2,3Student, Department of Computer Engineering, New Horizon Institute of Technology and Management (Thane) Mumbai, Maharashtra. 4Professor, Department of Computer Engineering, New Horizon Institute of Technology and Management (Thane) Mumbai, Maharashtra. ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Face detection and recognition have a wide range of possible applications. Several approaches for face detection and recognition have been proposed in the last decade. In recent years, deep convolutional neural networks (CNNs) have gained a lot of popularity. The traditional method of face recognition, such as Eigenface, is sensitive to lighting, noise, gestures, expressions and etc. Hence, we utilize CNN to implement face recognition. A deep learning-based face recognition attendance system is proposed in this paper. Due to the fact that CNNs achieve the best results for larger datasets, image augmentation will be applied on the original images for extending the small dataset. The attendance record is maintained in an excel sheet which is updated automatically by the system. Key Words: Face Detection, Face Recognition, Deep Learning, Attendance System, Convolution Neural Network 1. INTRODUCTION Traditionally, students attendance records are taken manually by teachers through roll calling in the class or the students are required to physically sign the attendance sheet each time for the attendance of each class. In order to surmount the shortcomings of traditional attendance, we propose a more convenient and intelligent attendance system which is implemented using face recognition. Face recognition is one of the most intensively studied technologies in computer vision, with new approaches and encouraging results reported every year. Face recognition approaches are generally classified as feature based and holistic approaches. In holistic based approaches, recognition is done based on global features from faces, whereas in feature-based approaches, faces are recognized using local features from faces. Besides, in holistic approaches, statistics and Artificial Intelligence (AI) is used that learn from and perform well on a dataset of facial images. Some statistical methods are Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA). AI utilizes neural networks and machine learning techniques to recognize faces. AI based approaches provide more accuracy than statistical and hence we have proposed deep learningbased face recognition. Deep learning is a sub-field of machine learning dealing with algorithms inspired by the structure and function of the brain called artificial neural networks. In other words, it mirrors the functioning of our brains. A neural network is a collection of layers that transforms the input in some way to produce an output. Each layer in the neural network consists of weights and biases which are just numbers that augment the input. Deep neural network is basically a neural network that has many layers. One of differences between machine learning and deep learning model is the feature extraction area. Feature extraction is done by humans in machine learning whereas deep learning models do it themselves. The face recognition process can be divided into two parts: face detection and face recognition. Face detection refers to computer technology that is able to identify the presence of people s faces within digital images. Face recognition goes way beyond recognizing when a human face is present. It attempts to establish whose face it is. The approach we are going to use for face recognition is pretty straightforward. The key here is to get a deep neural network to produce a bunch of numbers that describes a face, known as face encodings. When we pass in two different images of the same person, the network should return similar outputs for both images and when we pass in images of two different people, the network should return different outputs for the two images. This means that the neural network needs to be trained to automatically identify different features of faces and calculate numbers based on that. 2. LITERATURE SURVEY In the existing system, the lecturer takes the attendance of the students during lecture time by calling each and every student or by passing the attendance sheet around the class. This method is time consuming. Also, there is always a chance of proxies. Moreover, records of attendance are difficult to handle and preserve for a long-term. As a result of the active progress in software technologies, there are now many different types of computerized monitoring and attendance systems. These systems mostly differ in the core technology they use. Several classical methods are still used extensively such as Eigenface [1], which is presented by M. Turk and A. Pentland. The method of Eigen face provides a simple and cheap way for face 2019, IRJET Impact Factor value: 7.211 ISO 9001:2008 Certified Journal Page 4239

recognition. Furthermore, it reduces statistical complexity in face image representation adequately. However, the limitations of Eigenface are also obvious, it is sensitive to lighting, scale and it requires a highly controlled environment. However, in actual scenes, the environment cannot be controlled, like light intensity, pose variation, facial expressions, image noise and etc. Kadry et al. in the paper [2] presented a wireless attendance system based on iris recognition. The system uses an eye scan sensor and Daugman s algorithm for the iris recognition. Patil et al. in [3] applied face recognition for classroom attendance. They used Eigenface for the recognition, but the overall accuracy of the system was not mentioned in the paper. Tharanga et al. in [4] used Principle Component Analysis (PCA) method for the face recognition for their attendance system, achieving an accuracy of 68%. In recent years, there has been a tremendous progress in the field of deep learning and the accuracy of face recognition has drastically improved by the usage of deep CNNs. Schroff et al. in [5] presented the revolutionary system FaceNet which depends on the Deep Neural Network (DNN) for the face recognition task. The proposed method achieved astonishing results on the Labeled Faces in the Wild (LFW) dataset, 99.63% accuracy. Arsenovic et al. in [6] used OpenFace library and pre-trained FaceNet for developing the face recognition model which achieved an accuracy of 95 %. Motivated by these results, this paper proposes a deep learning based face recognition attendance system. Motivated by these results, this paper proposes a deep learning-based face recognition attendance system. 3. PROPOSED METHODOLOGY The whole method of developing the deep learning-based attendance system is explained in detail below. The developing procedure is divided into several important stages, including obtaining the training dataset and augmentation, training the model and finally taking the attendance and storing it into the excel sheet. 3.1 Dataset Preparation and Augmentation The first step is to prepare the dataset of students. In this, the students were photographed using a webcam. Approximately 10-12 images were taken of each student with changes in movements and expressions. The images of each student was saved in a different folder labelled with the student s name. After this step, we performed augmentation. In this step, we applied different transformations on the images like adding noise, flipping, blurring, etc. For this purpose we used scikitimage, a python library that includes algorithms for image transformation and manipulation. The purpose of augmentation is taking images that are already in the training dataset and manipulating them to create many altered versions of the same image. This both provides more images to train on and also help expose the model to a wider variety of lighting and coloring situations so as to make it more robust. Fig -2: Speckled and flipped images 3.2 Face Detection and Alignment Face detection and alignment are based on the paper Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Neural Networks by Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li and Yu Qiao [7]. This model has three convolutional networks (P-Net, R-Net, and O-Net). This method of face detection works well on non-frontal faces. Once the faces have been detected, they get cropped and are stored for further processing. Fig -1: Block diagram Fig -3: Bounding box on detected faces 2019, IRJET Impact Factor value: 7.211 ISO 9001:2008 Certified Journal Page 4240

3.3 Generating Embeddings To perform facial recognition, we need to uniquely represent a face. FaceNet is a deep learning architecture which uses a convolutional neural network. FaceNet returns a 128- dimensional vector embedding for each face. FaceNet uses the concept of triplet loss. pre-trained model. To further improve the accuracy, we performed augmentation. We then trained the model with the enlarged dataset on Google Colab which provides GPU on cloud service for free. During testing, some of the images were predicted incorrectly. False predictions were mostly made on images with bad lighting wherein the face was not completely visible. Also, the model makes false prediction if the face is extremely turned sideways. In general, it is very hard to get 100% accuracy. After training and prediction, an excel sheet is generated wherein the recognized students are marked present and the rest are marked absent. The attendance is marked as per the timetable which is stored in the form of an excel sheet and for each month, a different worksheet is activated. 5. RESULT Fig -4: The triplet loss function minimizes the distance between an anchor and a positive, both of which have the same identity and maximizes the distance between the anchor and a negative of a different identity As we mentioned earlier, we used two FaceNet models for training purpose. Initially, we trained both the models with non augmented dataset and achieved the following results. 3.4 Training the Classifier The final step is training the classifier based on the previously generated embedding from students dataset. These embeddings will be used as feature inputs into scikit-learn s SVM classifier. The classifier helps to find the person in our dataset of known people who has the closest measurements to the test image. 3.5 Marking the Attendance Once the face recognition process is completed, an excel sheet is generated and attendance is marked for the recognized students for that particular day. Fig -5: Image tested on 20170511-185253 model In the 2017 model, 2 students were recognized incorrectly. The same image was then tested with the 2018 model. 4. IMPLEMENTATION We have used two FaceNet models for training our dataset. There are two pre-trained FaceNet models [8] available - the 2017 model and 2018 model. The major difference with these two models is that the dimensions of the embeddings vector has been increased from 128 to 512. Initially, the two models were trained on a small number of images per student i.e. the dataset consisted of 10-15 images per person. Our dataset comprises of total 20 students. After training both the models, we tested it on a group of 11-15 students and out of the two models, the 2018 model performed better. So, we decided to go ahead with the 2018 Fig -6: Image tested on 20180402-114759 model 2019, IRJET Impact Factor value: 7.211 ISO 9001:2008 Certified Journal Page 4241

Here only 1 student was recognized incorrectly. In this way, we tested both the models on many different test images and found the 2018 model performing better than the 2017 model. Here in the first test image, the model was able to distinguish between the twins. But as seen below in the second test image, it failed to recognize the twins correctly. In order to further improve the results, we performed augmentation and trained the 2018 model again. Fig -10: Twins test image 2 Fig -7: Image tested on 20180402-114759 model with augmentation In this image, the student s face is turned sideways and is not completely visible. Hence the model fails to recognize correctly. After performing face recognition, the next step is marking the attendance. The attendance is taken as per the time-table stored. All those who get recognized in the picture are marked as present and rest as absent. For each subject there will be a new attendance sheet and once the sheet has been created, next attendance for the same subject will be marked in a new column date-wise. Fig -8: Another image tested on 20180402-114759 model with augmentation Also, the attendance of next month for a particular subject will be stored in the same excel document but in a different worksheet which is labeled with the month number, for e.g. March worksheet will be labeled as 3 In both the images, all the students were recognized correctly. With this we reached to the conclusion that training a model with an augmented dataset gives better results. In order to test the accuracy of the model even further we decided to test the model on twins. Fig -9: Twins test image 1 2019, IRJET Impact Factor value: 7.211 ISO 9001:2008 Certified Journal Page 4242

REFERENCES Fig -11: Attendance sheet for Fig -7 [1] M. Turk, A. Pentland, "Eigenfaces for recognition", Journal of cognitive neuroscience, vol. 3, no. 1, pp. 71-86, 1991. [2] Kadry Seifedine, Mohamad. Smaili, "Wireless attendance management system based on iris recognition", Scientific Research and Essays 5.12, pp. 1428-1435, 2013. [3] Patil Ajinkya, Mrudang. Shukla, "Implementation of Classroom Attendance System Based On Face Recognition in Class", International Journal of Advances in Engineering & Technology 7.3, 2014. [4] Tharanga, JG. Roshan et al., "SMART ATTENDANCE USING REAL TIME FACE RECOGNITION (SMART-FR)", Department of Electronic and Computer Engineering Sri Lanka Institute of Information Technology (SLIIT) Malabe Sri Lanka. [5] Schroff Florian, Dmitry. Kalenichenko, James. Philbin, "Facenet: A unified embedding for face recognition and clustering", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. [6] Marko Arsenovic, Srdjan Sladojevic, Andras Anderla, Darko Stefanovic, FaceTime Deep Learning Based Face Recognition Attendance System, IEEE 15th International Symposium on Intelligent Systems and Informatics (SISY), 2017 [7] Kaipeng Zhang et al., "Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks", IEEE Signal Processing Letters, vol. 23, no. 99, pp. 1499-1503, 2016. [8] https://github.com/davidsandberg/facenet 6. CONCLUSION The traditional method of taking attendance is very time consuming as well as it is difficult to handle the attendance records and preserve it for a long time. In this paper, a deep learning based face recognition attendance system has been proposed. Due to the fact that deep learning based models (CNN) achieve better results with larger dataset, augmentation was performed to extend the small dataset. As mentioned earlier the model sometimes fails to recognize a person correctly. The model may perform better if the quality of training dataset is improved. Also, if a high quality camera is used to capture the image of the students for attendance, then it may lead to better results. The future scope of the system is to implement it commercially so that this product can be useful in schools, colleges and at university level. 2019, IRJET Impact Factor value: 7.211 ISO 9001:2008 Certified Journal Page 4243