LITERATURE SURVEY. programmed and it is based on the concept of learning from data.

Similar documents
Python Machine Learning

Lecture 1: Machine Learning Basics

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Laboratorio di Intelligenza Artificiale e Robotica

Lecture 1: Basic Concepts of Machine Learning

CS Machine Learning

Rule Learning With Negation: Issues Regarding Effectiveness

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

Applications of data mining algorithms to analysis of medical data

Computerized Adaptive Psychological Testing A Personalisation Perspective

Time series prediction

Classification Using ANN: A Review

Software Maintenance

Word Segmentation of Off-line Handwritten Documents

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

Laboratorio di Intelligenza Artificiale e Robotica

Knowledge based expert systems D H A N A N J A Y K A L B A N D E

Assignment 1: Predicting Amazon Review Ratings

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Welcome to. ECML/PKDD 2004 Community meeting

Study and Analysis of MYCIN expert system

Evolutive Neural Net Fuzzy Filtering: Basic Description

(Sub)Gradient Descent

MYCIN. The MYCIN Task

Australian Journal of Basic and Applied Sciences

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Rule Learning with Negation: Issues Regarding Effectiveness

CS 446: Machine Learning

Seminar - Organic Computing

A Case Study: News Classification Based on Term Frequency

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Learning From the Past with Experiment Databases

Speech Emotion Recognition Using Support Vector Machine

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Mining Association Rules in Student s Assessment Data

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

CSL465/603 - Machine Learning

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Universidade do Minho Escola de Engenharia

Human Emotion Recognition From Speech

Learning Methods for Fuzzy Systems

Reducing Features to Improve Bug Prediction

A survey of multi-view machine learning

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT

Test Effort Estimation Using Neural Network

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

A study of speaker adaptation for DNN-based speech synthesis

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Speech Recognition at ICSI: Broadcast News and beyond

Telekooperation Seminar

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Probability estimates in a scenario tree

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

The One Minute Preceptor: 5 Microskills for One-On-One Teaching

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Knowledge-Based - Systems

The Good Judgment Project: A large scale test of different methods of combining expert predictions

GACE Computer Science Assessment Test at a Glance

A Comparison of Standard and Interval Association Rules

Axiom 2013 Team Description Paper

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Calibration of Confidence Measures in Speech Recognition

Automating the E-learning Personalization

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

A SURVEY OF FUZZY COGNITIVE MAP LEARNING METHODS

Comparison of network inference packages and methods for multiple networks inference

Abstractions and the Brain

Multisensor Data Fusion: From Algorithms And Architectural Design To Applications (Devices, Circuits, And Systems)

What is a Mental Model?

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Ph.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept B.Tech in Computer science and

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

Multivariate k-nearest Neighbor Regression for Time Series data -

INPE São José dos Campos

Issues in the Mining of Heart Failure Datasets

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Data Fusion Models in WSNs: Comparison and Analysis

Global Health Kitwe, Zambia Elective Curriculum

Artificial Neural Networks written examination

arxiv: v2 [cs.cv] 30 Mar 2017

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Exposé for a Master s Thesis

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

arxiv: v1 [cs.cl] 2 Apr 2017

On-Line Data Analytics

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Transcription:

9 CHAPTER 2: LITERATURE SURVEY 2.1. Machine learning Machine learning is a branch of artificial intelligence that aims at solving real life engineering problems. It provides the opportunity to learn without being explicitly programmed and it is based on the concept of learning from data. It is so much ubiquitously used dozen a times a day that we may not even know it. The advantage of machine learning (ML) methods [1] is that it uses mathematical models, heuristic learning, knowledge acquisitions and decision trees for decision making. Thus, it provides controllability, observability and stability. It updates easily by adding a new patient s record. The application of machine learning models [2] on human disease diagnosis aids medical experts based on the symptoms at an early stage, even though some diseases exhibit similar symptoms. One of the important problems in multivariate techniques is to select relevant features from the available set of attributes [3]. The common feature selection techniques include wrapper subset evaluation, filtering and embedded models. Embedded models use classifiers to construct ensembles, the wrapper subset evaluation method provides ranks to features based on their importance and filter methods rank the features based on statistical measurements.

10 2.2. Computer aided diagnosis systems A computer aided medical diagnosis system [4] generally consists of a knowledge base and a method for solving an intended problem. On the basis of the query posted to the system, it provides assistance to the physicians in diagnosing the patients accurately. The knowledge base of such medical systems relies on the inputs that spring up from the clinical experience of field experts. Knowledge acquisition is the process of transforming human expert knowledge and skills acquired through clinical practice to software. It is quite time consuming and labor intensive task. Common methods like Case Based Reasoning (CBR) solves the knowledge acquisition problem to some extent because the past records are maintained in a database, including possible remedies, past clinical decisions, preventive measures and expected diagnostic outcome measures. During patient diagnosis, the clinical database is matched for analogous past patient s record for taking suitable decisions. Some of the major problems faced during the development of an expert diagnosis system are: medical experts are less interested to share their knowledge with others, experience knowledge (called common sense) is practically impossible to be separated and designing a unique expert system for diagnosing all diseases is difficult. 2.3. Software reliability Software reliability [5] is defined as the probability that a system will not have a failure over a specified period of time under specific conditions. The knowledge of software reliability is very vital in critical systems because it indicates the design perfection [6]. In

11 this work, the primary aim is to enhance the software reliability of the computer aided diagnosis systems using machine learning algorithms. To provide quality treatment and prevent misdiagnosis are the prime motivations for developing a medical diagnosis system. Diagnosing a disease of a patient accurately is a great challenge in medical field. A huge amount is spent on advanced primary health care devices based on software reliability research as they are considered as critical systems. There are several software reliability models available in the literature; however, none of the models are perfect. An important research issue is choosing a suitable estimation model based on a specific application. One advantage of software reliability over hardware reliability is that a mechanical part surely undergoes ageing; suffer from wear and tear problem over time and usage; however software do not rust or wear out. Software reliability is a vital parameter for software quality, functionality and performance. Some common software reliability models are prediction and estimation models like bathtub curve, exponential, Putnam etc. 2.4. Supervised learning Supervised learning is the most common form of machine learning scheme used in solving the engineering problems [7]. It can be thought as the most appropriate way of mapping a set of input variables with a set of output variables. The system learns to infer a function from a collection of labeled training data. The training dataset contains a set of

12 input features and several instance values for respective features. The predictive performance accuracy of a machine learning algorithm depends on the supervised learning scheme [8]. The aim of the inferred function may be to solve a regression or classification problem. There are several metrics used in the measurement of the learning task like accuracy, sensitivity, specificity, kappa value, area under the curve etc. In this work, the aim is to classify the patients as healthy or ill based on the past medical records. Before solving any engineering problem, it is vital that it is necessary to choose a suitable algorithm for the training purpose based on the type of the data. The selection of a method depends primarily on the type of the data as the field of machine learning is data driven. The next important aspect is the optimization of the chosen machine learning algorithms. 2.5. Classification task Classification task [9] is a classical problem in the field of data mining which deals with assigning a pre-specified class to an unknown data. A learning model is built based on the relationship between the predictor attribute values and the value of the target [10]. The challenge is to correctly predict the class based on learning of past data. In machine learning, this kind of classification problems are referred to as supervised learning. Hence, we need to provide a data set containing instances with known classes and a test data set for which the class has to be determined. The success of the classification ability largely depends on the quality of data provided for learning and also the type of machine learning algorithm used [11]. For example, the classification techniques can be used to predict the fraud customers in a bank who apply for a loan or classify mangoes whether

13 they are good or bad and lots of other real time applications. The most common type of classification problem is binary classification, where the target has two possible values like good or bad, yes or no etc. There are several methods for measuring the classification performance like confusion matrix, lift curve, receiver operator characteristics etc. 2.6. Optimization Every machine learning algorithm has a specific technique of learning and is based on the values of their parameters. When an algorithm is applied to solve a classification problem with a different set of parameters, the classification accuracy also differs abruptly in each case [12]. The challenge in machine learning to find the most suitable parameter values of the algorithms that solves an engineering problem to the best possible way in terms of performance metrics. Therefore, one has to fine tune the algorithm parameters that best suits the problem. There are several optimization techniques like genetic algorithm, particle swarm optimization [13], Tabu search methods etc. The focus of the study is to calibrate the algorithm parameters using design of experiment method.

14 References 1. Mandal, I., and Sairam, N. Accurate Prediction of Coronary Artery Disease Using Reliable Diagnosis System Journal of Medical Systems, 2012, Volume 36, Number 5, Pages 3353-3373. DOI: 10.1007/s10916-012-9828-0 2. Mandal, I., Sairam, N. Enhanced classification performance using computational intelligence (2011) Communications in Computer and Information Science, 204 CCIS, pp. 384-391. DOI: 10.1007/978-3-642-24043-0_39 3. Mandal, I., Sairam, N. New machine-learning algorithms for prediction of Parkinson's disease (2014) International Journal of Systems Science, 45 (3), pp. 647-666. DOI: 10.1080/00207721.2012.724114 4. Mandal, I., Sairam, N. Accurate telemonitoring of Parkinson's disease diagnosis using robust inference system (2013) International Journal of Medical Informatics, 82 (5), pp. 359-377. DOI: 10.1016/j.ijmedinf.2012.10.006 5. Torrado, N., Wiper, M.P., Lillo, R.E. Software reliability modeling with software metrics data via gaussian processes (2013) IEEE Transactions on Software Engineering, 39 (8), art. no. 6392172, pp. 1179-1186. DOI: 10.1109/TSE.2012.87 6. Kumar, P., Singh, Y. A study on software reliability prediction models using soft computing techniques (2013) International Journal of Information and Communication Technology, 5 (2), pp. 187-204. DOI: 10.1504/IJICT.2013.053119 7. Xu, X., Yang, G. Robust manifold classification based on semi supervised learning (2013) International Journal of Advancements in Computing Technology, 5 (8), pp. 174-183. DOI: 10.4156/ijact.vol5.issue6.21 8. Alajlan, N., Bazi, Y., Melgani, F., Yager, R.R. Fusion of supervised and unsupervised learning for improved classification of hyperspectral images (2012) Information Sciences, 217, pp. 39-55. DOI: 10.1016/j.ins.2012.06.031

15 9. Škrinárová, J., Huraj, L., Siládi, V. A neural tree model for classification of computing grid resources using pso tasks scheduling (2013) Neural Network World, 23 (3), pp. 223-241. 10. Silva, J.D.A., Hruschka, E.R. An experimental study on the use of nearest neighborbased imputation algorithms for classification tasks (2013) Data and Knowledge Engineering, 84, pp. 47-58. DOI: 10.1016/j.datak.2012.12.006 11. Gao, S., Xu, S., Fang, Y., Fang, J. Prediction of core cancer genes using multi-task classification framework (2013) Journal of Theoretical Biology, 317, pp. 62-70. DOI: 10.1016/j.jtbi.2012.09.027 12. Feng, G., Qian, Z., Zhang, X. Evolutionary selection extreme learning machine optimization for regression (2012) Soft Computing, 16 (9), pp. 1485-1491. DOI: 10.1007/s00500-012-0823-7 13. Han, F., Yao, H.-F., Ling, Q.-H. An improved evolutionary extreme learning machine based on particle swarm optimization (2013) Neurocomputing, 116, pp. 87-93. DOI: 10.1016/j.neucom.2011.12.062