Childhood Obesity epidemic analysis using classification algorithms
|
|
- David Hamilton
- 6 years ago
- Views:
Transcription
1 Childhood Obesity epidemic analysis using classification algorithms Suguna. M M.Phil. Scholar Trichy, Tamilnadu, India suguna15.9@gmail.com Abstract Obesity is the one of the most serious public health challenges of 21 st century globally there are more than one billion overweight adults and it is increasing three fold. In Today s world people get affected by many diseases where such disease leads to many other diseases. Obesity is such a condition that leads to many diseases. In this paper several data mining classification algorithms are used to detect the obesity and overweight conditions in children and detect the cause further. Keywords: Data mining, classification, childhood obesity, J48, CART, Naive Bayes I. INTRODUCTION Data mining is the process of extracting knowledge from large data. It is also called as knowledge in database. It is performed in iterative sequence. Classification maps data into predefined groups or classes often referred as supervised learning as classes are determined prior to examine. A classification task begins with a data set in which the class assignments are known. Medical diagnosis is a classification process. Fat is one of the main macronutrients. Fat plays an important role in preserving vital organs and it also maintains body temperature and it also store calories for future use. But however when it is stored in excess lead to obesity. Obesity is a serious worldwide health epidemic that affects one in four Americans. This phenomenon is global and about 30 million Indian are found to be obese and found to be double in next five years. Obesity is the condition that arise when body fat has accumulated to the extent that may have a negative effect on health, and it also leads to many other disease such as diabetes, sleep apnea, heart disease, stroke, gout, gallbladder disease, high blood pressure,osteoarthritis and even certain type of cancer. Childhood obesity is one of the most serious public health challenges of the 21st century. Children and adolescents who are obese are likely to be obese as adults and they are likely to have many health related and psychological problems. Childhood obesity is doubled in children and quadrupled in adults in the past years. The paper is organized as follows: section II as Related works, section III as Data set description, section IV as Proposed model, section V as Experimental results and finally section VI as Conclusion II. RELATED WORKS Adnan M.H.M [4] et al in their work provides the review of various data mining methods and its utilization in predictions of obesity and points outs the merits and demerits of particular data mining techniques. Adnan M.H.M [5] et al in their paper provides a novel framework for predictions of childhood obesity predictions using NBtree. Abduh Elbanna[6] et al in gastroenterology and surgery using data mining in obesity related gastrointestinal motility disorder; (IBS) and surgical bariatric approaches. Naive Bayes algorithm is used to obtain result and the rapid miner tool was employed. Alex Aussem [7] et al in their paper provided the framework for the analysis of visceral obesity and its determinants in women using Bayesians Networks. Archana Bhattarai [8] et al in their research work use text classification to predict obesity and its co-morbidities. Various algorithms such as Naive Bayes, Support Vector Machine (SVM), J48 and J48 and extraction were used. Syed Taghi Heydari[10] et al in their work performed a comparative study on detection of obesity using Artificial Neural Network and Logistics Regression. They predicted that Artificial Neural Network provides better result than Logistics Regression however they cited the requirement of a customized classifier for the Neural Networks.Sivaranjani.T [11] provides a comparative study on the prediction of obesity by using KNN and ID3 algorithms.she predicted that the ID3 algorithm works efficiently and provide more accurate result than the KNN algorithm by using Rapid Miner tool.shaoyan Zhang et al[12]in their paper provided a comparison of logistic regression with six data mining techniques such as decision trees (C4.5), association rules, Artificial Neural Networks (ANN), Naive Bayes, Bayesian networks and Support Vector Machines (SVM).They concluded that Support Vector Machine (SVM) provides more accurate result than the rest five data mining techniques. Sunita Soni et al [13] in their work discussed about the Case Based Reasoning (CBR).Three Data Mining techniques such as Nearest Neighborhood, Decision Tree and Bayesian Classification, were applied on distributed case bases for Case retrieval and Case adaptation. their work talks about the use of data mining technology in RES Publication 2012 Page 22 III. DATA SET DESCRIPTION The data are collected from the Child and Adolescent Health Measurement Initiative (CAHMI). DRC Indicator Dataset: 2007 National Survey of Children s Health. Data Resource Center for Child and Adolescent Health, ww.childhealthdata.org.. The objective of the data set is to
2 analysis the obesity among children whose age is between 10 to 17. The data set consists of 12 attributes that are used to predict the obesity. The detailed description of the data set are given in the below table data set description No Name of the Description attribute 1 CID Children Id 2 Age Age of the children 3 Gender Gender of the children 4 Weight Weight (pounds) 5 Height Height (mts) 6 Sleep hour Sleep time 7 Exercise time Exercise time 8 Time on Time spend in Computer computer 9 Watching TV Time spend on hours watching TV 10 Depression Depression level 11 Health status Status of health 12 Body Mass Index Value 13 CLS class 14 Category Category of the The attributes given here are based on the data type. The data type used here is numerical and nominal, here category and gender takes nominal values and the rest of the data takes the numerical values. IV. PROPOSED MODEL In the proposed method mainly decision tree is used for predicting the obesity from the given data set instances. Here the framework can be given as below, In the proposed model three different types of decision tree algorithms such as Simple Cart, J48 and NB Tree are applied on type obesity dataset in the WEKA tool and the performance is calculated. 4.1 Simple Cart Table 4.1: Proposed Framework INPUT DATA PREPROCESSING COLLECT DB DATA CLEANING DECISION TREE ALGORITHM OUTPUT CART is the Classification And Regression Tree it is greedy algorithm, it is used to build binary decision tree in that it chooses the locally best discriminatory feature at each stage in the process. As with ID3 entropy is used as a measure to choose the best splitting attribute and criterion. Here child is created for each sub category, only two children are created. The splitting is performed around what is determined to be the best split point. At each step an exhaustive search is performed to determine the best split. CART handles the missing data by simply ignoring that record in calculating the goodness of a split on that attribute. The tree stops growing when no split will improve the performance, it also contains a pruning strategy. 4.2 J48 Algorithm J48 builds decision trees from a set of labeled training data using the concept of information entropy. It uses the fact that each attribute of the data can be used to make a decision by splitting the data into smaller subsets. J48 examines the normalized information gain (difference in entropy) that results from choosing an attribute for splitting the data. To make the decision, the attribute with the highest RES Publication 2012 Page 23
3 normalized information gain is used. Then the algorithm recurs on the smaller subsets. The splitting procedure stops if all instances in a subset belong to the same class. Then a leaf node is created in the decision tree telling to choose that class. But it can also happen that none of the features give any information gain. In this case J48 creates a decision node higher up in the tree using the expected value of the class.j48 can handle both continuous and discrete attributes, training data with missing attribute value sand attributes with differing costs. Further it provides an option for pruning trees after creation. Fig 5.1 Performance of the Algorithms based on the time taken 4.3 NB Tree Naive bayes classification is a type of Bayesian classification. It is a simple classification technique. It assumes that the effect of an attribute value on a given class is independent of the values of the other attributes. This assumption is called class conditional independence. It exhibits high accuracy and speed when applied to large database. It is used to compare it results with decision tree networks and neural networks. V. EXPERIMENTAL RESULTS The given three types of decision tree algorithms like Simple Cart, J48 and NB Tree are applied on the obesity data set in WEKA and the performance of the algorithm are given based various factors. The performance can be obtained based on the time taken to build the tree and correctly classified instances Fig 5.2 Performance of the Algorithms based on the accuracy Table 5.1 Time taken by the algorithms to build the decision tree Name of the Algorithm Simple Cart Time Taken to build the decision tree 0.03 seconds J seconds NB Tree 0.45seconds RES Publication 2012 Page 24
4 The decision tree model for the obesity is given by the tree structure. Fig 5.3 Decision tree for obesity dataset Body Mass Index Table 5.3 confusion matrix for Simple CART,NB Tree and J <= Under weight The above decision tree algorithm predicts the class label. The final output will be patterns which are used to find the children fall in which Class based on it they are classified as Underweight, Normal, Overweight and obese or not. A Confusion Matrix is a useful visualization tool for analyzing the classifier accuracy. Structure of the confusion matrix can be given as below Where, Normal CLASS Table 5.2: Structure of the Confusion Matrix TP FP >1 <= >3 Over weight TN TP is True Positive, obese children correctly diagnosed. FP is False Positive, Normal people incorrectly identified as obese. TN is True Negative, Normal people correctly identified as healthy. FN is False Negative, obesity people incorrectly identified as healthy. The confusion matrix based on the execution of decision tree classification algorithm such as Simple Cart, NB tree and J48 FN Obese From the various factors such as time taken to build the decision tree, accuracy and confusion matrix we arrive at a conclusion that J48 algorithm provide better result. using WEKA tool are given below, RES Publication 2012 Page 25 VI. CONCLUSION Data mining plays a vital role in mining large database and its usage in determining hidden information is very useful in medical mining. In medical field classification techniques have high utility. This experimental model is built based as a test case on the training dataset. This experiment is successfully performed in children health training dataset with several data mining classification algorithms and found that J48 algorithm provides better performance with 100% of accuracy and minimum time taken. It is believed that the data mining can help in the obesity research and improve the health care of people with overweight and obesity. It can also be implemented using several classification techniques. REFERENCE [1] Han J. Kamber. M, Data Mining; Concepts and Techniques, Morgan Kaufmann Publishers. [2] Margaret H. Dunham, Data Mining Techniques and Algorithms, Prentice Hall Publishers. [3] Arun.K.Pujari, Data mining Techniques, University Press( India) Private Limited,2001 [4] Adnan, M.H.M., Husain.W, Rashid, N.A.A.Rashid, A survey on utilization of data mining for childhood obesity prediction, Information and Telecommunication Technologies(APSITT), th Asia -pacific symposium on, vol., no., pp.1,6, July [5] Adnan, M.H.M,Husain. W., Rashid, N.A.A.Rashid, A framework for childhood obesity classifications and predictions using NBtree, Information Technology in Asia (CITA 11), th International Conference, vol., no., pp.1,6, July [6] AbduhElbanna, AbdElrazek M., AlyAbdElrazek Is Advanced Statistical Computing Technology a Clue in Applied Medicine? A Study using Data Mining as a Predictor Technology in Gastroenterology & Bariatric Surgery; Novel Elbanna Operations, Global Journal of Computer Science
5 and Technology Software & Data EngineeringVolume 13 Issue 12 Version 1.0 Year [7] Aussem et al., Analysis of lifestyle and metabolic predictors of visceral obesity with Bayesian Networks. BMC Bioinformatics :487. [8] ArchanaBhattarai, VasileRus, DipankarDasgupta., Classification of Clinical Conditions: A Case Study on Prediction of obesity and its Co-morbidities, Department of Computer Science,The university of Memphis,209 Dunn Hall. [9] LalithaSarojaThota et al., A Review on Information Technology in Obesity Epidemic: Prediction and Prevention, International Journal of Advanced Research in Computer Science and Software Engineering 3(9), September , pp [10] SeyedTaghiHeydari et al., Comparison of Artificial Neural Networks with Logistic Regression for Detection of Obesity Springer April [11] T.Sivaranjani, Comparative Study on Obesity Based on ID3 and KNN International Journal of Advanced Research in Computer Science And Management Studies pp 389,396vol.2, Issue 9, September [12] Shaoyan Zhang, C.T., XiaojunZeng, Hong Qiao, Iain Buchan, John Keane, Comparing data mining methods withlogistic regression in childhood obesity prediction. Information Systems Frontiers, (4): p. 51. [13] Soni, S.; Pillai, J., "Usage of Nearest Neighborhood, Decision Tree and Bayesian Classification Techniques in Development of Weight Management Counseling System," Emerging Trends in Engineering and Technology, ICETET '08. First International Conference, vol., no., pp.691, 694, July BIOGRAPHY M.Suguna, M.Sc, B.Ed, M.Phil, completed M.phil in Bishop Heber College Trichy ( ). Paper Published: A Technical review on Obesity Analysis using classification algorithms, International Journal of Applied Engineering Research, ISSN Vol. 10 No.55 (2015). RES Publication 2012 Page 26
Rule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationRule Learning with Negation: Issues Regarding Effectiveness
Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX
More informationComparison of EM and Two-Step Cluster Method for Mixed Data: An Application
International Journal of Medical Science and Clinical Inventions 4(3): 2768-2773, 2017 DOI:10.18535/ijmsci/ v4i3.8 ICV 2015: 52.82 e-issn: 2348-991X, p-issn: 2454-9576 2017, IJMSCI Research Article Comparison
More informationReducing Features to Improve Bug Prediction
Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More informationApplications of data mining algorithms to analysis of medical data
Master Thesis Software Engineering Thesis no: MSE-2007:20 August 2007 Applications of data mining algorithms to analysis of medical data Dariusz Matyja School of Engineering Blekinge Institute of Technology
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 98 (2016 ) 368 373 The 6th International Conference on Current and Future Trends of Information and Communication Technologies
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationMining Association Rules in Student s Assessment Data
www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama
More informationAustralian Journal of Basic and Applied Sciences
AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationEDEXCEL FUNCTIONAL SKILLS PILOT TEACHER S NOTES. Maths Level 2. Chapter 4. Working with measures
EDEXCEL FUNCTIONAL SKILLS PILOT TEACHER S NOTES Maths Level 2 Chapter 4 Working with measures SECTION G 1 Time 2 Temperature 3 Length 4 Weight 5 Capacity 6 Conversion between metric units 7 Conversion
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationProcess Evaluations for a Multisite Nutrition Education Program
Process Evaluations for a Multisite Nutrition Education Program Paul Branscum 1 and Gail Kaye 2 1 The University of Oklahoma 2 The Ohio State University Abstract Process evaluations are an often-overlooked
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationDisambiguation of Thai Personal Name from Online News Articles
Disambiguation of Thai Personal Name from Online News Articles Phaisarn Sutheebanjard Graduate School of Information Technology Siam University Bangkok, Thailand mr.phaisarn@gmail.com Abstract Since online
More informationCLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH
ISSN: 0976-3104 Danti and Bhushan. ARTICLE OPEN ACCESS CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH Ajit Danti 1 and SN Bharath Bhushan 2* 1 Department
More informationLecture 1: Basic Concepts of Machine Learning
Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationIssues in the Mining of Heart Failure Datasets
International Journal of Automation and Computing 11(2), April 2014, 162-179 DOI: 10.1007/s11633-014-0778-5 Issues in the Mining of Heart Failure Datasets Nongnuch Poolsawad 1 Lisa Moore 1 Chandrasekhar
More informationSpeech Emotion Recognition Using Support Vector Machine
Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationQuickStroke: An Incremental On-line Chinese Handwriting Recognition System
QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents
More informationSwitchboard Language Model Improvement with Conversational Data from Gigaword
Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationIntroduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and
More informationContent-based Image Retrieval Using Image Regions as Query Examples
Content-based Image Retrieval Using Image Regions as Query Examples D. N. F. Awang Iskandar James A. Thom S. M. M. Tahaghoghi School of Computer Science and Information Technology, RMIT University Melbourne,
More informationIndian Institute of Technology, Kanpur
Indian Institute of Technology, Kanpur Course Project - CS671A POS Tagging of Code Mixed Text Ayushman Sisodiya (12188) {ayushmn@iitk.ac.in} Donthu Vamsi Krishna (15111016) {vamsi@iitk.ac.in} Sandeep Kumar
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationA Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and
A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and Planning Overview Motivation for Analyses Analyses and
More informationActivity Recognition from Accelerometer Data
Activity Recognition from Accelerometer Data Nishkam Ravi and Nikhil Dandekar and Preetham Mysore and Michael L. Littman Department of Computer Science Rutgers University Piscataway, NJ 08854 {nravi,nikhild,preetham,mlittman}@cs.rutgers.edu
More informationWellness Committee Action Plan. Developed in compliance with the Child Nutrition and Women, Infant and Child (WIC) Reauthorization Act of 2004
Wellness Committee Action Plan Developed in compliance with the Child Nutrition and Women, Infant and Child (WIC) Reauthorization Act of 2004 PREAMBLE Overweight children and adolescents have become a
More informationMachine Learning from Garden Path Sentences: The Application of Computational Linguistics
Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,
More informationAUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS
AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS R.Barco 1, R.Guerrero 2, G.Hylander 2, L.Nielsen 3, M.Partanen 2, S.Patel 4 1 Dpt. Ingeniería de Comunicaciones. Universidad de Málaga.
More informationFeature Selection based on Sampling and C4.5 Algorithm to Improve the Quality of Text Classification using Naïve Bayes
Feature Selection based on Sampling and C4.5 Algorithm to Improve the Quality of Text Classification using Naïve Bayes Viviana Molano 1, Carlos Cobos 1, Martha Mendoza 1, Enrique Herrera-Viedma 2, and
More informationData Fusion Through Statistical Matching
A research and education initiative at the MIT Sloan School of Management Data Fusion Through Statistical Matching Paper 185 Peter Van Der Puttan Joost N. Kok Amar Gupta January 2002 For more information,
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationA NEW ALGORITHM FOR GENERATION OF DECISION TREES
TASK QUARTERLY 8 No 2(2004), 1001 1005 A NEW ALGORITHM FOR GENERATION OF DECISION TREES JERZYW.GRZYMAŁA-BUSSE 1,2,ZDZISŁAWS.HIPPE 2, MAKSYMILIANKNAP 2 ANDTERESAMROCZEK 2 1 DepartmentofElectricalEngineeringandComputerScience,
More informationNetpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models
Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models 1 Netpix: A Method of Feature Selection Leading to Accurate Sentiment-Based Classification Models James B.
More informationCross-lingual Short-Text Document Classification for Facebook Comments
2014 International Conference on Future Internet of Things and Cloud Cross-lingual Short-Text Document Classification for Facebook Comments Mosab Faqeeh, Nawaf Abdulla, Mahmoud Al-Ayyoub, Yaser Jararweh
More informationFuzzy rule-based system applied to risk estimation of cardiovascular patients
Fuzzy rule-based system applied to risk estimation of cardiovascular patients Jan Bohacik, Department of Computer Science, University of Hull, Hull, HU6 7RX, United Kingdom and Department of Informatics,
More informationUniversidade do Minho Escola de Engenharia
Universidade do Minho Escola de Engenharia Universidade do Minho Escola de Engenharia Dissertação de Mestrado Knowledge Discovery is the nontrivial extraction of implicit, previously unknown, and potentially
More informationTHE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING
SISOM & ACOUSTICS 2015, Bucharest 21-22 May THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING MarilenaăLAZ R 1, Diana MILITARU 2 1 Military Equipment and Technologies Research Agency, Bucharest,
More informationPredicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks
Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com
More informationExperiment Databases: Towards an Improved Experimental Methodology in Machine Learning
Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning Hendrik Blockeel and Joaquin Vanschoren Computer Science Dept., K.U.Leuven, Celestijnenlaan 200A, 3001 Leuven, Belgium
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationImpact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees
Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees Mariusz Łapczy ski 1 and Bartłomiej Jefma ski 2 1 The Chair of Market Analysis and Marketing Research,
More informationAnalysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion
More informationSemi-Supervised Face Detection
Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University
More informationEvolutive Neural Net Fuzzy Filtering: Basic Description
Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:
More informationA Model to Predict 24-Hour Urinary Creatinine Level Using Repeated Measurements
Virginia Commonwealth University VCU Scholars Compass Theses and Dissertations Graduate School 2006 A Model to Predict 24-Hour Urinary Creatinine Level Using Repeated Measurements Donna S. Kroos Virginia
More informationData Fusion Models in WSNs: Comparison and Analysis
Proceedings of 2014 Zone 1 Conference of the American Society for Engineering Education (ASEE Zone 1) Data Fusion s in WSNs: Comparison and Analysis Marwah M Almasri, and Khaled M Elleithy, Senior Member,
More informationTruth Inference in Crowdsourcing: Is the Problem Solved?
Truth Inference in Crowdsourcing: Is the Problem Solved? Yudian Zheng, Guoliang Li #, Yuanbing Li #, Caihua Shan, Reynold Cheng # Department of Computer Science, Tsinghua University Department of Computer
More informationOn-Line Data Analytics
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob
More informationImproving Simple Bayes. Abstract. The simple Bayesian classier (SBC), sometimes called
Improving Simple Bayes Ron Kohavi Barry Becker Dan Sommereld Data Mining and Visualization Group Silicon Graphics, Inc. 2011 N. Shoreline Blvd. Mountain View, CA 94043 fbecker,ronnyk,sommdag@engr.sgi.com
More informationMachine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler
Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationMultivariate k-nearest Neighbor Regression for Time Series data -
Multivariate k-nearest Neighbor Regression for Time Series data - a novel Algorithm for Forecasting UK Electricity Demand ISF 2013, Seoul, Korea Fahad H. Al-Qahtani Dr. Sven F. Crone Management Science,
More informationMulti-Lingual Text Leveling
Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency
More informationComputerized Adaptive Psychological Testing A Personalisation Perspective
Psychology and the internet: An European Perspective Computerized Adaptive Psychological Testing A Personalisation Perspective Mykola Pechenizkiy mpechen@cc.jyu.fi Introduction Mixed Model of IRT and ES
More informationUsing Web Searches on Important Words to Create Background Sets for LSI Classification
Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract
More informationAnalyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio
SCSUG Student Symposium 2016 Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio Praneth Guggilla, Tejaswi Jha, Goutam Chakraborty, Oklahoma State
More informationSAGES 2017 ANNUAL MEETING SESSION DESIGN FORM - SAMPLE
Fictitious Examples are listed below in orange: SAGES 2017 ANNUAL MEETING SESSION DESIGN FORM - SAMPLE SESSION CHAIR: Raul Rosenthal SESSION CO-CHAIR: Daniel Jones SESSION TITLE: PG Bariatric: What Every
More informationAnalysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems
Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems Ajith Abraham School of Business Systems, Monash University, Clayton, Victoria 3800, Australia. Email: ajith.abraham@ieee.org
More informationMining Student Evolution Using Associative Classification and Clustering
Mining Student Evolution Using Associative Classification and Clustering 19 Mining Student Evolution Using Associative Classification and Clustering Kifaya S. Qaddoum, Faculty of Information, Technology
More informationIterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages
Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer
More informationGRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics
2017-2018 GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics Entrance requirements, program descriptions, degree requirements and other program policies for Biostatistics Master s Programs
More informationMYCIN. The MYCIN Task
MYCIN Developed at Stanford University in 1972 Regarded as the first true expert system Assists physicians in the treatment of blood infections Many revisions and extensions over the years The MYCIN Task
More informationMulti-label Classification via Multi-target Regression on Data Streams
Multi-label Classification via Multi-target Regression on Data Streams Aljaž Osojnik 1,2, Panče Panov 1, and Sašo Džeroski 1,2,3 1 Jožef Stefan Institute, Jamova cesta 39, Ljubljana, Slovenia 2 Jožef Stefan
More informationINPE São José dos Campos
INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA
More informationCOMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS
COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS L. Descalço 1, Paula Carvalho 1, J.P. Cruz 1, Paula Oliveira 1, Dina Seabra 2 1 Departamento de Matemática, Universidade de Aveiro (PORTUGAL)
More informationA Biological Signal-Based Stress Monitoring Framework for Children Using Wearable Devices
Article A Biological Signal-Based Stress Monitoring Framework for Children Using Wearable Devices Yerim Choi 1, Yu-Mi Jeon 2, Lin Wang 3, * and Kwanho Kim 2, * 1 Department of Industrial and Management
More informationSoftprop: Softmax Neural Network Backpropagation Learning
Softprop: Softmax Neural Networ Bacpropagation Learning Michael Rimer Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: mrimer@axon.cs.byu.edu Tony Martinez Computer Science
More informationRule discovery in Web-based educational systems using Grammar-Based Genetic Programming
Data Mining VI 205 Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming C. Romero, S. Ventura, C. Hervás & P. González Universidad de Córdoba, Campus Universitario de
More informationA Neural Network GUI Tested on Text-To-Phoneme Mapping
A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis
More informationFragment Analysis and Test Case Generation using F- Measure for Adaptive Random Testing and Partitioned Block based Adaptive Random Testing
Fragment Analysis and Test Case Generation using F- Measure for Adaptive Random Testing and Partitioned Block based Adaptive Random Testing D. Indhumathi Research Scholar Department of Information Technology
More informationLarge-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy
Large-Scale Web Page Classification by Sathi T Marath Submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy at Dalhousie University Halifax, Nova Scotia November 2010
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationSystem Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationPOLA: a student modeling framework for Probabilistic On-Line Assessment of problem solving performance
POLA: a student modeling framework for Probabilistic On-Line Assessment of problem solving performance Cristina Conati, Kurt VanLehn Intelligent Systems Program University of Pittsburgh Pittsburgh, PA,
More informationExperiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling
Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad
More informationProduct Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments
Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Vijayshri Ramkrishna Ingale PG Student, Department of Computer Engineering JSPM s Imperial College of Engineering &
More informationCS 446: Machine Learning
CS 446: Machine Learning Introduction to LBJava: a Learning Based Programming Language Writing classifiers Christos Christodoulopoulos Parisa Kordjamshidi Motivation 2 Motivation You still have not learnt
More informationCS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University
CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE Mingon Kang, PhD Computer Science, Kennesaw State University Self Introduction Mingon Kang, PhD Homepage: http://ksuweb.kennesaw.edu/~mkang9
More informationUsing dialogue context to improve parsing performance in dialogue systems
Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,
More informationBeyond the Pipeline: Discrete Optimization in NLP
Beyond the Pipeline: Discrete Optimization in NLP Tomasz Marciniak and Michael Strube EML Research ggmbh Schloss-Wolfsbrunnenweg 33 69118 Heidelberg, Germany http://www.eml-research.de/nlp Abstract We
More informationExecutive Guide to Simulation for Health
Executive Guide to Simulation for Health Simulation is used by Healthcare and Human Service organizations across the World to improve their systems of care and reduce costs. Simulation offers evidence
More informationAUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders
More informationLinking the Ohio State Assessments to NWEA MAP Growth Tests *
Linking the Ohio State Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. August 2016 Introduction Northwest Evaluation Association (NWEA
More informationGenre classification on German novels
Genre classification on German novels Lena Hettinger, Martin Becker, Isabella Reger, Fotis Jannidis and Andreas Hotho Data Mining and Information Retrieval Group, University of Würzburg Email: {hettinger,
More informationHistorical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach
IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach To cite this
More informationPh.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept B.Tech in Computer science and
Name Qualification Sonia Thomas Ph.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept. 2016. M.Tech in Computer science and Engineering. B.Tech in
More informationGUIDELINES FOR COMBINED TRAINING IN PEDIATRICS AND MEDICAL GENETICS LEADING TO DUAL CERTIFICATION
GUIDELINES FOR COMBINED TRAINING IN PEDIATRICS AND MEDICAL GENETICS LEADING TO DUAL CERTIFICATION PREAMBLE This document is intended to provide educational guidance to program directors in pediatrics and
More information