Gene-Expression Microarrays Classification using Feature Selection and Support Vector Machines
|
|
- Bonnie Stevenson
- 6 years ago
- Views:
Transcription
1 Gene-Expression Microarrays Classification using Feature Selection and Support Vector Machines Darcy Davis Allison Hanuschak - Alina Lazar Department of Computer Science and Information Systems Youngstown State University 1. Introduction Every living organism contains inside its cells genetic material, which is transmitted from one generation to the next. The genetic material encoded in each cell is composed out of nucleic acid (DNA). The DNA molecule is organized into segments called genes. An organism has the same genes in all its cells but they can be in different stages at different time moments. The genetic information stored into DNA may be transcribed into complementary RNA molecules which in turn may be translated into proteins. Many complex human diseases and especially cancer are correlated with abnormal functionality at this level. After 1996, a new technology called DNA microarrays gave researchers the possibility to synthesize a global gene image of the cell. The study of gene-expression microarrays is a reasonably new development in biology that allows thousands of genes to be studied simu ltaneously. Each mircoarray is a silicon chip on which gene probes are align in a grid pattern. Measurements are done by using fluorescent detection. The fact that today one microarray can be used to measure all the human genes has led to advances in the diagnosis and prognosis of diseases and also in the drug discovery [1]. However, the amount of data in each microarray is too overwhelming for manual analysis, since a single sample often contains measurements for around 10,000 genes. Due to this excessive amount of information, efficiently producing results requires automatic computer controlled analysis of data. By using machine learning techniques [2, 3], the computer can be trained to recognize patterns that biologically classify the microarrays. Assuming that half of the instances are from healthy patients and half are from patients that have a disease, especially cancer, by using machine learning algorithms we can find gene combinations to distinguish and separate the healthy patients from the sick ones. The main data analysis techniques [4] used currently in biomedical applications related to microarrays are: classification, clustering and gene selection. In contrast with data sets from other fields a typical microarray dataset has a large number of genes (~10000) and a small number of samples (~100). However, we can expect that not all the genes will carry relevant information for a particular classification task. The process of selecting only the important components it is called gene or feature selection [5]. Supervised learning or classification algorithm can be used to classify and predict diseases outcome. Unlike classification, clustering does not use a tissue annotation as a decision and it is used to discover new biological classes. Several machine learning algorithms [6] have been previously used to classify microarrays datasets, including, decision trees, Fisher linear discriminant analysis, nearest neighbor, neural networks, Bayesian networks and support vector machines. Supervised machine learning method known like support vector machine (SVM) [7, 8] have been used to analyze a preexisting data set of microarrays and diagnose cancer. As it is unreasonable to expect perfect diagnosis with a limited knowledge of cancer, the goal is to optimize the correctness of the diagnosis by employing different methods for using and training the SVM. Applying machine learning algorithms on DNA microarray data sets is of maximum importance for the future medical research related to gene expression analysis for disease classification and genotyping for diagnosis and drug discovery. 2. Specific Questions The goal of our proposed research will be to use supervised learning to classify and predict cancer or other diseases, based on the gene expressions collected from microarrays. These microarrays give us information concerning the rate at which a certain gene is expressing itself, or in other terms, the rate at which its DNA is being transcribed into RNA and then being translated into the corresponding protein. Today, there are many freely 1
2 available public microarray data sets available to analyze and utilize in our research. Table 1 summarizes the data sets that will be used in the present research. Table 1. Publicly Available Microarray Datasets Name National Center for Biotechnology Information Stanford Microarray Database URL University of Pittsburgh Microarray Dataset Collection Kent Ridge Bio-medical Data Set Repository Known sets of data will be used to train the machine learning protocols to categorize cancer patients according to their prognosis. Consequently, the accuracy of the routines developed will be tested against a separate set of known data. The outcome of this study will provide information regarding the efficiency of the machine learning techniques, in particular SVM methods, in discovering patterns related to genetic disorders, and also will allow the identification of relevant types of gene expressions. These could possibly be abnormal expression rates for a particular gene, the presence or absence of a particular gene or sequence of genes, or a pattern of unusual expression across a gene subset. Subsequently, SVM methods with different parameters will be applied to identify the best ones in terms of accuracy, efficiency and least false positive outcome [9]. It is envisioned that this would thereby provide help to guide physicians in determining the best treatment for a patient, for example regarding the aggressiveness of a course of treatment on which to place a patient. 3. Methods Two of the most important and hard problems in microarray data analysis relate to the dimensionality of the data and to noise. Because many data analysis techniques involve exhaustive search over the object space, they are very sensitive to the size of the data in terms of time complexity. In case of microarrays, the solution is to reduce the search space vertically (in terms of genes) by using a feature selection method. The other problem is that errors occur during actual data collection and they are referred as noise in the data. Supervised learning methods based on statistical learning theory, for classification and regression, provide good generalization and classification accuracy on real data. However, their inherent trade-off is their computational expense. Recently, support vector machines (SVM) [10] have become a popular tool for learning methods since they translate the input data into a larger feature space where the instances are linear separable, thus increasing efficiency. In the SVM methods a kernel which can be considered a similarity measure is used to recode the input data. The kernel is used accompanied by a map function Φ. Even if the mathematics behind the SVM is straight forward, finding the best choices for the kernel function and parameters can be challenging, when applied to real data sets. We will use the Libsvm developed by Chang [11]. Usually, the recommended kernel function [12] for nonlinear problems is the Gaussian radial basis function, because it resembles the sigmoid kernel for certain parameters and it requires less parameters than a polynomial kernel. The kernel function parameter γ and the parameter C, which controls the complexity of the decision function versus the training error minimization, can be determined by running a 2 dimensional grid search, which means that the values for pairs of parameters (C, γ) are generated in a predefined interval with a fixed step. The performance of each combination is computed and used to determine the best pair of parameters. The non-sparse property of the solution leads to a really slow evaluation process. Thus, for the microarray datasets a data reduction [13] can be done in terms of genes or features of the dataset considered. Redundant or highly correlated features can be replaced with a smaller uncorrelated number of features capturing the entire information. This is done by applying a method called Principal Component Analysis (PCA) before using the SVM algorithm. The method is performed by solving an eigenvector problem or by using iterative algorithms and the result is a set of orthogonal vectors called principal components. The mapping of the larger set into the new smaller set is done by projecting the initial instances on the principal components. The first principal component is defined 2
3 as the direction given by a linear regression fit through the input data. This direction will hold the maximum variance in the input data. The second component is orthogonal on the first vector, uncorrelated and it is defined to maximize the remaining variance. This procedure is repeated until the last vector is obtained. The envisioned research will follow the main steps of knowledge discovery processes: - Gene selection - the irrelevant attributes (genes) are removed and the selected data is represented as a two-dimensional table. - Preprocessing - if the selected table contains missing values or empty cell entries, the table must be preprocessed in order to remove some of the incompleteness. Statistics should be run to obtain more information about the data. - Training and validation sample - the initial table is divided into at least two tables by using a crossvalidation procedure. One will be used in the training step, the other in the validation or testing step. - Interpretation and evaluation - the validation or test data set is then used to test the classificatory performance of the methods in terms of efficiency and accuracy. A time projection for the project is given in the next table. Table 2. Project Time Table Task Name Literature review research about the gene expression data, support vector machine techniques and feature selection algorithms. Developing programs that automatically test machine learning algorithms against for classification and prediction. Full scale integration off the successful algorithms to large gene expression datasets. Dissemination of results through papers and communications at specific conferences. Evaluation of the applicability of the developed algorithms to other datasets S O N D J F M A M J J A 4. References [1] R. Burbridge, M. Trotter, B. Buxton, and S. Holden, Drug design by machine learning; support vector machines for pharmaceutical data analysis. Computers and Chemistry 26:5-14, [2] M. Molla, M. Waddell, D. Page, J.and Shavlik, Using Machine Learning to Design and Interpret Gene- Expression Microarrays. AI Magazine 25:23-44, [3] Z. Wang, Y. Wang, J. Lu, S. Kung, J. Zhang, R. Lee, J. Xuan, at al., Discriminatory Mining of Gene Expression Microarray Data. The Journal of VLSI Signal Processing 35: , [4] W. Dubitzky, M. Granzow, and D. Berrar, Data Mining and Machine Learning Methods for Microarray Analysis. In: Lin, S.M., Johnson, K.F. (eds.) Methods of Microarray Data Analysis - Papers from CAMDA 2000, Boston. Kluwer, Academic Publishers, [5] P. S. Bradley and O. L. Mangasarian, Feature Selection via Concave Minimization and Support Vector Machines. In Machine Learning Proceedings of the Fifteenth International Conference(ICML '98), J. Shavlik, editor, Morgan Kaufmann, San Francisco, California, 82-90, [6] S. Cho, and H. Won, Machine Learning in DNA Microarray Analysis for Cancer Classification. APBC 2003: ,
4 [7] B. Schölkopf and A. Smola, Learning with Kernels. MIT Press, Cambridge Massachusetts, [8] V. N. Vapnik, The Nature of Statistical Learning Theory, 2 nd edition, Springer-Verlag, New York, NY, [9] J.B. Tobler, M.N. Molla, E.F. Nuwaysir, R.D. Green, and J.W. Shavlik, Evaluating machine learning approaches for aiding probe selection for gene-expression arrays. Bioinformatics 18: , [10] T. Joachims, Making large-scale SVM learning practical., In B. Scholkopf, C. J. C. Burges and A. j. Smola, editors, Advances in Kernel Methods Support Vector Learning, pp , MIT Press,, Cambrige, MA, [11] C.-C. Chang, and C.-J. Lin, LIBSVM: a library for support vector machines, Software available at [12] N. Cristianini and J. Shawe -Taylor, An Introduction to Support Vector Machines and Other Kernel-based Learning Methods, Cambridge University Press, Cambridge, England, [13] Y.-J. Lee and O.L. Mangasarian, RSVM: Reduced Support Vector Machines, Proc. Of the First SIAM International Conference on Data Mining, Chicago, April 5-7, Impact on the Goal of CREU The foremost goal of the CREU project is to encourage females and minorities to pursue graduate work and study in the field of computer science. This project will provide a realistic research experience for the two female undergraduates, by active involvement in the planning, execution and interpretation of scientific research. Welldeveloped research projects can significantly enrich the educational experience for undergraduate students. Working on this research project, students will be able to enhance their computer and programming skills, apply those skills to investigate scientific problems, learn how to formulate questions and problems and to participate in the discovery of new knowledge. A good research experience can foster an enthusiasm for lifelong learning and a desire to continue education beyond the baccalaureate. Successful scientific instruction should develop in student a sense of wonder and curiosity about the world. The students will be exposed to both sides of the scientific investigation: hypothesis testing and development of theoretical explanations of observations. No science education is complete without research related activities, technical writing and oral presentations. Darcy Davis feels that this project will certainly support the goals of CREU. As an undergraduate female with intentions of pursuing graduate work in computer science, this project will give her, a useful introduction to the practical applications of her studies for research, focusing on artificial intelligence. This is a project that can potentially be a foundation for her senior thesis, and the mathematical concepts will be a wonderful basis for the presentations she intends to make at this year's mathematics conferences. Allison Hanuschak believes that this research project will introduce the world of graduate research to her and will be an exceptional opportunity to gain valuable research experience. In addition, by completing the CREU project, she thinks that she will have a distinct advantage for admissions to the graduate school of her choice. Also, she hopes to to encourage fellow female students by setting an example for them and being a positive role model for continuing study in computer science. All in all, this experience will be beneficial for her and will also aid her in pursuing graduate study. Both students intend to present this project at the 2005 YSU QUEST conference. 6. Student Activity and Responsibilities Specific tasks for the two participant students will include: literature search and review, reading and discussing research articles, designing and implementing data mining and machine learning algorithms, data processing, data analysis and interpretation, summarizing and preparing results for presentations and publications, participation at the YSU QUEST 2005 and writing the final report. The primary responsibility of the two students is to participate in all phases of the project: proposal, development, experiments, and dissemination. The students will be required to do weekly independent work and to 4
5 schedule team meetings. It is important that they work together as a team. The faculty advisor will meet with the students every other week. will be used for questions, announcements and documents interchange. 7. Faculty Activity and Responsibilities As faculty advisor for the proposed project, Dr. Alina Lazar will work to actively mentor the two students and continuously supervise their progress during the one year period. She will meet with the students on regular basis to guide their activities and answer their questions related to the project. Dr. Lazar has extensive experience in data mining, machine learning and artificial intelligence and she has written several papers related to the subject of this proposal. Her knowledge will make this project an enjoyable research experience for the undergraduate students. The department will supply a small computer lab and Dr. Lazar will provide the necessary software from funds previously obtained through the university. She guided the students on how to develop and write the present proposal and she will help them with the final report and also with the preparation of a conference paper. The overall guidance and mentoring will not refer only to this project but it will provided insights about how to apply and how to succeed in graduate school, about being a female computer scientist and what the options are after graduate school. 8. Budget For the proposed project we are requesting $2000 for the two participant female students. An additional $500 will be used to buy computer media, books and other materials necessary for the project. While working on the project the students will be encouraged to apply for the Undergraduate Research Grant Award sponsored by the Youngstown State University and other scholarships. 5
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationSpeech Emotion Recognition Using Support Vector Machine
Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,
More informationGRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics
2017-2018 GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics Entrance requirements, program descriptions, degree requirements and other program policies for Biostatistics Master s Programs
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationIntroduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More informationQuickStroke: An Incremental On-line Chinese Handwriting Recognition System
QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationMaximizing Learning Through Course Alignment and Experience with Different Types of Knowledge
Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February
More informationAD (Leave blank) PREPARED FOR: U.S. Army Medical Research and Materiel Command Fort Detrick, Maryland
AD (Leave blank) Award Number: W81XWH-09-1-0282 TITLE: Georgetown University and Hampton University Prostate Cancer Undergraduate Fellowship Program PRINCIPAL INVESTIGATOR: Anna Riegel, PhD CONTRACTING
More informationExposé for a Master s Thesis
Exposé for a Master s Thesis Stefan Selent January 21, 2017 Working Title: TF Relation Mining: An Active Learning Approach Introduction The amount of scientific literature is ever increasing. Especially
More informationRule Learning with Negation: Issues Regarding Effectiveness
Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX
More informationReducing Features to Improve Bug Prediction
Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science
More informationApplications of data mining algorithms to analysis of medical data
Master Thesis Software Engineering Thesis no: MSE-2007:20 August 2007 Applications of data mining algorithms to analysis of medical data Dariusz Matyja School of Engineering Blekinge Institute of Technology
More informationGUIDELINES FOR COMBINED TRAINING IN PEDIATRICS AND MEDICAL GENETICS LEADING TO DUAL CERTIFICATION
GUIDELINES FOR COMBINED TRAINING IN PEDIATRICS AND MEDICAL GENETICS LEADING TO DUAL CERTIFICATION PREAMBLE This document is intended to provide educational guidance to program directors in pediatrics and
More informationAustralian Journal of Basic and Applied Sciences
AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean
More informationLecture 1: Basic Concepts of Machine Learning
Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010
More informationCourse Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE
EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationPrerequisite: General Biology 107 (UE) and 107L (UE) with a grade of C- or better. Chemistry 118 (UE) and 118L (UE) or permission of instructor.
Introduction to Molecular and Cell Biology BIOL 499-02 Fall 2017 Class time: Lectures: Tuesday, Thursday 8:30 am 9:45 am Location: Name of Faculty: Contact details: Laboratory: 2:00 pm-4:00 pm; Monday
More informationNotes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1
Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationWE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT
WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationIssues in the Mining of Heart Failure Datasets
International Journal of Automation and Computing 11(2), April 2014, 162-179 DOI: 10.1007/s11633-014-0778-5 Issues in the Mining of Heart Failure Datasets Nongnuch Poolsawad 1 Lisa Moore 1 Chandrasekhar
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationNanotechnology STEM Program via Research Experience for High School Teachers
Nanotechnology STEM Program via Research Experience for High School Teachers Mangilal Agarwal 1,*, Qurat-ul-Ann Mirza 3, 7, Joseph Bondi 3, 7, Brandon Sorge 3, Maher Rizkalla 1,4, Richard Ward 2, Corbin
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationSpeech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines
Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines Amit Juneja and Carol Espy-Wilson Department of Electrical and Computer Engineering University of Maryland,
More informationBIOH : Principles of Medical Physiology
University of Montana ScholarWorks at University of Montana Syllabi Course Syllabi Spring 2--207 BIOH 462.0: Principles of Medical Physiology Laurie A. Minns University of Montana - Missoula, laurie.minns@umontana.edu
More informationDepartment of Anatomy and Cell Biology Curriculum
Department of Anatomy and Cell Biology Curriculum The graduate program in Anatomy and Cell Biology prepares the student for a research and/or teaching career with concentrations in one or more of the following:
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationOn-Line Data Analytics
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob
More informationBusiness Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence
Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence COURSE DESCRIPTION This course presents computing tools and concepts for all stages
More informationMining Association Rules in Student s Assessment Data
www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama
More informationKnowledge Transfer in Deep Convolutional Neural Nets
Knowledge Transfer in Deep Convolutional Neural Nets Steven Gutstein, Olac Fuentes and Eric Freudenthal Computer Science Department University of Texas at El Paso El Paso, Texas, 79968, U.S.A. Abstract
More informationWhat Teachers Are Saying
How would you rate the impact of the Genes, Genomes and Personalized Medicine program on your teaching practice? Taking the course helped remove the fear of teaching biology at a molecular level and helped
More informationSARDNET: A Self-Organizing Feature Map for Sequences
SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu
More informationMYCIN. The MYCIN Task
MYCIN Developed at Stanford University in 1972 Regarded as the first true expert system Assists physicians in the treatment of blood infections Many revisions and extensions over the years The MYCIN Task
More informationLearning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for
Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com
More informationKnowledge-Based - Systems
Knowledge-Based - Systems ; Rajendra Arvind Akerkar Chairman, Technomathematics Research Foundation and Senior Researcher, Western Norway Research institute Priti Srinivas Sajja Sardar Patel University
More informationBIOLOGICAL CHEMISTRY MASTERS PROGRAM
BIOLOGICAL CHEMISTRY MASTERS PROGRAM STUDENT HANDBOOK 2017-2018 About the Cover Jennifer Gehret McCarthy, Ph.D. (BioChem 2012) The marine environment, full of bioactive natural products, is largely untapped.
More informationBENCHMARK TREND COMPARISON REPORT:
National Survey of Student Engagement (NSSE) BENCHMARK TREND COMPARISON REPORT: CARNEGIE PEER INSTITUTIONS, 2003-2011 PREPARED BY: ANGEL A. SANCHEZ, DIRECTOR KELLI PAYNE, ADMINISTRATIVE ANALYST/ SPECIALIST
More informationProduct Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments
Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Vijayshri Ramkrishna Ingale PG Student, Department of Computer Engineering JSPM s Imperial College of Engineering &
More informationDana Carolyn Paquin Curriculum Vitae
Dana Carolyn Paquin Curriculum Vitae Education 2007 Ph.D., Mathematics, Stanford University. Thesis: Multiscale methods for image registration. 2002 B.S., Mathematics (Magna Cum Laude), Davidson College.
More informationBiomedical Sciences (BC98)
Be one of the first to experience the new undergraduate science programme at a university leading the way in biomedical teaching and research Biomedical Sciences (BC98) BA in Cell and Systems Biology BA
More informationABSTRACT. A major goal of human genetics is the discovery and validation of genetic polymorphisms
ABSTRACT DEODHAR, SUSHAMNA DEODHAR. Using Grammatical Evolution Decision Trees for Detecting Gene-Gene Interactions in Genetic Epidemiology. (Under the direction of Dr. Alison Motsinger-Reif.) A major
More informationCLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH
ISSN: 0976-3104 Danti and Bhushan. ARTICLE OPEN ACCESS CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH Ajit Danti 1 and SN Bharath Bhushan 2* 1 Department
More informationScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 98 (2016 ) 368 373 The 6th International Conference on Current and Future Trends of Information and Communication Technologies
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationA Neural Network GUI Tested on Text-To-Phoneme Mapping
A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis
More informationTime series prediction
Chapter 13 Time series prediction Amaury Lendasse, Timo Honkela, Federico Pouzols, Antti Sorjamaa, Yoan Miche, Qi Yu, Eric Severin, Mark van Heeswijk, Erkki Oja, Francesco Corona, Elia Liitiäinen, Zhanxing
More informationA Comparison of Two Text Representations for Sentiment Analysis
010 International Conference on Computer Application and System Modeling (ICCASM 010) A Comparison of Two Text Representations for Sentiment Analysis Jianxiong Wang School of Computer Science & Educational
More informationSoftware Maintenance
1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories
More informationAgent-Based Software Engineering
Agent-Based Software Engineering Learning Guide Information for Students 1. Description Grade Module Máster Universitario en Ingeniería de Software - European Master on Software Engineering Advanced Software
More informationProgram in Molecular Medicine
Graduate Program in Life Sciences Program in Molecular Medicine Student and Faculty Handbook 2017-2018 UNIVERSITY OF MARYLAND GRADUATE SCHOOL UNIVERSITY OF MARYLAND SCHOOL OF MEDICINE Graduate Program
More informationA project-based learning approach to protein biochemistry suitable for both face-to-face and distance education students
A project-based learning approach to protein biochemistry suitable for both face-to-face and distance education students R.J. Prior, School of Health Studies, University of Canberra, Australia J.K. Forwood,
More informationPH.D. IN COMPUTER SCIENCE PROGRAM (POST M.S.)
PH.D. IN COMPUTER SCIENCE PROGRAM (POST M.S.) OVERVIEW ADMISSION REQUIREMENTS PROGRAM REQUIREMENTS OVERVIEW FOR THE PH.D. IN COMPUTER SCIENCE Overview The doctoral program is designed for those students
More informationCS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University
CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE Mingon Kang, PhD Computer Science, Kennesaw State University Self Introduction Mingon Kang, PhD Homepage: http://ksuweb.kennesaw.edu/~mkang9
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationClassroom Assessment Techniques (CATs; Angelo & Cross, 1993)
Classroom Assessment Techniques (CATs; Angelo & Cross, 1993) From: http://warrington.ufl.edu/itsp/docs/instructor/assessmenttechniques.pdf Assessing Prior Knowledge, Recall, and Understanding 1. Background
More informationEvolutive Neural Net Fuzzy Filtering: Basic Description
Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationWHEN THERE IS A mismatch between the acoustic
808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,
More informationA survey of multi-view machine learning
Noname manuscript No. (will be inserted by the editor) A survey of multi-view machine learning Shiliang Sun Received: date / Accepted: date Abstract Multi-view learning or learning with multiple distinct
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationarxiv: v1 [cs.lg] 3 May 2013
Feature Selection Based on Term Frequency and T-Test for Text Categorization Deqing Wang dqwang@nlsde.buaa.edu.cn Hui Zhang hzhang@nlsde.buaa.edu.cn Rui Liu, Weifeng Lv {liurui,lwf}@nlsde.buaa.edu.cn arxiv:1305.0638v1
More informationRule discovery in Web-based educational systems using Grammar-Based Genetic Programming
Data Mining VI 205 Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming C. Romero, S. Ventura, C. Hervás & P. González Universidad de Córdoba, Campus Universitario de
More informationA Vector Space Approach for Aspect-Based Sentiment Analysis
A Vector Space Approach for Aspect-Based Sentiment Analysis by Abdulaziz Alghunaim B.S., Massachusetts Institute of Technology (2015) Submitted to the Department of Electrical Engineering and Computer
More informationComparison of EM and Two-Step Cluster Method for Mixed Data: An Application
International Journal of Medical Science and Clinical Inventions 4(3): 2768-2773, 2017 DOI:10.18535/ijmsci/ v4i3.8 ICV 2015: 52.82 e-issn: 2348-991X, p-issn: 2454-9576 2017, IJMSCI Research Article Comparison
More informationMassachusetts Institute of Technology Tel: Massachusetts Avenue Room 32-D558 MA 02139
Hariharan Narayanan Massachusetts Institute of Technology Tel: 773.428.3115 LIDS har@mit.edu 77 Massachusetts Avenue http://www.mit.edu/~har Room 32-D558 MA 02139 EMPLOYMENT Massachusetts Institute of
More informationComputational Data Analysis Techniques In Economics And Finance
Computational Data Analysis Techniques In Economics And Finance If searched for a ebook Computational Data Analysis Techniques in Economics and Finance in pdf format, in that case you come on to correct
More informationBiology 10 - Introduction to the Principles of Biology Spring 2017
Biology 10 - Introduction to the Principles of Biology Spring 2017 Welcome to Bio 10! Lecture: Monday and Wednesday Lab: Monday 7:00 10:00pm or 5:30-7:00pm Wednesday 7:00 10:00pm Room: 2004 Lark Hall Room:
More informationRule-based Expert Systems
Rule-based Expert Systems What is knowledge? is a theoretical or practical understanding of a subject or a domain. is also the sim of what is currently known, and apparently knowledge is power. Those who
More informationExperiment Databases: Towards an Improved Experimental Methodology in Machine Learning
Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning Hendrik Blockeel and Joaquin Vanschoren Computer Science Dept., K.U.Leuven, Celestijnenlaan 200A, 3001 Leuven, Belgium
More informationEvolution of Symbolisation in Chimpanzees and Neural Nets
Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication
More informationKnowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute
Page 1 of 28 Knowledge Elicitation Tool Classification Janet E. Burge Artificial Intelligence Research Group Worcester Polytechnic Institute Knowledge Elicitation Methods * KE Methods by Interaction Type
More informationObserving Teachers: The Mathematics Pedagogy of Quebec Francophone and Anglophone Teachers
Observing Teachers: The Mathematics Pedagogy of Quebec Francophone and Anglophone Teachers Dominic Manuel, McGill University, Canada Annie Savard, McGill University, Canada David Reid, Acadia University,
More information*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN
From: AAAI Technical Report WS-98-08. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Recommender Systems: A GroupLens Perspective Joseph A. Konstan *t, John Riedl *t, AI Borchers,
More informationJeff Walker Office location: Science 476C (I have a phone but is preferred) 1 Course Information. 2 Course Description
BIO 221 Human Physiology I Jeff Walker Office location: Science 476C E-mail: walker@maine.edu (I have a phone but e-mail is preferred) Fall 2017 1 Course Information Room Science 105 Class meetings are
More informationCS 446: Machine Learning
CS 446: Machine Learning Introduction to LBJava: a Learning Based Programming Language Writing classifiers Christos Christodoulopoulos Parisa Kordjamshidi Motivation 2 Motivation You still have not learnt
More informationBeyond the Blend: Optimizing the Use of your Learning Technologies. Bryan Chapman, Chapman Alliance
901 Beyond the Blend: Optimizing the Use of your Learning Technologies Bryan Chapman, Chapman Alliance Power Blend Beyond the Blend: Optimizing the Use of Your Learning Infrastructure Facilitator: Bryan
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationWhat can I learn from worms?
What can I learn from worms? Stem cells, regeneration, and models Lesson 7: What does planarian regeneration tell us about human regeneration? I. Overview In this lesson, students use the information that
More informationTo link to this article: PLEASE SCROLL DOWN FOR ARTICLE
This article was downloaded by: [Dr Brian Winkel] On: 19 November 2014, At: 04:59 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer
More informationMaster's Programme Biomedicine and Biotechnology
Master's Programme Biomedicine and Biotechnology Translation of the curriculum, published June 2 nd, 2009 in the bulletin ( Mitteilungsblatt ) of the University of Veterinary Medicine, Vienna. University
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationMachine Learning from Garden Path Sentences: The Application of Computational Linguistics
Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,
More informationMathematics. Mathematics
Mathematics Program Description Successful completion of this major will assure competence in mathematics through differential and integral calculus, providing an adequate background for employment in
More informationMulti-tasks Deep Learning Model for classifying MRI images of AD/MCI Patients
Multi-tasks Deep Learning Model for classifying MRI images of AD/MCI Patients S.Sambath Kumar 1, Dr M. Nandhini 2, 1 Research scholar, 2 Assistant Professor 1,2 Department of Computer Science, Pondicherry
More informationStephanie Ann Siler. PERSONAL INFORMATION Senior Research Scientist; Department of Psychology, Carnegie Mellon University
Stephanie Ann Siler PERSONAL INFORMATION Senior Research Scientist; Department of Psychology, Carnegie Mellon University siler@andrew.cmu.edu Home Address Office Address 26 Cedricton Street 354 G Baker
More informationSemi-Supervised Face Detection
Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University
More informationPh.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept B.Tech in Computer science and
Name Qualification Sonia Thomas Ph.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept. 2016. M.Tech in Computer science and Engineering. B.Tech in
More information