Lecture 9: Classification and algorithmic methods
|
|
- Agatha Harrison
- 6 years ago
- Views:
Transcription
1 1/28 Lecture 9: Classification and algorithmic methods Måns Thulin Department of Mathematics, Uppsala University Multivariate Methods 17/5 2011
2 2/28 Outline What are algorithmic methods? Algorithmic methods for classification knn classification Decision trees Algorithmic versus probabilistic methods
3 3/28 Probabilistic methods Previously, we have looked at probabilistic methods for classification methods based on statistical theory and model assumptions. In a statistical problem, the basic situation is the following: Nature {Black box} Data The probabilistic approach is to assume a model for what happens in the black box (normal distribution, ARIMA time series, linear model, Markov chains...). We hope that the models describe the black box accurately enough. All models are wrong, but some are useful. - George Box Some statisticians, and indeed people from other fields as well, argue that it is time to think outside the box.
4 4/28 Algorithmic methods Suppose that we have a set of data with known classes. Without any model assumptions, we can use heuristics and good ideas to come up with new methods. We can create algorithms that creates rules for classifying new points using the given training data. By splitting the given data into a training set and a test set, we can evaluate the performance of our algorithmic method. All models are wrong, and increasingly you can succeed without them. - Peter Norvig, research director at Google As a motivating example, we ll look at a situation where it is more or less clear that we don t need fancy methods or model assumptions to classify new observations.
5 A toy example Example: consider a data set with two groups: red and blue. knn classification y 5/28
6 A toy example How should we classify the new black point? knn classification y 6/28
7 A toy example It seems reasonable to classify the point as being blue! knn classification y 7/28
8 A toy example How should we classify the new black point? knn classification y 8/28
9 A toy example It seems reasonable to classify the point as being red! knn classification y 9/28
10 A less nice example But what about this point? knn classification y 10/28
11 11/28 knn: basic idea In the first two examples, we could easily classify the point since all points in its neighbourhood had the same colour. What should we do when there is more than one colour in the neighbourhood? The knn algorithm classifies the new point by letting the k Nearest Neighbours the k points that are the closest to the point vote about the class of the new point.
12 knn: basic idea Look at the k = 1 closest neighbours. The point is classified as being blue, since the nearest neighbour is blue. knn classification k=1 y 12/28
13 knn: basic idea Look at the k = 2 closest neighbours. It is not clear how to classify the point (no colour has a majority). knn classification k=2 y 13/28
14 knn: basic idea Look at the k = 3 closest neighbours. The point is classified as being blue (2 votes against 1). knn classification k=3 y 14/28
15 15/28 knn: choosing k Clearly, the choice of k is very important. If k is too small, the algorithm becomes sensitive to noise points and outliers. If k is too large, the neighbourhood will probably include points from other classes. How should we choose k? This is a difficult question! There is no right answer. Often a test data set is used to investigate the performance for different k. Typically, we choose the k that has the lowest misclassification rate for the test data.
16 16/28 knn: no majority In our example, we encountered a problem when k = 2: no colour had a majority. What should we do in such cases? Flip a coin? This ignores some of the information that we have gathered! Let the closest neighbour decide? Or the k 1 closest? A better solution is probably to use weighted votes, so that the votes from closer neighbours are seen as more important. This idea could be used in all cases, and not just when there is no majority. Look at k + 1 neighbours instead? Essentially, this means that when we don t have enough information to make a decision, we gather more information.
17 17/28 knn: some last comments knn is essentially a rank method: we measure the distance to all points in the data set and rank them accordingly. The k points with the lowest ranks are used to classify the new point. An important question is what we mean by close. Which distance measure should we use? Euclidean distance? Statistical distance? Mahalanobis? Should we look at standardized data? Is it meaningful to use distance measures if the data is binary or categorical? If some of the variables are categorical and some are continuous measurements? Are more general similarity measures useful?
18 18/28 Decision trees: basic idea Another popular algorithmic classification method is decision trees. Have you ever played the game 20 questions? Decision trees is more or less that game! The idea is to classify the new observation by asking a series of questions. Depending on what the answer to the first question is, different second questions are asked, and so on. Questions are asked until a conclusion is reached.
19 19/28 Decision trees: basic idea Consider the following data set with vertebrate data: Name Body temp Gives birth Has legs Class Human warm-blooded yes yes mammal Whale warm-blooded yes no mammal Cat warm-blooded yes yes mammal Cow warm-blooded yes yes mammal Python cold-blooded no no reptile Komodo dragon cold-blooded no yes reptile Turtle cold-blooded no yes reptile Salmon cold-blooded no no fish Eel cold-blooded no no fish Pigeon warm-blooded no yes bird Penguin warm-blooded no yes bird Decision tree example: see blackboard!
20 20/28 Decision trees: building the tree Given training data, how can we build the decision tree? There are many algorithms for building the tree. One of the earliest is Hunt s algorithm: Let D t be a set of observations belonging to a node t. 1. If all observations in D t are of the same class i then t is a leaf node labeled as i. 2. Otherwise, use some condition to partition the observations into two smaller subsets. A child node is created for each outcome of the condition and the observations are distributed to the children based on the outcomes. When should the splitting stop? Other criterions are sometimes used, but a simple and reasonable stopping criterion is to stop splitting when all remaining nodes are leaf nodes. How, then, do we choose the condition for partitioning?
21 21/28 Decision trees: the best split Let p(i t) be the fraction of observations in class i at the node t and let c be the number of classes. The Gini for node t is defined as Gini(t) = 1 c (p(i t)) 2 Gini is a measure of impurity. If all observations belong to the same class, then i=1 Gini(t) = = 0. The Gini is maximized when all classes have the same number of observations at t. One criterion for splitting could be to minimize the Gini in the next level of the tree. That way we will get purer nodes.
22 22/28 Decision trees: the best split The situation becomes a bit more complicated if we take into account that the children can have different numbers of observations. To account for this, we try to maximize the gain: Gain = Gini(t) k j=1 n vj n t Gini(v j ) where v j are the children and n i are the number of observations at node i. This is equivalent to minimizing k n vj j=1 n t Gini(v j ). Vertebrate example: see blackboard! Sometimes other impurity measures than Gini are used. One example is the entropy: c Entropy(t) = p(i t) log 2 p(i t). i=1
23 23/28 Decision trees: vertebrate data Name Body temp Gives birth Has legs Class Human warm-blooded yes yes mammal Whale warm-blooded yes no mammal Cat warm-blooded yes yes mammal Cow warm-blooded yes yes mammal Python cold-blooded no no reptile Komodo dragon cold-blooded no yes reptile Turtle cold-blooded no yes reptile Salmon cold-blooded no no fish Eel cold-blooded no no fish Pigeon warm-blooded no yes bird Penguin warm-blooded no yes bird
24 24/28 Decision trees: extensions Some further remarks: In our example, we only used binary splits, were each internal node has two children. It is also possible to use non-binary splits, where each internal node can have more than two children. When the data is continuous, it is perhaps not as easy to choose the split criterions. Example: animal weight. A node question could be: is weight <10 kg? Is this a better question than is weight <11 kg? or is weight <9 kg? Having looked at two algorithmic methods, we will now compare the merits of algorithmic and probabilistic method.
25 25/28 Algorithmic versus probabilistic methods: pros Probabilistic methods Mathematical/probabilistic foundation. Possible to derive optimal methods. Often gives nice interpretations of the results. Possible to control error rates by choosing significance levels. Algorithmic methods No need for model assumptions. Is optimized using the test data. Often has a good heuristic foundation. Some methods work well when p > n.
26 26/28 Algorithmic versus probabilistic methods: cons Probabilistic methods May be based on asymptotic results, that do not work well when the sample size is small. The model may be a poor description of nature. The conclusions are only about the model s mechanism and not about the true mechanism. Evaluating the model fit can be difficult, especially in higher dimensions. Algorithmic methods Relies heavily on the training data, which may not be representative. Difficult or impossible to find optimal methods. Likely not as good as the probabilistic method if the model is accurate. Some methods lack solid theoretical support.
27 27/28 Algorithmic versus probabilistic methods: discussion A paper by Leo Breiman from 2001 (Statistical modeling: the two cultures, Statistical Science, Vol. 16) discusses the use of algorithmic methods in modern statistics. Breiman argues that: The data and the problem at hand should lead to the solution not prior ideas about what kind of methods are good. The statistician should focus on finding a good solution regardless of whether that solution uses algorithmic or probabilistic methods. How good a method is should be judged by the predictive accuracy of the method on the test data. This last point is perhaps controversial; we often judge probabilistic method by theoretical properties.
28 28/28 Algorithmic versus probabilistic methods: discussion Some further comments and questions: Are algorithmic methods simply even more non-parametric non-parametric methods? Today it is not uncommon for new probabilistic methods to be published with nothing but simulation results to back them up (as the underlying mathematics can be quite complicated). Is this any different from the support for algorithmic methods? There are some very interesting research problems in trying to provide probabilistic support for algorithmic methods. Regardless of how we feel about algorithmic methods, we should not be afraid to introduce new tools to our statistical toolbox!
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationA Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and
A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and Planning Overview Motivation for Analyses Analyses and
More informationAnalysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationMachine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler
Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationUsing Web Searches on Important Words to Create Background Sets for LSI Classification
Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract
More informationSETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT
SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT By: Dr. MAHMOUD M. GHANDOUR QATAR UNIVERSITY Improving human resources is the responsibility of the educational system in many societies. The outputs
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14
More informationCurriculum Design Project with Virtual Manipulatives. Gwenanne Salkind. George Mason University EDCI 856. Dr. Patricia Moyer-Packenham
Curriculum Design Project with Virtual Manipulatives Gwenanne Salkind George Mason University EDCI 856 Dr. Patricia Moyer-Packenham Spring 2006 Curriculum Design Project with Virtual Manipulatives Table
More informationThe Good Judgment Project: A large scale test of different methods of combining expert predictions
The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania
More informationINPE São José dos Campos
INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationBuild on students informal understanding of sharing and proportionality to develop initial fraction concepts.
Recommendation 1 Build on students informal understanding of sharing and proportionality to develop initial fraction concepts. Students come to kindergarten with a rudimentary understanding of basic fraction
More informationMADERA SCIENCE FAIR 2013 Grades 4 th 6 th Project due date: Tuesday, April 9, 8:15 am Parent Night: Tuesday, April 16, 6:00 8:00 pm
MADERA SCIENCE FAIR 2013 Grades 4 th 6 th Project due date: Tuesday, April 9, 8:15 am Parent Night: Tuesday, April 16, 6:00 8:00 pm Why participate in the Science Fair? Science fair projects give students
More informationComparison of EM and Two-Step Cluster Method for Mixed Data: An Application
International Journal of Medical Science and Clinical Inventions 4(3): 2768-2773, 2017 DOI:10.18535/ijmsci/ v4i3.8 ICV 2015: 52.82 e-issn: 2348-991X, p-issn: 2454-9576 2017, IJMSCI Research Article Comparison
More informationAlgebra 2- Semester 2 Review
Name Block Date Algebra 2- Semester 2 Review Non-Calculator 5.4 1. Consider the function f x 1 x 2. a) Describe the transformation of the graph of y 1 x. b) Identify the asymptotes. c) What is the domain
More informationObjectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition
Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic
More informationUsing dialogue context to improve parsing performance in dialogue systems
Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,
More informationOn-Line Data Analytics
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationBest Practices in Internet Ministry Released November 7, 2008
Best Practices in Internet Ministry Released November 7, 2008 David T. Bourgeois, Ph.D. Associate Professor of Information Systems Crowell School of Business Biola University Best Practices in Internet
More informationMeasurement. When Smaller Is Better. Activity:
Measurement Activity: TEKS: When Smaller Is Better (6.8) Measurement. The student solves application problems involving estimation and measurement of length, area, time, temperature, volume, weight, and
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More informationMathematics Success Grade 7
T894 Mathematics Success Grade 7 [OBJECTIVE] The student will find probabilities of compound events using organized lists, tables, tree diagrams, and simulations. [PREREQUISITE SKILLS] Simple probability,
More informationGetting Started with Deliberate Practice
Getting Started with Deliberate Practice Most of the implementation guides so far in Learning on Steroids have focused on conceptual skills. Things like being able to form mental images, remembering facts
More informationRule Learning with Negation: Issues Regarding Effectiveness
Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX
More informationDefragmenting Textual Data by Leveraging the Syntactic Structure of the English Language
Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language Nathaniel Hayes Department of Computer Science Simpson College 701 N. C. St. Indianola, IA, 50125 nate.hayes@my.simpson.edu
More informationDegreeWorks Advisor Reference Guide
DegreeWorks Advisor Reference Guide Table of Contents 1. DegreeWorks Basics... 2 Overview... 2 Application Features... 3 Getting Started... 4 DegreeWorks Basics FAQs... 10 2. What-If Audits... 12 Overview...
More informationIssues in the Mining of Heart Failure Datasets
International Journal of Automation and Computing 11(2), April 2014, 162-179 DOI: 10.1007/s11633-014-0778-5 Issues in the Mining of Heart Failure Datasets Nongnuch Poolsawad 1 Lisa Moore 1 Chandrasekhar
More informationChapter 4 - Fractions
. Fractions Chapter - Fractions 0 Michelle Manes, University of Hawaii Department of Mathematics These materials are intended for use with the University of Hawaii Department of Mathematics Math course
More informationUNIVERSITY OF CALIFORNIA SANTA CRUZ TOWARDS A UNIVERSAL PARAMETRIC PLAYER MODEL
UNIVERSITY OF CALIFORNIA SANTA CRUZ TOWARDS A UNIVERSAL PARAMETRIC PLAYER MODEL A thesis submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in COMPUTER SCIENCE
More informationShort vs. Extended Answer Questions in Computer Science Exams
Short vs. Extended Answer Questions in Computer Science Exams Alejandro Salinger Opportunities and New Directions April 26 th, 2012 ajsalinger@uwaterloo.ca Computer Science Written Exams Many choices of
More informationSwitchboard Language Model Improvement with Conversational Data from Gigaword
Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword
More informationK-Medoid Algorithm in Clustering Student Scholarship Applicants
Scientific Journal of Informatics Vol. 4, No. 1, May 2017 p-issn 2407-7658 http://journal.unnes.ac.id/nju/index.php/sji e-issn 2460-0040 K-Medoid Algorithm in Clustering Student Scholarship Applicants
More informationEXECUTIVE SUMMARY. Online courses for credit recovery in high schools: Effectiveness and promising practices. April 2017
EXECUTIVE SUMMARY Online courses for credit recovery in high schools: Effectiveness and promising practices April 2017 Prepared for the Nellie Mae Education Foundation by the UMass Donahue Institute 1
More informationEvolutive Neural Net Fuzzy Filtering: Basic Description
Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:
More informationActive Learning. Yingyu Liang Computer Sciences 760 Fall
Active Learning Yingyu Liang Computer Sciences 760 Fall 2017 http://pages.cs.wisc.edu/~yliang/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed by Mark Craven,
More informationName: Class: Date: ID: A
Name: Class: _ Date: _ Test Review Multiple Choice Identify the choice that best completes the statement or answers the question. 1. Members of a high school club sold hamburgers at a baseball game to
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationData Stream Processing and Analytics
Data Stream Processing and Analytics Vincent Lemaire Thank to Alexis Bondu, EDF Outline Introduction on data-streams Supervised Learning Conclusion 2 3 Big Data what does that mean? Big Data Analytics?
More informationCHAPTER 4: REIMBURSEMENT STRATEGIES 24
CHAPTER 4: REIMBURSEMENT STRATEGIES 24 INTRODUCTION Once state level policymakers have decided to implement and pay for CSR, one issue they face is simply how to calculate the reimbursements to districts
More informationTesting A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA
Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA Testing a Moving Target How Do We Test Machine Learning Systems? Peter Varhol, Technology
More informationGenevieve L. Hartman, Ph.D.
Curriculum Development and the Teaching-Learning Process: The Development of Mathematical Thinking for all children Genevieve L. Hartman, Ph.D. Topics for today Part 1: Background and rationale Current
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationExecutive Guide to Simulation for Health
Executive Guide to Simulation for Health Simulation is used by Healthcare and Human Service organizations across the World to improve their systems of care and reduce costs. Simulation offers evidence
More informationMultivariate k-nearest Neighbor Regression for Time Series data -
Multivariate k-nearest Neighbor Regression for Time Series data - a novel Algorithm for Forecasting UK Electricity Demand ISF 2013, Seoul, Korea Fahad H. Al-Qahtani Dr. Sven F. Crone Management Science,
More informationCharacteristics of the Text Genre Realistic fi ction Text Structure
LESSON 14 TEACHER S GUIDE by Oscar Hagen Fountas-Pinnell Level A Realistic Fiction Selection Summary A boy and his mom visit a pond and see and count a bird, fish, turtles, and frogs. Number of Words:
More informationSoftware Maintenance
1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories
More informationIT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University
IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University 06.11.16 13.11.16 Hannover Our group from Peter the Great St. Petersburg
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationFRAMEWORK FOR IDENTIFYING THE MOST LIKELY SUCCESSFUL UNDERPRIVILEGED TERTIARY STUDY BURSARY APPLICANTS
South African Journal of Industrial Engineering August 2017 Vol 28(2), pp 59-77 FRAMEWORK FOR IDENTIFYING THE MOST LIKELY SUCCESSFUL UNDERPRIVILEGED TERTIARY STUDY BURSARY APPLICANTS R. Steynberg 1 * #,
More informationAnalysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems
Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems Ajith Abraham School of Business Systems, Monash University, Clayton, Victoria 3800, Australia. Email: ajith.abraham@ieee.org
More informationChapter 2 Rule Learning in a Nutshell
Chapter 2 Rule Learning in a Nutshell This chapter gives a brief overview of inductive rule learning and may therefore serve as a guide through the rest of the book. Later chapters will expand upon the
More informationLecture 1: Basic Concepts of Machine Learning
Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010
More informationGo fishing! Responsibility judgments when cooperation breaks down
Go fishing! Responsibility judgments when cooperation breaks down Kelsey Allen (krallen@mit.edu), Julian Jara-Ettinger (jjara@mit.edu), Tobias Gerstenberg (tger@mit.edu), Max Kleiman-Weiner (maxkw@mit.edu)
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationInstructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100
San Diego State University School of Social Work 610 COMPUTER APPLICATIONS FOR SOCIAL WORK PRACTICE Statistical Package for the Social Sciences Office: Hepner Hall (HH) 100 Instructor: Mario D. Garrett,
More informationAnalysis: Evaluation: Knowledge: Comprehension: Synthesis: Application:
In 1956, Benjamin Bloom headed a group of educational psychologists who developed a classification of levels of intellectual behavior important in learning. Bloom found that over 95 % of the test questions
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationDate: 25 January 2012 Issue: 11
PDO School, PO Box 81, Muscat 100, Sultanate of Oman. Tel: [+968] 24677279 Website: http:// www.pdoschool.org Newsletter Date: 25 January 2012 Issue: 11 Dear Parents, We wish all of our Scottish families
More informationPractical Learning Tools (Communication Tools for the Trainer)
Practical Learning Tools (Communication Tools for the Trainer) The following practical learning tools described in detailed below were sourced from the various community work already conducted by the University
More informationThe Evolution of Random Phenomena
The Evolution of Random Phenomena A Look at Markov Chains Glen Wang glenw@uchicago.edu Splash! Chicago: Winter Cascade 2012 Lecture 1: What is Randomness? What is randomness? Can you think of some examples
More informationPurdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study
Purdue Data Summit 2017 Communication of Big Data Analytics New SAT Predictive Validity Case Study Paul M. Johnson, Ed.D. Associate Vice President for Enrollment Management, Research & Enrollment Information
More information1. Listen carefully as your teacher assigns you two or more rows of the Biome Jigsaw Chart (page S2) to fill in.
Biome Bags - Student Guide In this activity, you will explore ecology and biodiversity will be to work cooperatively to: in seven terrestrial biomes. Your task 1. Research and share information to complete
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationNumeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C
Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C Using and applying mathematics objectives (Problem solving, Communicating and Reasoning) Select the maths to use in some classroom
More informationImpact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees
Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees Mariusz Łapczy ski 1 and Bartłomiej Jefma ski 2 1 The Chair of Market Analysis and Marketing Research,
More informationAnswer Key For The California Mathematics Standards Grade 1
Introduction: Summary of Goals GRADE ONE By the end of grade one, students learn to understand and use the concept of ones and tens in the place value number system. Students add and subtract small numbers
More informationMining Student Evolution Using Associative Classification and Clustering
Mining Student Evolution Using Associative Classification and Clustering 19 Mining Student Evolution Using Associative Classification and Clustering Kifaya S. Qaddoum, Faculty of Information, Technology
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationReading Horizons. Organizing Reading Material into Thought Units to Enhance Comprehension. Kathleen C. Stevens APRIL 1983
Reading Horizons Volume 23, Issue 3 1983 Article 8 APRIL 1983 Organizing Reading Material into Thought Units to Enhance Comprehension Kathleen C. Stevens Northeastern Illinois University Copyright c 1983
More informationLesson Plan Title Aquatic Ecology
Lesson Plan Title Aquatic Ecology Name (last, first): Larson, Don Scientific Theme(s): C-2: develop an understanding of the structure, function, behavior, development, life cycles, and diversity of living
More informationMedical Complexity: A Pragmatic Theory
http://eoimages.gsfc.nasa.gov/images/imagerecords/57000/57747/cloud_combined_2048.jpg Medical Complexity: A Pragmatic Theory Chris Feudtner, MD PhD MPH The Children s Hospital of Philadelphia Main Thesis
More informationLanguage Acquisition Chart
Language Acquisition Chart This chart was designed to help teachers better understand the process of second language acquisition. Please use this chart as a resource for learning more about the way people
More informationMaximizing Learning Through Course Alignment and Experience with Different Types of Knowledge
Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February
More informationPractice Examination IREB
IREB Examination Requirements Engineering Advanced Level Elicitation and Consolidation Practice Examination Questionnaire: Set_EN_2013_Public_1.2 Syllabus: Version 1.0 Passed Failed Total number of points
More informationThe Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma
International Journal of Computer Applications (975 8887) The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma Gilbert M.
More informationMYCIN. The MYCIN Task
MYCIN Developed at Stanford University in 1972 Regarded as the first true expert system Assists physicians in the treatment of blood infections Many revisions and extensions over the years The MYCIN Task
More informationSARDNET: A Self-Organizing Feature Map for Sequences
SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu
More informationAlignment of Australian Curriculum Year Levels to the Scope and Sequence of Math-U-See Program
Alignment of s to the Scope and Sequence of Math-U-See Program This table provides guidance to educators when aligning levels/resources to the Australian Curriculum (AC). The Math-U-See levels do not address
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationTitle: Knowledge assessment of trainees and trainers in General Practice in a neighboring country. Making a case for international collaboration.
Author's response to reviews Title: Knowledge assessment of trainees and trainers in General Practice in a neighboring country. Making a case for international collaboration. Authors: Roy Remmen (roy.remmen@ua.ac.be)
More informationAnalyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio
SCSUG Student Symposium 2016 Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio Praneth Guggilla, Tejaswi Jha, Goutam Chakraborty, Oklahoma State
More informationSight Word Assessment
Make, Take & Teach Sight Word Assessment Assessment and Progress Monitoring for the Dolch 220 Sight Words What are sight words? Sight words are words that are used frequently in reading and writing. Because
More informationA Version Space Approach to Learning Context-free Grammars
Machine Learning 2: 39~74, 1987 1987 Kluwer Academic Publishers, Boston - Manufactured in The Netherlands A Version Space Approach to Learning Context-free Grammars KURT VANLEHN (VANLEHN@A.PSY.CMU.EDU)
More informationUsing computational modeling in language acquisition research
Chapter 8 Using computational modeling in language acquisition research Lisa Pearl 1. Introduction Language acquisition research is often concerned with questions of what, when, and how what children know,
More informationProof Theory for Syntacticians
Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax
More informationstateorvalue to each variable in a given set. We use p(x = xjy = y) (or p(xjy) as a shorthand) to denote the probability that X = x given Y = y. We al
Dependency Networks for Collaborative Filtering and Data Visualization David Heckerman, David Maxwell Chickering, Christopher Meek, Robert Rounthwaite, Carl Kadie Microsoft Research Redmond WA 98052-6399
More information16.1 Lesson: Putting it into practice - isikhnas
BAB 16 Module: Using QGIS in animal health The purpose of this module is to show how QGIS can be used to assist in animal health scenarios. In order to do this, you will have needed to study, and be familiar
More informationLinguistics 220 Phonology: distributions and the concept of the phoneme. John Alderete, Simon Fraser University
Linguistics 220 Phonology: distributions and the concept of the phoneme John Alderete, Simon Fraser University Foundations in phonology Outline 1. Intuitions about phonological structure 2. Contrastive
More informationFunctional Maths Skills Check E3/L x
Functional Maths Skills Check E3/L1 Name: Date started: The Four Rules of Number + - x May 2017. Kindly contributed by Nicola Smith, Gloucestershire College. Search for Nicola on skillsworkshop.org Page
More informationInquiry Learning Methodologies and the Disposition to Energy Systems Problem Solving
Inquiry Learning Methodologies and the Disposition to Energy Systems Problem Solving Minha R. Ha York University minhareo@yorku.ca Shinya Nagasaki McMaster University nagasas@mcmaster.ca Justin Riddoch
More informationCase study Norway case 1
Case study Norway case 1 School : B (primary school) Theme: Science microorganisms Dates of lessons: March 26-27 th 2015 Age of students: 10-11 (grade 5) Data sources: Pre- and post-interview with 1 teacher
More information