WEB SITE/TRITONED UPDATES
|
|
- Anastasia Oliver
- 5 years ago
- Views:
Transcription
1 CLASS 4, APRIL 2018 CHAPTER 9 CLASSIFICATION AND REGRESSION TREES DAY 2 PREDICTING PRICES OF TOYOTA CARS ROGER BOHN APRIL 2018 Notes based on: Data Mining for Business Analytics. Shmueli, et al + Data Mining with Rattle and R, G. Williams Middle section of slides is almost same as for previous class. 1 WEB SITE/TRITONED UPDATES New entries: Resources for studying R; Sony Entertainment + 1 more project description; trade ideas and looking for teammates on projects. Homework due Friday noon for today, and Saturday for your project proposals. Lecture notes, before or after class. Menu item. bda2020.wordpress.com/2018/04/04/latest-syllabusassignments-and-notes/
2 SESSION LEARNING GOALS Demonstrate the analytic flow for analyzing big data. Begin to practice it. BDA as an art, as well as a science. Introduce decision tree models = CART = very different than classical econometric regression models Key concepts of BDA Holdout sample Transform the data to physically meaningful, or useful, or both Many others C A R T = CLASSIFICATION TREES One of a dozen common mining models. (Algorithms) Data + Algorithm Predictions Predictions - Actual Model performance Relatively straightforward
3 TREES AND RULES Goal: Classify or predict an outcome based on a set of predictors The output is a set of rules Example: Goal: classify a record as will accept credit card offer or will not accept Rule might be IF (Income > 92.5) AND (Education < 1.5) AND (Family <= 2.5) THEN Class = 0 (nonacceptor) Also called CART, Decision Trees, or just Trees Rules are represented by tree diagrams 5 6
4 KEY IDEAS Recursive partitioning: Repeatedly split the records into two parts To achieve maximum homogeneity within the new parts Choosing the next variable Pruning the tree: Simplify the tree by pruning minor branches to avoid overfitting 7 RECURSIVE PARTITIONING 8
5 RECURSIVE PARTITIONING STEPS Pick one of the predictor variables, x i Pick a value of x i, say s i, that divides the training data into two (not necessarily equal) portions Measure how pure or homogeneous each of the resulting portions are Pure = containing records of mostly one class Algorithm tries different values of x i, and s i to maximize purity in initial split After you get a maximum purity split, repeat the process for a second split, and so on 9 EXAMPLE: RIDING MOWERS Goal: Classify 24 households as owning or not owning riding mowers Predictors = Income, Lot Size 10
6 Income Lot_Size Ownership owner owner owner owner owner owner owner owner owner owner owner owner non-owner non-owner non-owner non-owner non-owner non-owner non-owner non-owner non-owner non-owner non-owner non-owner 11 12
7 HOW TO SPLIT Order records according to one variable, say income Take a predictor value, say 60 (the first record) and divide records into those with income >= 60 and those < 60 Measure resulting purity (homogeneity) of class in each resulting portion Try all other split values Repeat for other variable(s) Select the one variable & split that yields the most purity increase THE FIRST SPLIT: INCOME = 60
8 SECOND SPLIT: LOT SIZE = 21 AFTER ALL SPLITS
9 WHAT ABOUT CATEGORICAL VARIABLES? Examine all possible ways in which the categories can be split. E.g., categories A, B, C can be split 3 ways {A} and {B, C} {B} and {A, C} {C} and {A, B} With many categories, # of splits explodes (Toyota car models) Computation will bog down. How many ways to split 30 models? ~ MEASURING IMPURITY 18
10 GINI INDEX I(A) = 1 - p = proportion of cases in rectangle A that belong to class k Gini Index for rectangle A containing m records I(A) = 0 when all cases belong to same class Max value when all classes are equally represented (= 0.50 in binary case) Note: XLMiner uses a variant called delta splitting rule 19 ENTROPY p = proportion of cases (out of m) in rectangle A that belong to class k Entropy ranges between 0 (most pure) and log 2 (m) (equal representation of classes) 20
11 IMPURITY AND RECURSIVE PARTITIONING Obtain overall impurity measure (weighted avg. of individual rectangles) At each successive stage, compare this measure across all possible splits in all variables Choose the split that reduces impurity the most Which variable, and where to split it. Chosen split points become nodes on the tree 21 FIRST SPLIT THE TREE
12 TREE AFTER ALL SPLITS The first split is on Income, then the next split is on Lot Size for both the low income group (at lot size 21) and the high income split (at lot size 20) Decision node Terminal node (leaf)
13 class in this portion of the first split (those with income The next >= 60) is split owner for this 11 owners group and of 165 nonowners on the will be basis of lot size, splitting at 20 TREE STRUCTURE Split points become nodes on tree (circles with split value in center) Rectangles represent leaves (terminal points, no further splits, classification value noted) Numbers on lines between nodes indicate # cases Read down tree to derive rule E.g., If lot size < 19, and if income > 84.75, then class = owner 26
14 Read down the tree to derive rules If Income < 60 AND Lot Size < 21, classify as Nonowner DETERMINING LEAF NODE LABEL Each leaf node label is determined by voting of the records within it, and by the cutoff value Records within each leaf node are from the training data Default cutoff=0.5 means that the leaf node s label is the majority class. Cutoff = 0.75: requires majority of 75% or more 1 records in the leaf to label it a 1 node 28
15 THE OVERFITTING PROBLEM 29 FULL TREES ARE COMPLEX AND OVERFIT THE DATA Natural end of process is 100% purity in each leaf This overfits the data, which end up fitting noise in the data Consider Example 2, Loan Acceptance with more records and more variables than the Riding Mower data the full tree is very complex
16 Full trees are too complex they end up fitting noise, overfitting the data OVERFITTING PRODUCES POOR PREDICTIVE PERFORMANCE PAST A CERTAIN POINT IN TREE COMPLEXITY, THE ERROR RATE ON NEW DATA STARTS TO INCREASE
17 PRUNING CART lets tree grow to full extent, then prunes it back Idea is to find that point at which the validation error is at a minimum Generate successively smaller trees by pruning leaves At each pruning stage, multiple trees are possible Use cost complexity to choose the best tree at that stage WHICH BRANCH TO CUT AT EACH STAGE OF PRUNING? CC(T) = Err(T) + α L(T) CC(T) = cost complexity of a tree Err(T) = proportion of misclassified records α = penalty factor attached to tree size (set by user) Among trees of given size, choose the one with lowest CC Do this for each size of tree (stage of pruning)
18 TREE INSTABILITY If 2 or more variables are of roughly equal importance, which one CART chooses for the first split can depend on the initial partition into training and validation A different partition into training/validation could lead to a different initial split This can cascade down and produce a very different tree from the first training/validation partition Solution is to try many different training/validation splits cross validation With future data, grow tree to 7 splits: estimated cv error std. error of the estimate smallest tree within 1 xstd of min. error (it has 7 splits) minimum error
19 ADVANTAGES OF TREES Easy to use, understand Produce rules that are easy to interpret & implement Variable selection & reduction is automatic Do not require the assumptions of statistical models. Completely distribution-free aka non-parametric Can work without extensive handling of missing data CART DISADVANTAGES May not perform well where there is structure in the data that is not well captured by horizontal or vertical splits Very simple, don t always give best fits Disadvantage of single trees: instability and poor predictive performance We will improve CART later in course with Random Forests. 38
20 SUMMARY Classification and Regression Trees are an easily understandable and transparent method for predicting or classifying new records A tree is a graphical representation of a set of rules Trees must be pruned to avoid over-fitting of the training data As trees do not make any assumptions about the data structure, they usually require large samples 39 40
21 TOYOTA COROLLA PRICES Case analysis Higher or lower than median price? 1436 records, 38 attributes 41 LOTS OF VARIABLES. WHICH TO USE? Look for unimportant variables Little variation in the data Zero correlation to final outcome (not always safe) Look for groups of variables (with high correlation) Probably irrelevant to price (use domain knowledge )
22 43
23 Corrgram of mtcars intercorrelations gear am drat mpg vs qsec wt disp cyl hp carb Figure Corrgram of the correlations among the variables in the mtcars data frame. Rows and columns have been reordered using principal components analysis. 5/28/2014 ggpairs3.png ( ) 46
24 GGPAIRS 1#,Uncomment,these,lines,and,install,if,necessary: 2 #install.packages('ggally') 3 #install.packages('ggplot2') 4 #install.packages('scales') 5 #install.packages('memisc') 6 7 library(ggplot2) 8 library(ggally) 9 library(scales) 10 data(diamonds) diasamp = diamonds[sample(1:length(diamonds$price),10000),] 13 ggpairs(diasamp,params = c(shape = I("."),outlier.shape=I("."))) 48
25 COROLLA TREE: AGE, KM, 49 50
26 EVALUATE RESULT: CONFUSION MATRIX Calculate model using only training data Evaluate model using only validation data. 51 HOW TO BUILD IN PRACTICE Start with every plausible variable Throw out obviously unimportant (radio) Let the algorithm decide what belongs in final model Do NOT screen heavily, unless you have to Do Throw out obvious junk Think hard about categorical variables with lots of categories. They become lots of dummy variables! Blows up Consolidate categories based on causal similarity, or small sample size Consider pruning highly correlated variables (at least at first) All choices are tentative 52
27 REPORT WHAT YOU DID Don t omit too many variables unless sure they don t matter. For some variables, yes you can be sure! (Radio) 53 TYPICAL RESULTS: WHAT ARE KEY VARIABLES Age (in months) Km traveled Air conditioning Weight Do these match our understanding of cars? Our domain knowledge? 54
28 AIR CONDITIONING AC Yes/No Automatic AC Yes/no Model thinks this is 2 independent variables Use outside knowledge: Convert this to 1 variable with 3 levels 55 56
29 SESSION LEARNING GOALS Demonstrate the analytic flow for analyzing big data. Begin to practice it. BDA as an art, as well as a science. Introduce decision tree models = CART = very different than classical econometric regression models Key concepts of BDA Holdout sample Transform the data to physically meaningful, or useful, or both Many others KEY CONCEPTS Decision trees Key concept #1: Use a holdout sample to evaluate performance. Train/validate/test Confusion matrix for classification problems (discrete results) Key concept #2: Overfitting. Models always overfit. Holdout sample tells how badly Concept: tuning a model for better results. Key concept #0: Variables have physical/economic/business meanings Key concept #3: transforming the data for better fits and better insight/ understanding of result. aka Feature creation Concept: cleaning data to get rid of data errors. Key concept # 4: Nonlinearity. Key concept #5: Knowing causality is wonderful, but for many purposes not necessary. The data mining process flow.
CS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationMachine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler
Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationA Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and
A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and Planning Overview Motivation for Analyses Analyses and
More informationIntroduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationOn-Line Data Analytics
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob
More informationMathematics Success Level E
T403 [OBJECTIVE] The student will generate two patterns given two rules and identify the relationship between corresponding terms, generate ordered pairs, and graph the ordered pairs on a coordinate plane.
More informationFunctional Skills Mathematics Level 2 assessment
Functional Skills Mathematics Level 2 assessment www.cityandguilds.com September 2015 Version 1.0 Marking scheme ONLINE V2 Level 2 Sample Paper 4 Mark Represent Analyse Interpret Open Fixed S1Q1 3 3 0
More informationlearning collegiate assessment]
[ collegiate learning assessment] INSTITUTIONAL REPORT 2005 2006 Kalamazoo College council for aid to education 215 lexington avenue floor 21 new york new york 10016-6023 p 212.217.0700 f 212.661.9766
More informationAGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS
AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS 1 CALIFORNIA CONTENT STANDARDS: Chapter 1 ALGEBRA AND WHOLE NUMBERS Algebra and Functions 1.4 Students use algebraic
More informationCHAPTER 4: REIMBURSEMENT STRATEGIES 24
CHAPTER 4: REIMBURSEMENT STRATEGIES 24 INTRODUCTION Once state level policymakers have decided to implement and pay for CSR, one issue they face is simply how to calculate the reimbursements to districts
More informationObjectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition
Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic
More informationProbability and Statistics Curriculum Pacing Guide
Unit 1 Terms PS.SPMJ.3 PS.SPMJ.5 Plan and conduct a survey to answer a statistical question. Recognize how the plan addresses sampling technique, randomization, measurement of experimental error and methods
More informationActivities, Exercises, Assignments Copyright 2009 Cem Kaner 1
Patterns of activities, iti exercises and assignments Workshop on Teaching Software Testing January 31, 2009 Cem Kaner, J.D., Ph.D. kaner@kaner.com Professor of Software Engineering Florida Institute of
More informationFRAMEWORK FOR IDENTIFYING THE MOST LIKELY SUCCESSFUL UNDERPRIVILEGED TERTIARY STUDY BURSARY APPLICANTS
South African Journal of Industrial Engineering August 2017 Vol 28(2), pp 59-77 FRAMEWORK FOR IDENTIFYING THE MOST LIKELY SUCCESSFUL UNDERPRIVILEGED TERTIARY STUDY BURSARY APPLICANTS R. Steynberg 1 * #,
More informationRule Learning with Negation: Issues Regarding Effectiveness
Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationUnderstanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010)
Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010) Jaxk Reeves, SCC Director Kim Love-Myers, SCC Associate Director Presented at UGA
More informationThe Good Judgment Project: A large scale test of different methods of combining expert predictions
The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania
More informationIndividual Differences & Item Effects: How to test them, & how to test them well
Individual Differences & Item Effects: How to test them, & how to test them well Individual Differences & Item Effects Properties of subjects Cognitive abilities (WM task scores, inhibition) Gender Age
More informationMathematics Success Grade 7
T894 Mathematics Success Grade 7 [OBJECTIVE] The student will find probabilities of compound events using organized lists, tables, tree diagrams, and simulations. [PREREQUISITE SKILLS] Simple probability,
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationIntroduction to Simulation
Introduction to Simulation Spring 2010 Dr. Louis Luangkesorn University of Pittsburgh January 19, 2010 Dr. Louis Luangkesorn ( University of Pittsburgh ) Introduction to Simulation January 19, 2010 1 /
More informationImpact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees
Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees Mariusz Łapczy ski 1 and Bartłomiej Jefma ski 2 1 The Chair of Market Analysis and Marketing Research,
More informationChapter 2 Rule Learning in a Nutshell
Chapter 2 Rule Learning in a Nutshell This chapter gives a brief overview of inductive rule learning and may therefore serve as a guide through the rest of the book. Later chapters will expand upon the
More informationAnalysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems
Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems Ajith Abraham School of Business Systems, Monash University, Clayton, Victoria 3800, Australia. Email: ajith.abraham@ieee.org
More informationGenerating Test Cases From Use Cases
1 of 13 1/10/2007 10:41 AM Generating Test Cases From Use Cases by Jim Heumann Requirements Management Evangelist Rational Software pdf (155 K) In many organizations, software testing accounts for 30 to
More informationTIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy
TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE Pierre Foy TIMSS Advanced 2015 orks User Guide for the International Database Pierre Foy Contributors: Victoria A.S. Centurino, Kerry E. Cotter,
More informationUsing focal point learning to improve human machine tacit coordination
DOI 10.1007/s10458-010-9126-5 Using focal point learning to improve human machine tacit coordination InonZuckerman SaritKraus Jeffrey S. Rosenschein The Author(s) 2010 Abstract We consider an automated
More informationChinese Language Parsing with Maximum-Entropy-Inspired Parser
Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationGrade 6: Correlated to AGS Basic Math Skills
Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and
More informationSection 3.4. Logframe Module. This module will help you understand and use the logical framework in project design and proposal writing.
Section 3.4 Logframe Module This module will help you understand and use the logical framework in project design and proposal writing. THIS MODULE INCLUDES: Contents (Direct links clickable belo[abstract]w)
More informationData Stream Processing and Analytics
Data Stream Processing and Analytics Vincent Lemaire Thank to Alexis Bondu, EDF Outline Introduction on data-streams Supervised Learning Conclusion 2 3 Big Data what does that mean? Big Data Analytics?
More informationInvestment in e- journals, use and research outcomes
Investment in e- journals, use and research outcomes David Nicholas CIBER Research Limited, UK Ian Rowlands University of Leicester, UK Library Return on Investment seminar Universite de Lyon, 20-21 February
More informationLANGUAGE DIVERSITY AND ECONOMIC DEVELOPMENT. Paul De Grauwe. University of Leuven
Preliminary draft LANGUAGE DIVERSITY AND ECONOMIC DEVELOPMENT Paul De Grauwe University of Leuven January 2006 I am grateful to Michel Beine, Hans Dewachter, Geert Dhaene, Marco Lyrio, Pablo Rovira Kaltwasser,
More informationLahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017
Instructor Syed Zahid Ali Room No. 247 Economics Wing First Floor Office Hours Email szahid@lums.edu.pk Telephone Ext. 8074 Secretary/TA TA Office Hours Course URL (if any) Suraj.lums.edu.pk FINN 321 Econometrics
More informationLinking the Ohio State Assessments to NWEA MAP Growth Tests *
Linking the Ohio State Assessments to NWEA MAP Growth Tests * *As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth. August 2016 Introduction Northwest Evaluation Association (NWEA
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationEDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016
EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016 Instructor: Dr. Katy Denson, Ph.D. Office Hours: Because I live in Albuquerque, New Mexico, I won t have office hours. But
More informationAlgebra 2- Semester 2 Review
Name Block Date Algebra 2- Semester 2 Review Non-Calculator 5.4 1. Consider the function f x 1 x 2. a) Describe the transformation of the graph of y 1 x. b) Identify the asymptotes. c) What is the domain
More information*In Ancient Greek: *In English: micro = small macro = large economia = management of the household or family
ECON 3 * *In Ancient Greek: micro = small macro = large economia = management of the household or family *In English: Microeconomics = the study of how individuals or small groups of people manage limited
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationOn Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC
On Human Computer Interaction, HCI Dr. Saif al Zahir Electrical and Computer Engineering Department UBC Human Computer Interaction HCI HCI is the study of people, computer technology, and the ways these
More informationIntroduction to Questionnaire Design
Introduction to Questionnaire Design Why this seminar is necessary! Bad questions are everywhere! Don t let them happen to you! Fall 2012 Seminar Series University of Illinois www.srl.uic.edu The first
More informationLearning goal-oriented strategies in problem solving
Learning goal-oriented strategies in problem solving Martin Možina, Timotej Lazar, Ivan Bratko Faculty of Computer and Information Science University of Ljubljana, Ljubljana, Slovenia Abstract The need
More informationCHEM 6487: Problem Seminar in Inorganic Chemistry Spring 2010
CHEM 6487: Problem Seminar in Inorganic Chemistry Spring 2010 Instructor: Dr. Stephen M. Holmes Course Time: 10 AM Friday Office Location: 418 Benton Hall Course Location: 451 Benton Hall Email: holmesst@umsl.edu
More informationAn Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District
An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District Report Submitted June 20, 2012, to Willis D. Hawley, Ph.D., Special
More informationThe Evolution of Random Phenomena
The Evolution of Random Phenomena A Look at Markov Chains Glen Wang glenw@uchicago.edu Splash! Chicago: Winter Cascade 2012 Lecture 1: What is Randomness? What is randomness? Can you think of some examples
More informationOn-the-Fly Customization of Automated Essay Scoring
Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,
More informationSoftware Maintenance
1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories
More informationAsian Development Bank - International Initiative for Impact Evaluation. Video Lecture Series
Asian Development Bank - International Initiative for Impact Evaluation Video Lecture Series Impact evaluations of social protection- Project and Programmes: considering cash transfers and educational
More informationSystem Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationSTA 225: Introductory Statistics (CT)
Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic
More informationSession 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design
Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design Paper #3 Five Q-to-survey approaches: did they work? Job van Exel
More informationAnalyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio
SCSUG Student Symposium 2016 Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio Praneth Guggilla, Tejaswi Jha, Goutam Chakraborty, Oklahoma State
More informationIntroduction to Causal Inference. Problem Set 1. Required Problems
Introduction to Causal Inference Problem Set 1 Professor: Teppei Yamamoto Due Friday, July 15 (at beginning of class) Only the required problems are due on the above date. The optional problems will not
More informationMathematics process categories
Mathematics process categories All of the UK curricula define multiple categories of mathematical proficiency that require students to be able to use and apply mathematics, beyond simple recall of facts
More informationMiami-Dade County Public Schools
ENGLISH LANGUAGE LEARNERS AND THEIR ACADEMIC PROGRESS: 2010-2011 Author: Aleksandr Shneyderman, Ed.D. January 2012 Research Services Office of Assessment, Research, and Data Analysis 1450 NE Second Avenue,
More informationEDIT 576 (2 credits) Mobile Learning and Applications Fall Semester 2015 August 31 October 18, 2015 Fully Online Course
GEORGE MASON UNIVERSITY COLLEGE OF EDUCATION AND HUMAN DEVELOPMENT INSTRUCTIONAL DESIGN AND TECHNOLOGY PROGRAM EDIT 576 (2 credits) Mobile Learning and Applications Fall Semester 2015 August 31 October
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationCal s Dinner Card Deals
Cal s Dinner Card Deals Overview: In this lesson students compare three linear functions in the context of Dinner Card Deals. Students are required to interpret a graph for each Dinner Card Deal to help
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationpreassessment was administered)
5 th grade Math Friday, 3/19/10 Integers and Absolute value (Lesson taught during the same period that the integer preassessment was administered) What students should know and be able to do at the end
More informationEDIT 576 DL1 (2 credits) Mobile Learning and Applications Fall Semester 2014 August 25 October 12, 2014 Fully Online Course
GEORGE MASON UNIVERSITY COLLEGE OF EDUCATION AND HUMAN DEVELOPMENT GRADUATE SCHOOL OF EDUCATION INSTRUCTIONAL DESIGN AND TECHNOLOGY PROGRAM EDIT 576 DL1 (2 credits) Mobile Learning and Applications Fall
More informationGrammar Lesson Plan: Yes/No Questions with No Overt Auxiliary Verbs
Grammar Lesson Plan: Yes/No Questions with No Overt Auxiliary Verbs DIALOGUE: Hi Armando. Did you get a new job? No, not yet. Are you still looking? Yes, I am. Have you had any interviews? Yes. At the
More informationDefragmenting Textual Data by Leveraging the Syntactic Structure of the English Language
Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language Nathaniel Hayes Department of Computer Science Simpson College 701 N. C. St. Indianola, IA, 50125 nate.hayes@my.simpson.edu
More informationMaximizing Learning Through Course Alignment and Experience with Different Types of Knowledge
Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February
More informationActive Learning. Yingyu Liang Computer Sciences 760 Fall
Active Learning Yingyu Liang Computer Sciences 760 Fall 2017 http://pages.cs.wisc.edu/~yliang/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed by Mark Craven,
More informationINTERMEDIATE ALGEBRA Course Syllabus
INTERMEDIATE ALGEBRA Course Syllabus This syllabus gives a detailed explanation of the course procedures and policies. You are responsible for this information - ask your instructor if anything is unclear.
More informationMulti-label classification via multi-target regression on data streams
Mach Learn (2017) 106:745 770 DOI 10.1007/s10994-016-5613-5 Multi-label classification via multi-target regression on data streams Aljaž Osojnik 1,2 Panče Panov 1 Sašo Džeroski 1,2,3 Received: 26 April
More information*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN
From: AAAI Technical Report WS-98-08. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Recommender Systems: A GroupLens Perspective Joseph A. Konstan *t, John Riedl *t, AI Borchers,
More informationCreating a Test in Eduphoria! Aware
in Eduphoria! Aware Login to Eduphoria using CHROME!!! 1. LCS Intranet > Portals > Eduphoria From home: LakeCounty.SchoolObjects.com 2. Login with your full email address. First time login password default
More informationThe Indices Investigations Teacher s Notes
The Indices Investigations Teacher s Notes These activities are for students to use independently of the teacher to practise and develop number and algebra properties.. Number Framework domain and stage:
More informationChapter 4 - Fractions
. Fractions Chapter - Fractions 0 Michelle Manes, University of Hawaii Department of Mathematics These materials are intended for use with the University of Hawaii Department of Mathematics Math course
More informationWE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT
WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working
More informationMay To print or download your own copies of this document visit Name Date Eurovision Numeracy Assignment
1. An estimated one hundred and twenty five million people across the world watch the Eurovision Song Contest every year. Write this number in figures. 2. Complete the table below. 2004 2005 2006 2007
More informationCollege Pricing and Income Inequality
College Pricing and Income Inequality Zhifeng Cai U of Minnesota and FRB Minneapolis Jonathan Heathcote FRB Minneapolis OSU, November 15 2016 The views expressed herein are those of the authors and not
More informationCS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus
CS 1103 Computer Science I Honors Fall 2016 Instructor Muller Syllabus Welcome to CS1103. This course is an introduction to the art and science of computer programming and to some of the fundamental concepts
More informationOntologies vs. classification systems
Ontologies vs. classification systems Bodil Nistrup Madsen Copenhagen Business School Copenhagen, Denmark bnm.isv@cbs.dk Hanne Erdman Thomsen Copenhagen Business School Copenhagen, Denmark het.isv@cbs.dk
More informationCollege Pricing. Ben Johnson. April 30, Abstract. Colleges in the United States price discriminate based on student characteristics
College Pricing Ben Johnson April 30, 2012 Abstract Colleges in the United States price discriminate based on student characteristics such as ability and income. This paper develops a model of college
More information12- A whirlwind tour of statistics
CyLab HT 05-436 / 05-836 / 08-534 / 08-734 / 19-534 / 19-734 Usable Privacy and Security TP :// C DU February 22, 2016 y & Secu rivac rity P le ratory bo La Lujo Bauer, Nicolas Christin, and Abby Marsh
More informationTrends in College Pricing
Trends in College Pricing 2009 T R E N D S I N H I G H E R E D U C A T I O N S E R I E S T R E N D S I N H I G H E R E D U C A T I O N S E R I E S Highlights Published Tuition and Fee and Room and Board
More informationFirms and Markets Saturdays Summer I 2014
PRELIMINARY DRAFT VERSION. SUBJECT TO CHANGE. Firms and Markets Saturdays Summer I 2014 Professor Thomas Pugel Office: Room 11-53 KMC E-mail: tpugel@stern.nyu.edu Tel: 212-998-0918 Fax: 212-995-4212 This
More informationPragmatic Use Case Writing
Pragmatic Use Case Writing Presented by: reducing risk. eliminating uncertainty. 13 Stonebriar Road Columbia, SC 29212 (803) 781-7628 www.evanetics.com Copyright 2006-2008 2000-2009 Evanetics, Inc. All
More informationStudents Understanding of Graphical Vector Addition in One and Two Dimensions
Eurasian J. Phys. Chem. Educ., 3(2):102-111, 2011 journal homepage: http://www.eurasianjournals.com/index.php/ejpce Students Understanding of Graphical Vector Addition in One and Two Dimensions Umporn
More informationEvolutive Neural Net Fuzzy Filtering: Basic Description
Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:
More informationMINUTE TO WIN IT: NAMING THE PRESIDENTS OF THE UNITED STATES
MINUTE TO WIN IT: NAMING THE PRESIDENTS OF THE UNITED STATES THE PRESIDENTS OF THE UNITED STATES Project: Focus on the Presidents of the United States Objective: See how many Presidents of the United States
More informationData Structures and Algorithms
CS 3114 Data Structures and Algorithms 1 Trinity College Library Univ. of Dublin Instructor and Course Information 2 William D McQuain Email: Office: Office Hours: wmcquain@cs.vt.edu 634 McBryde Hall see
More informationNumeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C
Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C Using and applying mathematics objectives (Problem solving, Communicating and Reasoning) Select the maths to use in some classroom
More informationQuantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)
Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur) 1 Interviews, diary studies Start stats Thursday: Ethics/IRB Tuesday: More stats New homework is available
More informationInteractive Whiteboard
50 Graphic Organizers for the Interactive Whiteboard Whiteboard-ready graphic organizers for reading, writing, math, and more to make learning engaging and interactive by Jennifer Jacobson & Dottie Raymer
More informationAP Statistics Summer Assignment 17-18
AP Statistics Summer Assignment 17-18 Welcome to AP Statistics. This course will be unlike any other math class you have ever taken before! Before taking this course you will need to be competent in basic
More informationMany instructors use a weighted total to calculate their grades. This lesson explains how to set up a weighted total using categories.
Weighted Totals Many instructors use a weighted total to calculate their grades. This lesson explains how to set up a weighted total using categories. Set up your grading scheme in your syllabus Your syllabus
More information