Decision Tree. Machine Learning. Hamid Beigy. Sharif University of Technology. Fall 1396
|
|
- Eustace White
- 5 years ago
- Views:
Transcription
1 Decision Tree Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
2 Table of contents 1 Introduction 2 Decision tree classification 3 Building decision trees 4 ID3 Algorithm Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
3 Introduction 1 The decision tree is a classic and natural model of learning. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
4 Introduction 1 The decision tree is a classic and natural model of learning. 2 It is closely related to the notion of divide and conquer. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
5 Introduction 1 The decision tree is a classic and natural model of learning. 2 It is closely related to the notion of divide and conquer. 3 A decision tree partitions the instance space into axis-parallel regions, labeled with class value Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
6 Introduction 1 The decision tree is a classic and natural model of learning. 2 It is closely related to the notion of divide and conquer. 3 A decision tree partitions the instance space into axis-parallel regions, labeled with class value 4 Why decsion trees? Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
7 Introduction 1 The decision tree is a classic and natural model of learning. 2 It is closely related to the notion of divide and conquer. 3 A decision tree partitions the instance space into axis-parallel regions, labeled with class value 4 Why decsion trees? Interpretable, popular in medical applications because they mimic the way a doctor thinks Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
8 Introduction 1 The decision tree is a classic and natural model of learning. 2 It is closely related to the notion of divide and conquer. 3 A decision tree partitions the instance space into axis-parallel regions, labeled with class value 4 Why decsion trees? Interpretable, popular in medical applications because they mimic the way a doctor thinks Can model discrete outcomes nicely Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
9 Introduction 1 The decision tree is a classic and natural model of learning. 2 It is closely related to the notion of divide and conquer. 3 A decision tree partitions the instance space into axis-parallel regions, labeled with class value 4 Why decsion trees? Interpretable, popular in medical applications because they mimic the way a doctor thinks Can model discrete outcomes nicely Can be very powerful, can be as complex as you need them Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
10 Introduction 1 The decision tree is a classic and natural model of learning. 2 It is closely related to the notion of divide and conquer. 3 A decision tree partitions the instance space into axis-parallel regions, labeled with class value 4 Why decsion trees? Interpretable, popular in medical applications because they mimic the way a doctor thinks Can model discrete outcomes nicely Can be very powerful, can be as complex as you need them C4.5 and CART decision trees are very popular. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
11 Decision tree classification 1 Structure of decsion trees Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
12 Decision tree classification 1 Structure of decsion trees Each internal node tests an attribute Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
13 Decision tree classification 1 Structure of decsion trees Each internal node tests an attribute Each branch corresponds to attribute value Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
14 Decision tree classification 1 Structure of decsion trees Each internal node tests an attribute Each branch corresponds to attribute value Each leaf node assigns a classification. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
15 can Decision not be tree trained classification incrementally., ID5, ID5R are samples of incremental induction of decision trees 1 ding Structure of decsion trees Each internal node tests an attribute P.E. Utgoff, Each Incremental branch corresponds Induction to attribute of Decision value Trees, Machine Learning, Vol. 4, p 186,1989. Each leaf node assigns a classification. 2 Decision Tree for PlayTennis [9+,5-] Outlook? Sunny Overcast Rain [2+,3-] Humidity? Yes Wind? [4+,0-] High Normal Strong Light [3+,2-] No Yes [0+,3-] [2+,0-] No Yes [0+,2-] [3+,0-] Machine Learning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
16 e idea of binary classification trees is not unlike that of the histogram - partition the featur Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24 Decision surface Histogram Classifier Linear Classifier 0 1 Tree Classifier (a) (b) (c) Figure 9.2: (a) Histogram classifier ; (b) Linear classifier; (c)tree classifier. 3 Binary Classification Trees
17 poor generalization characteristics, and then prune this tree, to avoid overfitting. Building decision trees Growing Trees The growing process is based on recursively subdividing the feature space. Usually the subdivisions are splits of existing regions into two smaller regions (i.e., binary splits). For simplicity, the splits are perpendicular to one of the feature axis. An example of such construction is depicted in Figure Decsion trees recursively subdivide the feature space. and so on... Figure 9.3: Growing a recursive binary tree (X =[0, 1] 2 ). Often the splitting process is based on the training data, and is designed to separate data with di erent labels as much as possible. In such constructions, the splits, and hence the treestructure itself, are data dependent. This causes major di culties for the analysis (and tunning) of these methods. Alternatively, the splitting and subdivision could be taken independent from the training data. The latter approach is the one we are going to investigate in detail, since it is more amenable to analysis, and we will consider Dyadic Decision Trees and Recursive Dyadic Partitions (depicted in Figure 9.4) in particular. 84 Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
18 poor generalization characteristics, and then prune this tree, to avoid overfitting. Building decision trees Growing Trees The growing process is based on recursively subdividing the feature space. Usually the subdivisions are splits of existing regions into two smaller regions (i.e., binary splits). For simplicity, the Until now we have been referring to trees, but did not made clear how do trees relate to splits are perpendicular to one of the feature axis. An example of such construction is depicted partitions. It turns out that any decision tree can be associated with a partition of the input in Figure 9.3. space X and vice-versa. In particular, a Recursive Dyadic Partition (RDP) can be associated with a (binary) tree. In fact, this is the most e cient way of describing a RDP. In Figure 9.4 we illustrate the procedure. Each leaf of the tree corresponds to a cell of the partition. The nodes in the tree correspond to the various partition cells that are generated in the construction and so on... of the tree. The orientation of the dyadic split alternates between the levels of the tree (for the example of Figure 9.4, at the root level the split is done in the horizontal axis, at the level below that (the level of nodes 2 and 3) the split is done in the vertical axis, and so on...). The tree is called dyadic because the splits of cells are always at the midpoint along one coordinate axis, and consequently the Figure sidelengths 9.3: Growing of all cells a recursive are dyadic binary (i.e., tree powers (X of =[0, 2). 1] 2 ). 1 Decsion trees recursively subdivide the feature space. 2 The test variable specifies the division Often the splitting process is based on the training data, and is designed to separate 1 data 1 with di erent labels as much as possible. In such constructions, 1 the splits, 4 and hence the treestructure itself, are data dependent. This causes major di culties for the analysis (and tunning) of these methods. Alternatively, the splitting and subdivision could be taken independent from 4 5 the training data. The latter approach is the one we are going to investigate in detail, since it is more amenable to analysis, and we will consider Dyadic Decision Trees and Recursive Dyadic Partitions (depicted in Figure 9.4) in particular Figure 9.4: Example of Recursive Dyadic Partition (RDP) growing (X =[0, 1] 2 ). In the following we are going to consider the 2-dimensional case, but all the results can be Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
19 Building decision trees (example) 10! Training Examples for Concept PlayTennis Training examples for PlayTennis An Illustrative Example Day Outlook Temperature Humidity Wind PlayTennis? 1 Sunny Hot High Light No 2 Sunny Hot High Strong No 3 Overcast Hot High Light Yes 4 Rain Mild High Light Yes 5 Rain Cool Normal Light Yes 6 Rain Cool Normal Strong No 7 Overcast Cool Normal Strong Yes 8 Sunny Mild High Light No 9 Sunny Cool Normal Light Yes 10 Rain Mild Normal Light Yes 11 Sunny Mild Normal Strong Yes 12 Overcast Mild High Strong Yes 13 Overcast Hot Normal Light Yes 14 Rain Mild High Strong No! ID3 Build-DT using Gain( )! How Will ID3 Construct A Decision Tree? Machine Learning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
20 Building decision trees (cont.) How to build a decision tree? Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
21 Building decision trees (cont.) How to build a decision tree? 1 Start at the top of the tree. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
22 Building decision trees (cont.) How to build a decision tree? 1 Start at the top of the tree. 2 Grow it by splitting attributes one by one. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
23 Building decision trees (cont.) How to build a decision tree? 1 Start at the top of the tree. 2 Grow it by splitting attributes one by one. 3 Assign leaf nodes. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
24 Building decision trees (cont.) How to build a decision tree? 1 Start at the top of the tree. 2 Grow it by splitting attributes one by one. 3 Assign leaf nodes. 4 When we get to the bottom, prune the tree to prevent overfitting. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
25 Building decision trees (cont.) How to build a decision tree? 1 Start at the top of the tree. 2 Grow it by splitting attributes one by one. 3 Assign leaf nodes. 4 When we get to the bottom, prune the tree to prevent overfitting. How choose a test variable for an internal node? Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
26 Building decision trees (cont.) How to build a decision tree? 1 Start at the top of the tree. 2 Grow it by splitting attributes one by one. 3 Assign leaf nodes. 4 When we get to the bottom, prune the tree to prevent overfitting. How choose a test variable for an internal node? Choosing different measures result in different algorithms. We describe ID Gini index Entropy Misclassification error p Node impurity measures for two-class classification, as a function of the Hamid Beigy (Sharif University proportion of Technology) p in class 2. Cross-entropy Decisionhas Tree been scaled to pass through Fall / 24
27 Building decision trees (cont.) ID3 uses information gain to choose a test variable for an internal node. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
28 Building decision trees (cont.) ID3 uses information gain to choose a test variable for an internal node. The information gain of D relative to attribute A is the expected reduction in entropy due to splitting on A. Gain(D, A) = H(D) [ ] Dv D H(D v ) v values(a) where D v is {x D : x.a = v}, the set of examples in D where attribute A has value v Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
29 ID3 Algorihm 10! Training Examples for Concept PlayTennis An Illustrative Example Day Outlook Temperature Humidity Wind PlayTennis? 1 Sunny Hot High Light No 2 Sunny Hot High Strong No 3 Overcast Hot High Light Yes 4 Rain Mild High Light Yes 5 Rain Cool Normal Light Yes 6 Rain Cool Normal Strong No 7 Overcast Cool Normal Strong Yes 8 Sunny Mild High Light No 9 Sunny Cool Normal Light Yes 10 Rain Mild Normal Light Yes 11 Sunny Mild Normal Strong Yes 12 Overcast Mild High Strong Yes 13 Overcast Hot Normal Light Yes 14 Rain Mild High Strong No! ID3 Build-DT using Gain( )! How Will ID3 Construct A Decision Tree? H(D) = (9/14) log(9/14) (5/14) log(5/14) = 0.94bits H(D, Humidity = High) = (3/7) log(3/7) (4/7) log(4/7) = 0.985bits H(D, Humidity = Normal) = (6/7) log(6/7) (1/7) log(1/7) = 0.592bits Machine Learning Gain(D, Humidity) = 0.94 (7/14) (7/14) = 0.151bits Gain(D, Wind) = 0.94 (8/14) (6/14) 1.0 = 0.048bits Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
30 ID3 Algorihm An Illustrative Example onstructing A Decision Tree 10! Training Examples for Concept PlayTennis for PlayTennis using ID3 [ 2 ] Day Outlook Temperature Humidity Wind PlayTennis? 1 Sunny Hot High Light No 2 Sunny Hot High Strong No 3 Overcast Hot High Light Yes Temperature Humidity Wind PlayTennis? 4 Rain Mild High Light Yes Hot High Light No 5 Rain Cool Normal Light Yes Hot High Strong No t Hot 6 High Rain Cool Yes Normal Strong No Mild 7 HighOvercast LightCool Yes Normal Strong Yes Cool 8 Normal Sunny Light Mild Yes High Light No Cool 9 Normal Sunny Strong Cool No Normal Light Yes t Cool 10 Normal Rain Strong Mild Yes Normal Light Yes Mild 11 HighSunny LightMild No Normal Strong Yes Cool 12 Normal Overcast LightMild Yes High Strong Yes Mild 13 Normal Overcast LightHot Yes Normal Light Yes Mild 14 Normal Rain Strong Mild Yes High Strong No t Mild High Strong Yes t Hot Normal Light Yes Mild! ID3 Build-DT High using Gain( ) Strong No! How Will ID3 Construct A Decision Tree? [9+, 5-] Gain(D, Humidity) = 0.151bits Outlook.151 bits Attribute bits Gain(D, Wind) = 0.048bits ) = Gain(D, bits Temperature) = 0.029bits 246 bitsgain(d, Outlook) = 0.246bits ttribute (Root of Subtree) Machine Learning Sunny [2+, 3-] Overcast [4+, 0-] Rain [3+, 2-] Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
31 ID3 Algorihm An Illustrative Example 10! Training Examples for Concept PlayTennis Day Outlook Temperature Humidity Wind PlayTennis? 1 Sunny Hot High Light No 2 Sunny Hot High Strong No 3 Overcast Hot High Light Yes 4 Rain Mild High Light Yes 5 Rain Cool Normal Light Yes 6 Rain Cool Normal Strong No 7 Overcast Cool Normal Strong Yes 8 Sunny Mild High Light No 9 Sunny Cool Normal Light Yes 10 Rain Mild Normal Incremental Light Learning of Decision Yes Trees 11 Sunny Mild Normal Strong Yes Overcast Mild High Strong Yes! ID3 can not be trained incrementally. 13 Overcast Hot Normal Light Yes 14 Rain Mild High Strong No! ID3 Build-DT using Gain( )! Gain(D How Will Sunny ID3, Construct Humidity) A Decision = Tree? 0.97bits Gain(D Sunny, Wind) = 0.02bits Gain(D Sunny, Temperature) = 0.57bits Machine Learning! ID4, ID5, ID5R are samples of incremental induction of decision trees! Reading " P.E. Utgoff, Incremental Induction of Decision Trees, Machine Learning, Vol. 4, pp ,1989. [2+,3-] No [9+,5-] High Humidity? Yes Outlook? Sunny Overcast Rain Normal [0+,3-] [2+,0-] Yes [4+,0-] Strong No Wind? Light Yes [0+,2-] [3+,0-] [3+,2-] Machine Learning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
32 Inductive Bias in ID3 Types of Biases Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
33 Inductive Bias in ID3 Types of Biases 1 Preference (search) bias Put priority on choosing hypothesis. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
34 Inductive Bias in ID3 Types of Biases 1 Preference (search) bias Put priority on choosing hypothesis. 2 Language bias Put restriction on the set of hypotheses considered Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
35 Inductive Bias in ID3 Types of Biases 1 Preference (search) bias Put priority on choosing hypothesis. 2 Language bias Put restriction on the set of hypotheses considered Which Bias is better? Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
36 Inductive Bias in ID3 Types of Biases 1 Preference (search) bias Put priority on choosing hypothesis. 2 Language bias Put restriction on the set of hypotheses considered Which Bias is better? 1 Preference bias is more desirable. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
37 Inductive Bias in ID3 Types of Biases 1 Preference (search) bias Put priority on choosing hypothesis. 2 Language bias Put restriction on the set of hypotheses considered Which Bias is better? 1 Preference bias is more desirable. 2 Because, the learner works within a complete space that is assured to contain the unknown concept. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
38 Inductive Bias in ID3 Types of Biases 1 Preference (search) bias Put priority on choosing hypothesis. 2 Language bias Put restriction on the set of hypotheses considered Which Bias is better? 1 Preference bias is more desirable. 2 Because, the learner works within a complete space that is assured to contain the unknown concept. Inductive Bias of ID3 Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
39 Inductive Bias in ID3 Types of Biases 1 Preference (search) bias Put priority on choosing hypothesis. 2 Language bias Put restriction on the set of hypotheses considered Which Bias is better? 1 Preference bias is more desirable. 2 Because, the learner works within a complete space that is assured to contain the unknown concept. Inductive Bias of ID3 1 Shorter trees are preferred over longer trees. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
40 Inductive Bias in ID3 Types of Biases 1 Preference (search) bias Put priority on choosing hypothesis. 2 Language bias Put restriction on the set of hypotheses considered Which Bias is better? 1 Preference bias is more desirable. 2 Because, the learner works within a complete space that is assured to contain the unknown concept. Inductive Bias of ID3 1 Shorter trees are preferred over longer trees. 2 Occams razor : Prefer the simplest hypothesis that fits the data. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
41 Inductive Bias in ID3 Types of Biases 1 Preference (search) bias Put priority on choosing hypothesis. 2 Language bias Put restriction on the set of hypotheses considered Which Bias is better? 1 Preference bias is more desirable. 2 Because, the learner works within a complete space that is assured to contain the unknown concept. Inductive Bias of ID3 1 Shorter trees are preferred over longer trees. 2 Occams razor : Prefer the simplest hypothesis that fits the data. 3 Trees that place high information gain attributes close to the root are preferred over those that do not. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
42 Overfitting in ID3 How can we avoid over-fitting? Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
43 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
44 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
45 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
46 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
47 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance 2 Avoidance Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
48 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance 2 Avoidance Allow to over-fit, then improve the generalization capability of the tree. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
49 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance 2 Avoidance Allow to over-fit, then improve the generalization capability of the tree. Holding out a validation set (test set) Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
50 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance 2 Avoidance Allow to over-fit, then improve the generalization capability of the tree. Holding out a validation set (test set) 3 Detection and Recovery Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
51 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance 2 Avoidance Allow to over-fit, then improve the generalization capability of the tree. Holding out a validation set (test set) 3 Detection and Recovery Letting the problem happen, detecting when it does, recovering afterward Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
52 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance 2 Avoidance Allow to over-fit, then improve the generalization capability of the tree. Holding out a validation set (test set) 3 Detection and Recovery Letting the problem happen, detecting when it does, recovering afterward Build model, remove (prune) elements that contribute to overfitting Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
53 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance 2 Avoidance Allow to over-fit, then improve the generalization capability of the tree. Holding out a validation set (test set) 3 Detection and Recovery Letting the problem happen, detecting when it does, recovering afterward Build model, remove (prune) elements that contribute to overfitting How to select Best tree? Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
54 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance 2 Avoidance Allow to over-fit, then improve the generalization capability of the tree. Holding out a validation set (test set) 3 Detection and Recovery Letting the problem happen, detecting when it does, recovering afterward Build model, remove (prune) elements that contribute to overfitting How to select Best tree? 1 Training and validation set Use a separate set of examples (distinct from the training set) for test. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
55 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance 2 Avoidance Allow to over-fit, then improve the generalization capability of the tree. Holding out a validation set (test set) 3 Detection and Recovery Letting the problem happen, detecting when it does, recovering afterward Build model, remove (prune) elements that contribute to overfitting How to select Best tree? 1 Training and validation set Use a separate set of examples (distinct from the training set) for test. 2 Statistical test Use all data for training, but apply the statistical test to estimate the over-fitting. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
56 Overfitting in ID3 How can we avoid over-fitting? 1 Prevention Stop training (growing) before it reaches the point that overfits. Select attributes that are relevant (i.e., will be useful in the decision tree) Requires some predictive measure of relevance 2 Avoidance Allow to over-fit, then improve the generalization capability of the tree. Holding out a validation set (test set) 3 Detection and Recovery Letting the problem happen, detecting when it does, recovering afterward Build model, remove (prune) elements that contribute to overfitting How to select Best tree? 1 Training and validation set Use a separate set of examples (distinct from the training set) for test. 2 Statistical test Use all data for training, but apply the statistical test to estimate the over-fitting. 3 Define the measure of complexity Halting the grow when this measure is minimized. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
57 Machine Lea Pruning algorithms Reduced-Error Pruning. Acc Size error true error training error Number of nodes in tree Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
58 Pruning algorithms Reduced-Error Pruning. oss-validation Approach aining and Validation Sets, node) btree rooted at node Reduced-Error Pruning error af (with majority label of associated examples) ed-error-pruning (D) D train (training / growing ), D validation (validation / pruning ) tree T using ID3 on D train y on D validation decreases DO n-leaf node candidate in T p[candidate] Prune (T, candidate) uracy[candidate] Test (Temp[candidate], D validation ) p with best value of Accuracy (best increase; greedy) ed) T The effect of reduced error pruning on error Accuracy Acc true error training error Number of nodes in tree Size Machine Lea On training data On test data 0.6 Post-pruned tree 0.55 Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
59 Pruning algorithms Reduced-Error Pruning. oss-validation Approach aining and Validation Sets, node) btree rooted at node Reduced-Error Pruning error af (with majority label of associated examples) ed-error-pruning (D) D train (training / growing ), D validation (validation / pruning ) tree T using ID3 on D train y on D validation decreases DO n-leaf node candidate in T p[candidate] Prune (T, candidate) uracy[candidate] Test (Temp[candidate], D validation ) p with best value of Accuracy (best increase; greedy) ed) T The effect of reduced error pruning on error Accuracy Acc true error training error Number of nodes in tree Size Machine Lea On training data On test data 0.6 Post-pruned tree 0.55 Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
60 Pruning algorithms Prunning algorithms Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
61 Pruning algorithms Prunning algorithms 1 Reduced Error Pruning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
62 Pruning algorithms Prunning algorithms 1 Reduced Error Pruning 2 Pessimistic Error Pruning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
63 Pruning algorithms Prunning algorithms 1 Reduced Error Pruning 2 Pessimistic Error Pruning 3 Minimum Error Pruning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
64 Pruning algorithms Prunning algorithms 1 Reduced Error Pruning 2 Pessimistic Error Pruning 3 Minimum Error Pruning 4 Critical Value Pruning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
65 Pruning algorithms Prunning algorithms 1 Reduced Error Pruning 2 Pessimistic Error Pruning 3 Minimum Error Pruning 4 Critical Value Pruning 5 Cost-Complexity Pruning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
66 Pruning algorithms Prunning algorithms 1 Reduced Error Pruning 2 Pessimistic Error Pruning 3 Minimum Error Pruning 4 Critical Value Pruning 5 Cost-Complexity Pruning Reading Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
67 Pruning algorithms Prunning algorithms 1 Reduced Error Pruning 2 Pessimistic Error Pruning 3 Minimum Error Pruning 4 Critical Value Pruning 5 Cost-Complexity Pruning Reading 1 F. Esposito, D. Malerba, and G. Semeraro, A Comparative Analysis of Methods for Pruning Decision Trees, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 19, No. 5, pp , May Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
68 Pruning algorithms Prunning algorithms 1 Reduced Error Pruning 2 Pessimistic Error Pruning 3 Minimum Error Pruning 4 Critical Value Pruning 5 Cost-Complexity Pruning Reading 1 F. Esposito, D. Malerba, and G. Semeraro, A Comparative Analysis of Methods for Pruning Decision Trees, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 19, No. 5, pp , May S. R. Safavian and D. Landgrebe, A Survey of Decision Tree Classifier Methodology, IEEE Trans on Systems, Man, and Cybernetics, Vol. 21, No. 3, pp , Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
69 Continuous Valued Attributes Two methods for handling continuous attributes Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
70 Continuous Valued Attributes Two methods for handling continuous attributes 1 Discretization (e.g., histogramming) Break real-valued attributes into ranges in advance Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
71 Continuous Valued Attributes Example Two methods for handling continuous attributes 1 Discretization (e.g., histogramming) Break real-valued attributes into ranges in advance high = {Temp > 35C} med = {10C < Temp 35C} low = {Temp 10C} Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
72 Continuous Valued Attributes Example Two methods for handling continuous attributes 1 Discretization (e.g., histogramming) Break real-valued attributes into ranges in advance high = {Temp > 35C} med = {10C < Temp 35C} low = {Temp 10C} 2 Using thresholds for splitting nodes Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
73 Continuous Valued Attributes Example Two methods for handling continuous attributes 1 Discretization (e.g., histogramming) Break real-valued attributes into ranges in advance high = {Temp > 35C} med = {10C < Temp 35C} low = {Temp 10C} 2 Using thresholds for splitting nodes Example A a produces subsets A a and A > a. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
74 Continuous Valued Attributes Example Two methods for handling continuous attributes 1 Discretization (e.g., histogramming) Break real-valued attributes into ranges in advance high = {Temp > 35C} med = {10C < Temp 35C} low = {Temp 10C} 2 Using thresholds for splitting nodes Example A a produces subsets A a and A > a. 3 Information gain is calculated the same way as for discrete splits Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
75 Continuous Valued Attributes Example Two methods for handling continuous attributes 1 Discretization (e.g., histogramming) Break real-valued attributes into ranges in advance Example high = {Temp > 35C} med = {10C < Temp 35C} low = {Temp 10C} 2 Using thresholds for splitting nodes A a produces subsets A a and A > a. 3 Information gain is calculated the same way as for discrete splits How to find the split with highest Gain Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
76 Continuous Valued Attributes Example Two methods for handling continuous attributes 1 Discretization (e.g., histogramming) Break real-valued attributes into ranges in advance Example high = {Temp > 35C} med = {10C < Temp 35C} low = {Temp 10C} 2 Using thresholds for splitting nodes A a produces subsets A a and A > a. 3 Information gain is calculated the same way as for discrete splits How to find the split with highest Gain Example length label Thresholds Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
77 Missing Data Problem: What If Some Examples Missing Values of A? Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
78 Missing Data Training: evaluate Gain (D, A) where for some x D, a value Testing: classify a new example x without knowing the value! Solutions: Incorporating a Guess into Calculation of G Problem: What If Some Examples Missing Values of A? Consider dataset. Day Outlook Temperature Humidity Wind PlayTennis? 1 Sunny Hot High Light No 2 Sunny Hot High Strong No 3 Overcast Hot High Light Yes 4 Rain Mild High Light Yes 5 Rain Cool Normal Light Yes 6 Rain Cool Normal Strong No 7 Overcast Cool Normal Strong Yes 8 Sunny Mild??? Light No 9 Sunny Cool Normal Light Yes 10 Rain Mild Normal Light Yes 11 Sunny Mild Normal Strong Yes 12 Overcast Mild High Strong Yes 13 Overcast Hot Normal Light Yes 14 Rain Mild High Strong No [2 Machine Learning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
79 known Attribute Values Missing Data Training: evaluate Gain (D, A) where for some x D, a value Testing: classify a new example x without knowing the value s Missing! Solutions: Values of Incorporating A? a Guess into Calculation of G Problem: What If Some Examples Missing Values of A? butes Consider during dataset. training or testing Day Outlook Temperature Humidity Wind PlayTennis? 1 Sunny Hot High Light No 2 Sunny Hot High Strong No 3 Overcast Hot High Light Yes 4 Rain Mild High Light Yes 5 Rain Cool Normal Light Yes 6 Rain Cool Normal Strong No 7 Overcast Cool Normal Strong Yes 8 Sunny Mild??? Light No 9 Sunny Cool Normal Light Yes 10 Rain Mild Normal Light Yes 11 Sunny Mild Normal Strong Yes 12 Overcast Mild High Strong Yes 13 Overcast Hot Normal Light Yes 14 Rain Mild High Strong No ormal,, Blood-Test =?, >, sometimes low priority (or cost too high) ssification ere for some x D, a value for A is not given without knowing the value of A into Calculation of Gain(D, A) Wind Light Strong Light Light Light Strong Strong Light Light PlayTennis? No No Yes Yes Yes No Yes No Yes What is the decision tree? [2+, 3-] [9+, 5-] Outlook Sunny Overcast [4+, 0-] Machine Learning Rain [3+, 2-] Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24 [2
80 known Attribute Values Missing Data Training: evaluate Gain (D, A) where for some x D, a value Testing: classify a new example x without knowing the value s Missing! Solutions: Values of Incorporating A? a Guess into Calculation of G Problem: What If Some Examples Missing Values of A? butes Consider during dataset. training or testing Day Outlook Temperature Humidity Wind PlayTennis? 1 Sunny Hot High Light No 2 Sunny Hot High Strong No 3 Overcast Hot High Light Yes 4 Rain Mild High Light Yes 5 Rain Cool Normal Light Yes 6 Rain Cool Normal Strong No 7 Overcast Cool Normal Strong Yes 8 Sunny Mild??? Light No 9 Sunny Cool Normal Light Yes 10 Rain Mild Normal Light Yes 11 Sunny Mild Normal Strong Yes 12 Overcast Mild High Strong Yes 13 Overcast Hot Normal Light Yes 14 Rain Mild High Strong No ormal,, Blood-Test =?, >, sometimes low priority (or cost too high) ssification ere for some x D, a value for A is not given without knowing the value of A into Calculation of Gain(D, A) Wind Light Strong Light Light Light Strong Strong Light Light PlayTennis? No No Yes Yes Yes No Yes No Yes What is the decision tree? [2+, 3-] [9+, 5-] Outlook Sunny Overcast [4+, 0-] Machine Learning Rain [3+, 2-] Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24 [2
81 Attributes with Many Values Problem: If attribute has many values such as Date, Gain() will select it (why?) Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
82 Attributes with Many Values Problem: If attribute has many values such as Date, Gain() will select it (why?) One Approach: Use GainRatio instead of Gain Gain(D, A) = H(D) GainRatio(D, A) = SplitInformation(D, A) = v values(a) [ ] Dv D H(D v ) Gain(D, A) SplitInformation(D, A) [ Dv D log D ] v D v values(a) Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
83 Attributes with Many Values Problem: If attribute has many values such as Date, Gain() will select it (why?) One Approach: Use GainRatio instead of Gain Gain(D, A) = H(D) GainRatio(D, A) = SplitInformation(D, A) = v values(a) [ ] Dv D H(D v ) Gain(D, A) SplitInformation(D, A) [ Dv D log D ] v D v values(a) SplitInformation: directly proportional to values(a), i.e., penalizes attributes with more values. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
84 Attributes with Many Values Problem: If attribute has many values such as Date, Gain() will select it (why?) One Approach: Use GainRatio instead of Gain Gain(D, A) = H(D) GainRatio(D, A) = SplitInformation(D, A) = v values(a) [ ] Dv D H(D v ) Gain(D, A) SplitInformation(D, A) [ Dv D log D ] v D v values(a) SplitInformation: directly proportional to values(a), i.e., penalizes attributes with more values. What is its inductive bias? Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
85 Attributes with Many Values Problem: If attribute has many values such as Date, Gain() will select it (why?) One Approach: Use GainRatio instead of Gain Gain(D, A) = H(D) GainRatio(D, A) = SplitInformation(D, A) = v values(a) [ ] Dv D H(D v ) Gain(D, A) SplitInformation(D, A) [ Dv D log D ] v D v values(a) SplitInformation: directly proportional to values(a), i.e., penalizes attributes with more values. What is its inductive bias? Preference bias (for lower branch factor) expressed via GainRatio(.) Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
86 Attributes with Many Values Problem: If attribute has many values such as Date, Gain() will select it (why?) One Approach: Use GainRatio instead of Gain Gain(D, A) = H(D) GainRatio(D, A) = SplitInformation(D, A) = v values(a) [ ] Dv D H(D v ) Gain(D, A) SplitInformation(D, A) [ Dv D log D ] v D v values(a) SplitInformation: directly proportional to values(a), i.e., penalizes attributes with more values. What is its inductive bias? Preference bias (for lower branch factor) expressed via GainRatio(.) Alternative attribute selection : Gini Index Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
87 Handling Attributes With Different Costs Problem: In some learning tasks the instance attributes may have associated costs. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
88 Handling Attributes With Different Costs Problem: In some learning tasks the instance attributes may have associated costs. Solutions Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
89 Handling Attributes With Different Costs Problem: In some learning tasks the instance attributes may have associated costs. Solutions 1 ExtendedID3 Gain(S, A) Cost(A) Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
90 Handling Attributes With Different Costs Problem: In some learning tasks the instance attributes may have associated costs. Solutions 1 ExtendedID3 2 TanandSchlimmer Gain(S, A) Cost(A) Gain 2 (S, A) Cost(A) Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
91 Handling Attributes With Different Costs Problem: In some learning tasks the instance attributes may have associated costs. Solutions 1 ExtendedID3 2 TanandSchlimmer 3 Nunez Gain(S, A) Cost(A) Gain 2 (S, A) Cost(A) where w [0, 1] is a constant. 2 Gain(S,A) 1 (Cost(A) + 1) w Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
92 t Regression that appropriate Tree impurity measure for regression is subset of X reaching node m. on tree, the goodness of a split is measured by the m the estimated value. In regression tree, the goodness of a split is measured by the mean square error from the estimated value. A 1 True F 1 (A-A 1 )] False F[11+, 2 (A-A2-] 1 )] an, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classifica ion Trees, Belmont, CA; Wadsworth International Group, (1 ba, F. Esposito, M. Ceci, and A. Appice, Top-Down Induction ression and Splitting Nodes, IEEE Trans. on Pattern Analysis Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
93 t Regression that appropriate Tree impurity measure for regression is subset of X reaching node m. on tree, the goodness of a split is measured by the m the estimated value. In regression tree, the goodness of a split is measured by the mean square error from the estimated value. A 1 True F 1 (A-A 1 )] False F[11+, 2 (A-A2-] 1 )] References an, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classifica ion Trees, Belmont, CA; Wadsworth International Group, (1 ba, F. Esposito, M. Ceci, and A. Appice, Top-Down Induction ression and Splitting Nodes, IEEE Trans. on Pattern Analysis Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
94 t Regression that appropriate Tree impurity measure for regression is subset of X reaching node m. on tree, the goodness of a split is measured by the m the estimated value. In regression tree, the goodness of a split is measured by the mean square error from the estimated value. A 1 True F 1 (A-A 1 )] False F[11+, 2 (A-A2-] 1 )] References 1 L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees, Belmont, CA; Wadsworth International Group, (1984). an, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classifica ion Trees, Belmont, CA; Wadsworth International Group, (1 ba, F. Esposito, M. Ceci, and A. Appice, Top-Down Induction ression and Splitting Nodes, IEEE Trans. on Pattern Analysis Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
95 t Regression that appropriate Tree impurity measure for regression is subset of X reaching node m. on tree, the goodness of a split is measured by the m the estimated value. In regression tree, the goodness of a split is measured by the mean square error from the estimated value. A 1 True F 1 (A-A 1 )] False F[11+, 2 (A-A2-] 1 )] References 1 L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees, Belmont, CA; Wadsworth International Group, (1984). an, J. 2 H. D. Friedman, Malerba, F. Esposito, R. M. A. Ceci, Olshen, and A. Appice, and Top-Down C. J. Stone, Induction of Classifica Model Trees ion Trees, with Regression Belmont, and Splitting CA; Wadsworth Nodes, IEEE Trans. International Pattern Analysis and Group, Machine (1 Intelligence, Vol. 25, No. 5, pp , May ba, F. Esposito, M. Ceci, and A. Appice, Top-Down Induction ression and Splitting Nodes, IEEE Trans. on Pattern Analysis Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
96 Other types of decision trees Univariate trees In univariate trees, the test at each internal node just uses only one of input attributes. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
97 Other types of decision trees Univariate trees In univariate trees, the test at each internal node just uses only one of input attributes. Multivariate trees In multivariate trees, the test at each internal node can use all input attributes. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
98 Other types of decision trees Univariate trees In univariate trees, the test at each internal node just uses only one of input attributes. Multivariate trees In multivariate trees, the test at each internal node can use all input attributes. Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
99 Other types of decision trees Decision Trees 26! Univariate Trees Univariate " In univariate trees trees, the test at each internal node just uses only one of input In attributes. univariate trees, the test at each internal node just uses only one of input! Multivariate attributes. Trees " In multivariate trees, the test at each internal node can use all input attributes. Multivariate trees " For example: Consider a data set with numerical attributes. In multivariate trees, the test at each internal node can use all input attributes. # The test can be made using the weighted linear combination of some input attributes. " For example X=(x 1,x 2 ) be the input attributes. Let f (X)=w 0 +w 1 x 1 +w 2 x 2 can be used for test at an internal node. Such as f (x) > 0. (w 0 +w 1 x 1 +w 2 x 2 )>0 True False YES [11+, NO2-]! Reading " C. E. Brodley and P. E. Utgoff, Multivariate Decision Trees, Machine Learning, Vol. 19, pp , Machine Learning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
100 Other types of decision trees Decision Trees ariate Trees univariate trees, the test at each internal node just uses only one of input 26 ttributes.! Univariate Trees ivariate Univariate " Trees In univariate trees trees, the test at each internal node just uses only one of input multivariate In attributes. univariate trees, the trees, test the at test each atinternal each internal node can nodeuse justall uses input onlyattributes. one of input or! example: Multivariate attributes. Consider Treesa data set with numerical attributes. # The " test In can multivariate be made trees, using the the test weighted at each linear internal combination node can use of all some input input attributes. attributes. Multivariate trees or example " For X=(x example: Consider a data set with numerical attributes. In multivariate 1,x 2 ) be the input attributes. Let f (X)=w # The test can trees, be made the using test the at weighted each internal linear combination node 0 +w can 1 x of some use 1 +w 2 x all 2 can be used for input input attributes. st at an internal node. Such as f (x) > 0. attributes. " For example X=(x 1,x 2 ) be the input attributes. Let f (X)=w 0 +w 1 x 1 +w 2 x 2 can be used for test at an internal node. Such as f (x) > 0. (w 0 +w 1 x 1 +w 2 x 2 )>0 (w 0 +w 1 x 1 +w 2 x 2 )>0 Decision Trees True True False False YES YES [11+, NO2-] [11+, NO2-] ing! Reading. E. Brodley " C. E. and Brodley P. E. and Utgoff, P. E. Utgoff, Multivariate Multivariate Decision Trees, Machine Learning, Vol. Vol. 19, 19, p , pp , Machine Learning Machine Learning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
101 Other types of decision trees Decision Trees ariate Trees univariate trees, the test at each internal node just uses only one of input 26 ttributes.! Univariate Trees ivariate Univariate " Trees In univariate trees trees, the test at each internal node just uses only one of input multivariate In attributes. univariate trees, the trees, test the at test each atinternal each internal node can nodeuse justall uses input onlyattributes. one of input or! example: Multivariate attributes. Consider Treesa data set with numerical attributes. # The " test In can multivariate be made trees, using the the test weighted at each linear internal combination node can use of all some input input attributes. attributes. Multivariate trees or example " For X=(x example: Consider a data set with numerical attributes. In multivariate 1,x 2 ) be the input attributes. Let f (X)=w # The test can trees, be made the using test the at weighted each internal linear combination node 0 +w can 1 x of some use 1 +w 2 x all 2 can be used for input input attributes. st at an internal node. Such as f (x) > 0. attributes. " For example X=(x 1,x 2 ) be the input attributes. Let f (X)=w 0 +w 1 x 1 +w 2 x 2 can be used for test at an internal node. Such as f (x) > 0. (w 0 +w 1 x 1 +w 2 x 2 )>0 (w 0 +w 1 x 1 +w 2 x 2 )>0 Decision Trees True True False False YES YES [11+, NO2-] [11+, NO2-] ing! Reading. E. Brodley " C. E. Brodley and P. E. Utgoff, Multivariate Vol. 19, References and P. E. Utgoff, Multivariate Decision Trees, Machine Learning, Vol. 19, p , pp , Machine Learning Machine Learning Hamid Beigy (Sharif University of Technology) Decision Tree Fall / 24
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationImpact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees
Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees Mariusz Łapczy ski 1 and Bartłomiej Jefma ski 2 1 The Chair of Market Analysis and Marketing Research,
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationMachine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler
Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina
More informationA Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and
A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and Planning Overview Motivation for Analyses Analyses and
More informationAGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS
AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS 1 CALIFORNIA CONTENT STANDARDS: Chapter 1 ALGEBRA AND WHOLE NUMBERS Algebra and Functions 1.4 Students use algebraic
More informationA Version Space Approach to Learning Context-free Grammars
Machine Learning 2: 39~74, 1987 1987 Kluwer Academic Publishers, Boston - Manufactured in The Netherlands A Version Space Approach to Learning Context-free Grammars KURT VANLEHN (VANLEHN@A.PSY.CMU.EDU)
More informationGrade 6: Correlated to AGS Basic Math Skills
Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and
More informationData Stream Processing and Analytics
Data Stream Processing and Analytics Vincent Lemaire Thank to Alexis Bondu, EDF Outline Introduction on data-streams Supervised Learning Conclusion 2 3 Big Data what does that mean? Big Data Analytics?
More informationLecture 1: Basic Concepts of Machine Learning
Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010
More informationChapter 2 Rule Learning in a Nutshell
Chapter 2 Rule Learning in a Nutshell This chapter gives a brief overview of inductive rule learning and may therefore serve as a guide through the rest of the book. Later chapters will expand upon the
More informationLearning goal-oriented strategies in problem solving
Learning goal-oriented strategies in problem solving Martin Možina, Timotej Lazar, Ivan Bratko Faculty of Computer and Information Science University of Ljubljana, Ljubljana, Slovenia Abstract The need
More informationAnalysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems
Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems Ajith Abraham School of Business Systems, Monash University, Clayton, Victoria 3800, Australia. Email: ajith.abraham@ieee.org
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationRule Learning with Negation: Issues Regarding Effectiveness
Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationA Reinforcement Learning Variant for Control Scheduling
A Reinforcement Learning Variant for Control Scheduling Aloke Guha Honeywell Sensor and System Development Center 3660 Technology Drive Minneapolis MN 55417 Abstract We present an algorithm based on reinforcement
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationSTA 225: Introductory Statistics (CT)
Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic
More informationProof Theory for Syntacticians
Department of Linguistics Ohio State University Syntax 2 (Linguistics 602.02) January 5, 2012 Logics for Linguistics Many different kinds of logic are directly applicable to formalizing theories in syntax
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More informationCS 101 Computer Science I Fall Instructor Muller. Syllabus
CS 101 Computer Science I Fall 2013 Instructor Muller Syllabus Welcome to CS101. This course is an introduction to the art and science of computer programming and to some of the fundamental concepts of
More informationGACE Computer Science Assessment Test at a Glance
GACE Computer Science Assessment Test at a Glance Updated May 2017 See the GACE Computer Science Assessment Study Companion for practice questions and preparation resources. Assessment Name Computer Science
More informationProbability and Statistics Curriculum Pacing Guide
Unit 1 Terms PS.SPMJ.3 PS.SPMJ.5 Plan and conduct a survey to answer a statistical question. Recognize how the plan addresses sampling technique, randomization, measurement of experimental error and methods
More informationMULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
Ch 2 Test Remediation Work Name MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) High temperatures in a certain
More informationCS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus
CS 1103 Computer Science I Honors Fall 2016 Instructor Muller Syllabus Welcome to CS1103. This course is an introduction to the art and science of computer programming and to some of the fundamental concepts
More informationMachine Learning from Garden Path Sentences: The Application of Computational Linguistics
Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationNotes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1
Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial
More informationSystem Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationMulti-label Classification via Multi-target Regression on Data Streams
Multi-label Classification via Multi-target Regression on Data Streams Aljaž Osojnik 1,2, Panče Panov 1, and Sašo Džeroski 1,2,3 1 Jožef Stefan Institute, Jamova cesta 39, Ljubljana, Slovenia 2 Jožef Stefan
More informationConstructive Induction-based Learning Agents: An Architecture and Preliminary Experiments
Proceedings of the First International Workshop on Intelligent Adaptive Systems (IAS-95) Ibrahim F. Imam and Janusz Wnek (Eds.), pp. 38-51, Melbourne Beach, Florida, 1995. Constructive Induction-based
More informationChinese Language Parsing with Maximum-Entropy-Inspired Parser
Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art
More informationOn-Line Data Analytics
International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationApplications of data mining algorithms to analysis of medical data
Master Thesis Software Engineering Thesis no: MSE-2007:20 August 2007 Applications of data mining algorithms to analysis of medical data Dariusz Matyja School of Engineering Blekinge Institute of Technology
More informationSoftprop: Softmax Neural Network Backpropagation Learning
Softprop: Softmax Neural Networ Bacpropagation Learning Michael Rimer Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: mrimer@axon.cs.byu.edu Tony Martinez Computer Science
More informationRANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S
N S ER E P S I M TA S UN A I S I T VER RANKING AND UNRANKING LEFT SZILARD LANGUAGES Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A-1997-2 UNIVERSITY OF TAMPERE DEPARTMENT OF
More informationEdexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE
Edexcel GCSE Statistics 1389 Paper 1H June 2007 Mark Scheme Edexcel GCSE Statistics 1389 NOTES ON MARKING PRINCIPLES 1 Types of mark M marks: method marks A marks: accuracy marks B marks: unconditional
More informationMulti-label classification via multi-target regression on data streams
Mach Learn (2017) 106:745 770 DOI 10.1007/s10994-016-5613-5 Multi-label classification via multi-target regression on data streams Aljaž Osojnik 1,2 Panče Panov 1 Sašo Džeroski 1,2,3 Received: 26 April
More informationCooperative evolutive concept learning: an empirical study
Cooperative evolutive concept learning: an empirical study Filippo Neri University of Piemonte Orientale Dipartimento di Scienze e Tecnologie Avanzate Piazza Ambrosoli 5, 15100 Alessandria AL, Italy Abstract
More informationSemi-Supervised Face Detection
Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University
More informationActive Learning. Yingyu Liang Computer Sciences 760 Fall
Active Learning Yingyu Liang Computer Sciences 760 Fall 2017 http://pages.cs.wisc.edu/~yliang/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed by Mark Craven,
More informationQuickStroke: An Incremental On-line Chinese Handwriting Recognition System
QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents
More informationAlgebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview
Algebra 1, Quarter 3, Unit 3.1 Line of Best Fit Overview Number of instructional days 6 (1 day assessment) (1 day = 45 minutes) Content to be learned Analyze scatter plots and construct the line of best
More informationReducing Features to Improve Bug Prediction
Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science
More informationUsing focal point learning to improve human machine tacit coordination
DOI 10.1007/s10458-010-9126-5 Using focal point learning to improve human machine tacit coordination InonZuckerman SaritKraus Jeffrey S. Rosenschein The Author(s) 2010 Abstract We consider an automated
More informationA. What is research? B. Types of research
A. What is research? Research = the process of finding solutions to a problem after a thorough study and analysis (Sekaran, 2006). Research = systematic inquiry that provides information to guide decision
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationNumeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C
Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C Using and applying mathematics objectives (Problem solving, Communicating and Reasoning) Select the maths to use in some classroom
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationLab 1 - The Scientific Method
Lab 1 - The Scientific Method As Biologists we are interested in learning more about life. Through observations of the living world we often develop questions about various phenomena occurring around us.
More informationAxiom 2013 Team Description Paper
Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association
More informationSwitchboard Language Model Improvement with Conversational Data from Gigaword
Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword
More informationUsing dialogue context to improve parsing performance in dialogue systems
Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationINPE São José dos Campos
INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA
More informationNORTH CAROLINA VIRTUAL PUBLIC SCHOOL IN WCPSS UPDATE FOR FALL 2007, SPRING 2008, AND SUMMER 2008
E&R Report No. 08.29 February 2009 NORTH CAROLINA VIRTUAL PUBLIC SCHOOL IN WCPSS UPDATE FOR FALL 2007, SPRING 2008, AND SUMMER 2008 Authors: Dina Bulgakov-Cooke, Ph.D., and Nancy Baenen ABSTRACT North
More informationMulti-Lingual Text Leveling
Multi-Lingual Text Leveling Salim Roukos, Jerome Quin, and Todd Ward IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 {roukos,jlquinn,tward}@us.ibm.com Abstract. Determining the language proficiency
More informationAP Calculus AB. Nevada Academic Standards that are assessable at the local level only.
Calculus AB Priority Keys Aligned with Nevada Standards MA I MI L S MA represents a Major content area. Any concept labeled MA is something of central importance to the entire class/curriculum; it is a
More informationRadius STEM Readiness TM
Curriculum Guide Radius STEM Readiness TM While today s teens are surrounded by technology, we face a stark and imminent shortage of graduates pursuing careers in Science, Technology, Engineering, and
More informationDefragmenting Textual Data by Leveraging the Syntactic Structure of the English Language
Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language Nathaniel Hayes Department of Computer Science Simpson College 701 N. C. St. Indianola, IA, 50125 nate.hayes@my.simpson.edu
More informationCLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH
ISSN: 0976-3104 Danti and Bhushan. ARTICLE OPEN ACCESS CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH Ajit Danti 1 and SN Bharath Bhushan 2* 1 Department
More informationMeasurement. When Smaller Is Better. Activity:
Measurement Activity: TEKS: When Smaller Is Better (6.8) Measurement. The student solves application problems involving estimation and measurement of length, area, time, temperature, volume, weight, and
More informationVersion Space. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Version Space Term 2012/ / 18
Version Space Javier Béjar cbea LSI - FIB Term 2012/2013 Javier Béjar cbea (LSI - FIB) Version Space Term 2012/2013 1 / 18 Outline 1 Learning logical formulas 2 Version space Introduction Search strategy
More informationIntroduction to Questionnaire Design
Introduction to Questionnaire Design Why this seminar is necessary! Bad questions are everywhere! Don t let them happen to you! Fall 2012 Seminar Series University of Illinois www.srl.uic.edu The first
More informationImproving Simple Bayes. Abstract. The simple Bayesian classier (SBC), sometimes called
Improving Simple Bayes Ron Kohavi Barry Becker Dan Sommereld Data Mining and Visualization Group Silicon Graphics, Inc. 2011 N. Shoreline Blvd. Mountain View, CA 94043 fbecker,ronnyk,sommdag@engr.sgi.com
More informationCOMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS
COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS L. Descalço 1, Paula Carvalho 1, J.P. Cruz 1, Paula Oliveira 1, Dina Seabra 2 1 Departamento de Matemática, Universidade de Aveiro (PORTUGAL)
More informationModel Ensemble for Click Prediction in Bing Search Ads
Model Ensemble for Click Prediction in Bing Search Ads Xiaoliang Ling Microsoft Bing xiaoling@microsoft.com Hucheng Zhou Microsoft Research huzho@microsoft.com Weiwei Deng Microsoft Bing dedeng@microsoft.com
More informationA Neural Network GUI Tested on Text-To-Phoneme Mapping
A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis
More informationThe Good Judgment Project: A large scale test of different methods of combining expert predictions
The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania
More informationPredicting Students Performance with SimStudent: Learning Cognitive Skills from Observation
School of Computer Science Human-Computer Interaction Institute Carnegie Mellon University Year 2007 Predicting Students Performance with SimStudent: Learning Cognitive Skills from Observation Noboru Matsuda
More informationSociology 521: Social Statistics and Quantitative Methods I Spring Wed. 2 5, Kap 305 Computer Lab. Course Website
Sociology 521: Social Statistics and Quantitative Methods I Spring 2012 Wed. 2 5, Kap 305 Computer Lab Instructor: Tim Biblarz Office hours (Kap 352): W, 5 6pm, F, 10 11, and by appointment (213) 740 3547;
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationMath 96: Intermediate Algebra in Context
: Intermediate Algebra in Context Syllabus Spring Quarter 2016 Daily, 9:20 10:30am Instructor: Lauri Lindberg Office Hours@ tutoring: Tutoring Center (CAS-504) 8 9am & 1 2pm daily STEM (Math) Center (RAI-338)
More informationSARDNET: A Self-Organizing Feature Map for Sequences
SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu
More informationVisit us at:
White Paper Integrating Six Sigma and Software Testing Process for Removal of Wastage & Optimizing Resource Utilization 24 October 2013 With resources working for extended hours and in a pressurized environment,
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14
More informationRule discovery in Web-based educational systems using Grammar-Based Genetic Programming
Data Mining VI 205 Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming C. Romero, S. Ventura, C. Hervás & P. González Universidad de Córdoba, Campus Universitario de
More informationGetting Started with TI-Nspire High School Science
Getting Started with TI-Nspire High School Science 2012 Texas Instruments Incorporated Materials for Institute Participant * *This material is for the personal use of T3 instructors in delivering a T3
More informationSTT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.
STT 231 Test 1 Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point. 1. A professor has kept records on grades that students have earned in his class. If he
More informationAlignment of Australian Curriculum Year Levels to the Scope and Sequence of Math-U-See Program
Alignment of s to the Scope and Sequence of Math-U-See Program This table provides guidance to educators when aligning levels/resources to the Australian Curriculum (AC). The Math-U-See levels do not address
More informationMultivariate k-nearest Neighbor Regression for Time Series data -
Multivariate k-nearest Neighbor Regression for Time Series data - a novel Algorithm for Forecasting UK Electricity Demand ISF 2013, Seoul, Korea Fahad H. Al-Qahtani Dr. Sven F. Crone Management Science,
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More information12- A whirlwind tour of statistics
CyLab HT 05-436 / 05-836 / 08-534 / 08-734 / 19-534 / 19-734 Usable Privacy and Security TP :// C DU February 22, 2016 y & Secu rivac rity P le ratory bo La Lujo Bauer, Nicolas Christin, and Abby Marsh
More informationAUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders
More informationLearning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for
Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com
More informationDetecting English-French Cognates Using Orthographic Edit Distance
Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National
More informationSoftware Maintenance
1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories
More informationMining Association Rules in Student s Assessment Data
www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama
More informationTOPICS LEARNING OUTCOMES ACTIVITES ASSESSMENT Numbers and the number system
Curriculum Overview Mathematics 1 st term 5º grade - 2010 TOPICS LEARNING OUTCOMES ACTIVITES ASSESSMENT Numbers and the number system Multiplies and divides decimals by 10 or 100. Multiplies and divide
More informationFRAMEWORK FOR IDENTIFYING THE MOST LIKELY SUCCESSFUL UNDERPRIVILEGED TERTIARY STUDY BURSARY APPLICANTS
South African Journal of Industrial Engineering August 2017 Vol 28(2), pp 59-77 FRAMEWORK FOR IDENTIFYING THE MOST LIKELY SUCCESSFUL UNDERPRIVILEGED TERTIARY STUDY BURSARY APPLICANTS R. Steynberg 1 * #,
More informationPredicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks
Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com
More informationEvaluating and Comparing Classifiers: Review, Some Recommendations and Limitations
Evaluating and Comparing Classifiers: Review, Some Recommendations and Limitations Katarzyna Stapor (B) Institute of Computer Science, Silesian Technical University, Gliwice, Poland katarzyna.stapor@polsl.pl
More informationA Bootstrapping Model of Frequency and Context Effects in Word Learning
Cognitive Science 41 (2017) 590 622 Copyright 2016 Cognitive Science Society, Inc. All rights reserved. ISSN: 0364-0213 print / 1551-6709 online DOI: 10.1111/cogs.12353 A Bootstrapping Model of Frequency
More information