Machine Learning CSL 302 ARTIFICIAL INTELLIGENCE SPRING 2014
Reference Material CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 2
What is Learning? In the Human context? CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 3
What is Learning? Webster ogain knowledge or understanding of or skill in by study, instruction or experience omemorize Sleep learning? (www.links999.net) Synonym: Discovery oobtain knowledge of for the first time omay imply acquiring knowledge with little effort or conscious intention (as by simply being told) or it may imply study and practice Knowledge oknowing something with familiarity gained through experience or association ofacts or ideas acquired by study, investigation, observation, or experience CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 4
CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 5
What is Machine Learning? Herbert Simon (1970) oany process by which a system improves its performance. Tom Mitchell (1990) oa computer program that improves its performance at some task through experience. Ethem Alpaydin (2010) oprogramming computers to optimize a performance criterion using example data or past experience. CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 6
Why Do Machine Learning? Automated knowledge engineering oexpertise is scarce ocodification of expertise is difficult oexpertise frequently consists of a set of test cases Only one computer has to learn, then Ctrl+C Ctrl+P Discover new knowledge Understand human learning CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 7
Applications Medical diagnosis Autonomous driving Speech recognition Recommendation Systems Prediction (financial, climate, health, energy) Fraud/intrusion detection Activity recognition Computer Vision And the list goes on and on CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 8
Related Disciplines Statistics Pattern recognition Signal processing Artificial intelligence Data mining Neuroscience Cognitive science Psychology CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 9
Approaches Supervised Learning o Classification o Regression Unsupervised Learning o Clustering o Rule Mining Semi-supervised Learning Reinforcement Learning Transfer Learning Active Learning Online Learning Dimensionality Reduction CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 10
Approaches Supervised Learning o Classification o Regression Unsupervised Learning o Clustering o Rule Mining Semi-supervised Learning Reinforcement Learning Transfer Learning Active Learning Online Learning Dimensionality Reduction CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 11
Approaches Supervised Learning o Classification o Regression Unsupervised Learning o Clustering o Rule Mining Semi-supervised Learning Reinforcement Learning Transfer Learning Active Learning Online Learning Dimensionality Reduction CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 12
Approaches Supervised Learning o Classification o Regression Unsupervised Learning o Clustering o Rule Mining Semi-supervised Learning Reinforcement Learning Transfer Learning Active Learning Online Learning Dimensionality Reduction CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 13
Approaches Supervised Learning o Classification o Regression Unsupervised Learning o Clustering o Rule Mining Semi-supervised Learning Reinforcement Learning Transfer Learning Active Learning Online Learning Dimensionality Reduction CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 14
Approaches Supervised Learning o Classification o Regression Unsupervised Learning o Clustering o Rule Mining Semi-supervised Learning Reinforcement Learning Transfer Learning Active Learning Online Learning Dimensionality Reduction CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 15
Approaches Supervised Learning o Classification o Regression Unsupervised Learning o Clustering o Rule Mining Semi-supervised Learning Reinforcement Learning Transfer Learning Active Learning Online Learning Dimensionality Reduction CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 16
Approaches Supervised Learning o Classification o Regression Unsupervised Learning o Clustering o Rule Mining Semi-supervised Learning Reinforcement Learning Transfer Learning Active Learning Online Learning Dimensionality Reduction CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 17
Approaches Supervised Learning o Classification o Regression Unsupervised Learning o Clustering o Rule Mining Semi-supervised Learning Reinforcement Learning Transfer Learning Active Learning Online Learning Dimensionality Reduction CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 18
Other ML Issues Evaluation owhich learning approach is better Theoretical bounds owhat is and is not learnable Scalability olearning from massive datasets CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 19
Resources Software oweka (www.cs.waikato.ac.nz/~ml/weka) omachine learning open-source software (mloss.org) Data ouci ML Repository (archive.ics.uci.edu/ml) ouci KDD Repository (kdd.ics.uci.edu) ochallenges: KDD-Cup, Netflix, CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 20
Resources Conferences ointernational Conference on Machine Learning (ICML) oknowledge Discovery and Data Mining (KDD) oieee Conference on Data Mining (ICDM) osiam Data Mining Conference (SDM) oassociation for the Advancement of AI (AAAI) conference ointernational Joint Conference on AI (IJCAI) ointernational Conference on Machine Learning Applications omany more CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 21
Resources Journals omachine Learning Journal ojournal of Machine Learning Research odata Mining and Knowledge Discovery omany more WWW owww.kdnuggets.com (subscribe!) CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 22
Supervised Learning Learning task olearn to classify cars into one of two classes: family car or other oeach car is represented by two features (attributes): engine power and price ogiven several training examples of already-classified cars ooutput classifier that accurately classifies cars CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 23
Example: Family Car CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 24
Definitions Feature (attribute): x i o A property of the object to be classified o Discrete or continuous o e.g., engine power, price Instance: x = [x 1, x 2,, x d ] o The feature values for a specific object o e.g., engine power = 100, price = high Instance space: I o Space of all possible instances Class: C o Categorical feature of an object o Set of instances of objects in this category o e.g., family car CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 25
Example: Family Car Class CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 26
Definitions Example: (x, r) o Instance along with its class membership r o Positive example: member of class (r = 1) o Negative example: not a member of class (r = 0) Training set: X = {x t, r t }, 1 t N o Set of N examples Target concept (C) o Correct expression of class o E.g., (e 1 engine power e 2 ) and (p 1 price p 2 ) Concept class o Space of all possible target concepts o E.g., axis-aligned rectangles in instance space o E.g., power set of instance space CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 27
Definitions Hypothesis: h(x) {0,1} oapproximation to target concept Hypothesis class: H ospace of all possible hypotheses oe.g., axis-aligned rectangles oe.g., axis-aligned ellipses Learning goal ofind hypothesis h H that closely approximates target concept C oh is the output classifier otarget concept may not be in H CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 28
Example: Hypothesis Error Note: h is consistent with the training set, but not the target concept C. CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 29
Definitions Empirical error ohow well h classifies training set X N 1 t t E( h X ) 1( h( x ) r ) N t 1 Generalization error ohow well h classifies instances not in X True error ohow well h classifies entire instance space 1 E( h) 1( h( x) C( x)) I x I 1(expr) = 1 if expr is true, else 0 CSL 302 ARTIFICIAL INTELLIGENCE, INDIAN INSTITUTE OF TECHNOLOGY ROPAR 30