Lecture 1: Basic Concepts of Machine Learning

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Machine Learning Basics

Python Machine Learning

Laboratorio di Intelligenza Artificiale e Robotica

(Sub)Gradient Descent

Laboratorio di Intelligenza Artificiale e Robotica

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Proof Theory for Syntacticians

CSL465/603 - Machine Learning

Artificial Neural Networks written examination

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Rule Learning With Negation: Issues Regarding Effectiveness

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Lecture 10: Reinforcement Learning

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Word Segmentation of Off-line Handwritten Documents

CS Machine Learning

Axiom 2013 Team Description Paper

Seminar - Organic Computing

A Version Space Approach to Learning Context-free Grammars

AQUA: An Ontology-Driven Question Answering System

Chapter 2 Rule Learning in a Nutshell

Applications of data mining algorithms to analysis of medical data

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Knowledge-Based - Systems

A Case Study: News Classification Based on Term Frequency

MYCIN. The MYCIN Task

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Rule-based Expert Systems

Learning Methods for Fuzzy Systems

Computerized Adaptive Psychological Testing A Personalisation Perspective

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Version Space. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Version Space Term 2012/ / 18

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

CS 598 Natural Language Processing

Chapter 9 Banked gap-filling

Knowledge Transfer in Deep Convolutional Neural Nets

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Using focal point learning to improve human machine tacit coordination

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

Probabilistic Latent Semantic Analysis

Critical Thinking in the Workplace. for City of Tallahassee Gabrielle K. Gabrielli, Ph.D.

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Welcome to. ECML/PKDD 2004 Community meeting

Reinforcement Learning by Comparing Immediate Reward

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

Time series prediction

Innovative Methods for Teaching Engineering Courses

Rule Learning with Negation: Issues Regarding Effectiveness

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Classifying combinations: Do students distinguish between different types of combination problems?

CAAP. Content Analysis Report. Sample College. Institution Code: 9011 Institution Type: 4-Year Subgroup: none Test Date: Spring 2011

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Firms and Markets Saturdays Summer I 2014

Abstractions and the Brain

Evolution of Collective Commitment during Teamwork

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Study and Analysis of MYCIN expert system

CS 446: Machine Learning

Mining Association Rules in Student s Assessment Data

Unit: Human Impact Differentiated (Tiered) Task How Does Human Activity Impact Soil Erosion?

INSTRUCTIONAL FOCUS DOCUMENT Grade 5/Science

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Reducing Features to Improve Bug Prediction

Mathematics. Mathematics

Ontologies vs. classification systems

Statistical Analysis of Climate Change, Renewable Energies, and Sustainability An Independent Investigation for Introduction to Statistics

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Lecturing Module

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Cooperative evolutive concept learning: an empirical study

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

What is Thinking (Cognition)?

On-Line Data Analytics

Chamilo 2.0: A Second Generation Open Source E-learning and Collaboration Platform

Discriminative Learning of Beam-Search Heuristics for Planning

Software Maintenance

Linking Task: Identifying authors and book titles in verbose queries

Full text of O L O W Science As Inquiry conference. Science as Inquiry

Using dialogue context to improve parsing performance in dialogue systems

Math 96: Intermediate Algebra in Context

Problems of the Arabic OCR: New Attitudes

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

The Strong Minimalist Thesis and Bounded Optimality

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Evolutive Neural Net Fuzzy Filtering: Basic Description

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Read the passage above. What does Chief Seattle believe about owning land?

KLI: Infer KCs from repeated assessment events. Do you know what you know? Ken Koedinger HCI & Psychology CMU Director of LearnLab

Ontological spine, localization and multilingual access

Australian Journal of Basic and Applied Sciences

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

SOFTWARE EVALUATION TOOL

Compositional Semantics

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS

Scientific Method Investigation of Plant Seed Germination

Transcription:

Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010 by Martin Sticht; 2014 by Christian Reißner Applied Computer Science, Bamberg University Last change: October 18, 2017 Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 1 / 31

Introduction Organization Organization of the Course Homepage: http://www.uni-bamberg.de/kogsys/teaching/courses/lernende-systeme/ Sign up in the VC-course! Textbook: Tom Mitchell (1997). Machine Learning. McGraw Hill. A classic, based more on an AI background than on a purely statistical treatment of ML For current/statistical/probabilistic approaches see: Text book of Bishop and partially also the AI book of Russell and Norvig Practice: Paper/Pencil, Programming Assignements, Rapid Miner Marked exercise sheets and extra points for the exam Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 2 / 31

Introduction Outline Outline of the Course Basic Concepts of Machine Learning Basic Approaches to Classification Learning Foundations of Classification Learning Decision Trees Perceptrons and Multilayer-Perceptrons Human Concept Learning Special Aspects of Classification/Inductive Learning Inductive Logic Programming Genetic Algorithms Instance-based Learning Bayesian Learning Kernel Methods (SVMs) Hidden Markov Models Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 3 / 31

Introduction Outline Outline of the Course Theoretical Aspects of Learning Evaluating Hypotheses (Computational Learning Theory) Learning Programs and Strategies Reinforcement Learning Inductive Function Synthesis (Analytical Learning) Unsupervised Learning Cluster Analysis Further Topics and Applications in Machine Learning (e.g. data mining) Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 4 / 31

Introduction Course Objectives Course Objectives Introduce central approaches of machine learning Point out relations to human learning Provide understanding of the fundamental structure of learning problems and processes Explore algorithms that solve such problems Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 5 / 31

Motivation Some Quotes as Motivation If an expert system brilliantly designed, engineered and implemented cannot learn not to repeat its mistakes, it is not as intelligent as a worm or a sea anemone or a kitten. Oliver G. Selfridge, from The Gardens of Learning If we are ever to make claims of creating an artificial intelligence, we must address issues in natural language, automated reasoning, and machine learning. George F. Luger Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 6 / 31

Definitions What is Machine Learning? Some definitions Machine learning refers to a system capable of the autonomous acquisition and integration of knowledge. This capacity to learn from experience, analytical observation, and other means, results in a system that can continuously self-improve and thereby offer increased efficiency and effectiveness. http://www.aaai.org/aitopics/html/machine.html The field of machine learning is concerned with the question of how to construct computer programms that automatically improve with experience. Tom M. Mitchell, Machine Learning (1997) Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 7 / 31

Definitions ML as Multidisciplinary Field Machine learning is inherently a multidisciplinary field artificial intelligence probability theory, statistics computational complexity theory information theory philosophy psychology neurobiology... e.g. CALD (Center of Automated Learning and Discovery at CMU) Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 8 / 31

Definitions Knowledge-based vs. Learning Systems Knowledge-based Systems: Acquisition and modeling of common-sense knowledge and expert knowledge limited to given knowledge base and rule set Inference: Deduction generates no new knowledge but makes implicitly given knowledge explicit Top-Down: from rules to facts Learning Systems: Extraction of knowledge and rules from examples/experience Teach the system vs. program the system Learning as inductive process Bottom-Up: from facts to rules Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 9 / 31

Definitions Knowledge-based vs. Learning Systems A flexible and adaptive organism cannot rely on a fixed set of behavior rules but must learn (over its complete life-span)! Motivation for Learning Systems Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 10 / 31

Definitions Knowledge Acquisition Bottleneck (Feigenbaum, 1983) Break-through in computer chess with Deep Blue: Evaluation function of chess grandmaster Joel Benjamin. Deep Blue cannot change the evaluation function by itself! Experts are often not able to verbalize their special knowledge. Indirect methods: Extraction of knowledge from expert behavior in example situations (diagnosis of X-rays, controlling a chemical plant,...) Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 11 / 31

Definitions Merit of Machine Learning Great practical value in many application domains Data Mining: large databases may contain valuable implicit regularities that can be discovered automatically (outcomes of medical treatments, consumer preferences) Poorly understood domains where humans might not have the knowledge needed to develop efficient algorithms (human face recognition from images) Domains where the program must dynamically adapt to changing conditions (controlling manufacturing processes under changing supply stocks) Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 12 / 31

Learning as Induction Learning as Induction Deduction Induction All humans are mortal. (Axiom) Socrates is human. (Background K.) Socrates is human. (Fact) Socrates is mortal. (Observation(s)) Conclusion: Socrates is mortal. Generalization: All humans are mortal. Deduction: from general to specific proven correctness Induction: from specific to general (unproven) knowledge gain Induction generates hypotheses not knowledge! Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 13 / 31

Learning as Induction Epistemological Problems pragmatic solutions Confirmation Theory: A hypothesis obtained by generalization gets supported by new observations (not proven!). Grue Paradox : All emeralds are grue. Something is grue, if it is green before a future time t and blue thereafter. Not learnable from examples! Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 14 / 31

Learning as Induction Inductive Learning Hypothesis As shown above inductive learning is not proven correct The learning task is to determine a hypothesis h H identical to the target concept c for all possible instances in instance space X ( x X )[h(x) = c(x)] Only training examples D X are available Inductive algorithms can at best guarantee that the output hypothesis h fits the target concept over D ( x D)[h(x) = c(x)] Inductive Learning Hypothesis: Any hypothesis found to approximate the target concept well over a sufficiently large set of training examples will also approximate the target function well over other unobserved examples Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 15 / 31

Concept Learning Concept and Classification Learning Concept learning: Objects are clustered in concepts. Extensional: (infinite) set X of all exemplars Intentional: finite characterization T = {x has-3/4-legs(x), has-top(x)} Construction of a finite characterization from a subset of examples in X ( training set D). Natural extended to classes: h : X {0, 1} c(x) {0, 1} Identification of relevant attributes and their interrelation, which characterize an object as member of a class. h : X K c(x) {k 1,..., k n } Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 16 / 31

Concept Learning Constituents of Classification Learning A set of training examples D X Each example is represented by an n-ary feature vector x X and associated with a class c(x) K: x, c(x) A learning algorithm constructing a hypothesis h H A set of new objects, also represented by feature vectors which can be classified according to h Examples for features and values Sky {sunny, rainy} AirTemp {warm, cold} Humidity {normal, high} Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 17 / 31

Concept Learning Concept Learning / Examples Occurrence of Tse-Tse fly yes/no, given geographic and climatic attributes Risk of cardiac arrest yes/no, given medical data Credit-worthiness of customer yes/no, given personal and customer data Safe chemical process yes/no, given physical and chemical measurements Generalization of pre-classified example data, application for prognosis Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 18 / 31

Terminology Learning Terminology Supervised learning: pre-classified examples Unsupervised learning: no classification available (data exploration) Different approaches Concept/Classification vs. Policy Learning Symbolic vs. Statistical/Neural Network Learning Inductive vs. Analytical Learning Some General Learning Strategies rote learning/learning by being told (no generalization/induction) learning by analogy (generalization over base and target problem) learning from discovery (unsupervised learning) learning from experience learning from examples (classical inductive approach) Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 19 / 31

Terminology Further Example Learning Problems Handwriting recognition Play checkers Robot driving Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 20 / 31

Design a Learning System Designing a Learning System Learning system: A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. i.e. Handwriting recognition T : recognizing and classifying handwritten words within images P: percent of words correctly classified E: database of handwritten words with given classifications consider designing a program to learn to recognize handwritten words in order to illustrate some of the basic design issues and approaches to machine learning Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 21 / 31

Design a Learning System Designing a Learning System 1 Choosing the Training Experience direct or indirect feedback degree to which the learner controls the sequence of training examples representativity of the distribution of the training examples significant impact on success or failure 2 Choosing the Target Function determine what type of knowledge will be learned most obvious form is some kind of combination of feature values which can be associated with a class (word/letter) 3 Choosing a Representation for the Target Function e.g. a large table, a set of rules, a linear function, an arbitrary function 4 Choosing a Learning Algorithm Decision Tree, Multi-Layer Perceptron,... 5 Presenting Training Examples all at once incrementally Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 22 / 31

Design a Learning System Recapitulation: Notation Instance Space X : set of all possible examples over which the concept is defined (possibly attribute vectors) Target Concept c : X {0, 1}: concept or function to be learned Target Class c : X {k 1,..., k n } Training Example x X of the form < x, c(x) > Training Set D: set of all available training examples Hypothesis Space H: set of all possible hypotheses according to the hypothesis language Hypothesis h H: boolean valued function of the form X {0, 1} or X K the goal is to find a h H, such that ( x X )[h(x) = c(x)] Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 23 / 31

Hypotheses Hypothesis Language H is determined by the predefined language in which hypotheses can be formulated e.g.: Conjunctions of feature values vs. Disjunction of conjunctions vs. Matrix of real numbers vs. Horn clauses... Hypothesis language and learning algorithm are highly interdependent Each hypothesis language implies a bias! Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 24 / 31

Hypotheses Properties of Hypotheses general-to-specific ordering naturally occurring order over H learning algorithms can be designed to search H exhaustively without explicitly enumerating each hypothesis h h i is more general or equal to h k (written h i g h k ) ( x X )[(h k (x) = 1) (h i (x) = 1)] h i is (strictly) more general to h k (written h i > g h k ) (h i g h k ) (h k g h i ) g defines a partial ordering over the Hypothesis Space H Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 25 / 31

Hypotheses Running Example example target concept Enjoy: days on which Aldo enjoys his favorite sport set of example days D, each represented by a set of attributes Example Sky AirTemp Humidity Wind Water Forecast Enjoy 1 Sunny Warm Normal Strong Warm Same Yes 2 Sunny Warm High Strong Warm Same Yes 3 Rainy Cold High Strong Warm Change No 4 Sunny Warm High Strong Cool Change Yes the task is to learn to predict the value of Enjoy for an arbitrary day, based on the values of its other attributes Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 26 / 31

Hypotheses Properties of Hypotheses - Example h 1 = Aldo loves playing Tennis if the sky is sunny h 2 = Aldo loves playing Tennis if the water is warm h 3 = Aldo loves playing Tennis if the sky is sunny and the water is warm h 1 > g h 3, h 2 > g h 3, h 2 g h 1, h 1 g h 2 Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 27 / 31

Hypotheses Properties of Hypotheses consistency a hypothesis h is consistent with a set of training examples D iff h(x) = c(x) for each example < x, c(x) > in D Consistent(h, D) ( < x, c(x) > D)[h(x) = c(x)] that is, every example in D is classified correctly by the hypothesis Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 28 / 31

Hypotheses Properties of Hypotheses - Example h 1 is consistent with D Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 29 / 31

Hypotheses Learning Involves Search Searching through a space of possible hypotheses to find the hypothesis that best fits the available training examples and other prior constraints or knowledge Different learning methods search different hypothesis spaces Learning methods can be characterized by the conditions under which these search methods converge toward an optimal hypothesis Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 30 / 31

Summary Summary Machine learning (ML) is automated knowledge acquisition and improvement Typically, ML is a process of inductive reasoning. In contrast to deductive knowledge extraction, ML means acquistion of new, generalized, hypothetical knowledge from sample experience. The inductive learning hypothesis states that if a hypothesis approximates a target concept reasonably well over the training examples, it will also work reasonably well over unobserved examples. Concept learning is a special case of classification learning with only two classes (belongs to concepts/does not belong to concept). Important concepts of ML are: Instance space and hypothesis space, training set and target class. Some hypothesis languages allow a general-to-specific ordering of hypotheses. A hypothesis is called consistent with a training set if all examples can be classified correctly (in many cases, we do not want to learn such overfitting hypotheses, as we will discuss later). In general, ML can be characterized as search in hypothesis space. Ute Schmid (CogSys, WIAI) ML Basic Concepts October 18, 2017 31 / 31