DM825 (5 ECTS - 4th Quarter) Introduction to Machine Learning Introduktion til maskinlœring

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Basic Concepts of Machine Learning

Python Machine Learning

Artificial Neural Networks written examination

Probabilistic Latent Semantic Analysis

STA 225: Introductory Statistics (CT)

Lecture 1: Machine Learning Basics

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Probability and Statistics Curriculum Pacing Guide

Speech Emotion Recognition Using Support Vector Machine

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Learning Methods for Fuzzy Systems

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

The Evolution of Random Phenomena

EGRHS Course Fair. Science & Math AP & IB Courses

Laboratorio di Intelligenza Artificiale e Robotica

Laboratorio di Intelligenza Artificiale e Robotica

MGT/MGP/MGB 261: Investment Analysis

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

CSL465/603 - Machine Learning

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

(Sub)Gradient Descent

CS Machine Learning

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

POLA: a student modeling framework for Probabilistic On-Line Assessment of problem solving performance

Time series prediction

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining. Predictive Data Mining with Finite Mixtures

Why Did My Detector Do That?!

School of Innovative Technologies and Engineering

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

DIDACTIC MODEL BRIDGING A CONCEPT WITH PHENOMENA

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

Issues in the Mining of Heart Failure Datasets

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

STAT 220 Midterm Exam, Friday, Feb. 24

A study of speaker adaptation for DNN-based speech synthesis

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

Math Placement at Paci c Lutheran University

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Human Emotion Recognition From Speech

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Multivariate k-nearest Neighbor Regression for Time Series data -

Mathematics. Mathematics

International Journal of Advanced Networking Applications (IJANA) ISSN No. :

Applications of data mining algorithms to analysis of medical data

B.S/M.A in Mathematics

Automatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Physics 270: Experimental Physics

Statistics and Data Analytics Minor

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Knowledge Transfer in Deep Convolutional Neural Nets

Advanced Multiprocessor Programming

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

CS/SE 3341 Spring 2012

Welcome to. ECML/PKDD 2004 Community meeting

Computerized Adaptive Psychological Testing A Personalisation Perspective

Rule-based Expert Systems

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Reducing Features to Improve Bug Prediction

THE world surrounding us involves multiple modalities

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

Truth Inference in Crowdsourcing: Is the Problem Solved?

COMPUTER SCIENCE GRADUATE STUDIES Course Descriptions by Methodology

Automatic Pronunciation Checker

Missouri Mathematics Grade-Level Expectations

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

CS 446: Machine Learning

COMPUTER SCIENCE GRADUATE STUDIES Course Descriptions by Research Area

Data Fusion Through Statistical Matching

A survey of multi-view machine learning

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

San José State University Department of Marketing and Decision Sciences BUS 90-06/ Business Statistics Spring 2017 January 26 to May 16, 2017

Uncertainty concepts, types, sources

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Managerial Decision Making

Unit 3: Lesson 1 Decimals as Equal Divisions

TEACHING AND EXAMINATION REGULATIONS PART B: programme-specific section MASTER S PROGRAMME IN LOGIC

Universidade do Minho Escola de Engenharia

Word Segmentation of Off-line Handwritten Documents

Knowledge-Based - Systems

Helping Your Children Learn in the Middle School Years MATH

Test Effort Estimation Using Neural Network

Navigating the PhD Options in CMS

Transcription:

DM825 (5 ECTS - 4th Quarter) Introduction to Machine Learning Introduktion til maskinlœring Marco Chiarandini adjunkt, IMADA www.imada.sdu.dk/~marco/ 1

Machine Learning A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. Tom M. Mitchell (1997) Machine Learning p.2 2

Machine Learning A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E. Tom M. Mitchell (1997) Machine Learning p.2 Core objective of a learner: generalize from its experience. Training examples from experience come from unknown probability distribution. The learner has to extract something to produce a useful answer in new cases. 2

Contents Classification and Regression via Linear Models Neural Networks Graphical Models Bayesian Networks Hidden Markov Models Mixture Models and Expectation Maximization Support Vector Machines Assessment and Selection Unsupervised Learning (Association rules, cluster analysis, principal components) 3

Perceptron algorithm 4

Multilayered Neural Networks 5

Applications 6

Applications Handwritten digit recognition Humans are at 0.2% 2.5 % error 400 300 10 unit MLP = 1.6% error LeNet: 768 192 30 10 unit MLP = 0.9% error 7

Graphical Models Allow to represent our prior knoweldge and to use a general suite of algorithms to make inference and to improve our models for a specific application domain Complex systems involve uncertainty => Probability framework interralated aspects of the system are modelled as random variables 8

Example: Medical diagnosis two deases: Fly and Hayfever they are not mutually exclusive Season might be correlated with them symptoms such as Congestion and Muscle Pain 4 random variables: Flu = {true,false}; Hayfever = {true, false} Season = {fall, winter, spring, summer} Congestion = {true, false} MusclePain = {true, false} 2x2x4x2x2=64 possible prob. values for joint distribution P(Flu=true Season=fall, Congestion=true, MusclePain=false) If the number of variables grows the problem becomes intractable 9

Example: continued Graphical models use graph-based representation to encode independencies Season MusclePain Flu F and H independent given Season C and S independent given F and H M and H,C independent given F M and C independent gien F Congestion Hayfever We thus only need to define 3+ 4 +4 +4 +2 =17 parameteers P(S,F,H,C,M)=P(S)P(F S)P(H S)P(C F,H)P(M F) 10

Bayesian Learning What can we do from here? Inference: Complexity issues O(2^n) Learning (parameters and structure) 11

Bayesian Learning What can we do from here? Inference: Complexity issues O(2^n) Learning (parameters and structure) Thumbtack Experiment 11

Bayesian Learning What can we do from here? Inference: Complexity issues O(2^n) Learning (parameters and structure) Thumbtack Experiment Flip the thumbtack in the air and observe the number of times it lands with head and tail We wish to learn how much the probability deviates from 0.5 11

Bayesian Learning What can we do from here? Inference: Complexity issues O(2^n) Learning (parameters and structure) Thumbtack Experiment Flip the thumbtack in the air and observe the number of times it lands with head and tail We wish to learn how much the probability deviates from 0.5 11

Bayesian Learning What can we do from here? Inference: Complexity issues O(2^n) Learning (parameters and structure) Thumbtack Experiment Flip the thumbtack in the air and observe the number of times it lands with head and tail We wish to learn how much the probability deviates from 0.5 Suppose we observe 3 heads in 10 tosses. With no prior knowledge we would set p=3/10=0.33 With a prior of 10 heads over 20 tosses we would set p=(3+10)/ (10+20)=13/30=0.43 However if we obtain more data the effect diminshes: (300+1)/1000+2=0.3 and (300+10)/(1000+20)=0.3 11

Course Organization Prerequisites MM501 Calculus I MM505 Linear Algebra Basics of Probability Calculus Final Assessment (5 ECTS) Mandatory assignments, pass/fail, internal evaluation by the teacher. Include programming work in R 3 hours written exam, Danish 7 mark scale External examiner 12

Course Material Text book - C.M. Bishop. Pattern recognition and Machine Learning Springer, 2006 - Slides Source code and data sets www.imada.sdu.dk/~marco/dm825 13