T Machine Learning: Advanced Probablistic Methods

Similar documents
Welcome to. ECML/PKDD 2004 Community meeting

Semi-Supervised Face Detection

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Machine Learning Basics

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

CSL465/603 - Machine Learning

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Probabilistic Latent Semantic Analysis

Generative models and adversarial training

Laboratorio di Intelligenza Artificiale e Robotica

Rule Learning With Negation: Issues Regarding Effectiveness

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

CS 446: Machine Learning

Evaluating Statements About Probability

Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining. Predictive Data Mining with Finite Mixtures

Laboratorio di Intelligenza Artificiale e Robotica

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Switchboard Language Model Improvement with Conversational Data from Gigaword

Rule Learning with Negation: Issues Regarding Effectiveness

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Lecture 1: Basic Concepts of Machine Learning

understand a concept, master it through many problem-solving tasks, and apply it in different situations. One may have sufficient knowledge about a do

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Reducing Features to Improve Bug Prediction

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

Learning From the Past with Experiment Databases

Comparison of network inference packages and methods for multiple networks inference

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

A Case Study: News Classification Based on Term Frequency

Reinforcement Learning by Comparing Immediate Reward

Word learning as Bayesian inference

The Evolution of Random Phenomena

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Evolutive Neural Net Fuzzy Filtering: Basic Description

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

CS Machine Learning

A Case-Based Approach To Imitation Learning in Robotic Agents

Toward Probabilistic Natural Logic for Syllogistic Reasoning

Learning Methods for Fuzzy Systems

A survey of multi-view machine learning

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Robot Learning Simultaneously a Task and How to Interpret Human Instructions

Georgetown University at TREC 2017 Dynamic Domain Track

Truth Inference in Crowdsourcing: Is the Problem Solved?

Python Machine Learning

Uncertainty concepts, types, sources

CS 100: Principles of Computing

Australian Journal of Basic and Applied Sciences

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Lecture 10: Reinforcement Learning

Integrating E-learning Environments with Computational Intelligence Assessment Agents

Second training session for international tutors. Noora Maja & Henriikka Kaunela 19 August 2014

Top US Tech Talent for the Top China Tech Company

(Sub)Gradient Descent

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

FF+FPG: Guiding a Policy-Gradient Planner

Preference Learning in Recommender Systems

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Knowledge-Based - Systems

Fulltime MSc Real Estate and MSc Real Estate Finance Programmes: An Introduction

Handling Concept Drifts Using Dynamic Selection of Classifiers

Rule-based Expert Systems

A Genetic Irrational Belief System

Automatic document classification of biological literature

Speech Recognition at ICSI: Broadcast News and beyond

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Managerial Decision Making

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

Master s Programme in Computer, Communication and Information Sciences, Study guide , ELEC Majors

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

arxiv: v1 [cs.lg] 15 Jun 2015

Computerized Adaptive Psychological Testing A Personalisation Perspective

Applications of data mining algorithms to analysis of medical data

Dublin City Schools Mathematics Graded Course of Study GRADE 4

In how many ways can one junior and one senior be selected from a group of 8 juniors and 6 seniors?

Henry Tirri* Petri Myllymgki

Multi-Dimensional, Multi-Level, and Multi-Timepoint Item Response Modeling.

Artificial Neural Networks written examination

Mathematics Success Grade 7

Multisensor Data Fusion: From Algorithms And Architectural Design To Applications (Devices, Circuits, And Systems)

Visual CP Representation of Knowledge

An OO Framework for building Intelligence and Learning properties in Software Agents

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

Data Fusion Through Statistical Matching

Learning Rules from Incomplete Examples via Implicit Mention Models

Universidade do Minho Escola de Engenharia

SARDNET: A Self-Organizing Feature Map for Sequences

EDEXCEL FUNCTIONAL SKILLS PILOT. Maths Level 2. Chapter 7. Working with probability

Informal Comparative Inference: What is it? Hand Dominance and Throwing Accuracy

Ph.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept B.Tech in Computer science and

A Comparison of Two Text Representations for Sentiment Analysis

A NEW ALGORITHM FOR GENERATION OF DECISION TREES

Bluetooth mlearning Applications for the Classroom of the Future

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

How do adults reason about their opponent? Typologies of players in a turn-taking game

Transcription:

T-61.5140 Machine Learning: Advanced Probablistic Methods Jaakko Hollmén Department of Information and Computer Science Helsinki University of Technology, Finland e-mail: Jaakko.Hollmen@tkk.fi Web: http://www.cis.hut.fi/opinnot/t-61.5140/ January 17, 2008

Course Organization: Personnel Lecturer: Jaakko Hollmén, D.Sc.(Tech.) Lectures on Thursdays, from 10.15-12.00 in T3 Course Assistant: Tapani Raiko, D.Sc.(Tech.) Problem sessions on Fridays, from 10.15-12.00 in T3 For the schedule, holidays and special program, see http://www.cis.hut.fi/opinnot/t-61.5140/

Course Material Lecture slides and lectures Lecture notes (aid the presentation on the lectures) Lecture notes (contain extra material) Course book Christopher M. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 Chapters 8,9,10,11, and 13 covered during the course Problem sessions Problems and solutions Demonstrations

Participating on the Course Interest in machine learning Student number at TKK needed Course registration on the WebTopi System: https://webtopi.tkk.fi Prerequisites: T-61.3050 Machine Learning: Basic principles taught in Autumn by Kai Puolamäki and the necessary prerequisites for that course

Passing the Course (5 ECTS credit points) Attend the lectures and the exercise sessions for best learning experience :-) Browse the material before attending the lectures and complete the exercises Complete the term project requiring solving of a machine learning problem by programming Pass the examination, next exam scheduled: Thursday, 15th of May, morning Requirements: passed exam and a acceptable term project, bonus for active participation and excellent term project (+1)

Relation to Other Courses This course replaces the old course T-61.5040 Learning Models and Methods no more lectures, last exam in March, 2008 Little overlap expected in parts with courses like T-61.3050 Machine Learning: Basic Principles T-61.5130 Machine Learning and Neural Networks T-61.3020 Principles of Pattern Recognition Some overlap is good!

Resources on Machine Learning Machine Learning: Basic Principles course book Ethem Alpaydin: Introduction to Machine Learning, MIT Press, 2004 Conferences on Machine Learning: European Conference on Machine Learning (ECML), co-located with the Principles of Knowledge Discovery and Data Mining (PKDD) International Conference in Machine Learning (ICML), in Helsinki in July 2008, see for details: http://icml2008.cs.helsinki.fi/ Uncertainty in Artificial Intelligence (UAI), in Helsinki in July 2008, see for details: http://uai2008.cs.helsinki.fi/

Resources on Machine Learning Journals in Machine Learning Machine Learning, Journal of Machine Learning Research, IEEE Pattern Analysis and Machine Intelligence, Pattern Recognition, Pattern Recognition Letters, Neural Computing, Neural Computation, and many others Also domain-related journals: BMC Bioinformatics, Bioinformatics, etc. Community-based resources Mailing lists: UAI, connectionists, ML-news, ml-list, kdnuggets, etc. http://en.wikipedia.org/wiki/machine_learning

What is machine learning? Machine learning people develop algorithms for computers to learn from data. We don t cover all of machine learning! The modern approach to machine learning: the probabilistic approach The probabilistic approach to machine learning Generative models, Finite mixture models Graphical models, Bayesian networks Inference and learning Expectation Maximization algorithm

Topics covered on the course Central topics Random variables Independence and conditional independence Bayes s rule Naive Bayes classifier, finite mixture models, k-means clustering Expectation Maximization algorithm for inference and learning Computational algorithms for exact inference Computational algorithms for approximate inference Sampling techniques Bayesian modeling

Three simple examples Simple coin tossing with one coin A game two players: coin tossing with two coins Naive Bayes classification in a bioinformatics application

Simple coin tossing with one coin Throw a coin The coin lands either on heads (H) or tails (T). We don t know the outcome before the experiment We model the outcome with a random variable X X = {H, T}, P(X = H) =?, P(X = T) = 1? Perform an experiment, estimate the? Parameterization: P(X = T) = θ, P(X = H) = 1 θ Fixed parameters tell about the properties of the coin

Simple coin tossing with one coin After the experiment, we have X 1 = x 1,..., X 12 = x 12 The likelihood function is the probability of observed data P(x 1,..., x 12 ; θ 1, θ 2,..., θ 12 ) What can we assume? What do we want to assume? Fair coin? Coin tosses are independent and identically distributed random variables Likelihood function factorizes to P(x 1 ; θ)p(x 2 ; θ)... P(x 12 ; θ) Maximum likelihood estimator gives a parameter value that maximizes the likelihood

Guessing game with two coins Description of the game: Player one, player two Coin number one: P(X 1 = T) = θ 1 (unknown) Coin number two: P(X 2 = T) = θ 2 (unknown) Player one chooses a coin randomly, either one or two model the choice as a random variable Choose coin: P(C = c 1 ) = π 1, or P(C = c 2 ) = π 2 π 1 + π 2 = 1 π 2 = 1 π 1

Guessing game with two coins We would like to do better that guessing, let s model the situation Outcome of the coin from coin j: P(X C = j) Ingredients: P(X C = 1), P(X C = 2), P(C) First, the coin is chosen (secretly), then, thrown The outcome of the coin depends on the choice P(X, C) = P(C)P(X C) P(X) = 2 j=1 P(C = j)p(x C = j) What is the probability of heads?

Guessing game with two coins Guess which coin it was? P(C = j X)? We know P(C), P(X C), P(X) Use the Bayes s rule! P(C X) = P(C)P(X C) P(X) Which coin was it more probably if you observed heads?

Naive Bayes classification Classify gastric cancers using DNA copy number amplification data X 1,..., X 6 The observed data: X i = {0, 1}, i = 1,..., 6 Class labels: C = 1, 2 The joint probability distribution P(X 1, X 2, X 3, X 4, X 5, X 6, C) Assumptions creep in... X i and X j are conditionally independent given C P(X 1, X 2, X 3, X 4, X 5, X 6, C) = P(C)P(X 1 C)P(X 2 C)... P(X 6 C) Interest in P(C X 1, X 2,..., X 6 ) Demo here!

Problem sessions Schedule for the problem sessions: First Problem session: 25 of January, 10.15-12.00 Problems posted on the Web site one week before the session