Machine Learning and Pattern Recognition Introduction

Machine Learning and Pattern Recognition Introduction Giovanni Maria Farinella gfarinella@dmi.unict.it www.dmi.unict.it/farinella

What is ML & PR? Interdisciplinary field focusing on both the mathematical foundations and practical applications of systems that learn, reason and act. ML & PR

Definition of Machine Learning Arthur Samuel (1959). MachineLearning: Field of study that gives computers the ability to learn without being explicitly programmed. Tom Mitchell (1998) Well-posed Learning Problem: A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. SPAM

A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. Suppose your email program watches which emails you do or do not mark as spam, and based on that learns how to better filter spam. What is the task T in this setting? The task T is classify emails as spam or not spam What is the experience E in this problem? Watching the emails you have labeled as spam or not spam What is the measure P in this settings? The number of emails correctly classified as spam/not spam

Why ML and PR are useful?

Learning Approaches Imagine an organism/machine which experiences a series of sensory inputs: x (1), x (2), x (n) Supervised learning: the organism/machine is also given desired outputs y (1), y (2),...,y (n), and its goal is to learn to produce the correct output given a new input. Unsupervised learning: the goal of the organism/machine is to build a model of x that can be used for reasoning, decision making,predicting things,etc. Reinforcement learning: the organism/machine can also produce actions a (1), a (2),..a (n), which affect the state of the world, and receives rewards (or punishments) r (1), r (2),... r (n). Its goal is to learn to act in a way that maximizes rewards (or minimize punishments) in the long term.

Data Representation

Features Classes A feature is the specification of an attribute. It is a measurements which represent the data. For example, color is an attribute. ``Color is blue'' is a feature of an example. The statistical model to be employed is crucially dependent on the choice of features. Hence it is useful to consider alternative representations of the same measurements (i.e. different features).

Our Goal

Features Types Categorical: A finite number of discrete values. The type nominal denotes that there is no ordering between the values, such as last names and colors. The type ordinal denotes that there is an ordering, such as in an attribute taking on the values low,medium, or high. Continuous (quantitative): Commonly, subset of real numbers, where there is a measurable difference between the possible values. Integers are usually treated as continuous inpractical problems. Hybrid: set of feature composed by both categorical and continuous features

Examples (Height,Weight) à BMI (Height,Weight) à {normal, overweight} (Prod1, Prod2,,ProdK) à Find Groups (Rating1, Rating2,, RatingN) à Find Groups Q1,Q2,..,QN à {PD, PDL, M5S, } Q1,Q2,..,QM à % ImageàLike

Learning Approach: Supervised Learning x " ($) Notazione x ($) =(x & ($), x " ($) )=(long (4), lat (4) ) n=8 à i=1,,n d=2 y ($) ϵ{0,1}= {Italian,Japanise} x & ($)

Typical Supervised Learning Problems Regression: the desidered output consist of one o more continuous variables Classification/Recognition: the aim is to assign each input vector to one of a finite number of category Recommendation systems: seek to predict the rating or preference that a user would give to an item given past ratings/preferences

Supervised Learning Problem: Regression y ($) Price ($) in 1000 s 400 300 200 100 Supervised Learning right answers given 0 Hipothesis={h 1,h 2 } h 1 h 2 0 500 1000 1500 2000 2500 Size in mq x & ($) Training Set Regression: Predict continuous valued output (price)

Supervised Learning Problem: Regression y ($) Price ($) in 1000 s 400 300 200 100 0 0 500 1000 1500 2000 2500 Size in mq x ($) =(x & ($), x " ($). x A ($) )=(? & ($),? " ($).? A ($) ), i=1,,n n=? d=? y ($) =? x & ($) E=Experience=? T=Task =? P=Performances=?

Supervised Learning Problem: Regression y ($) Price ($) in 1000 s 400 300 200 100 0 0 500 1000 1500 2000 2500 Size in mq x ($) =(x & ($) )=(Size in mq ($) ), i=1,,n n=11 d=1 y ($) =Price ($) x & ($)

Stages

Training set Age Supervised Learning Problem: Classification/Recognition Tumor Size More significant features: - Clump Thickness - Uniformity of Cell Size - Uniformity of Cell Shape Homework x ($) =(x & ($), x " ($). x A ($) )=(? & ($),? " ($).? A ($) ), i=1,,n n=? d=? y ($) =? E=Experience=? T=Task =? P=Performances=?

Training and Test Sets

Training and Test Sets For Learning For Evaluation

Training, Validation and Test Sets For Learning For Tuning For Evaluation

Evaluation - Common Data Split Validation 5-fold Cross-validation: iteratesover the choice of which fold isthe validation fold, separately from 1-5. The bestmodel ishence evaluated on test data.

Supervised Learning Problem: Recommandators Predicting movie ratings User rates movies using zero to five stars Training set Romance Action Movie Alice (1) Bob (2) Carol (3) Dave (4) Love at last 5 5 0 0 Romance forever 5 4.5?? 0 0 Cute puppies of love? 5 4 0 0? Nonstop car chases 0 0 5 4 Swords vs. karate 0 0 5 4? Nice Boring

Supervised Learning: Take Home Message x " ($) input output Training set: x & ($)

Learning Approach: Unsupervised Learning x " ($) Build a model of x that can be used for reasoning, decision making, predicting things Clustering Training set: x & ($)

Typical Unsupervised Learning Problems Market segmentation: divide a broad target market into subsets of consumers who have common needs and priorities, and then designing and implementing strategies to target them. Social network analysis: views social relationships in terms of network theory, consisting of nodes, representing individual actors within the network, and ties which represent relationships between the individuals, such as friendship, kinship, organizations and sexual relationships

Classification by Retrieval K-Nearest Neighbor (KNN) Training must be stored. We need a distance.

Classification by Retrieval K-Nearest Neighbor (KNN)

Classification by Retrieval

The importance of Representation KNN with Euclidean distance on pixel space is not suitable! We need a better representation of the data.