Machine Learning 1. week

Similar documents
Python Machine Learning

Artificial Neural Networks written examination

Lecture 1: Machine Learning Basics

CS Machine Learning

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

(Sub)Gradient Descent

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Basic Concepts of Machine Learning

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Laboratorio di Intelligenza Artificiale e Robotica

First Grade Standards

Artificial Neural Networks

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Grade 2: Using a Number Line to Order and Compare Numbers Place Value Horizontal Content Strand

Lecture 10: Reinforcement Learning

STA 225: Introductory Statistics (CT)

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Laboratorio di Intelligenza Artificiale e Robotica

Arizona s College and Career Ready Standards Mathematics

Grade 6: Correlated to AGS Basic Math Skills

Learning Methods for Fuzzy Systems

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Reducing Features to Improve Bug Prediction

On the Combined Behavior of Autonomous Resource Management Agents

Probability and Statistics Curriculum Pacing Guide

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Australian Journal of Basic and Applied Sciences

Human Emotion Recognition From Speech

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Learning From the Past with Experiment Databases

Assignment 1: Predicting Amazon Review Ratings

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Physics 270: Experimental Physics

Probabilistic Latent Semantic Analysis

Time series prediction

SARDNET: A Self-Organizing Feature Map for Sequences

Word Segmentation of Off-line Handwritten Documents

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

12- A whirlwind tour of statistics

Lecture 2: Quantifiers and Approximation

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

UNIT ONE Tools of Algebra

Softprop: Softmax Neural Network Backpropagation Learning

CS 446: Machine Learning

Applications of data mining algorithms to analysis of medical data

Self Study Report Computer Science

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Ohio s Learning Standards-Clear Learning Targets

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Introduction to Simulation

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Focus of the Unit: Much of this unit focuses on extending previous skills of multiplication and division to multi-digit whole numbers.

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Radius STEM Readiness TM

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

On-the-Fly Customization of Automated Essay Scoring

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

Predicting Students Performance with SimStudent: Learning Cognitive Skills from Observation

Unit 3: Lesson 1 Decimals as Equal Divisions

THINKING TOOLS: Differentiating the Content. Nanci Cole, Michelle Wikle, and Sacha Bennett - TOSAs Sandi Ishii, Supervisor of Gifted Education

Algebra 2- Semester 2 Review

Axiom 2013 Team Description Paper

Test Effort Estimation Using Neural Network

Finding Your Friends and Following Them to Where You Are

Speech Emotion Recognition Using Support Vector Machine

Build on students informal understanding of sharing and proportionality to develop initial fraction concepts.

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Rule Learning with Negation: Issues Regarding Effectiveness

arxiv: v1 [cs.cv] 10 May 2017

How People Learn Physics

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

I-COMPETERE: Using Applied Intelligence in search of competency gaps in software project managers.

Chapter 2 Rule Learning in a Nutshell

TABLE OF CONTENTS TABLE OF CONTENTS COVER PAGE HALAMAN PENGESAHAN PERNYATAAN NASKAH SOAL TUGAS AKHIR ACKNOWLEDGEMENT FOREWORD

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Grades. From Your Friends at The MAILBOX

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

TD(λ) and Q-Learning Based Ludo Players

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

TRANSFER LEARNING IN MIR: SHARING LEARNED LATENT REPRESENTATIONS FOR MUSIC AUDIO CLASSIFICATION AND SIMILARITY

A NEW ALGORITHM FOR GENERATION OF DECISION TREES

Calibration of Confidence Measures in Speech Recognition

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

Multivariate k-nearest Neighbor Regression for Time Series data -

K-Medoid Algorithm in Clustering Student Scholarship Applicants

A study of speaker adaptation for DNN-based speech synthesis

Contents. Foreword... 5

Introduction to Causal Inference. Problem Set 1. Required Problems

Transcription:

Machine Learning 1. week Terminology Supervised Unsupervised Learning Data Preparation Cross Validation Overfitting 1 What is Machine Learning? Machine Learning is common name of algorithms which can model a problem according to its data. There are many approaches. A part of these approaches is on prediction and estimation, other part is on classification. 2 1

Machine Learning Methods Machine learning methods can be different to each other according to their approach to problem and therefore may have different success in different problems. 3 Machine Learning Terms Prediction: It is used in case desired outputs have to be quantitative (continuous). Classification: It is used in case desired outputs have to be qualitative (discrete). 4 2

Prediction And Estimation Because they have similar meaning, the usages are confused in literature. But in statistics, while estimate term is used for determination of the model, prediction is a computation of an unknown value of a random variable by using an estimated equation. 5 Prediction And Estimation For example, a linear curve function is determined on red points. "Estimation" term is used to determine equation represent relation between x and y, "Prediction" term should be preferred to compute y values corresponding to an x value. 6 3

Estimation Example With regard to a problem, let following dataset be given. The goal is to produce an equation which can compute y value corresponding any x input. It is easy to see that the solution of this simple problem is y = 3 * x. X Y 3 9 5 15 8 24 10 30 19 57 24 72 27 81 31 93 38 114 43 129 7 Prediction Example According to the equation of found solution y = 3 * x, if y value is desired for x = 50 input, then it is possible to easily compute y = 150 value. X Y 3 9 5 15 8 24 10 30 19 57 24 72 27 81 31 93 38 114 43 129 8 4

Classification The goal is to divide the whole space of problem into a certain number of classes. In the image at the right, each color represents a class. By means of classification techniques, all space, even areas without any data, can be painted. 9 Classification Example Because Y values are all discrete, this is a classification problem. Classification equation can be written easily by using a threshold value as below. 0, Y 1, X 20 X 20 X Y 2 0 5 0 9 0 13 0 19 0 20 1 27 1 33 1 39 1 47 1 10 5

Supervised vs. Unsupervised According to dataset, learning is performed in two different ways. Supervised Learning: Data is organized as input vs. output parameters. The aim is to compute output values by using inputs in minimum errors. Unsupervised Learning: The aim to discover hidden groups in data without any output information. 11 Reinforcement Learning Sometimes, supervisor does not give expected result directly to learning machine. But for produced results, partial supports are send to system as "true / false". This learning method is called as reinforcement learning. Boltzmann machine, LVQ and genetic algorithm can be considered as examples. 12 6

Clustering All classification and estimation methods can be considered as supervised learning methods. Clustering methods are described under unsupervised learning title. But what is clustering in detail? 13 Clustering Clustering dealing with unlabeled data is the process of organizing similar objects into the same groups and dissimilar objects into different groups. In clustering literature, similarity term is used for opposite sense of distance. 14 7

Clustering Because of their similarity, closer samples to each other are placed into the same cluster. Likewise, distant samples are located in different clusters. The number of clusters is usually provided by experts. 15 Supervised Clustering It is a supervised learning that can use class information and sample similarities at the same time. On other hands, clustering before classical classification usually increases the classification accuracy or success. Kümeleme Danışmanlı Kümeleme 16 8

Notation In our studies, notations at below are preferred. D for desired classes, Y for estimation outputs, X for input data, X j for each feature of input, x i for each example, X for all of the input data set 17 Learning Schedule Online learning: If the learning process should be sustained continuously, online learning is used. Offline learning: At first, the system is trained, then it loaded, and started. This kind of systems does not have to run continuously. 18 9

Learning Rules Although there are many training algorithm proposed, according to learning rule background, learning algorithms can be divided into four groups: Hebb Delta Hopfield Kohonen 19 Hebb Learning Rule It is the first learning rule developed in 1949. The rule is based on principle of "a cell affects its neighbors". By improving this rule, different learning rules have been developed. 20 10

Delta Learning Rule Squared difference between the desired and the calculated results is the error of the system. In order to reduce this error, connections between cells is continuously updated. Multilayered perceptron networks are trained in accordance with this rule. 21 Hopfield Learning Rule When the desired result is the same as the calculated one, connection between related cells is strengthened in a specific ratio. Otherwise the connection is weakened. Recurrent Elman networks are trained with this rule. 22 11

Kohonen Learning Rule In this unsupervised model, cells are in the race. The cell that produces the greatest result wins the race. All connections of winner cell are strengthened. ART (Adaptive Resonance Theory) and SOM (Self Organizing Map) developed by Kohonen are examples for this rule. 23 Data Preparation Before starting learning process, a dataset that can represent whole problem space should be prepared. 24 12

Data Preparation Data prepared for the solution is divided into two sets because it is used in both train and test processes. 25 Validation The main aim of machine learning studies is to provide that the system trained from a dataset can answer any unknown question in the same problem. Therefore the limited data should be used in both training and testing the system. The methods known as cross-validation are successful in this subject. 26 13

Cross Validation This proposed method is based on the basic principles are a few validity. But all has the same basic logic. The current data set to measure the success of the system is divided into two parts. One for training (train sets) and the other is never seen to represent possible examples of the system (test set) is used. The system learns the training set with the selected training algorithm. The success of the trained system is then calculated on the test set. 27 Cross Validation Three types of cross-validation method is proposed: Random sampling K-fold Leave one out 28 14

Random Sampling 1 2 3 4 29 K-Fold Red folds show test-set, Blue ones are train-set. 30 15

Leave One Out It is a special application of K-fold (K = N). For a dataset with N samples, if we choose number of folds (K) as number of samples (N), then K-fold runs as Leave One Out method. 31 Overfitting All iterational learning machines must be stopped at the right time. Otherwise the system starts to memorize examples in the training data, and this decreases the prediction ability of the system for unknown samples. This kind of excessive training is called as Overfitting. 32 16