Introduction to Machine Learning

Similar documents
(Sub)Gradient Descent

Python Machine Learning

CS Machine Learning

Lecture 1: Machine Learning Basics

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

CSL465/603 - Machine Learning

Lecture 1: Basic Concepts of Machine Learning

Laboratorio di Intelligenza Artificiale e Robotica

Generative models and adversarial training

Laboratorio di Intelligenza Artificiale e Robotica

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Artificial Neural Networks written examination

CS 446: Machine Learning

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

The stages of event extraction

Assignment 1: Predicting Amazon Review Ratings

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

12- A whirlwind tour of statistics

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

A Case Study: News Classification Based on Term Frequency

Welcome to. ECML/PKDD 2004 Community meeting

Axiom 2013 Team Description Paper

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Rule Learning With Negation: Issues Regarding Effectiveness

Interactive Whiteboard

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Probabilistic Latent Semantic Analysis

Reinforcement Learning by Comparing Immediate Reward

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Rule Learning with Negation: Issues Regarding Effectiveness

Human Emotion Recognition From Speech

Machine Learning and Development Policy

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

EGRHS Course Fair. Science & Math AP & IB Courses

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

Learning Methods in Multilingual Speech Recognition

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

arxiv: v2 [cs.cv] 30 Mar 2017

Students Understanding of Graphical Vector Addition in One and Two Dimensions

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Time series prediction

Introduction to Causal Inference. Problem Set 1. Required Problems

School of Innovative Technologies and Engineering

Math 96: Intermediate Algebra in Context

Agents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators

Learning From the Past with Experiment Databases

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

A survey of multi-view machine learning

Learning Methods for Fuzzy Systems

LOUISIANA HIGH SCHOOL RALLY ASSOCIATION

Speech Emotion Recognition Using Support Vector Machine

Lecture 10: Reinforcement Learning

Semi-Supervised Face Detection

Measurement. When Smaller Is Better. Activity:

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Detecting English-French Cognates Using Orthographic Edit Distance

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

Word Segmentation of Off-line Handwritten Documents

An investigation of imitation learning algorithms for structured prediction

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

Section 7, Unit 4: Sample Student Book Activities for Teaching Listening

A Reinforcement Learning Variant for Control Scheduling

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Syllabus Foundations of Finance Summer 2014 FINC-UB

Applications of data mining algorithms to analysis of medical data

ME 443/643 Design Techniques in Mechanical Engineering. Lecture 1: Introduction

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Discriminative Learning of Beam-Search Heuristics for Planning

Intensive English Program Southwest College

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

Issues in the Mining of Heart Failure Datasets

Indian Institute of Technology, Kanpur

Active Learning. Yingyu Liang Computer Sciences 760 Fall

This map-tastic middle-grade story from Andrew Clements gives the phrase uncharted territory a whole new meaning!

MGT/MGP/MGB 261: Investment Analysis

A Comparison of Two Text Representations for Sentiment Analysis

SARDNET: A Self-Organizing Feature Map for Sequences

Online Updating of Word Representations for Part-of-Speech Tagging

The taming of the data:

Australian Journal of Basic and Applied Sciences

Modeling function word errors in DNN-HMM based LVCSR systems

Calibration of Confidence Measures in Speech Recognition

A Vector Space Approach for Aspect-Based Sentiment Analysis

TUESDAYS/THURSDAYS, NOV. 11, 2014-FEB. 12, 2015 x COURSE NUMBER 6520 (1)

What to Do When Conflict Happens

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

CSC200: Lecture 4. Allan Borodin

Mathematics process categories

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Navigating the PhD Options in CMS

Transcription:

Introduction to Machine Learning Hamed Pirsiavash CMSC 678 http://www.csee.umbc.edu/~hpirsiav/courses/ml_fall17 The slides are closely adapted from Subhransu Maji s slides

Course background What is the course about? Finding (and exploiting) patterns in data Replacing humans writing code with humans supplying data System figures out what the person wants based on examples Need to abstract from training examples to test examples Most central issue in ML: generalization Why is machine learning so cool? Broad applicability Finance, robotics, vision, machine translation, medicine, etc Close connections between theory and practice Open area, lots of room for new work 2

Some applications Email spam detection Movie recommendation Person recognition Stock price prediction Handwriting recognition Translation Speech recognition Self-driving cars What are the best ads to place on this website? Does my DNA correspond to Alzheimer s disease?

Course goals By the end of the semester, you should be able to: Look at a problem and identify if ML is an appropriate solution If so, identify what types of algorithms might be applicable Apply those algorithms In order to get there, you will need to: Do a lot of math (calculus, linear algebra, probability) Do a fair amount of programming Work hard 4

Topics covered Supervised learning: learning with a teacher Unsupervised learning: learning without a teacher Complex settings: learning in a complicated world Time-series models Structured prediction Semi-supervised learning Large-scale learning Not a zoo tour! Not an introduction to tools! You will learn how these techniques work and how to implement them 5

Topics covered Decision trees Nearest neighbor classifier Perceptron Linear regression Logistic regression Support vector machines Dimensionality reduction Neural networks Deep learning Expectation maximization 6

Grading Homework assignments: 60% Include MATLAB implementation Should be on time Final project: 40% Proposal Presentation Report Maybe a final exam: H,P,E: 50%, 40%, 10% Total 5 days of grace period for H and P 7

Textbook Main: "A Course in Machine Learning" by Hal Daumé III http://ciml.info/ Another: Machine Learning: A Probabilistic Perspective by Kevin Murphy

Who should take this course? Is this the right course for you? do you have all the pre-requisites? good math and programming background Still not sure? talk to me after class 9

Now, on to some real content (but first, questions?) 10

Classification How would you write a program to distinguish a picture of me from a picture of someone else? Provide examples pictures of me and pictures of other people and let a classifier learn to distinguish the two. How would you write a program to determine whether a sentence is grammatical or not? Provide examples of grammatical and ungrammatical sentences and let a classifier learn to distinguish the two. How would you write a program to distinguish cancerous cells from normal cells? Provide examples of cancerous and normal cells and let a classifier learn to distinguish the two. 11

Classification How would you write a program to distinguish a picture of me from a picture of someone else? Provide example pictures of me and pictures of other people and let a classifier learn to distinguish the two. How would you write a program to determine whether a sentence is grammatical or not? Provide examples of grammatical and ungrammatical sentences and let a classifier learn to distinguish the two. How would you write a program to distinguish cancerous cells from normal cells? Provide examples of cancerous and normal cells and let a classifier learn to distinguish the two. 12

Example dataset Example ( weather prediction) Three principal components 1. Class label (aka label, denoted by y) 2. Features (aka attributes ) 3. Feature values (aka attribute values, denoted by x) Feature values can be binary, nominal or continuous A labeled dataset is a collection of (x, y) pairs 13

Example dataset Example ( weather prediction) Task: Predict the class of this test example Requires us to generalize from the training data 14

Example dataset Example ( weather prediction) Three principal components 1. Class label (aka label, denoted by y) 2. Features (aka attributes ) 3. Feature values (aka attribute values, denoted by x) Feature values can be binary, nominal or continuous A labeled dataset is a collection of (x, y) pairs 15

Example dataset Example ( weather prediction) Three principal components 1. Class label (aka label, denoted by y) 2. Features (aka attributes ) 3. Feature values (aka attribute values, denoted by x) Feature values can be binary, nominal or continuous A labeled dataset is a collection of (x, y) pairs 16

Example dataset Example ( weather prediction) Task: Predict the class of this test example Requires us to generalize from the training data 17

Classification

Example (face recognition) What is a good representation for images? Pixel values? Edges? 19

Example (chair detection)

Example (chair detection)

Ingredients for classification Whole idea: Inject your knowledge into a learning system Sources of knowledge: 1.Feature representation Not typically a focus of machine learning Typically seen as problem specific However, it s hard to learn from bad representations 2.Training data: labeled examples Often expensive to label lots of data Sometimes data is available for free 3.Model No single learning algorithm is always good ( no free lunch ) Different learning algorithms work with different ways of representing the learned classifier 22

Regression Regression is like classification except the labels are real valued Example applications: Stock value prediction Income prediction CPU power consumption 23

Structured prediction 24

Unsupervised learning: Clustering 25

Reinforcement learning Unlike classification, regression and unsupervised learning, RL does not receive examples Rather, it gathers experience by interacting with the world RL problems always include time as a variable Example problems: 1. Chess, Go 2. Robot control 3. Taxi driving 26

Why do we care about math?! Calculus and linear algebra Techniques for finding maxima/minima of functions Convenient language for high dimensional data analysis Probability The study of the outcomes of repeated experiments The study of the plausibility of some event Statistics: The analysis and interpretation of data Statistics makes heavy use of probability theory 27

Why do we care about probability & statistics? Recall, statistics is the analysis and interpretation of data In machine learning, we attempt to generalize from one training data set to general rules that can be applied to test data How is machine learning different from statistics? 1. Stats care about the model, we care about predictions 2. Stats care about model fit, we care about generalization 3. Stats tries to explain the world, we try to predict the future 28

Slide credit These slides are adapted from the machine learning course taught by: Hal Daume at University of Maryland, College Park Subhransu Maji at University of Massachusetts, Amherst 29