CS545 Machine Learning

Similar documents
Python Machine Learning

(Sub)Gradient Descent

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Lecture 1: Machine Learning Basics

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

CSL465/603 - Machine Learning

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

CS Machine Learning

CS 446: Machine Learning

Assignment 1: Predicting Amazon Review Ratings

Lecture 1: Basic Concepts of Machine Learning

A Case Study: News Classification Based on Term Frequency

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Rule Learning With Negation: Issues Regarding Effectiveness

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

School of Innovative Technologies and Engineering

Learning From the Past with Experiment Databases

Probabilistic Latent Semantic Analysis

Rule Learning with Negation: Issues Regarding Effectiveness

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Switchboard Language Model Improvement with Conversational Data from Gigaword

Reducing Features to Improve Bug Prediction

Time series prediction

Modeling function word errors in DNN-HMM based LVCSR systems

Human Emotion Recognition From Speech

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

Modeling function word errors in DNN-HMM based LVCSR systems

Australian Journal of Basic and Applied Sciences

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Multivariate k-nearest Neighbor Regression for Time Series data -

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Unit 2. A whole-school approach to numeracy across the curriculum

Laboratorio di Intelligenza Artificiale e Robotica

Word Segmentation of Off-line Handwritten Documents

Artificial Neural Networks written examination

Issues in the Mining of Heart Failure Datasets

Active Learning. Yingyu Liang Computer Sciences 760 Fall

The Evolution of Random Phenomena

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Lecture 10: Reinforcement Learning

LEGO MINDSTORMS Education EV3 Coding Activities

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Speech Emotion Recognition Using Support Vector Machine

Learning Methods for Fuzzy Systems

Indian Institute of Technology, Kanpur

Calibration of Confidence Measures in Speech Recognition

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Innovative Methods for Teaching Engineering Courses

Semi-Supervised Face Detection

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Evolutive Neural Net Fuzzy Filtering: Basic Description

arxiv: v2 [cs.cv] 30 Mar 2017

Welcome to. ECML/PKDD 2004 Community meeting

Mining Association Rules in Student s Assessment Data

Using focal point learning to improve human machine tacit coordination

A study of speaker adaptation for DNN-based speech synthesis

A Reinforcement Learning Variant for Control Scheduling

Missouri Mathematics Grade-Level Expectations

Universidade do Minho Escola de Engenharia

Applications of data mining algorithms to analysis of medical data

Answer Key For The California Mathematics Standards Grade 1

A Neural Network GUI Tested on Text-To-Phoneme Mapping

ASTR 102: Introduction to Astronomy: Stars, Galaxies, and Cosmology

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Spinners at the School Carnival (Unequal Sections)

CS 100: Principles of Computing

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Probability and Statistics Curriculum Pacing Guide

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

The taming of the data:

Introduction to Causal Inference. Problem Set 1. Required Problems

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Laboratorio di Intelligenza Artificiale e Robotica

Data Structures and Algorithms

CS 101 Computer Science I Fall Instructor Muller. Syllabus

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

Multi-label Classification via Multi-target Regression on Data Streams

Using dialogue context to improve parsing performance in dialogue systems

Handling Concept Drifts Using Dynamic Selection of Classifiers

A Vector Space Approach for Aspect-Based Sentiment Analysis

Contents. Foreword... 5

Intelligent Agents. Chapter 2. Chapter 2 1

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Statewide Framework Document for:

A Comparison of Two Text Representations for Sentiment Analysis

Firms and Markets Saturdays Summer I 2014

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

Speech Segmentation Using Probabilistic Phonetic Feature Hierarchy and Support Vector Machines

Transcription:

Machine learning and related fields CS545 Machine Learning Course Introduction Machine learning: the construction and study of systems that learn from data. Pattern recognition: the same field, different practitioners Data mining: using algorithms (often ML) to discover patterns in a data Statistics and probability: a lot of algorithms have a probabilistic flavor 1 2 Example problem: handwritten digit recognition Tasks best solved by a learning algorithm Recognizing patterns and anomalies: Face recognition Handwritten or spoken words Medical images Unusual credit card transactions Unusual patterns of sensor readings (in nuclear power plants or car engines) Stock prices 1

Examples of machine learning on the web Course Objectives Spam filtering, fraud detection: The enemy adapts so we must adapt too. Recommendation systems (amazon, netflix): Lots of noisy data. Million dollar prize! Information retrieval: Find documents or images with similar content. The machine learning toolbox Formulating a problem as an ML problem Understanding a variety of ML algorithms Running and interpreting ML experiments Understanding what makes ML work theory and practice The textbook: The Book Grading Assignments are 100% of the grade Around 5 assignmnets, worth 80% q Combination of implementation, running ML experiments, and theory questions A project assignment worth 20% q You choose what you want to work on! 2

Asa Ben-Hur (instructor) Navini Dantanarayana (TA) Course staff Implementation: Python Why Python for ML? v A concise and intuitive language v An interpreted language allows for interactive data analysis v Simple, easy to learn syntax v Highly readable, compact code v Supports object oriented and functional programming v Libraries for plotting and vector/matrix computation v Strong support for integration with other languages (C,C++,Java) 9 Implementation: Python Why Python for ML? v v v v Dynamic typing and garbage collection Cross-platform compatibility Free Language of choice for many ML researchers Why I love Python I am more productive! Machine performance vs. programmer performance Makes programming fun! image from: ftp://www.mindview.net/pub/eckel/ LovePython.zip 3

Which version? 2.x or 3.x? Stick with 2.x for now. Python 3 is a non-backward compatible version that removes a few warts from the language. Does anyone else use python? One of the three official languages in google. Peter Norvig, Director of Research at Google: "Python has been an important part of Google since the beginning, and remains so as the system grows and evolved. Today dozens of Google engineers use Python, and we're looking for more people with skils in this language" 13 ML in Python How will we learn Python? We will use PyML, which was written by the instructor q Overview of Python/PyML in lecture. Available on sourceforge: http://pyml.sf.net q Course website has links to Python tutorials and other resources Also: NumPy: operations on arrays and matrices Matplotlib: plotting library 15 16 4

Labeled data Binary classification E-mail x 1 x 2 Spam? 1 1 1 1 2 1 0-1 3 0 1-1 4 0 0 1 5 0 0-1 x 2 x 1 and x 2 are two characteristics of emails (e.g. the presence of the word viagara ). These are called features Spam? Is the label associated with the each email This is a binary classification problem Scatter plot of labeled data with two features (dimensions) x 1 17 18 ML tasks Using ML to address a learning task Classification: discrete/categorical labels Task Domain objects Features Data Model Output Regression: continuous labels Clustering: no labels Training data Learning problem Learning algorithm 16 14 12 10 8 6 4 2 0 2 4 6 8 10 12 14 19 20 5

Types of models Types of learning tasks Geometric q Ridge-regression, SVM, perceptron Distance-based q K-nearest-neighbors Probabilistic q Naïve-bayes P (Y = spam Viagara, lottery) Logical models: Tree/Rule based q Decision trees w w T x + b > 0 w T x + b < 0 ʻViagraʼ =0 =1 ʻlotteryʼ ĉ(x) = spam =0 =1 Supervised learning Learn to predict output given labeled examples Unsupervised learning Data is unlabeled Create an internal representation of the input e.g. form clusters; extract features Most big datasets do not come with labels Reinforcement learning Learn action to maximize payoff Not much information in a payoff signal Payoff is often delayed Not covered in this course. ĉ(x) = ham ĉ(x) = spam 21 ML in Practice Human vs machine learning Understanding the domain, and goals Creating features, data cleaning and preprocessing Learning models Interpreting results Consolidating and deploying discovered knowledge Loop Human Observe someone, then repeat Keep trying until it works (riding a bike) Memorize Machine Supervised Learning Reinforcement Learning k-nearest Neighbors 20 Questions Decision Tree A network of neurons with complex interconnections Neural networks 24 6

Generalization A simple example: Fitting a polynomial The real aim of supervised learning is to do well on test data that is not known during training. We want the learning machine to model the true regularities in the data and to ignore the noise in the data. But the learning machine does not know which regularities are real and which are accidental quirks of the particular set of training examples we happen to pick. So how can we be sure that the machine will generalize correctly to new data? The green curve is the true function (which is not a polynomial) The data points have noise in y. Measure of error (loss function) that measures the squared error in the prediction of y(x) from x. The loss for the red polynomial is the sum of the squared vertical errors. Which model is best? Which model is best? underfitting overfitting Figures from: Pattern Recognition and Machine Learning by Christopher Bishop 7

Trading off goodness of fit against model complexity You can only expect a model to generalize well if it explains the data surprisingly well given the complexity of the model. If the model has as many degrees of freedom as the data, it can fit the data perfectly. But so what? What we ll cover Supervised learning Linear classifiers Decision trees Probabilistic classifiers Neural networks Support vector machines Ensemble models Unsupervised learning Clustering Dimensionality reduction Running and interpreting ML experiments 30 8