Course Overview and Introduction CE-717 : Machine Learning Sharif University of Technology. M. Soleymani Fall 2014

Similar documents
Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Lecture 1: Basic Concepts of Machine Learning

CS Machine Learning

CSL465/603 - Machine Learning

(Sub)Gradient Descent

Laboratorio di Intelligenza Artificiale e Robotica

Python Machine Learning

Axiom 2013 Team Description Paper

Lecture 1: Machine Learning Basics

Probabilistic Latent Semantic Analysis

Laboratorio di Intelligenza Artificiale e Robotica

Lecture 10: Reinforcement Learning

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Reinforcement Learning by Comparing Immediate Reward

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Rule Learning With Negation: Issues Regarding Effectiveness

Learning Methods for Fuzzy Systems

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Word Segmentation of Off-line Handwritten Documents

A Case Study: News Classification Based on Term Frequency

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

An investigation of imitation learning algorithms for structured prediction

Speech Recognition at ICSI: Broadcast News and beyond

AQUA: An Ontology-Driven Question Answering System

A Neural Network GUI Tested on Text-To-Phoneme Mapping

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

A Reinforcement Learning Variant for Control Scheduling

Multisensor Data Fusion: From Algorithms And Architectural Design To Applications (Devices, Circuits, And Systems)

Speeding Up Reinforcement Learning with Behavior Transfer

Using Web Searches on Important Words to Create Background Sets for LSI Classification

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Content-free collaborative learning modeling using data mining

Knowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute

Georgetown University at TREC 2017 Dynamic Domain Track

Rule Learning with Negation: Issues Regarding Effectiveness

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

Generative models and adversarial training

Australian Journal of Basic and Applied Sciences

Human Emotion Recognition From Speech

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Guru: A Computer Tutor that Models Expert Human Tutors

Data Structures and Algorithms

Applications of memory-based natural language processing

Using dialogue context to improve parsing performance in dialogue systems

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

Reducing Features to Improve Bug Prediction

Artificial Neural Networks written examination

Automating the E-learning Personalization

Automatic document classification of biological literature

Linking Task: Identifying authors and book titles in verbose queries

Using focal point learning to improve human machine tacit coordination

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Extending Place Value with Whole Numbers to 1,000,000

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Welcome to. ECML/PKDD 2004 Community meeting

Evolutive Neural Net Fuzzy Filtering: Basic Description

A Comparison of Two Text Representations for Sentiment Analysis

Switchboard Language Model Improvement with Conversational Data from Gigaword

TextGraphs: Graph-based algorithms for Natural Language Processing

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Assignment 1: Predicting Amazon Review Ratings

CWSEI Teaching Practices Inventory

Discriminative Learning of Beam-Search Heuristics for Planning

INPE São José dos Campos

Office Hours: Mon & Fri 10:00-12:00. Course Description

Evolution of Symbolisation in Chimpanzees and Neural Nets

A survey of multi-view machine learning

Firms and Markets Saturdays Summer I 2014

arxiv: v2 [cs.cv] 30 Mar 2017

Corrective Feedback and Persistent Learning for Information Extraction

Comparison of network inference packages and methods for multiple networks inference

BYLINE [Heng Ji, Computer Science Department, New York University,

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Universidade do Minho Escola de Engenharia

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

An OO Framework for building Intelligence and Learning properties in Software Agents

COSI Meet the Majors Fall 17. Prof. Mitch Cherniack Undergraduate Advising Head (UAH), COSI Fall '17: Instructor COSI 29a

Seminar - Organic Computing

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Carnegie Mellon University Department of Computer Science /615 - Database Applications C. Faloutsos & A. Pavlo, Spring 2014.

Word learning as Bayesian inference

Transcription:

Course Overview and Introduction CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2014

Course Info Instructor: Mahdieh Soleymani Email: soleymani@sharif.edu Lectures: Sun-Tue (13:30-15) Website: http://ce.sharif.edu/cources/93-94/1/ce717-2 TAs: Hassan Hafez Nooshin Maghsoudi Amin Sabzmakan Marzieh Gheisari 2

Text Books Pattern Recognition and Machine Learning, C. Bishop, Springer, 2006. Additional readings: will be made available when appropriate. Other books: Machine Learning, T. Mitchell, MIT Press,1998. Reinforcement Learning: An Introduction, R.S. Sutton, A.G. Barto, MIT Press, 1999. The elements of statistical learning, T. Hastie, R. Tibshirani, J. Friedman, Second Edition, 2008. Machine Learning: A Probabilistic Perspective, K. Murphy, MIT Press, 2012. 3

Marking Scheme Midterm Exam: 25% Final Exam: 35% Project: 10% Homeworks (written & programming) : 20% Mini-exams: 10% 4

Machine Learning (ML) and Artificial Intelligence (AI) ML appears first as a branch of AI ML is now also a preferred approach to other subareas of AI Perception (Computer Vision, Speech Recognition, ) Robotics Natural Language Processing ML is a strong driver in Computer Vision and NLP 5

ML Definition Tom Mitchell (1998): Well-posed learning problem A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E. Using the observed data to make better decisions Generalizing from the observed data 6

ML Definition: Example Consider an email program that learns how to better filter spam according to emails you do or do not mark as spam. T: Classifying emails as spam or not spam. E: Watching you label emails as spam or not spam. P: The number (or fraction) of emails correctly classified as spam/not spam. 7

Some Learning Applications Face, speech, handwritten character recognition Document classification and ranking Self-customizing programs (e.g., recommender systems) Database mining (e.g., medical records) Market prediction (e.g., stock/house prices) Computational biology (e.g., annotation of biological sequences) Autonomous vehicles 8

Handwritten Digit Recognition 0 1 2 3 4 5 6 7 8 9 9

ML in Computer Science Why ML applications are growing? Improved machine learning algorithms Availability of data (Increased data capture, networking, etc) Demand for self-customization to user or environment Software too complex to write by hand 10

Paradigms of ML Supervised learning (regression, classification) predicting a target variable for which we get to see examples. Unsupervised learning revealing structure in the observed data Reinforcement learning partial (indirect) feedback, no explicit guidance Given rewards for a sequence of moves to learn a policy and utility functions Other paradigms: semi-supervised learning, active learning, etc. 11

Experience (E) in ML We have different types of (getting) experience in different paradigms of ML methods (previous slide) 12

Assumption Data are usually considered as vectors in a d dimensional space Now, we make this assumption for illustrative purpose We will see it is not necessary 13

Supervised Learning Given: Training set labeled set of N input-output pairs D = x i, y i i=1 N Goal: learning a mapping from x to y 14

Supervised Learning: Example x 2 15? x 1 x 1 x 2 y 0.9 2.3 1 3.5 2.6 1 2.6 3.3 1 2.7 4.1 1 1.8 3.9 1 6.5 6.8-1 7.2 7.5-1 7.9 8.3-1 6.9 8.3-1 8.8 7.9-1 9.1 6.2-1

Sample Data in Supervised Learning Supervised Learning: right answers (targets) are known for training samples Columns: Features/attributes/dimensions Sample1 x 1 x 2... x d y (Target) Rows: Data/instances/samples Y column: Target/label Sample 2 Sample n-1 Sample n Evaluation 16 Test data?

Unsupervised Learning Given: Training set x i N i=1 Goal: find groups or structures in the data 17

Unsupervised Learning: Example x 2 Clustering 18 x 1

Sample Data in Unsupervised Learning Unsupervised Learning: x 1 x 2... x d Sample1 Columns: Features/attributes/dimensions Rows: Data/instances/samples Sample 2 Sample n-1 Sample n 19

Unsupervised Learning: Example Applications Clustering docs based on their similarities Market segmentation: group customers into different market segments given a database of customer data. Finding semantic relations between ontological concepts in the molecular biology domain 20

Supervised Learning: Regression vs. Classification Supervised Learning Regression: predict a continuous target variable E.g., y [0,1] Classification: predict a discrete target variable E.g.,y {1,2,, C} A core objective of learning is to generalize from the experience. Generalization: ability of a learning algorithm to perform accurately on new, unseen examples after having experienced. 21

Regression: Example Housing price prediction 400 Price ($) in 1000 s 300 200 100 0 0 500 1000 1500 2000 2500 Size in feet 2 Figure adopted from slides of Andrew Ng 22

Classification: Example Weight (Cat, Dog) 1(Dog) 0(Cat) weight weight 23

Main Steps of (Supervised) Learning Tasks Hypothesis class or model specification Which class of models (mappings) should we use for our data? Learning: find mapping f (from hypothesis class) based on the training set of examples Which notion of error should we use? (loss functions) Optimization of loss function to find mapping f Evaluation: how well f generalizes to yet unseen examples How do we ensure that the error on future data is minimized? (generalization) 24

Main Topics of the Course Supervised learning Regression Classification (we focus on this topic and introduce many classification methods) Model evaluation and selection Learning theory Ensemble learning Unsupervised learning Density estimation, unsupervised dimensionality reduction, and clustering Reinforcement learning Some advanced topics & applications 25