SB2b Statistical Machine Learning Hilary Term 2017

Similar documents
Python Machine Learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Learning From the Past with Experiment Databases

CS 446: Machine Learning

CSL465/603 - Machine Learning

Lecture 1: Machine Learning Basics

Laboratorio di Intelligenza Artificiale e Robotica

CS Machine Learning

Lecture 1: Basic Concepts of Machine Learning

A Case Study: News Classification Based on Term Frequency

Generative models and adversarial training

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Laboratorio di Intelligenza Artificiale e Robotica

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Probabilistic Latent Semantic Analysis

Speech Recognition at ICSI: Broadcast News and beyond

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Universidade do Minho Escola de Engenharia

Reducing Features to Improve Bug Prediction

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

(Sub)Gradient Descent

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

A study of speaker adaptation for DNN-based speech synthesis

Speech Emotion Recognition Using Support Vector Machine

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Lecture 10: Reinforcement Learning

Human Emotion Recognition From Speech

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

MYCIN. The MYCIN Task

Assignment 1: Predicting Amazon Review Ratings

Learning Methods in Multilingual Speech Recognition

Modeling function word errors in DNN-HMM based LVCSR systems

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Artificial Neural Networks written examination

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

Modeling function word errors in DNN-HMM based LVCSR systems

Australian Journal of Basic and Applied Sciences

CS224d Deep Learning for Natural Language Processing. Richard Socher, PhD

Time series prediction

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Word Segmentation of Off-line Handwritten Documents

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Issues in the Mining of Heart Failure Datasets

Seminar - Organic Computing

Switchboard Language Model Improvement with Conversational Data from Gigaword

CS 598 Natural Language Processing

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Linking Task: Identifying authors and book titles in verbose queries

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Probability and Statistics Curriculum Pacing Guide

Going to School: Measuring Schooling Behaviors in GloFish

Using dialogue context to improve parsing performance in dialogue systems

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Rule Learning With Negation: Issues Regarding Effectiveness

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

Mining Association Rules in Student s Assessment Data

Semi-Supervised Face Detection

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Evolution of Symbolisation in Chimpanzees and Neural Nets

Unsupervised Acoustic Model Training for Simultaneous Lecture Translation in Incremental and Batch Mode

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma

Challenges in Deep Reinforcement Learning. Sergey Levine UC Berkeley

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Model Ensemble for Click Prediction in Bing Search Ads

arxiv: v2 [cs.cv] 30 Mar 2017

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Lecture 2: Quantifiers and Approximation

arxiv: v1 [cs.lg] 15 Jun 2015

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

A Review: Speech Recognition with Deep Learning Methods

Axiom 2013 Team Description Paper

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

Ph.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept B.Tech in Computer science and

IS FINANCIAL LITERACY IMPROVED BY PARTICIPATING IN A STOCK MARKET GAME?

Why Did My Detector Do That?!

Unit: Human Impact Differentiated (Tiered) Task How Does Human Activity Impact Soil Erosion?

Calibration of Confidence Measures in Speech Recognition

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017

Spring 2016 Stony Brook University Instructor: Dr. Paul Fodor

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Lecturing Module

FONDAMENTI DI INFORMATICA

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition

Transcription:

SB2b Statistical Machine Learning Hilary Term 2017 Mihaela van der Schaar and Seth Flaxman Guest lecturer: Yee Whye Teh Department of Statistics Oxford Slides and other materials available at: http://www.oxford-man.ox.ac.uk/~mvanderschaar/home_ page/course_ml.html

Administrative details Course Structure MMath Part B & MSc in Applied Statistics Lectures: MSc: Part C: Wednesdays 12:00-13:00, LG.01. Thursdays 16:00-17:00, LG.01. 4 problem sheets, discussed at the classes: weeks 2,4,6,7 (check website) 4 problem sheets Class Tutors: Lloyd Elliott, Kevin Sharp, and Hyunjik Kim Please sign up for the classes on the sign up sheet!

Administrative details Course Aims 1 Understand statistical fundamentals of machine learning, with a focus on supervised learning (classification and regression) and empirical risk minimisation. 2 Understand difference between generative and discriminative learning frameworks. 3 Learn to identify and use appropriate methods and models for given data and task. 4 Learn to use the relevant R or python packages to analyse data, interpret results, and evaluate methods.

Administrative details Syllabus I Part I: Introduction to supervised learning (4 lectures) Empirical risk minimization Bias/variance, Generalization, Overfitting, Cross validation Regularization Logistic regression Neural networks Part II: Classification and regression (3 lectures) Generative vs. Discriminative models K-nearest neighbours, Maximum Likelihood Estimation, Mixture models Naive Bayes, Decision trees, CART Support Vector Machines Random forest, Boostrap Aggregation (Bagging), Ensemble learning Expectation Maximization

Administrative details Syllabus II Part III: Theoretical frameworks Statistical learning theory Decision theory Part IV: Further topics Optimisation Hidden Markov Models Backward-forward algorithms Reinforcement learning

Overview What is Machine Learning? Statistical Machine Learning http://gureckislab.org

Overview Statistical Machine Learning What is Machine Learning? Arthur Samuel, 1959 Field of study that gives computers the ability to learn without being explicitly programmed.

Overview Statistical Machine Learning What is Machine Learning? Arthur Samuel, 1959 Field of study that gives computers the ability to learn without being explicitly programmed. Tom Mitchell, 1997 Any computer program that improves its performance at some task through experience.

Overview Statistical Machine Learning What is Machine Learning? Arthur Samuel, 1959 Field of study that gives computers the ability to learn without being explicitly programmed. Tom Mitchell, 1997 Any computer program that improves its performance at some task through experience. Kevin Murphy, 2012 To develop methods that can automatically detect patterns in data, and then to use the uncovered patterns to predict future data or other outcomes of interest.

Overview What is Machine Learning? Statistical Machine Learning Information Structure Prediction Decisions Actions data Larry Page about DeepMind s ML systems that can learn to play video games like humans

Overview What is Machine Learning? Statistical Machine Learning statistics business finance computer science biology genetics Machine Learning cognitive science psychology physics engineering operations research mathematics

Overview Statistical Machine Learning What is Data Science? Early years John Tukey, The Future of Data Analysis, 1962 For a long time I have thought I was a statistician, interested in inferences from the particular to the general. But as I have watched mathematical statistics evolve, I have had cause to wonder and to doubt.... All in all I have come to feel that my central interest is in data analysis, which I take to include, among other things: procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data Four driving forces, according to Tukey The formal theories of statistics Accelerating developments in computers... The challenge, in many fields, of more and ever larger bodies of data The emphasis on quantification in an ever wider variety of disciplines

Overview Statistical Machine Learning What is Data Science? Bin Yu, Let us own Data Science, IMS Presidential Address, 2014 Statistics Domain/science knowledge Computing Collaboration/teamwork Communication to outsiders David Donoho, 50 years of Data Science, 2015 Greater Data Science : Data Exploration and Preparation Data Representation and Transformation Computing with Data Data Modeling Data Visualization and Presentation Science about Data Science

Overview Statistical Machine Learning Statistics vs Machine Learning Traditional Problems in Applied Statistics Well formulated question that we would like to answer. Expensive data gathering and/or expensive computation. Create specially designed experiments to collect high quality data. Information Revolution Improvements in data processing and data storage. Powerful, cheap, easy data capturing. Lots of (low quality) data with potentially valuable information inside. CS and Stats forced back together: unified framework of data, inferences, procedures, algorithms statistics taking computation seriously computing taking statistical risk seriously Michael I. Jordan: On the Computational and Statistical Interface and "Big Data" Max Welling: Are Machine Learning and Statistics Complementary?

Overview Types of Machine Learning Types of Machine Learning Unsupervised learning Extract key features of the unlabelled data clustering, signal separation, density estimation Goal: representation, hypothesis generation, visualization Supervised learning Data contains labels : every example is an input-output pair classification, regression Goal: prediction on new examples

Overview Types of Machine Learning Types of Machine Learning Semi-supervised Learning A database of examples, only a small subset of which are labelled. Multi-task Learning A database of examples, each of which has multiple labels corresponding to different prediction tasks. Reinforcement Learning An agent acting in an environment, given rewards for performing appropriate actions, learns to maximize their reward.

Overview Supervised Learning Supervised Learning Unsupervised learning: To extract structure and postulate hypotheses about data generating process from unlabelled observations x 1,..., x n. Visualize, summarize and compress data. Supervised learning: In addition to the observations of X, we have access to their response variables / labels Y Y: we observe {(x i, y i )} n i=1. Types of supervised learning: Classification: discrete responses, e.g. Y = {+1, 1} or {1,..., K}. Regression: a numerical value is observed and Y = R. The goal is to accurately predict the response Y on new observations of X, i.e., to learn a function f : R p Y, such that f (X) will be close to the true response Y.

Overview Supervised Learning Applications of Machine Learning spam filtering recommendation systems fraud detection self-driving cars image recognition stock market analysis ImageNet: Computer Eyesight Gets a Lot More Accurate, Krizhevsky et al, 2012 New applications of ML: Machine Learning is Eating the World

Machine learning in practice Spam detection Observations X are text documents Labels Y are spam = +1 and not spam = 1. How do we encode documents of different lengths as a vector X R p? Given a set of labelled documents {(x i, y i )} n i=1 how do we learn a function f : R p Y Many answers to both questions will be covered in this course: logistic regression, naive Bayes, neural networks, Support Vector Machines, etc.

Image classification Machine learning in practice Observations X are images Labels Y {0, 1,..., 9} Learn a function f : R p Y

Face recognition Machine learning in practice Observations X are images Labels Y are a very large set of people: {Queen Elizabeth, Bill Gates, Justin Trudeau, Leonardo DiCaprio, etc.} How do we encode images as vectors X R p? Given a set of labelled images {(x i, y i )} n i=1 how do we learn a function f : R p Y Fundamentally harder or different than image classification?

Machine learning in practice Face detection Farfade, Saberian, and Li 2015 https://arxiv.org/pdf/1502.02766v3.pdf

Machine learning in practice Face detection 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Farfade, Saberian, and Li 2015 https://arxiv.org/pdf/1502.02766v3.pdf Observations X are images What are the labels Y? How should our function f work?

Machine translation Machine learning in practice Kyunghyun Cho https://devblogs.nvidia.com/parallelforall/ introduction-neural-machine-translation-gpus-part-3/ Observations X are sentences in language A Labels Y are sentences in language B How should we encode X and Y numerically? Is this regression or classification?

Speech recognition Machine learning in practice Dahl et al. 2012

Machine learning in practice Self-driving cars 27 million connections and 250 thousand parameters devblogs.nvidia.com/parallelforall/ deep-learning-self-driving-cars/

Machine learning in practice Product recommendation Fully observe all user interactions on a website (what pages they view, what items they buy, what reviews they leave, etc.) What products should be recommended to them? On which websites? How can you phrase this as supervised learning?

Machine learning in practice Software Software R Python: scikit-learn, mlpy, Theano Weka, mlpack, Torch, Shogun, TensorFlow... Matlab/Octave

Machine learning in practice Software Machine learning advances in 2016 and challenges ahead 2016: Free/open source software for deep learning: TensorFlow (Google), CNTK (Microsoft), PaddlePaddle (Baidu), MXNet (Amazon) Audio generation Go Advances in machine translation (Google translate) 2017 and beyond: Increasing concern about, regulation of algorithms Transparency / explainability in machine learning Effect of increasing automation of work on society Medical advances?