Introduction to Machine Learning for NLP I


 Blanche Banks
 8 months ago
 Views:
Transcription
1 Introduction to Machine Learning for NLP I Benjamin Roth CIS LMU München Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 1 / 49
2 Outline 1 This Course 2 Overview 3 Machine Learning Definition Data (Eperience) Tasks Performance Measures 4 Linear Regression: Overview and Cost Function 5 Summary Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 2 / 49
3 Course Overview Foundations of machine learning loss functions linear regression logistic regression gradientbased optimization neural networks and backpropagation Deep learning tools in Python Numpy Theano Keras (some) Tensorflow?, (some) Pytorch? Applications Word Embeddings Senitment Analysis Relation etraction (some) Machine Translation? Practical projects (NLP related, to be agreed on during the course) Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 3 / 49
4 Lecture Times, Tutorials Course homepage: dlnlp.github.io 911 is supposed to be the lecture slot, and the tutorial slot but we will not stick to that allocation We will sometimes have longer Q&Astyle/interactive tutorial sessions, sometimes more lectures (see net slide) Tutor: Simon Schäfer Will discuss eercise sheets in the tutorials Will help you with the projects Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 4 / 49
5 Plan 911 slot slot E. sheet 10/18 Overview / ML Intro I ML Intro I Linear algebra chapter 10/25 Linear algebra Q&A / ML II ML II Probability chapter 11/1 public holiday 11/8 Probability Q&A / ML III Numpy Numpy 11/15 ML IV/Theano Intro Convolution Theano I 911 slot slot E. sheet 11/22 Embeddings / CNNs & RNNs for NLP Numpy Q&A Read LSTM/RNN 11/29 LSTM (reading group) Theano I Q&A Theano II 12/6 Keras Keras Keras 12/13 DL for Relation Prediction Theano II Q&A Relation Prediction 12/20 Word Vectors Project Topics Project Assignments 911 slot slot E. sheet 1/10 Keras Q&A, Rel.Etr. Q&A Tensorflow 1/17 optimization methods/pytorch Help with projects 1/24 Other Work at CIS / LMU, Neural MT Help with projects 1/31 Project presentations presentations 2/7 Project presentations presentations Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 5 / 49
6 Formalities This class is graded by a project The grade of the project is determined taking the average of: Grade of the code written for the project. Grade of project documentation / minireport. Grade of presentation about your project. You have to pass all three elements in order to pass the course. Bonus points: The grade can be improved by 0.5 absolute grades through the eercise sheets before New Year. Formula: g project = g projectcode + g projectreport + g projectpresentation 3 g final = round(g project 0.5 ) where is the fraction of points reached in the eercises (between 0 and 1), and round selects the closest value of 1; 1.3; 1.7; 2; 3.7; 4 Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 6 / 49
7 Eercise sheets, Projects, Presentations 6 ECTS, 14 weeks avg work load 13hrs / week (3 in class, 10 at home) in the first weeks, spend enough time to read and prepare so that you are not lost later from midnovember to middecember: programming assignments  coding takes time, and can be frustating (but rewarding)! Eercise sheets Work on nonprogramming eercise sheets individually For eercise sheets that contain programming parts, submit in teams of 2 or 3 Projects A list of topics will be proposed by me: Implement a deep learning technique applied to information etaction (or other NLP task) Own ideas also possible, need to be discussed with me Work in groups of two or three Project report: 3 pages / team member Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 7 / 49
8 Good project code shows that you master the techniques taught in the lectures and eercises.... shows that you can make own decisions : e.g. adapt model / task / training data etc if necessary.... is wellstructured and easy to understand (telling variable names, meaningful modularization avoid: code duplication, dead code)... is correct (especially: train/dev/test splits, evaluation)... is within the scope of this lecture (timewise should not eceed 5 10h) Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 8 / 49
9 A good project presentation is short (10 min. p.p min. Q&A per team)... similar to the report, contains the problem statement, motivation, model, and results... is targeted to your fellow students, who do not know details beforehand... contains interesting stuff: unepected observations? conclusions / recommendations? did you deviate from some common practice?... demonstrates that all team members worked together on the project Possible outline Background / Motivation Formal characterization of techniques used Technical Approach and Difficulties Eperiments, Results and Interpretation Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 9 / 49
10 A good project report is concise (3 pages / person) and clear... motivates and describes the model that you have implemented and the results that you have obtained... shows that you can correctly describe the concepts taught in this class... contains interesting stuff: unepected observations? conclusions / recommendations? did you deviate from some common practice? Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 10 / 49
11 Outline 1 This Course 2 Overview 3 Machine Learning Definition Data (Eperience) Tasks Performance Measures 4 Linear Regression: Overview and Cost Function 5 Summary Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 11 / 49
12 Machine Learning Machine learning for natural language processing Why? Advantages and disadvantages to alternatives? Accuracy; Coverage; resources required (data, epertise, human labour); Reliability/Robustness; Eplainability P NP VP VP V NP NP Det NN Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 12 / 49
13 Deep Learning Learn comple functions, that are (recursively) composed of simpler functions. Many parameters have to be estimated. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 13 / 49
14 Deep Learning Main Advantage: Feature learning Models learn to capture most essential properties of data (according to some performance measure) as intermediate representations. No need to handcraft feature etraction algorithms Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 14 / 49
15 Neural Networks First training methods for deep nonlinear NNs appeared in the 1960s (Ivakhnenko and others). Increasing interest in NN technology (again) since around 5 years ago ( Neural Network Renaissance ): Orders of magnitude more data and faster computers now. Many successes: Image recognition and captioning Speech regonition NLP and Machine translation (demo of Bahdanau / Cho / Bengio system) Game playing (AlphaGO)... Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 15 / 49
16 Machine Learning Deep Learning builds on general Machine Learning concepts argmin θ H m i=1 Fitting data vs. generalizing from data L(f ( i ; θ), y i ) prediction prediction prediction feature feature feature Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 16 / 49
17 Outline 1 This Course 2 Overview 3 Machine Learning Definition Data (Eperience) Tasks Performance Measures 4 Linear Regression: Overview and Cost Function 5 Summary Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 17 / 49
18 A Definition A computer program is said to learn from eperience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with eperience E. (Mitchell 1997) Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 18 / 49
19 A Definition A computer program is said to learn from eperience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with eperience E. (Mitchell 1997) Learning: Attaining the ability to perform a task. A set of eamples ( eperience ) represents a more general task. Eamples are described by features: sets of numerical properties that can be represented as vectors R n. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 19 / 49
20 Outline 1 This Course 2 Overview 3 Machine Learning Definition Data (Eperience) Tasks Performance Measures 4 Linear Regression: Overview and Cost Function 5 Summary Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 20 / 49
21 Data A computer program is said to learn from eperience E [...], if its performance [...] improves with eperience E. Dataset: collection of eamples Design matri X R n m n: number of eamples m: number of features Eample: Xi,j count of feature j (e.g. a stem form) in document i. Unsupervised learning: Model X, or find interesting properties of X. Training data: only X. Supervised learning: Predict specific additional properties from X. Training data: Label vector y R n together with X Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 21 / 49
22 Data Low training error does not mean good generalization. Algorithm may overfit. prediction feature prediction feature Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 22 / 49
23 Data Splits Best Practice: Split data into training, crossvalidation and test set. ( Crossvalidation set = development set ). Optimize lowlevel parameters (feature weights...) on training set. Select models and hyperparameters on crossvalidation set. (type of machine learning model, number of features, regularization, priors). It is possible to overfit both in the training as well as in the model selection stage! Report final score on test set only after model has been selected! Don t report the error on training or crossvalidation set as your model performance! Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 23 / 49
24 Outline 1 This Course 2 Overview 3 Machine Learning Definition Data (Eperience) Tasks Performance Measures 4 Linear Regression: Overview and Cost Function 5 Summary Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 24 / 49
25 Machine Learning Tasks A computer program is said to learn [...] with respect to some class of tasks T [...] if its performance at tasks in T [...] improves [...] Types of Tasks: Classification Regression Structured Prediction Anomaly Detection synthesis and sampling Imputation of missing values Denoising Clustering Reinforcement learning... Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 25 / 49
26 Machine Learning Tasks: Typical Eamples & Eamples from Recent NLP Reserch What are the most important conferences relevant to the intersection of ML and NLP? Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 26 / 49
27 Task: Classification Which of k classes does an eample belong to? f : R n {1... k} Typical eample: Categorize image patches Feature vector: color intensities for each piel; derived features. Output categories: Predefined set of labels Typical eample: Spam Classification Feature vector: Highdimensional, sparse vector. Each dimension indicates occurrence of a particular word, or other specific information. Output categories: spam vs. ham Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 27 / 49
28 Task: Classification EMNLP 2017: Given a person name in a sentence that contains keywords related to police ( officer, police...) and to killing ( killed, shot ), was the person a civilian killed by police? Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 28 / 49
29 Task: Regression Predict a numerical value given some input. f : R n R Typical eamples: Predict the risk of an insurance customer. Predict the value of a stock. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 29 / 49
30 Task: Regression ACL 2017: Given a response in a multiturn dialogue, predict the value (on a scale from 1 to 5) how natural a response is. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 30 / 49
31 Often involves search and problemspecific algorithms. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 31 / 49 Task: Structured Prediction Predict a multivalued output with special interdependencies and constraints. Typical eamples: Partofspeech tagging Syntactic parsing Proteinfolding
32 Task: Structured Prediction ACL 2017: jointly find all relations relations of interest in a sentence by tagging arguments and combining them. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 32 / 49
33 Task: Reinforcement Learning In reinforcement learning, the model (also called agent) needs to select a serious of actions, but only observes the outcome (reward) at the end. The goal is to predict actions that will maimize the outcome. EMNLP 2017: The computer negotiates with humans in natural language in order to maimize its points in a game. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 33 / 49
34 Task: Anomaly Detection Detect atypical items or events. Common approach: Estimate density and identify items that have low probability. Eamples: Quality assurance Detection of criminal activity Often items categorized as outliers are sent to humans for further scrutiny. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 34 / 49
35 Task: Anomaly Detection ACL 2017: Schizophrenia patients can be detected by their nonstandard use of mataphors, and more etreme sentiment epressions. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 35 / 49
36 Supervised and Unsupervised Learning Unsupervised learning: Learn interesting properties, such as probability distribution p() Supervised learning: learn mapping from to y, typically by estimating p(y ) Supervised learning in an unsupervised way: p(y ) = p(, y) y p(, y ) Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 36 / 49
37 Outline 1 This Course 2 Overview 3 Machine Learning Definition Data (Eperience) Tasks Performance Measures 4 Linear Regression: Overview and Cost Function 5 Summary Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 37 / 49
38 Performance Measures A computer program is said to learn [...] with respect to some [...] performance measure P, if its performance [...] as measured by P, improves [...] Quantitative measure of algorithm performance. Taskspecific. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 38 / 49
39 Discrete Loss Functions Can be used to measure classification performance. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 39 / 49
40 Discrete Loss Functions Can be used to measure classification performance. Not applicable to measure density estimation or regression performance. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 39 / 49
41 Discrete Loss Functions Can be used to measure classification performance. Not applicable to measure density estimation or regression performance. Accuracy Proportion of eamples for which model produces correct output. 01 loss = error rate = 1  accuracy. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 39 / 49
42 Discrete Loss Functions Can be used to measure classification performance. Not applicable to measure density estimation or regression performance. Accuracy Proportion of eamples for which model produces correct output. 01 loss = error rate = 1  accuracy. Accuracy may be inappropriate for skewed label distributions, where relevant category is rare F1score = 2 Prec Rec Prec + Rec Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 39 / 49
43 Discrete vs. Continuous Loss Functions Discrete loss functions cannot indicate how wrong a wrong decision for one eample is. Continuous loss functions are more widely applicable.... are often easier to optimize (differentiable).... can also be applied to discrete tasks (classification). Sometimes algorithms are optimized using one loss (e.g. Hinge loss) and evaluated using another loss (e.g. F1Score). Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 40 / 49
44 Eamples for Continuous Loss Functions Density estimation: log probability of eample Regression: squared error Classification: Loss L(y i f ( i )) is function of label prediction label { 1, 1}, prediction R Correct prediction: y i f ( i ) > 0 Wrong prediction: y i f ( i ) <= 0 zeroone loss, Hingeloss, logistic loss... Loss on data set is sum of pereample losses. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 41 / 49
45 Outline 1 This Course 2 Overview 3 Machine Learning Definition Data (Eperience) Tasks Performance Measures 4 Linear Regression: Overview and Cost Function 5 Summary Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 42 / 49
46 Linear Regression For one instance: Input: vector R n Output: scalar y R (actual output: y; predicted output: ŷ) Linear function ŷ = w T = n w j j Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 43 / 49 j=1
47 Linear Regression Linear function: ŷ = w T = n w j j Parameter vector w R n Weight w j decides if value of feature j increases or decreases prediction ŷ. Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 44 / 49 j=1
48 Linear Regression For the whole data set: Use matri X and vector y to stack instances on top of each other. Typically first column contains all 1 for the intercept (bias, shift) term n y n X = y = y 2. 1 m2 m3... mn y m For entire data set, predictions are stacked on top of each other: ŷ = Xw Estimate parameters using X (train) and y (train). Make highlevel decisions (which features...) using X (dev) and y (dev). Evaluate resulting model using X (test) and y (test). Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 45 / 49
49 Simple Eample: Housing Prices Predict Munich property prices (in 1K Euros) from just one feature: Square meters of property X = y = Prediction is: w w [ ] ŷ = w w 2 = w1 = Xw w w w w 1 will contain costs incurred in any property acquisition w 2 will contain remaining average price per square meter. Optimal parameters are for the above case: [ ] w = ŷ = Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 46 / 49
50 Linear Regression: Mean Squared Error Mean squared error of training (or test) data set is the sum of squared differences between the predictions and labels of all m instances. MSE (train) = 1 m m i=1 (ŷ (train) i y (train) i ) 2 In matri notation: MSE (train) = 1 m ŷ(train) y (train) ) 2 2 = 1 m X(train) w y (train) ) 2 2 Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 47 / 49
51 Outline 1 This Course 2 Overview 3 Machine Learning Definition Data (Eperience) Tasks Performance Measures 4 Linear Regression: Overview and Cost Function 5 Summary Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 48 / 49
52 Summary Deep Learning many successes in recent years feature learning instead of feature engineering builds on general machine learning concepts Machine learning definition Data Task Cost function Machine tasks Classification Regression... Linear regression Output depends linearly on input Cost function: Mean squared error Net up: estimating the parameters Benjamin Roth (CIS LMU München) Introduction to Machine Learning for NLP I 49 / 49
Lecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationSlides credited from Richard Socher
Slides credited from Richard Socher Sequence Modeling Idea: aggregate the meaning from all words into a vector Compositionality Method: Basic combination: average, sum Neural combination: Recursive neural
More informationCSC321 Lecture 1: Introduction
CSC321 Lecture 1: Introduction Roger Grosse Roger Grosse CSC321 Lecture 1: Introduction 1 / 26 What is machine learning? For many problems, it s difficult to program the correct behavior by hand recognizing
More informationCS519: Deep Learning. Winter Fuxin Li
CS519: Deep Learning Winter 2017 Fuxin Li Course Information Instructor: Dr. Fuxin Li KEC 2077, lif@eecs.oregonstate.edu TA: Mingbo Ma: mam@oregonstate.edu Xu Xu: xux@oregonstate.edu My office hour: TBD
More informationWord Sense Determination from Wikipedia. Data Using a Neural Net
1 Word Sense Determination from Wikipedia Data Using a Neural Net CS 297 Report Presented to Dr. Chris Pollett Department of Computer Science San Jose State University By Qiao Liu May 2017 Word Sense Determination
More informationUnsupervised Learning: Clustering
Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning
More informationCS534 Machine Learning
CS534 Machine Learning Spring 2013 Lecture 1: Introduction to ML Course logistics Reading: The discipline of Machine learning by Tom Mitchell Course Information Instructor: Dr. Xiaoli Fern Kec 3073, xfern@eecs.oregonstate.edu
More informationINTRODUCTION TO DATA SCIENCE
DATA11001 INTRODUCTION TO DATA SCIENCE EPISODE 6: MACHINE LEARNING TODAY S MENU 1. WHAT IS ML? 2. CLASSIFICATION AND REGRESSSION 3. EVALUATING PERFORMANCE & OVERFITTING WHAT IS MACHINE LEARNING? Definition:
More informationComputer Vision for Card Games
Computer Vision for Card Games Matias Castillo matiasct@stanford.edu Benjamin Goeing bgoeing@stanford.edu Jesper Westell jesperw@stanford.edu Abstract For this project, we designed a computer vision program
More informationLinear Models Continued: Perceptron & Logistic Regression
Linear Models Continued: Perceptron & Logistic Regression CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Linear Models for Classification Feature function
More informationLoad Forecasting with Artificial Intelligence on Big Data
1 Load Forecasting with Artificial Intelligence on Big Data October 9, 2016 Patrick GLAUNER and Radu STATE SnT  Interdisciplinary Centre for Security, Reliability and Trust, University of Luxembourg 2
More informationMachine Learning for SAS Programmers
Machine Learning for SAS Programmers The Agenda Introduction of Machine Learning Supervised and Unsupervised Machine Learning Deep Neural Network Machine Learning implementation Questions and Discussion
More informationCS545 Machine Learning
Machine learning and related fields CS545 Machine Learning Course Introduction Machine learning: the construction and study of systems that learn from data. Pattern recognition: the same field, different
More informationCS 510: Lecture 8. Deep Learning, Fairness, and Bias
CS 510: Lecture 8 Deep Learning, Fairness, and Bias Next Week All Presentations, all the time Upload your presentation before class if using slides Sign up for a timeslot google doc, if you haven t already
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationProgramming Social Robots for Human Interaction. Lecture 4: Machine Learning and Pattern Recognition
Programming Social Robots for Human Interaction Lecture 4: Machine Learning and Pattern Recognition ZhengHua Tan Dept. of Electronic Systems, Aalborg Univ., Denmark zt@es.aau.dk, http://kom.aau.dk/~zt
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationCOMP 551 Applied Machine Learning Lecture 11: Ensemble learning
COMP 551 Applied Machine Learning Lecture 11: Ensemble learning Instructor: Herke van Hoof (herke.vanhoof@mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp551
More informationIndepth: Deep learning (one lecture) Applied to both SL and RL above Code examples
Introduction to machine learning (two lectures) Supervised learning Reinforcement learning (lab) Indepth: Deep learning (one lecture) Applied to both SL and RL above Code examples 20170930 2 1 To enable
More informationCS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh February 28, 2017
CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh February 28, 2017 HW2 due Thursday Announcements Office hours on Thursday: 4:15pm5:45pm Talk at 3pm: http://www.sam.pitt.edu/arc
More informationTwitter Sentiment Analysis with Recursive Neural Networks
Twitter Sentiment Analysis with Recursive Neural Networks Ye Yuan, You Zhou Department of Computer Science Stanford University Stanford, CA 94305 {yy0222, youzhou}@stanford.edu Abstract In this paper,
More informationMachine Learning for Computer Vision
Prof. Daniel Cremers Machine Learning for Computer PD Dr. Rudolph Triebel Lecturers PD Dr. Rudolph Triebel rudolph.triebel@in.tum.de Room number 02.09.058 (Fridays) Main lecture MSc. Ioannis John Chiotellis
More informationCSE 546 Machine Learning
CSE 546 Machine Learning Instructor: Luke Zettlemoyer TA: Lydia Chilton Slides adapted from Pedro Domingos and Carlos Guestrin Logistics Instructor: Luke Zettlemoyer Email: lsz@cs Office: CSE 658 Office
More informationLarge Scale Data Analysis Using Deep Learning
Large Scale Data Analysis Using Deep Learning Introduction to Deep Learning U Kang Seoul National University U Kang 1 In This Lecture Overview of deep learning History of deep learning and its recent advances
More informationLinear Regression. Chapter Introduction
Chapter 9 Linear Regression 9.1 Introduction In this class, we have looked at a variety of di erent models and learning methods, such as finite state machines, sequence models, and classification methods.
More informationLinear Regression: Predicting House Prices
Linear Regression: Predicting House Prices I am big fan of Kalid Azad writings. He has a knack of explaining hard mathematical concepts like Calculus in simple words and helps the readers to get the intuition
More informationSB2b Statistical Machine Learning Hilary Term 2017
SB2b Statistical Machine Learning Hilary Term 2017 Mihaela van der Schaar and Seth Flaxman Guest lecturer: Yee Whye Teh Department of Statistics Oxford Slides and other materials available at: http://www.oxfordman.ox.ac.uk/~mvanderschaar/home_
More informationLecture 1: Introduc4on
CSC2515 Spring 2014 Introduc4on to Machine Learning Lecture 1: Introduc4on All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html
More informationSTA 414/2104 Statistical Methods for Machine Learning and Data Mining
STA 414/2104 Statistical Methods for Machine Learning and Data Mining Radford M. Neal, University of Toronto, 2014 Week 1 What are Machine Learning and Data Mining? Typical Machine Learning and Data Mining
More informationCPSC 340: Machine Learning and Data Mining. Course Review/Preview Fall 2015
CPSC 340: Machine Learning and Data Mining Course Review/Preview Fall 2015 Admin Assignment 6 due now. We will have office hours as usual next week. Final exam details: December 15: 8:3011 (WESB 100).
More informationSession 1: Gesture Recognition & Machine Learning Fundamentals
IAP Gesture Recognition Workshop Session 1: Gesture Recognition & Machine Learning Fundamentals Nicholas Gillian Responsive Environments, MIT Media Lab Tuesday 8th January, 2013 My Research My Research
More informationDeep Neural Networks for Acoustic Modelling. Bajibabu Bollepalli Hieu Nguyen Rakshith Shetty Pieter Smit (Mentor)
Deep Neural Networks for Acoustic Modelling Bajibabu Bollepalli Hieu Nguyen Rakshith Shetty Pieter Smit (Mentor) Introduction Automatic speech recognition Speech signal Feature Extraction Acoustic Modelling
More informationApplied Machine Learning Lecture 1: Introduction
Applied Machine Learning Lecture 1: Introduction Richard Johansson January 16, 2018 welcome to the course! machine learning is getting increasingly popular among students our courses are full! many thesis
More informationTensorFlow APIs for Image Classification. Installing Tensorflow and TFLearn
CSc215 (Gordon) Week 10B notes TensorFlow APIs for Image Classification TensorFlow is a powerful opensource library for Deep Learning, developed at Google. It became available to the general public in
More informationSystem Implementation for SemEval2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 TzuHsuan Yang, 2 TzuHsuan Tseng, and 3 ChiaPing Chen Department of Computer Science and Engineering
More informationSession 4: Regularization (Chapter 7)
Session 4: Regularization (Chapter 7) Tapani Raiko Aalto University 30 September 2015 Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September 2015 1 / 27 Table of Contents Background
More informationDetection of Insults in Social Commentary
Detection of Insults in Social Commentary CS 229: Machine Learning Kevin Heh December 13, 2013 1. Introduction The abundance of public discussion spaces on the Internet has in many ways changed how we
More informationCOMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection.
COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection. Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise
More informationCOMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection.
COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection. Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551
More informationDeep Learning Explained
Deep Learning Explained Module 1: Introduction and Overview Sayan D. Pathak, Ph.D., Principal ML Scientist, Microsoft Roland Fernandez, Senior Researcher, Microsoft Course outline What is deep learning?
More informationWelcome to CMPS 142: Machine Learning. Administrivia. Lecture Slides for. Instructor: David Helmbold,
Welcome to CMPS 142: Machine Learning Instructor: David Helmbold, dph@soe.ucsc.edu Web page: www.soe.ucsc.edu/classes/cmps142/winter07/ Text: Introduction to Machine Learning, Alpaydin Administrivia Sign
More informationRecommender Systems. Sargur N. Srihari
Recommender Systems Sargur N. srihari@cedar.buffalo.edu This is part of lecture slides on Deep Learning: http://www.cedar.buffalo.edu/~srihari/cse676 1 Topics in Recommender Systems Types of Recommender
More informationWelcome to CMPS 142 and 242: Machine Learning
Welcome to CMPS 142 and 242: Machine Learning Instructor: David Helmbold, dph@soe.ucsc.edu Office hours: Monday 1:302:30, Thursday 4:155:00 TA: Aaron Michelony, amichelo@soe.ucsc.edu Web page: www.soe.ucsc.edu/classes/cmps242/fall13/01
More informationStanford NLP. Evan Jaffe and Evan Kozliner
Stanford NLP Evan Jaffe and Evan Kozliner Some Notable Researchers Chris Manning Statistical NLP, Natural Language Understanding and Deep Learning Dan Jurafsky sciences Percy Liang Natural Language Understanding,
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationSapienza Università di Roma
Sapienza Università di Roma Machine Learning Course Prof: Paola Velardi Deep QLearning with a multilayer Neural Network Alfonso Alfaro Rojas  1759167 Oriola Gjetaj  1740479 February 2017 Contents 1.
More informationNatural Language Processing with Deep Learning CS224N/Ling284
Natural Language Processing with Deep Learning CS224N/Ling284 Lecture 8: Recurrent Neural Networks and Language Models Abigail See Announcements Assignment 1: Grades will be released after class Assignment
More informationCOMP150 DR Final Project Proposal
COMP150 DR Final Project Proposal Ari Brown and Julie Jiang October 26, 2017 Abstract The problem of sound classification has been studied in depth and has multiple applications related to identity discrimination,
More informationMachine Learning 2nd Edition
INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010
More informationClassification with Deep Belief Networks. HussamHebbo Jae Won Kim
Classification with Deep Belief Networks HussamHebbo Jae Won Kim Table of Contents Introduction... 3 Neural Networks... 3 Perceptron... 3 Backpropagation... 4 Deep Belief Networks (RBM, Sigmoid Belief
More informationP(A, B) = P(A B) = P(A) + P(B)  P(A B)
AND Probability P(A, B) = P(A B) = P(A) + P(B)  P(A B) P(A B) = P(A) + P(B)  P(A B) Area = Probability of Event AND Probability P(A, B) = P(A B) = P(A) + P(B)  P(A B) If, and only if, A and B are independent,
More informationDisclaimer. Copyright. Deep Learning With Python
i Disclaimer The information contained within this ebook is strictly for educational purposes. If you wish to apply ideas contained in this ebook, you are taking full responsibility for your actions. The
More informationDynamic Memory Networks for Question Answering
Dynamic Memory Networks for Question Answering Arushi Raghuvanshi Department of Computer Science Stanford University arushi@stanford.edu Patrick Chase Department of Computer Science Stanford University
More informationJeff Howbert Introduction to Machine Learning Winter
Classification Ensemble e Methods 1 Jeff Howbert Introduction to Machine Learning Winter 2012 1 Ensemble methods Basic idea of ensemble methods: Combining predictions from competing models often gives
More informationExploration vs. Exploitation. CS 473: Artificial Intelligence Reinforcement Learning II. How to Explore? Exploration Functions
CS 473: Artificial Intelligence Reinforcement Learning II Exploration vs. Exploitation Dieter Fox / University of Washington [Most slides were taken from Dan Klein and Pieter Abbeel / CS188 Intro to AI
More informationPrinciples of Machine Learning
Principles of Machine Learning Lab 5  OptimizationBased Machine Learning Models Overview In this lab you will explore the use of optimizationbased machine learning models. Optimizationbased models
More informationKnowledge Representation and Reasoning with Deep Neural Networks. Arvind Neelakantan
Knowledge Representation and Reasoning with Deep Neural Networks Arvind Neelakantan UMass Amherst: David Belanger, Rajarshi Das, Andrew McCallum and Benjamin Roth Google Brain: Martin Abadi, Dario Amodei,
More informationTwo Ideas For Structured Data: Reward augmented maximum likelihood Order matters. Samy Bengio, and the Brain team
Two Ideas For Structured Data: Reward augmented maximum likelihood Order matters Samy Bengio, and the Brain team Reward augmented maximum likelihood for neural structured prediction Mohammad Norouzi, Samy
More informationCOMS 4771 Introduction to Machine Learning. Nakul Verma
COMS 4771 Introduction to Machine Learning Nakul Verma Machine learning: what? Study of making machines learn a concept without having to explicitly program it. Constructing algorithms that can: learn
More informationCOMP 551 Applied Machine Learning Lecture 12: Ensemble learning
COMP 551 Applied Machine Learning Lecture 12: Ensemble learning Associate Instructor: Herke van Hoof (herke.vanhoof@mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551
More informationModelling Time Series Data with Theano. Charles Killam, LP.D. Certified Instructor, NVIDIA Deep Learning Institute NVIDIA Corporation
Modelling Time Series Data with Theano Charles Killam, LP.D. Certified Instructor, NVIDIA Deep Learning Institute NVIDIA Corporation 1 DEEP LEARNING INSTITUTE DLI Mission Helping people solve challenging
More informationIntroduction to Machine Learning
Introduction to Machine Learning Hamed Pirsiavash CMSC 678 http://www.csee.umbc.edu/~hpirsiav/courses/ml_fall17 The slides are closely adapted from Subhransu Maji s slides Course background What is the
More informationLecture 6: Course Project Introduction and Deep Learning Preliminaries
CS 224S / LINGUIST 285 Spoken Language Processing Andrew Maas Stanford University Spring 2017 Lecture 6: Course Project Introduction and Deep Learning Preliminaries Outline for Today Course projects What
More informationAn Introduction to Deep Learning. Labeeb Khan
An Introduction to Deep Learning Labeeb Khan Special Thanks: Lukas Masuch @lukasmasuch +lukasmasuch Lead Software Engineer: Machine Intelligence, SAP The Big Players Companies The Big Players Startups
More informationCOMP 527: Data Mining and Visualization. Danushka Bollegala
COMP 527: Data Mining and Visualization Danushka Bollegala Introductions Lecturer: Danushka Bollegala Office: 2.24 Ashton Building (Second Floor) Email: danushka@liverpool.ac.uk Personal web: http://danushka.net/
More informationSanta Monica College Spring 2016 Department of Mathematics MATH 54(#2730) Elementary Statistics Friday, 8:00am 12:05pm, Room MC74
Santa Monica College Spring 2016 Department of Mathematics MATH 54(#2730) Elementary Statistics Friday, 8:00am 12:05pm, Room MC74 Instructor: Melanie Xie Office Hours: Friday, 7:00 am 7:55am, Room MC74
More informationCS 2750: Machine Learning. Other Topics. Prof. Adriana Kovashka University of Pittsburgh April 13, 2017
CS 2750: Machine Learning Other Topics Prof. Adriana Kovashka University of Pittsburgh April 13, 2017 Plan for last lecture Overview of other topics and applications Reinforcement learning Active learning
More informationStay Alert!: Creating a Classifier to Predict Driver Alertness in Realtime
Stay Alert!: Creating a Classifier to Predict Driver Alertness in Realtime Aditya Sarkar, Julien KawawaBeaudan, Quentin Perrot Friday, December 11, 2014 1 Problem Definition Driving while drowsy inevitably
More informationCS Machine Learning
CS 478  Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationAzure Machine Learning. Designing Iris MultiClass Classifier
Media Partners Azure Machine Learning Designing Iris MultiClass Classifier Marcin Szeliga 20 years of experience with SQL Server Trainer & data platform architect Books & articles writer Speaker at numerous
More information10707 Deep Learning. Russ Salakhutdinov. Language Modeling. h0p://www.cs.cmu.edu/~rsalakhu/10707/ Machine Learning Department
10707 Deep Learning Russ Salakhutdinov Machine Learning Department rsalakhu@cs.cmu.edu h0p://www.cs.cmu.edu/~rsalakhu/10707/ Language Modeling Neural Networks Online Course Disclaimer: Some of the material
More informationSurvey Analysis of Machine Learning Methods for Natural Language Processing for MBTI Personality Type Prediction
Survey Analysis of Machine Learning Methods for Natural Language Processing for MBTI Personality Type Prediction Brandon Cui (bcui19@stanford.edu) 1 Calvin Qi (calvinqi@stanford.edu) 2 Abstract We studied
More informationDeep Learning Fun with TensorFlow. Martin Andrews Red Cat Labs
Deep Learning Fun with TensorFlow Martin Andrews Red Cat Labs Outline About me + Singapore community + Workshops Something inthenews : Actual talk content Including lots of code (show of hands?) Deep
More informationE9 205 Machine Learning for Signal Processing
E9 205 Machine Learning for Signal Processing Introduction to Machine Learning of Sensory Signals 14082017 Instructor  Sriram Ganapathy (sriram@ee.iisc.ernet.in) Teaching Assistant  Aravind Illa (aravindece77@gmail.com).
More informationEra of AI (Deep Learning) and harnessing its true potential
Era of AI (Deep Learning) and harnessing its true potential Artificial Intelligence (AI) AI Augments our brain with infallible memories and infallible calculators Humans and Computers have become a tightly
More informationArtificial Neural Networks
Artificial Neural Networks Outline Introduction to Neural Network Introduction to Artificial Neural Network Properties of Artificial Neural Network Applications of Artificial Neural Network Demo Neural
More informationDeep Learning and its application to CV and NLP. Fei Yan University of Surrey June 29, 2016 Edinburgh
Deep Learning and its application to CV and NLP Fei Yan University of Surrey June 29, 2016 Edinburgh Overview Machine learning Motivation: why go deep Feedforward networks: CNN Recurrent networks: LSTM
More informationMachine Learning for NLP
Natural Language Processing SoSe 2014 Machine Learning for NLP Dr. Mariana Neves April 30th, 2014 (based on the slides of Dr. Saeedeh Momtazi) Introduction Field of study that gives computers the ability
More informationMachine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler
Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina
More informationMachine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University. January 11, 2011
Machine Learning 10701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 11, 2011 Today: What is machine learning? Decision tree learning Course logistics Readings: The Discipline
More informationNeural Text Summarization
Neural Text Summarization Urvashi Khandelwal Department of Computer Science Stanford University urvashik@stanford.edu Abstract Generation based text summarization is a hard task and recent deep learning
More informationImproved Word and Symbol Embedding for PartofSpeech Tagging
Improved Word and Symbol Embedding for PartofSpeech Tagging Nicholas Altieri, Sherdil Niyaz, Samee Ibraheem, and John DeNero {naltieri,sniyaz,sibraheem,denero}@berkeley.edu Abstract Stateoftheart
More informationLecture 1. Introduction Bastian Leibe Visual Computing Institute RWTH Aachen University
Advanced Machine Learning Lecture 1 Introduction 20.10.2015 Bastian Leibe Visual Computing Institute RWTH Aachen University http://www.vision.rwthaachen.de/ leibe@vision.rwthaachen.de Organization Lecturer
More informationOverview COEN 296 Topics in Computer Engineering Introduction to Pattern Recognition and Data Mining Course Goals Syllabus
Overview COEN 296 Topics in Computer Engineering to Pattern Recognition and Data Mining Instructor: Dr. Giovanni Seni G.Seni@ieee.org Department of Computer Engineering Santa Clara University Course Goals
More informationMachine Learning : Hinge Loss
Machine Learning Hinge Loss 16/01/2014 Machine Learning : Hinge Loss Recap tasks considered before Let a training dataset be given with (i) data and (ii) classes The goal is to find a hyper plane that
More informationArtificial Neural Networks. Andreas Robinson 12/19/2012
Artificial Neural Networks Andreas Robinson 12/19/2012 Introduction Artificial Neural Networks Machine learning technique Learning from past experience/data Predicting/classifying novel data Biologically
More informationLet the data speak: Machine Learning methods for data editing and imputation. Paper by: Felibel Zabala Presented by: Amanda Hughes
Let the data speak: Machine Learning methods for data editing and imputation Paper by: Felibel Zabala Presented by: Amanda Hughes September 2015 Objective Machine Learning (ML) methods can be used to help
More informationArticle from. Predictive Analytics and Futurism December 2015 Issue 12
Article from Predictive Analytics and Futurism December 2015 Issue 12 The Third Generation of Neural Networks By Jeff Heaton Neural networks are the phoenix of artificial intelligence. Right now neural
More informationHomework III Using Logistic Regression for Spam Filtering
Homework III Using Logistic Regression for Spam Filtering Introduction to Machine Learning  CMPS 242 By Bruno Astuto Arouche Nunes February 14 th 2008 1. Introduction In this work we study batch learning
More informationDecision Tree for Playing Tennis
Decision Tree Decision Tree for Playing Tennis (outlook=sunny, wind=strong, humidity=normal,? ) DT for prediction Csection risks Characteristics of Decision Trees Decision trees have many appealing properties
More informationModelling Sentence Pair Similarity with MultiPerspective Convolutional Neural Networks ZHUCHENG TU CS 898 SPRING 2017 JULY 17, 2017
Modelling Sentence Pair Similarity with MultiPerspective Convolutional Neural Networks ZHUCHENG TU CS 898 SPRING 2017 JULY 17, 2017 1 Outline Motivation Why do we want to model sentence similarity? Challenges
More informationINTRODUCTION TO MACHINE LEARNING. Machine Learning: What s The Challenge?
INTRODUCTION TO MACHINE LEARNING Machine Learning: What s The Challenge? Goals of the course Identify a machine learning problem Use basic machine learning techniques Think about your data/results What
More informationPG DIPLOMA IN MACHINE LEARNING & AI 11 MONTHS ONLINE
& PG DIPLOMA IN MACHINE LEARNING & AI 11 MONTHS ONLINE UpGrad is an online education platform to help individuals develop their professional potential in the most engaging learning environment. Online
More informationMocking the Draft Predicting NFL Draft Picks and Career Success
Mocking the Draft Predicting NFL Draft Picks and Career Success Wesley Olmsted [wolmsted], Jeff Garnier [jeff1731], Tarek Abdelghany [tabdel] 1 Introduction We started off wanting to make some kind of
More informationIntroduction: Convolutional Neural Networks for Visual Recognition.
Introduction: Convolutional Neural Networks for Visual Recognition boris.ginzburg@intel.com 1 Acknowledgments This presentation is heavily based on: http://cs.nyu.edu/~fergus/pmwiki/pmwiki.php http://deeplearning.net/readinglist/tutorials/
More informationDeep Learning for Amazon Food Review Sentiment Analysis
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationProgramming Assignment2: Neural Networks
Programming Assignment2: Neural Networks Problem :. In this homework assignment, your task is to implement one of the common machine learning algorithms: Neural Networks. You will train and test a neural
More informationWINGNUS at CLSciSumm 2017: Learning from Syntactic and Semantic Similarity for Citation Contextualization
WINGNUS at CLSciSumm 2017: Learning from Syntactic and Semantic Similarity for Citation Contextualization Animesh Prasad School of Computing, National University of Singapore, Singapore a0123877@u.nus.edu
More informationA Review on Classification Techniques in Machine Learning
A Review on Classification Techniques in Machine Learning R. Vijaya Kumar Reddy 1, Dr. U. Ravi Babu 2 1 Research Scholar, Dept. of. CSE, Acharya Nagarjuna University, Guntur, (India) 2 Principal, DRK College
More informationHot Topics in Machine Learning
Hot Topics in Machine Learning Winter Term 2016 / 2017 Prof. Marius Kloft, Florian Wenzel October 19, 2016 Organization Organization The seminar is organized by Prof. Marius Kloft and Florian Wenzel (PhD
More information