Session 4: Regularization (Chapter 7)


 Egbert Allen
 1 years ago
 Views:
Transcription
1 Session 4: Regularization (Chapter 7) Tapani Raiko Aalto University 30 September 2015 Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
2 Table of Contents Background Regularization methods Exercises Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
3 Goal of Regularization Neural networks are very powerful (universal appr.). Easy to perform great on the training set (overfitting). Regularization improves generalization to new data at the expense of increased training error. Use heldout validation data to choose hyperparameters (e.g. regularization strength). Use heldout test data to evaluate performance. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
4 Example Without regularization training error goes to zero and learning stops. With noise regularization, test error keeps dropping. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
5 Expressivity demo: Training first layer only No regularization, training W (1) and b (1) only. 0.2% error on training set, 2% error on test set. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
6 What is overfitting? Probability theory states how we should make predictions (of y test) using a model with unknowns θ and data X = {x train, y train, x test}: P(y test X) = P(y test, θ X)dθ = P(y test θ, X)P(θ X)dθ. Probability of observing y test can be acquired by summing or integrating over all different explanations θ. The term P(y test θ, X) is the probability of y test given a particular explanation θ and it is weighted with the probability of the explanation P(θ X). However, such computation is intractable. If we want to choose a single θ to represent all the probability mass, it is better not to overfit to the highest probability peak, but to find a good representative of the mass. Posterior probability mass matters Center of gravity maximum Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
7 Table of Contents Background Regularization methods Exercises Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
8 Regularization methods Limited size of network Early stopping Weight decay Data augmentation Injecting noise Parameter sharing (e.g. convolutional) Sparse representations Ensemble methods Auxiliary tasks (e.g. unsupervised) Probabilistic treatment (e.g. variational methods) Adversarial training,... Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
9 Limited size of network Rule of thumb: When #parameters is ten times less than #outputs #examples, overfitting will not be severe. Reducing input dimensionality (e.g. by PCA) helps in reducing parameters Easy. Low computational complexity Other methods give better accuracy Data augmentation increases #examples Parameter sharing decreases #parameters Auxiliary tasks increases #outputs Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
10 Early stopping Monitor validation performance during training Stop when it starts to deteriorate With other regularization, it might never start Keeps solution close to the initialization Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
11 Weight decay (Tikhonov, 1943) Add a penalty term to the training cost C = + Ω(θ) Note: only a function of parameters θ, not data. L 2 regularization: Ω(θ) = λ 2 θ 2 hyperparameter λ for strength. Gradient: Ω(θ) θ i = λθ i. L 1 regularization: Ω(θ) = λ/2 θ 1 Gradient: Ω(θ) θ i = λ sign(θ i ). Induces sparsity: Often many params become zero. Maxnorm: Constrain row vectors w i of weight matrices to w i 2 c. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
12 Weight decay L2 (left) and L1 (right). w unregularized solution, w regularized solution. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
13 Weight decay as Bayesian prior Consider the maximum a posteriori solution Bayes rule: P(θ X) = P(X θ)p(θ) written on log scale: C = log P(X θ) log P(θ) Assuming Gaussian prior P(θ) = N (0, λ 1 I) we get Ω(θ) = i log exp θ2 i 2λ = λ 1 2 θ 2 L 2 regularization Gaussian prior L 1 regularization Laplace prior Maxnorm regularization Uniform prior with finite support Ω = 0 Maximum likelihood Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
14 Data augmentation Image from (Dosovitskiy et al., 2014) Augmented data by imagespecific transformations. E.g. cropping just 2 pixels gets you 9 times the data! Infinite MNIST: Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
15 Injecting noise (Sietsma and Dow, 1991) Inject random noise during training separately in each epoch Can be applied to input data, to hidden activations, or to weights Can be seen as data augmentation Simple end effective Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
16 Injecting noise to inputs (analysis) Inject small additive Gaussian noise at inputs Assume least squares error at output y Taylor series expansion around x Corresponds to penalizing the Jacobian J 2 y 1 J = dy x dx = y 1 x d y c x 1. y c x d For linear networks, this reduces to L 2 penalty Rifai et al. (2011) penalize the Jacobian directly Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
17 Parameter sharing Force sets of parameters to be equal Reduces the number of (unique) parameters Important in convolutional networks (next week) Autoencoders sometimes share weights between encoder and decoder (Oct 28 session) Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
18 Sparse representations Penalize representation h using Ω(h) to make it sparse L 1 penalty on weights makes W sparse Similarly L 1 penalty can make h sparse Also possible to set a desired sparsity level Sparse coding is common in image processing Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
19 Ensemble methods Train several models and take average of their outputs Also known as bagging or model averaging It helps to make individual models different by varying models or algorithms varying hyperparameters varying data (dropping examples or dimensions) varying random seed It is possible to train a single final model to mimick the performance of the ensemble, for testtime computational efficiency (Hinton et al., 2015) Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
20 Dropout (Hinton et al., 2012) Each time we present data example x, randomly delete each hidden node with 0.5 probability Can be seen as injecting noise or as ensemble: Multiplicative binary noise Training an ensemble of 2 h networks with weight sharing At test time, use all nodes but divide weights by 2 Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
21 Dropout training Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
22 Dropout as bagging Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
23 Auxiliary tasks Multitask learning: Parameter sharing between multiple tasks E.g. speech recognition and speaker identification could share lowlevel representations Layerwise pretraining (Hinton and Salakhutdinov, 2006) can be seen as using unsupervised learning as an auxiliary task (Nov 4 session) Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
24 Probabilistic treatment Variational methods starting to appear in deep learning research T Machine Learning: Advanced Probabilistic Methods Jyri Kivinen might discuss these on Nov 11 session Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
25 Adversarial training (Szegedy et al., 2014) Search for an input x near a datapoint x that would have very different output y from y Adversaries can be found surprisingly close! Miato et al. (2015) build a very effective regularizer Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
26 Table of Contents Background Regularization methods Exercises Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
27 Exercises Read Chapter 7 (Regularization) and Chapter 9 (Convolutional Networks) Read the Theano tutorial on Regularization: Extend your MNIST classifier to include regularization. Consider at least L2 weight decay and additive Gaussian noise injected in the inputs. Choose a good regularization strength using a heldout validation set. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
Lecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationPlankton Image Classification
Plankton Image Classification Sagar Chordia Stanford University sagarc14@stanford.edu Romil Verma Stanford University vermar@stanford.edu Abstract This paper is in response to the National Data Science
More informationArticle from. Predictive Analytics and Futurism December 2015 Issue 12
Article from Predictive Analytics and Futurism December 2015 Issue 12 The Third Generation of Neural Networks By Jeff Heaton Neural networks are the phoenix of artificial intelligence. Right now neural
More informationCPSC 340: Machine Learning and Data Mining. Course Review/Preview Fall 2015
CPSC 340: Machine Learning and Data Mining Course Review/Preview Fall 2015 Admin Assignment 6 due now. We will have office hours as usual next week. Final exam details: December 15: 8:3011 (WESB 100).
More informationProgramming Social Robots for Human Interaction. Lecture 4: Machine Learning and Pattern Recognition
Programming Social Robots for Human Interaction Lecture 4: Machine Learning and Pattern Recognition ZhengHua Tan Dept. of Electronic Systems, Aalborg Univ., Denmark zt@es.aau.dk, http://kom.aau.dk/~zt
More informationDeep Learning. Early Work Why Deep Learning Stacked Auto Encoders Deep Belief Networks. l l l l. CS 678 Deep Learning 1
Deep Learning Early Work Why Deep Learning Stacked Auto Encoders Deep Belief Networks CS 678 Deep Learning 1 Deep Learning Overview Train networks with many layers (vs. shallow nets with just a couple
More informationEnsembles. CS Ensembles 1
Ensembles CS 478  Ensembles 1 A Holy Grail of Machine Learning Outputs Just a Data Set or just an explanation of the problem Automated Learner Hypothesis Input Features CS 478  Ensembles 2 Ensembles
More informationCS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh February 28, 2017
CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh February 28, 2017 HW2 due Thursday Announcements Office hours on Thursday: 4:15pm5:45pm Talk at 3pm: http://www.sam.pitt.edu/arc
More informationDeep (Structured) Learning
Deep (Structured) Learning Yasmine Badr 06/23/2015 NanoCAD Lab UCLA What is Deep Learning? [1] A wide class of machine learning techniques and architectures Using many layers of nonlinear information
More informationCS 510: Lecture 8. Deep Learning, Fairness, and Bias
CS 510: Lecture 8 Deep Learning, Fairness, and Bias Next Week All Presentations, all the time Upload your presentation before class if using slides Sign up for a timeslot google doc, if you haven t already
More informationCapacity, Learning, Teaching
Capacity, Learning, Teaching Xiaojin Zhu Department of Computer Sciences University of WisconsinMadison jerryzhu@cs.wisc.edu 23 Machine learning human learning Learning capacity and generalization bounds
More informationTraining Neural Networks, Part I. FeiFei Li & Justin Johnson & Serena Yeung. Lecture 61
Lecture 6: Training Neural Networks, Part I Lecture 61 Administrative Assignment 1 due Thursday (today), 11:59pm on Canvas Assignment 2 out today Project proposal due Tuesday April 25 Notes on backprop
More informationArtificial Neural Networks. Andreas Robinson 12/19/2012
Artificial Neural Networks Andreas Robinson 12/19/2012 Introduction Artificial Neural Networks Machine learning technique Learning from past experience/data Predicting/classifying novel data Biologically
More informationClassification with Deep Belief Networks. HussamHebbo Jae Won Kim
Classification with Deep Belief Networks HussamHebbo Jae Won Kim Table of Contents Introduction... 3 Neural Networks... 3 Perceptron... 3 Backpropagation... 4 Deep Belief Networks (RBM, Sigmoid Belief
More informationNoiseOut: A Simple Way to Prune Neural Networks
NoiseOut: A Simple Way to Prune Neural Networks Mohammad Babaeizadeh, Paris Smaragdis & Roy H. Campbell Department of Computer Science University of Illinois at UrbanaChampaign {mb2,paris,rhc}@illinois.edu.edu
More informationSurvey Analysis of Machine Learning Methods for Natural Language Processing for MBTI Personality Type Prediction
Survey Analysis of Machine Learning Methods for Natural Language Processing for MBTI Personality Type Prediction Brandon Cui (bcui19@stanford.edu) 1 Calvin Qi (calvinqi@stanford.edu) 2 Abstract We studied
More informationMultiPrediction Deep Boltzmann Machines
MultiPrediction Deep Boltzmann Machines Ian J. Goodfellow, Mehdi Mirza, Aaron Courville, Yoshua Bengio Département d informatique et de recherche opérationnelle Université de Montréal Montréal, QC H3C
More informationarxiv: v3 [cs.lg] 9 Mar 2014
Learning Factored Representations in a Deep Mixture of Experts arxiv:1312.4314v3 [cs.lg] 9 Mar 2014 David Eigen 1,2 Marc Aurelio Ranzato 1 Ilya Sutskever 1 1 Google, Inc. 2 Dept. of Computer Science, Courant
More informationDeep Neural Networks for Acoustic Modelling. Bajibabu Bollepalli Hieu Nguyen Rakshith Shetty Pieter Smit (Mentor)
Deep Neural Networks for Acoustic Modelling Bajibabu Bollepalli Hieu Nguyen Rakshith Shetty Pieter Smit (Mentor) Introduction Automatic speech recognition Speech signal Feature Extraction Acoustic Modelling
More informationDudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA
Adult Income and Letter Recognition  Supervised Learning Report An objective look at classifier performance for predicting adult income and Letter Recognition Dudon Wai Georgia Institute of Technology
More informationTraining Neural Networks, Part 2. FeiFei Li & Justin Johnson & Serena Yeung. Lecture 71
Lecture 7: Training Neural Networks, Part 2 Lecture 71 Administrative  Assignment 1 is being graded, stay tuned  Project proposals due today by 11:59pm  Assignment 2 is out, due Thursday May 4 at 11:59pm
More informationCOMS 4771 Introduction to Machine Learning. Nakul Verma
COMS 4771 Introduction to Machine Learning Nakul Verma Machine learning: what? Study of making machines learn a concept without having to explicitly program it. Constructing algorithms that can: learn
More informationSystem Implementation for SemEval2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 TzuHsuan Yang, 2 TzuHsuan Tseng, and 3 ChiaPing Chen Department of Computer Science and Engineering
More informationUnsupervised Learning: Clustering
Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning
More informationA study of the NIPS feature selection challenge
A study of the NIPS feature selection challenge Nicholas Johnson November 29, 2009 Abstract The 2003 Nips Feature extraction challenge was dominated by Bayesian approaches developed by the team of Radford
More informationLinear Models Continued: Perceptron & Logistic Regression
Linear Models Continued: Perceptron & Logistic Regression CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Linear Models for Classification Feature function
More informationMachine Learning for Computer Vision
Prof. Daniel Cremers Machine Learning for Computer PD Dr. Rudolph Triebel Lecturers PD Dr. Rudolph Triebel rudolph.triebel@in.tum.de Room number 02.09.058 (Fridays) Main lecture MSc. Ioannis John Chiotellis
More information15 : Case Study: Topic Models
10708: Probabilistic Graphical Models, Spring 2015 15 : Case Study: Topic Models Lecturer: Eric P. Xing Scribes: Xinyu Miao,Yun Ni 1 Task Humans cannot afford to deal with a huge number of text documents
More informationComputer Vision for Card Games
Computer Vision for Card Games Matias Castillo matiasct@stanford.edu Benjamin Goeing bgoeing@stanford.edu Jesper Westell jesperw@stanford.edu Abstract For this project, we designed a computer vision program
More informationT Machine Learning: Advanced Probablistic Methods
T61.5140 Machine Learning: Advanced Probablistic Methods Jaakko Hollmén Department of Information and Computer Science Helsinki University of Technology, Finland email: Jaakko.Hollmen@tkk.fi Web: http://www.cis.hut.fi/opinnot/t61.5140/
More informationDeep MultiTask Learning with evolving weights
Deep MultiTask Learning with evolving weights Soufiane Belharbi1, Romain He rault1, Cle ment Chatelain1 and Se bastien Adam2 1 INSA de Rouen  LITIS EA 4108 Saint E tienne du Rouvray 76800  France 2
More informationDeep Learning of Representations for Unsupervised and Transfer Learning
JMLR: Workshop and Conference Proceedings 7 (2011) 1 20 Workshop on Unsupervised and Transfer Learning Deep Learning of Representations for Unsupervised and Transfer Learning Yoshua Bengio yoshua.bengio@umontreal.ca
More informationPhoneme Recognition Using Deep Neural Networks
CS229 Final Project Report, Stanford University Phoneme Recognition Using Deep Neural Networks John Labiak December 16, 2011 1 Introduction Deep architectures, such as multilayer neural networks, can be
More informationP(A, B) = P(A B) = P(A) + P(B)  P(A B)
AND Probability P(A, B) = P(A B) = P(A) + P(B)  P(A B) P(A B) = P(A) + P(B)  P(A B) Area = Probability of Event AND Probability P(A, B) = P(A B) = P(A) + P(B)  P(A B) If, and only if, A and B are independent,
More informationTEMPORAL ENSEMBLING FOR SEMISUPERVISED LEARNING
TEMPORAL ENSEMBLING FOR SEMISUPERVISED LEARNING Samuli Laine NVIDIA slaine@nvidia.com Timo Aila NVIDIA taila@nvidia.com ABSTRACT In this paper, we present a simple and efficient method for training deep
More informationLinear Regression. Chapter Introduction
Chapter 9 Linear Regression 9.1 Introduction In this class, we have looked at a variety of di erent models and learning methods, such as finite state machines, sequence models, and classification methods.
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More information20.3 The EM algorithm
20.3 The EM algorithm Many realworld problems have hidden (latent) variables, which are not observable in the data that are available for learning Including a latent variable into a Bayesian network may
More informationStudies in Deep Belief Networks
Studies in Deep Belief Networks Jiquan Ngiam jngiam@cs.stanford.edu Chris Baldassano chrisb33@cs.stanford.edu Abstract Deep networks are able to learn good representations of unlabelled data via a greedy
More informationTiny ImageNet Image Classification Alexei Bastidas Stanford University
Tiny ImageNet Image Classification Alexei Bastidas Stanford University alexeib@stanford.edu Abstract In this work, I investigate how finetuning and adapting existing models, namely InceptionV3[7] and
More informationCSE 258 Lecture 3. Web Mining and Recommender Systems. Supervised learning Classification
CSE 258 Lecture 3 Web Mining and Recommender Systems Supervised learning Classification Last week Last week we started looking at supervised learning problems Last week We studied linear regression, in
More informationIntroduction to Deep Learning
Introduction to Deep Learning M S Ram Dept. of Computer Science & Engg. Indian Institute of Technology Kanpur Reading of Chap. 1 from Learning Deep Architectures for AI ; Yoshua Bengio; FTML Vol. 2, No.
More informationINTRODUCTION TO DATA SCIENCE
DATA11001 INTRODUCTION TO DATA SCIENCE EPISODE 6: MACHINE LEARNING TODAY S MENU 1. WHAT IS ML? 2. CLASSIFICATION AND REGRESSSION 3. EVALUATING PERFORMANCE & OVERFITTING WHAT IS MACHINE LEARNING? Definition:
More informationGovernment of Russian Federation. Federal State Autonomous Educational Institution of High Professional Education
Government of Russian Federation Federal State Autonomous Educational Institution of High Professional Education National Research University Higher School of Economics Syllabus for the course Advanced
More informationLearning Deep Belief Networks from NonStationary Streams
Learning Deep Belief Networks from NonStationary Streams Roberto Calandra 1, Tapani Raiko 2, Marc Peter Deisenroth 1, and Federico Montesino Pouzols 3 1 Fachbereich Informatik, Technische Universität
More informationAnalyzing human feature learning as nonparametric Bayesian inference
Analyzing human feature learning as nonparametric Bayesian inference Joseph L. Austerweil Department of Psychology University of California, Berkeley Berkeley, CA 94720 Joseph.Austerweil@gmail.com Thomas
More informationHAMLET JERRY ZHU UNIVERSITY OF WISCONSIN
HAMLET JERRY ZHU UNIVERSITY OF WISCONSIN Collaborators: Rui Castro, Michael Coen, Ricki Colman, Charles Kalish, Joseph Kemnitz, Robert Nowak, Ruichen Qian, Shelley Prudom, Timothy Rogers Somewhere, something
More informationAutomatic Generation of Neural Networks based on Genetic Algorithms
Automatic Generation of Neural Networks based on Genetic Algorithms Fiszelew, A. 1, Britos, P. 2, 3, Perichisky, G. 3 & GarcíaMartínez, R. 2 1 Intelligent Systems Laboratory. School of Engineering. University
More informationDeep learning for music genre classification
Deep learning for music genre classification Tao Feng University of Illinois taofeng1@illinois.edu Abstract In this paper we will present how to use Restricted Boltzmann machine algorithm to build deep
More informationTwitter Sentiment Analysis with Recursive Neural Networks
Twitter Sentiment Analysis with Recursive Neural Networks Ye Yuan, You Zhou Department of Computer Science Stanford University Stanford, CA 94305 {yy0222, youzhou}@stanford.edu Abstract In this paper,
More informationA deep learning strategy for widearea surveillance
A deep learning strategy for widearea surveillance 17/05/2016 Mr Alessandro Borgia Supervisor: Prof Neil Robertson HeriotWatt University EPS/ISSS Visionlab Roke Manor Research partnership 17/05/2016
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationPattern Classification and Clustering Spring 2006
Pattern Classification and Clustering Time: Spring 2006 Room: Instructor: Yingen Xiong Office: 621 McBryde Office Hours: Phone: 2314212 Email: yxiong@cs.vt.edu URL: http://www.cs.vt.edu/~yxiong/pcc/ Detailed
More informationDeep multitask learning with evolving weights
Deep multitask learning with evolving weights ESANN 2016 Soufiane Belharbi Romain Hérault Clément Chatelain Sébastien Adam soufiane.belharbi@insarouen.fr LITIS lab., DocApp team  INSA de Rouen, France
More informationData Mining. CS57300 Purdue University. Bruno Ribeiro. February 15th, 2018
Data Mining CS573 Purdue University Bruno Ribeiro February 15th, 218 1 Today s Goal Ensemble Methods Supervised Methods Metalearners Unsupervised Methods 215 Bruno Ribeiro Understanding Ensembles The
More informationUnsupervised Learning
17s1: COMP9417 Machine Learning and Data Mining Unsupervised Learning May 2, 2017 Acknowledgement: Material derived from slides for the book Machine Learning, Tom M. Mitchell, McGrawHill, 1997 http://www2.cs.cmu.edu/~tom/mlbook.html
More information10701/15781 Machine Learning, Spring 2005: Homework 1
10701/15781 Machine Learning, Spring 2005: Homework 1 Due: Monday, February 6, beginning of the class 1 [15 Points] Probability and Regression [Stano] 1 1.1 [10 Points] The Matrix Strikes Back The Matrix
More informationBayesian Reasoning and Deep Learning Shakir Mohamed
Bayesian Reasoning and Deep Learning Shakir Mohamed DeepMind shakirm.com @shakir_za 9 October 2015 Abstract Deep learning and Bayesian machine learning are currently two of the most active areas of machine
More informationLarge Scale Data Analysis Using Deep Learning
Large Scale Data Analysis Using Deep Learning Introduction to Deep Learning U Kang Seoul National University U Kang 1 In This Lecture Overview of deep learning History of deep learning and its recent advances
More informationLecture 1. Introduction Bastian Leibe Visual Computing Institute RWTH Aachen University
Advanced Machine Learning Lecture 1 Introduction 20.10.2015 Bastian Leibe Visual Computing Institute RWTH Aachen University http://www.vision.rwthaachen.de/ leibe@vision.rwthaachen.de Organization Lecturer
More informationClassification of News Articles Using Named Entities with Named Entity Recognition by Neural Network
Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network Nick Latourette and Hugh Cunningham 1. Introduction Our paper investigates the use of named entities
More informationA brief tutorial on reinforcement learning: The game of Chung Toi
A brief tutorial on reinforcement learning: The game of Chung Toi Christopher J. Gatti 1, Jonathan D. Linton 2, and Mark J. Embrechts 1 1 Rensselaer Polytechnic Institute Department of Industrial and
More informationarxiv: v3 [cs.cv] 16 Feb 2014
Unsupervised feature learning by augmenting single images arxiv:1312.5242v3 [cs.cv] 16 Feb 2014 Alexey Dosovitskiy, Jost Tobias Springenberg and Thomas Brox Department of Computer Science University of
More informationCHILDNet: Curiositydriven HumanIntheLoop Deep Network
CHILDNet: Curiositydriven HumanIntheLoop Deep Network Byungwoo Kang Stanford University Department of Physics bkang@stanford.edu Hyun Sik Kim Stanford University Department of Electrical Engineering
More informationLearning Bayes Networks
Learning Bayes Networks 6.034 Based on Russell & Norvig, Artificial Intelligence:A Modern Approach, 2nd ed., 2003 and D. Heckerman. A Tutorial on Learning with Bayesian Networks. In Learning in Graphical
More informationSTA 414/2104 Statistical Methods for Machine Learning and Data Mining
STA 414/2104 Statistical Methods for Machine Learning and Data Mining Radford M. Neal, University of Toronto, 2014 Week 1 What are Machine Learning and Data Mining? Typical Machine Learning and Data Mining
More informationMACHINE LEARNING WITH SAS
This webinar will be recorded. Please engage, use the Questions function during the presentation! MACHINE LEARNING WITH SAS SAS NORDIC FANS WEBINAR 21. MARCH 2017 Gert Nissen Technical Client Manager Georg
More informationUnderstanding data augmentation for classification: when to warp?
Understanding data augmentation for classification: when to warp? Sebastien C. Wong Defence Science and Technology Edinburgh SA, Australia Email: sebastien.wong@dsto.defence.gov.au Adam Gatt Australian
More informationOneShot Learning of Faces
OneShot Learning of Faces Luke Johnston William Chen Department of Computer Science, Stanford University Introduction The ability to learn and generalize from single or few examples is often cited as
More informationComparing Deep Learning and Conventional Machine Learning for Authorship Attribution and Text Generation
Comparing Deep Learning and Conventional Machine Learning for Authorship Attribution and Text Generation Gregory Luppescu Department of Electrical Engineering Stanford University gluppes@stanford.edu Francisco
More informationCS519: Deep Learning 1. Introduction
CS519: Deep Learning 1. Introduction Winter 2017 Fuxin Li With materials from Pierre Baldi, Geoffrey Hinton, Andrew Ng, Honglak Lee, Aditya Khosla, Joseph Lim 1 Cutting Edge of Machine Learning: Deep Learning
More informationA Review on Classification Techniques in Machine Learning
A Review on Classification Techniques in Machine Learning R. Vijaya Kumar Reddy 1, Dr. U. Ravi Babu 2 1 Research Scholar, Dept. of. CSE, Acharya Nagarjuna University, Guntur, (India) 2 Principal, DRK College
More informationDeep Dictionary Learning vs Deep Belief Network vs Stacked Autoencoder: An Empirical Analysis
Target Target Deep Dictionary Learning vs Deep Belief Network vs Stacked Autoencoder: An Empirical Analysis Vanika Singhal, Anupriya Gogna and Angshul Majumdar Indraprastha Institute of Information Technology,
More informationAssembly Output Codes for Learning Neural Networks
Assembly Output Codes for Learning Neural Networks Philippe Tigreat*, Carlos Rosar Kos Lassance*, Xiaoran liang **, Vincent Gripon*, Claude Berrou* *Electronics Department, Telecom Bretagne **INRIA Rennes
More informationWord Sense Determination from Wikipedia. Data Using a Neural Net
1 Word Sense Determination from Wikipedia Data Using a Neural Net CS 297 Report Presented to Dr. Chris Pollett Department of Computer Science San Jose State University By Qiao Liu May 2017 Word Sense Determination
More informationCOMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection.
COMP 551 Applied Machine Learning Lecture 6: Performance evaluation. Model assessment and selection. Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551
More informationDeep Reinforcement Learning for Flappy Bird Kevin Chen
Deep Reinforcement Learning for Flappy Bird Kevin Chen Abstract Reinforcement learning is essential for applications where there is no single correct way to solve a problem. In this project, we show that
More information545 Machine Learning, Fall 2011
545 Machine Learning, Fall 2011 Final Project Report Experiments in Automatic Text Summarization Using Deep Neural Networks Project Team: Ben King Rahul Jha Tyler Johnson Vaishnavi Sundararajan Instructor:
More informationArtificial Neural Networks
Artificial Neural Networks Outline Introduction to Neural Network Introduction to Artificial Neural Network Properties of Artificial Neural Network Applications of Artificial Neural Network Demo Neural
More informationCSC 411 MACHINE LEARNING and DATA MINING
CSC 411 MACHINE LEARNING and DATA MINING Lectures: Monday, Wednesday 121 (section 1), 34 (section 2) Lecture Room: MP 134 (section 1); Bahen 1200 (section 2) Instructor (section 1): Richard Zemel Instructor
More informationAbstractive Summarization with Global Importance Scores
Abstractive Summarization with Global Importance Scores Shivaal Roy Department of Computer Science Stanford University shivaal@cs.stanford.edu Vivian Nguyen Department of Computer Science Stanford University
More informationConvolutional Neural Networks for Speech Recognition
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL 22, NO 10, OCTOBER 2014 1533 Convolutional Neural Networks for Speech Recognition Ossama AbdelHamid, Abdelrahman Mohamed, Hui Jiang,
More informationIndian Institute of Technology Kanpur. Deep Learning for Document Classification
Indian Institute of Technology Kanpur CS671  Natural Language Processing Course project Deep Learning for Document Classification Amlan Kar Sanket Jantre Supervised by Dr. Amitabha Mukerjee Contents 1
More informationSession 1: Gesture Recognition & Machine Learning Fundamentals
IAP Gesture Recognition Workshop Session 1: Gesture Recognition & Machine Learning Fundamentals Nicholas Gillian Responsive Environments, MIT Media Lab Tuesday 8th January, 2013 My Research My Research
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 0014
More informationSynaptic Weight Noise During MLP Learning Enhances FaultTolerance, Generalisation and Learning Trajectory
Synaptic Weight Noise During MLP Learning Enhances FaultTolerance, Generalisation and Learning Trajectory Alan F. Murray Dept. of Electrical Engineering Edinburgh University Scotland Peter J. Edwards
More informationProgramming Assignment2: Neural Networks
Programming Assignment2: Neural Networks Problem :. In this homework assignment, your task is to implement one of the common machine learning algorithms: Neural Networks. You will train and test a neural
More informationLearning General Features From Images and Audio With Stacked Denoising Autoencoders
Portland State University PDXScholar Dissertations and Theses Dissertations and Theses Fall 1232014 Learning General Features From Images and Audio With Stacked Denoising Autoencoders Nathaniel H. Nifong
More informationBayesian Deep Learning for Integrated Intelligence: Bridging the Gap between Perception and Inference
1 Bayesian Deep Learning for Integrated Intelligence: Bridging the Gap between Perception and Inference Hao Wang Department of Computer Science and Engineering Joint work with Naiyan Wang, Xingjian Shi,
More informationBinary decision trees
Binary decision trees A binary decision tree ultimately boils down to taking a majority vote within each cell of a partition of the feature space (learned from the data) that looks something like this
More informationWhen Dictionary Learning Meets Classification
When Dictionary Learning Meets Classification Bufford, Teresa Chen, Yuxin Horning, Mitchell Shee, Liberty Supervised by: Prof. Yohann Tero August 9, 213 Abstract This report details and exts the implementation
More informationECE 5424: Introduction to Machine Learning
ECE 5424: Introduction to Machine Learning Topics: Classification: Naïve Bayes Readings: Barber 10.110.3 Stefan Lee Virginia Tech Administrativia HW2 Due: Friday 09/28, 10/3, 11:55pm Implement linear
More informationStay Alert!: Creating a Classifier to Predict Driver Alertness in Realtime
Stay Alert!: Creating a Classifier to Predict Driver Alertness in Realtime Aditya Sarkar, Julien KawawaBeaudan, Quentin Perrot Friday, December 11, 2014 1 Problem Definition Driving while drowsy inevitably
More informationA Few Useful Things to Know about Machine Learning. Pedro Domingos Department of Computer Science and Engineering University of Washington" 2012"
A Few Useful Things to Know about Machine Learning Pedro Domingos Department of Computer Science and Engineering University of Washington 2012 A Few Useful Things to Know about Machine Learning Machine
More informationVocal Tract Length Perturbation (VTLP) improves speech recognition
Vocal Tract Length Perturbation (VTLP) improves speech recognition Navdeep Jaitly ndjaitly@cs.toronto.edu University of Toronto, 10 King s College Rd., Toronto, ON M5S 3G4 CANADA Geoffrey E. Hinton hinton@cs.toronto.edu
More information18 LEARNING FROM EXAMPLES
18 LEARNING FROM EXAMPLES An intelligent agent may have to learn, for instance, the following components: A direct mapping from conditions on the current state to actions A means to infer relevant properties
More informationLip Reader: VideoBased Speech Transcriber
Lip Reader: VideoBased Speech Transcriber Bora Erden Max Wolff Sam Wood 1. Introduction We set out to build a lipreader, which would take audiofree videos of people speaking and reconstruct their spoken
More informationDeep Learning in Music Informatics
Deep Learning in Music Informatics Demystifying the Dark Art, Part III Practicum Eric J. Humphrey 04 November 2013 Outline In this part of the talk, we ll touch on the following: Recap: What is deep learning
More informationFILTER BANK FEATURE EXTRACTION FOR GAUSSIAN MIXTURE MODEL SPEAKER RECOGNITION
FILTER BANK FEATURE EXTRACTION FOR GAUSSIAN MIXTURE MODEL SPEAKER RECOGNITION James H. Nealand, Alan B. Bradley, & Margaret Lech School of Electrical and Computer Systems Engineering, RMIT University,
More informationBig Data Analytics Clustering and Classification
E6893 Big Data Analytics Lecture 4: Big Data Analytics Clustering and Classification ChingYung Lin, Ph.D. Adjunct Professor, Dept. of Electrical Engineering and Computer Science September 28th, 2017 1
More information