Session 4: Regularization (Chapter 7)


 Egbert Allen
 1 years ago
 Views:
Transcription
1 Session 4: Regularization (Chapter 7) Tapani Raiko Aalto University 30 September 2015 Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
2 Table of Contents Background Regularization methods Exercises Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
3 Goal of Regularization Neural networks are very powerful (universal appr.). Easy to perform great on the training set (overfitting). Regularization improves generalization to new data at the expense of increased training error. Use heldout validation data to choose hyperparameters (e.g. regularization strength). Use heldout test data to evaluate performance. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
4 Example Without regularization training error goes to zero and learning stops. With noise regularization, test error keeps dropping. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
5 Expressivity demo: Training first layer only No regularization, training W (1) and b (1) only. 0.2% error on training set, 2% error on test set. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
6 What is overfitting? Probability theory states how we should make predictions (of y test) using a model with unknowns θ and data X = {x train, y train, x test}: P(y test X) = P(y test, θ X)dθ = P(y test θ, X)P(θ X)dθ. Probability of observing y test can be acquired by summing or integrating over all different explanations θ. The term P(y test θ, X) is the probability of y test given a particular explanation θ and it is weighted with the probability of the explanation P(θ X). However, such computation is intractable. If we want to choose a single θ to represent all the probability mass, it is better not to overfit to the highest probability peak, but to find a good representative of the mass. Posterior probability mass matters Center of gravity maximum Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
7 Table of Contents Background Regularization methods Exercises Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
8 Regularization methods Limited size of network Early stopping Weight decay Data augmentation Injecting noise Parameter sharing (e.g. convolutional) Sparse representations Ensemble methods Auxiliary tasks (e.g. unsupervised) Probabilistic treatment (e.g. variational methods) Adversarial training,... Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
9 Limited size of network Rule of thumb: When #parameters is ten times less than #outputs #examples, overfitting will not be severe. Reducing input dimensionality (e.g. by PCA) helps in reducing parameters Easy. Low computational complexity Other methods give better accuracy Data augmentation increases #examples Parameter sharing decreases #parameters Auxiliary tasks increases #outputs Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
10 Early stopping Monitor validation performance during training Stop when it starts to deteriorate With other regularization, it might never start Keeps solution close to the initialization Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
11 Weight decay (Tikhonov, 1943) Add a penalty term to the training cost C = + Ω(θ) Note: only a function of parameters θ, not data. L 2 regularization: Ω(θ) = λ 2 θ 2 hyperparameter λ for strength. Gradient: Ω(θ) θ i = λθ i. L 1 regularization: Ω(θ) = λ/2 θ 1 Gradient: Ω(θ) θ i = λ sign(θ i ). Induces sparsity: Often many params become zero. Maxnorm: Constrain row vectors w i of weight matrices to w i 2 c. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
12 Weight decay L2 (left) and L1 (right). w unregularized solution, w regularized solution. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
13 Weight decay as Bayesian prior Consider the maximum a posteriori solution Bayes rule: P(θ X) = P(X θ)p(θ) written on log scale: C = log P(X θ) log P(θ) Assuming Gaussian prior P(θ) = N (0, λ 1 I) we get Ω(θ) = i log exp θ2 i 2λ = λ 1 2 θ 2 L 2 regularization Gaussian prior L 1 regularization Laplace prior Maxnorm regularization Uniform prior with finite support Ω = 0 Maximum likelihood Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
14 Data augmentation Image from (Dosovitskiy et al., 2014) Augmented data by imagespecific transformations. E.g. cropping just 2 pixels gets you 9 times the data! Infinite MNIST: Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
15 Injecting noise (Sietsma and Dow, 1991) Inject random noise during training separately in each epoch Can be applied to input data, to hidden activations, or to weights Can be seen as data augmentation Simple end effective Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
16 Injecting noise to inputs (analysis) Inject small additive Gaussian noise at inputs Assume least squares error at output y Taylor series expansion around x Corresponds to penalizing the Jacobian J 2 y 1 J = dy x dx = y 1 x d y c x 1. y c x d For linear networks, this reduces to L 2 penalty Rifai et al. (2011) penalize the Jacobian directly Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
17 Parameter sharing Force sets of parameters to be equal Reduces the number of (unique) parameters Important in convolutional networks (next week) Autoencoders sometimes share weights between encoder and decoder (Oct 28 session) Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
18 Sparse representations Penalize representation h using Ω(h) to make it sparse L 1 penalty on weights makes W sparse Similarly L 1 penalty can make h sparse Also possible to set a desired sparsity level Sparse coding is common in image processing Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
19 Ensemble methods Train several models and take average of their outputs Also known as bagging or model averaging It helps to make individual models different by varying models or algorithms varying hyperparameters varying data (dropping examples or dimensions) varying random seed It is possible to train a single final model to mimick the performance of the ensemble, for testtime computational efficiency (Hinton et al., 2015) Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
20 Dropout (Hinton et al., 2012) Each time we present data example x, randomly delete each hidden node with 0.5 probability Can be seen as injecting noise or as ensemble: Multiplicative binary noise Training an ensemble of 2 h networks with weight sharing At test time, use all nodes but divide weights by 2 Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
21 Dropout training Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
22 Dropout as bagging Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
23 Auxiliary tasks Multitask learning: Parameter sharing between multiple tasks E.g. speech recognition and speaker identification could share lowlevel representations Layerwise pretraining (Hinton and Salakhutdinov, 2006) can be seen as using unsupervised learning as an auxiliary task (Nov 4 session) Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
24 Probabilistic treatment Variational methods starting to appear in deep learning research T Machine Learning: Advanced Probabilistic Methods Jyri Kivinen might discuss these on Nov 11 session Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
25 Adversarial training (Szegedy et al., 2014) Search for an input x near a datapoint x that would have very different output y from y Adversaries can be found surprisingly close! Miato et al. (2015) build a very effective regularizer Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
26 Table of Contents Background Regularization methods Exercises Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
27 Exercises Read Chapter 7 (Regularization) and Chapter 9 (Convolutional Networks) Read the Theano tutorial on Regularization: Extend your MNIST classifier to include regularization. Consider at least L2 weight decay and additive Gaussian noise injected in the inputs. Choose a good regularization strength using a heldout validation set. Tapani Raiko (Aalto University) Session 4: Regularization (Chapter 7) 30 September / 27
Lecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationDeep Learning. Mohammad Ali Keyvanrad Lecture 5:A Review of Artificial Neural Networks (4)
Deep Learning Mohammad Ali Keyvanrad Lecture 5:A Review of Artificial Neural Networks (4) OUTLINE Model Ensembles Regularization Dropout Regularization: A common pattern 10/15/2017 M.A Keyvanrad Deep Learning
More informationCSE Deep Learning Session 2: Introduction to Deep 16 September Learning, Deep 2015Feedforward 1 / 27 N
CSE4050  Deep Learning Session 2: Introduction to Deep Learning, Deep Feedforward Networks Jyri Kivinen Aalto University 16 September 2015 Presentation largely based on material in Lecun et al. (2015)
More informationDropout Training (Hinton et al. 2012)
Dropout Training (Hinton et al. 2012) Aaron Courville IFT6135  Representation Learning Slide Credit: Some slides were taken from Ian Goodfellow 1 Dropout training Introduced in Hinton, G. E., Srivastava,
More informationLecture 10 Summary and reflections
Lecture 10 Summary and reflections Niklas Wahlström Division of Systems and Control Department of Information Technology Uppsala University. Email: niklas.wahlstrom@it.uu.se SML  Lecture 10 Contents Lecture
More informationECE521 Lecture10 Deep Learning
ECE521 Lecture10 Deep Learning Learning fully connected multilayer neural networks For a single data point, we can write the the hidden activations of the fully connected neural network as a recursive
More informationPlankton Image Classification
Plankton Image Classification Sagar Chordia Stanford University sagarc14@stanford.edu Romil Verma Stanford University vermar@stanford.edu Abstract This paper is in response to the National Data Science
More informationImproving neural networks by preventing coadaption of feature detectors
Improving neural networks by preventing coadaption of feature detectors Published by: G.E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever and R. R. Salakhutdinov Presented by: Melvin Laux TEst adhssahss2013
More informationAn Analysis of SingleLayer Networks in Unsupervised Feature Learning. Adam Coates1, Honglak Lee2, Andrew Y. Ng1
An Analysis of SingleLayer Networks in Unsupervised Feature Learning Adam Coates1, Honglak Lee2, Andrew Y. Ng1 Overview A Brief Introduction Unsupervised feature learning framework Experiments and Analysis
More informationNeural Networks. Robert Platt Northeastern University. Some images and slides are used from: 1. CS188 UC Berkeley
Neural Networks Robert Platt Northeastern University Some images and slides are used from: 1. CS188 UC Berkeley Problem we want to solve The essence of machine learning: A pattern exists We cannot pin
More informationFinal exam for CSC 321 April 11, 2013, 7:00pm 9:00pm No aids are allowed.
Your name: Your student number: Final exam for CSC 321 April 11, 2013, 7:00pm 9:00pm No aids are allowed. This exam has two sections, each of which is worth a total of 10 points. Answer all 10 questions
More informationKey Ideas in Machine Learning
CHAPTER 14 Key Ideas in Machine Learning Machine Learning Copyright c 2017. Tom M. Mitchell. All rights reserved. *DRAFT OF December 4, 2017* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This
More informationUnderstanding Deep Learning Requires Rethinking Generalization
Understanding Deep Learning Requires Rethinking Generalization Chiyuan Zhang Massachusetts Institute of Technology chiyuan@mit.edu Samy Bengio Google Brain bengio@google.com Moritz Hardt Google Brain mrtz@google.com
More informationCIS680: Vision & Learning Assignment 2.a: Gradient manipulation. Due: Oct. 16, 2018 at 11:59 pm
CIS680: Vision & Learning Assignment 2.a: Gradient manipulation. Due: Oct. 16, 2018 at 11:59 pm Instructions This is an individual assignment. Individual means each student must hand in their own answers,
More informationNeural Networks and Regularization
Deep Learning Theory and Applications Neural Networks and Regularization Kevin Moon (kevin.moon@yale.edu) Guy Wolf (guy.wolf@yale.edu) CPSC/AMTH 663 Outline 1. Overfitting 2. L2 Regularization 3. Other
More informationSparse Gaussian Graphical Models with Unknown Block Structure
Sparse Gaussian Graphical Models with Unknown Block Structure Department of Computer Science University of British Columbia Department of Computer Science, University of British Columbia 1 Outline Introduction
More informationProgramming Social Robots for Human Interaction. Lecture 4: Machine Learning and Pattern Recognition
Programming Social Robots for Human Interaction Lecture 4: Machine Learning and Pattern Recognition ZhengHua Tan Dept. of Electronic Systems, Aalborg Univ., Denmark zt@es.aau.dk, http://kom.aau.dk/~zt
More informationTTIC 31190: Natural Language Processing
TTIC 31190: Natural Language Processing Kevin Gimpel Winter 2016 Lecture 10: Neural Networks for NLP 1 Announcements Assignment 2 due Friday project proposal due Tuesday, Feb. 16 midterm on Thursday, Feb.
More informationArticle from. Predictive Analytics and Futurism December 2015 Issue 12
Article from Predictive Analytics and Futurism December 2015 Issue 12 The Third Generation of Neural Networks By Jeff Heaton Neural networks are the phoenix of artificial intelligence. Right now neural
More informationCPSC 540: Machine Learning. VAEs and GANs Winter 2018
CPSC 540: Machine Learning VAEs and GANs Winter 2018 Density Estimation Strikes Back One of the hottest topic in machine learning: density estimation? In particular, deep learning for density estimation.
More informationThe Fundamentals of Machine Learning
The Fundamentals of Machine Learning Willie Brink 1, Nyalleng Moorosi 2 1 Stellenbosch University, South Africa 2 Council for Scientific and Industrial Research, South Africa Deep Learning Indaba 2017
More informationSpecial Topic: Deep Learning
Special Topic: Deep Learning Hello! We are Zach Jones and Sohan Nipunage You can find us at: zdj21157@uga.edu smn57958@uga.edu 2 Outline I. II. III. IV. What is Deep Learning? Why Deep Learning? Common
More informationCPSC 340: Machine Learning and Data Mining. Course Review/Preview Fall 2015
CPSC 340: Machine Learning and Data Mining Course Review/Preview Fall 2015 Admin Assignment 6 due now. We will have office hours as usual next week. Final exam details: December 15: 8:3011 (WESB 100).
More informationCOMPUTATIONAL INTELLIGENCE
COMPUTATIONAL INTELLIGENCE AUTOS for feature extraction Adrian Horzyk Autoencoders Autoencoder is a kind of artificial neural networks which is trained to represent a set of training data in an unsupervised
More informationApplications, Deep Learning Networks
COMP9444 13s2 Applications, 1 vi COMP9444: Neural Networks Applications, Deep Learning Networks Example Applications speech phoneme recognition credit card fraud detection financial prediction image classification
More informationStatistical Learning. CS 486/686 Introduction to AI University of Waterloo
Statistical Learning CS 486/686 Introduction to AI University of Waterloo Motivation: Things you know Agents model uncertainty in the world and utility of different courses of actions  Bayes nets are
More informationDeep Learning. Early Work Why Deep Learning Stacked Auto Encoders Deep Belief Networks. l l l l. CS 678 Deep Learning 1
Deep Learning Early Work Why Deep Learning Stacked Auto Encoders Deep Belief Networks CS 678 Deep Learning 1 Deep Learning Overview Train networks with many layers (vs. shallow nets with just a couple
More informationLecture 2 Fundamentals of machine learning
Lecture 2 Fundamentals of machine learning Topics of this lecture Formulation of machine learning Taxonomy of learning algorithms Supervised, semisupervised, and unsupervised learning Parametric and nonparametric
More informationEnsembles. CS Ensembles 1
Ensembles CS 478  Ensembles 1 A Holy Grail of Machine Learning Outputs Just a Data Set or just an explanation of the problem Automated Learner Hypothesis Input Features CS 478  Ensembles 2 Ensembles
More informationCS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh February 28, 2017
CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh February 28, 2017 HW2 due Thursday Announcements Office hours on Thursday: 4:15pm5:45pm Talk at 3pm: http://www.sam.pitt.edu/arc
More informationAlex Zamoshchin (alexzam), Jonathan Gold (johngold)
Alex Zamoshchin (alexzam), Jonathan Gold (johngold) Convolutional Neural Networks for Plankton Classification: Transfer Learning, Data Augmentation, and Ensemble Models 1. ABSTRACT We designed multiple
More informationIs Simple Better?: Revisiting Simple Generative Models for Unsupervised Clustering
Is Simple Better?: Revisiting Simple Generative Models for Unsupervised Clustering Jhosimar Arias Figueroa and Adín Ramírez Rivera Institute of Computing, University of Campinas, Campinas, SP, Brazil jhosimar.figueroa@students.ic.unicamp.br,
More informationUniversität Konstanz,
Universität Konstanz, 11.06.2018 LeNet  LeCun, et al. developed a pioneer ConvNet for handwritten digits:  Many hidden layers  Many kernels in each layer  Pooling of the outputs of nearby replicated
More informationLearning from a Probabilistic Perspective
Learning from a Probabilistic Perspective Data Mining and Concept Learning CSI 5387 1 Learning from a Probabilistic Perspective Bayesian network classifiers Decision trees Random Forest Neural networks
More informationModeling with Keras. Open Discussion Machine Learning Christian Contreras, PhD
Modeling with Keras Open Discussion Machine Learning Christian Contreras, PhD Overview  As practitioners in deep networks, we often want to understand areas of prototyping and modeling. While there are
More information11. Artificial Neural Networks
Foundations of Machine Learning CentraleSupélec Fall 2017 11. Artificial Neural Networks ChloéAgathe Azencot Centre for Computational Biology, Mines ParisTech chloeagathe.azencott@minesparistech.fr
More informationMachine Translation WiSe 2016/2017. Neural Machine Translation
Machine Translation WiSe 2016/2017 Neural Machine Translation Dr. Mariana Neves January 30th, 2017 Overview 2 Introduction Neural networks Neural language models Attentional encoderdecoder Google NMT
More informationDeep (Structured) Learning
Deep (Structured) Learning Yasmine Badr 06/23/2015 NanoCAD Lab UCLA What is Deep Learning? [1] A wide class of machine learning techniques and architectures Using many layers of nonlinear information
More informationLearning. Machine. A First Course in. Simon Rogers Mark Girolami. Chapman & Hall/CRC. CRC Press. Machine Learning & Pattern Recognition Series
Chapman & Hall/CRC Machine Learning & Pattern Recognition Series A First Course in Machine Learning Simon Rogers Mark Girolami CRC Press Taylor & Francis Croup Boca Raton London New York CRC Press is an
More informationDeep metric learning using Triplet network
Deep metric learning using Triplet network Elad Hoffer, Nir Ailon January 2016 Outline 1 Motivation Deep Learning Feature Learning 2 Deep Metric Learning Previous attempts  Siamese Network Triplet network
More informationOverview of Machine Learning and H2O.ai
Overview of Machine Learning and H2O.ai Machine Learning Overview What is machine learning?  Arthur Samuel, 1959 Why now? Data, computers, and algorithms are commodities Unstructured data Increasing
More informationLearning Discrete Representations via Information Maximizing SelfAugmented Training
Learning Discrete Representations via Information Maximizing SelfAugmented Training Weihua Hu, Takeru Miyato, Seiya Tokui, Eiichi Matsumoto, Masashi Sugiyama University of Tokyo RIKEN *Based on the work
More informationCogSci 109: Lecture 23. Mon Dec 2, 2007 Multilayer artificial neural networks, examples, and applications (II)
CogSci 109: Lecture 23 Mon Dec 2, 2007 Multilayer artificial neural networks, examples, and applications (II) Outline for today Announcements Homework announcement Instead of a threshold, we can consider
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 10 2019 Class Outline Introduction 1 week Probability and linear algebra review Supervised
More informationModified Dropout for Training Neural Network
Modified Dropout for Training Neural Network James Duyck Machine Learning Department jduyck@andrew.cmu.edu Min Hyung Lee Machine Learning Department minhyunl@andrew.cmu.edu Eric Lei Machine Learning Department
More informationAn Introduction to Machine Learning
MindLAB Research Group  Universidad Nacional de Colombia Introducción a los Sistemas Inteligentes Outline 1 2 What s machine learning History Supervised learning Nonsupervised learning 3 Observation
More informationSecurity Analytics Review for Final Exam. Purdue University Prof. Ninghui Li
Security Analytics Review for Final Exam Purdue University Prof. Ninghui Li Exam Date/Time Monday Dec 10 (8am 10am) LWSN B134 Organization of the Course Basic machine learning algorithms Neural networks
More informationCSC 2515: Lecture 01: Introduction
CSC 2515: Lecture 01: Introduction Richard Zemel & Raquel Urtasun University of Toronto Sep 17, 2015 Zemel & Urtasun (UofT) CSC 2515: 01Introduction Sep 17, 2015 1 / 50 Today Administration details Why
More informationCapacity, Learning, Teaching
Capacity, Learning, Teaching Xiaojin Zhu Department of Computer Sciences University of WisconsinMadison jerryzhu@cs.wisc.edu 23 Machine learning human learning Learning capacity and generalization bounds
More informationParameter and Structure Learning in Graphical Models
Advanced Signal Processing 2 SE Parameter and Structure Learning in Graphical Models 02.05.2005 Stefan Tertinek turtle@sbox.tugraz.at Outline Review: Graphical models (DGM, UGM) Learning issues (approaches,
More informationSolving Higgs Boson Machine Learning. Challenge using Neural Networks
Solving Higgs Boson Machine Learning Challenge using Neural Networks 1 Solving Higgs Boson Machine Learning Challenge using Neural Networks Varun Thumbe [13773] Satya Prakash P [14610] Indian Institute
More informationCS 510: Lecture 8. Deep Learning, Fairness, and Bias
CS 510: Lecture 8 Deep Learning, Fairness, and Bias Next Week All Presentations, all the time Upload your presentation before class if using slides Sign up for a timeslot google doc, if you haven t already
More informationCategorization of Web News Documents Using Word2Vec and Deep Learning
Categorization of Web News Documents Using Word2Vec and Deep Learning Ryoma Kato/Hosei University Department of Systems Engineering Tokyo, Japan ryoma.kato.ra@stu.hosei.ac.jp Hiroyuki Goto/Hosei University
More informationTraining Neural Networks, Part I. FeiFei Li & Justin Johnson & Serena Yeung. Lecture 61
Lecture 6: Training Neural Networks, Part I Lecture 61 Administrative Assignment 1 due Thursday (today), 11:59pm on Canvas Assignment 2 out today Project proposal due Tuesday April 25 Notes on backprop
More informationlhe Fundamentais of Machine learning 4 Why Use Machine Learning? 7 Supervised/Unsupervised Learning 8 Batch and Online Learning
Table of Contents Preface...................................................................... xiii Part I. lhe Fundamentais of Machine learning 1. The Machine learning landscape.............................................
More informationMachine Learning 101a. Jan Peters Gerhard Neumann
Machine Learning 101a Jan Peters Gerhard Neumann 1 Purpose of this Lecture Statistics and Math Refresher Foundations of machine learning tools for robotics We focus on regression methods and general principles
More informationMachine Learning: Summary
Machine Learning: Summary Greg Grudic CSCI4830 Machine Learning 1 What is Machine Learning? The goal of machine learning is to build computer systems that can adapt and learn from their experience. Tom
More informationHello! Practical deep neural nets for detecting marine
Hello! Practical deep neural nets for detecting marine mammals daniel.nouri@gmail.com @dnouri Kaggle competitions 2 sec sounds right whale upcall? ICML2013 comp results (1) 47k examples, 10% positive AUC:
More informationPlankton Classification Using Hybrid Convolutional NetworkRandom Forests Architectures
Plankton Classification Using Hybrid Convolutional NetworkRandom Forests Architectures Pranav Jindal Department of Computer Science, Stanford University pranavj@stanford.edu Rohit Mundra Department of
More informationCS 760 Machine Learning Spring 2017
Page 1 University of Wisconsin Madison Department of Computer Sciences CS 760 Machine Learning Spring 2017 Final Examination Duration: 1 hour 15 minutes One set of handwritten notes and calculator allowed.
More informationMachine Learning Basics
Deep Learning Theory and Applications Machine Learning Basics Kevin Moon (kevin.moon@yale.edu) Guy Wolf (guy.wolf@yale.edu) CPSC/AMTH 663 Outline 1. What is machine learning? 2. Supervised Learning Regression
More informationMultilayer Perceptron on Interval Data
Multilayer Perceptron on Interval Data Fabrice Rossi 1 and Brieuc ConanGuez 2 1 LISE/CEREMADE, UMR CNRS 7534, Université ParisIX Dauphine, Place du Maréchal de Lattre de Tassigny, 75016 Paris, France
More informationComparison and Combination of Multilayer Perceptrons and Deep Belief Networks in Hybrid Automatic Speech Recognition Systems
APSIPA ASC 2011 Xi an Comparison and Combination of Multilayer Perceptrons and Deep Belief Networks in Hybrid Automatic Speech Recognition Systems Van Hai Do, Xiong Xiao, Eng Siong Chng School of Computer
More informationLecture 7: Distributed Representations
Lecture 7: Distributed Representations Roger Grosse 1 Introduction We ll take a break from derivatives and optimization, and look at a particular example of a neural net that we can train using backprop:
More informationClassification with Deep Belief Networks. HussamHebbo Jae Won Kim
Classification with Deep Belief Networks HussamHebbo Jae Won Kim Table of Contents Introduction... 3 Neural Networks... 3 Perceptron... 3 Backpropagation... 4 Deep Belief Networks (RBM, Sigmoid Belief
More informationImproved Learning of GaussianBernoulli Restricted Boltzmann Machines
Improved Learning of GaussianBernoulli Restricted Boltzmann Machines KyungHyun Cho, Alexander Ilin, and Tapani Raiko Department of Information and Computer Science Aalto University School of Science,
More informationNoiseOut: A Simple Way to Prune Neural Networks
NoiseOut: A Simple Way to Prune Neural Networks Mohammad Babaeizadeh, Paris Smaragdis & Roy H. Campbell Department of Computer Science University of Illinois at UrbanaChampaign {mb2,paris,rhc}@illinois.edu.edu
More informationCSC 411: Introduction to Machine Learning
CSC 411: duction to Machine Learning Lecture 1  duction Ethan Fetaya, James Lucas and Emad Andrews University of Toronto Today Administration details Why is machine learning so cool? The Team I Instructors:
More informationWord Recognition with Conditional Random Fields
Outline ord Recognition with Conditional Random Fields Jeremy Morris 2/05/2010 ord Recognition CRF Pilot System  TIDIGITS Larger Vocabulary  SJ Future ork 1 2 Conditional Random Fields (CRFs) Discriminative
More informationCSCI 315: Artificial Intelligence through Deep Learning
CSCI 315: Artificial Intelligence through Deep Learning W&L Winter Term 2017 Prof. Levy Autoencoder Networks: Embedding and Representation Learning (Chapter 6) Motivation Representing words and other data
More informationSpeech Recognition Deep Speech 2: EndtoEnd Speech Recognition in English and Mandarin
Speech Recognition Deep Speech 2: EndtoEnd Speech Recognition in English and Mandarin Amnon Drory & Matan Karo 19/12/2017 Deep Speech 1 Overview 19/12/2017 Deep Speech 2 Automatic Speech Recognition
More informationRelation Classification with Gated Recursive Convolutional Networks
Relation Classification with Gated Recursive Convolutional Networks KarlHeinz Krachenfels CIS, LMUMunich, Germany February 21, 2017 Abstract In this work we investigate variants of recursive Convolutional
More informationThe Machine Learning Landscape
The Machine Learning Landscape Vineet Bansal Research Software Engineer, Center for Statistics & Machine Learning vineetb@princeton.edu Oct 31, 2018 What is ML? A field of study that gives computers the
More informationLanguage Independent Automatic Framework for Entity Extraction in Indian Languages
IIT(BHU)@IECSILFIRE2018: Language Independent Automatic Framework for Entity Extraction in Indian Languages Akanksha Mishra, Rajesh Kumar Mundotiya, and Sukomal Pal Indian Institute of Technology, Varanasi,
More informationArtificial Neural Networks. Andreas Robinson 12/19/2012
Artificial Neural Networks Andreas Robinson 12/19/2012 Introduction Artificial Neural Networks Machine learning technique Learning from past experience/data Predicting/classifying novel data Biologically
More informationSurvey Analysis of Machine Learning Methods for Natural Language Processing for MBTI Personality Type Prediction
Survey Analysis of Machine Learning Methods for Natural Language Processing for MBTI Personality Type Prediction Brandon Cui (bcui19@stanford.edu) 1 Calvin Qi (calvinqi@stanford.edu) 2 Abstract We studied
More informationCSC 411/2515 Machine Learning and Data Mining Assignment 2 Out: Oct. 28 Due: Nov 16 [noon] k=1
CSC 411/2515 Machine Learning and Data Mining Assignment 2 Out: Oct. 28 Due: Nov 16 [noon] Overview In this assignment, you will experiment with a neural network and mixture of Gaussians model. Some code
More informationWord Recognition with Conditional Random Fields. Jeremy Morris 2/05/2010
ord Recognition with Conditional Random Fields Jeremy Morris 2/05/2010 1 Outline Background ord Recognition CRF Model Pilot System  TIDIGITS Larger Vocabulary  SJ Future ork 2 Background Conditional
More informationIntroduction to Machine Learning 1. Nov., 2018 D. Ratner SLAC National Accelerator Laboratory
Introduction to Machine Learning 1 Nov., 2018 D. Ratner SLAC National Accelerator Laboratory Introduction What is machine learning? Arthur Samuel (1959): Ability to learn without being explicitly programmed
More informationarxiv: v3 [cs.lg] 9 Mar 2014
Learning Factored Representations in a Deep Mixture of Experts arxiv:1312.4314v3 [cs.lg] 9 Mar 2014 David Eigen 1,2 Marc Aurelio Ranzato 1 Ilya Sutskever 1 1 Google, Inc. 2 Dept. of Computer Science, Courant
More informationDeep Neural Networks for Acoustic Modelling. Bajibabu Bollepalli Hieu Nguyen Rakshith Shetty Pieter Smit (Mentor)
Deep Neural Networks for Acoustic Modelling Bajibabu Bollepalli Hieu Nguyen Rakshith Shetty Pieter Smit (Mentor) Introduction Automatic speech recognition Speech signal Feature Extraction Acoustic Modelling
More informationMultiPrediction Deep Boltzmann Machines
MultiPrediction Deep Boltzmann Machines Ian J. Goodfellow, Mehdi Mirza, Aaron Courville, Yoshua Bengio Département d informatique et de recherche opérationnelle Université de Montréal Montréal, QC H3C
More informationDeep Generative Models:
Deep Generative Models: GANs and VAE Jakub M. Tomczak AMLAB, Universiteit van Amsterdam Split, Croatia 2017 Do we need generative modeling? Do we need generative modeling? Do we need generative modeling?
More informationCSC412/2506 Probabilistic Learning and Reasoning. Introduction
CSC412/2506 Probabilistic Learning and Reasoning Introduction Today Course information Overview of ML with examples Ungraded, anonymous background quiz Thursday: Basics of ML vocabulary (crossvalidation,
More informationCS230: Lecture 5 Case Study
CS230: Lecture 5 Case Study Kian Katanforoosh Problem statement: LiveCell Detection Goal: determining which parts of a microscope image corresponds to which individual cells. Data: Doctors have collected
More informationIncorporating Semantic Information into Image Classifiers
Incorporating Semantic Information into Image Classifiers Osbert Bastani and Hamsa Sridhar Advised by Richard Socher December 14, 2012 1 Introduction In this project, we are investigating the incorporation
More informationLearning Featurebased Semantics with Autoencoder
Wonhong Lee Minjong Chung wonhong@stanford.edu mjipeo@stanford.edu Abstract It is essential to reduce the dimensionality of features, not only for computational efficiency, but also for extracting the
More informationReinforcement Learning for NLP
Reinforcement Learning for NLP Caiming Xiong Salesforce Research CS224N/Ling284 Outline Introduction to Reinforcement Learning Policybased Deep RL Valuebased Deep RL Examples of RL for NLP Many Faces
More informationCOMS 4771 Introduction to Machine Learning. Nakul Verma
COMS 4771 Introduction to Machine Learning Nakul Verma Machine learning: what? Study of making machines learn a concept without having to explicitly program it. Constructing algorithms that can: learn
More informationDudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA
Adult Income and Letter Recognition  Supervised Learning Report An objective look at classifier performance for predicting adult income and Letter Recognition Dudon Wai Georgia Institute of Technology
More information10601 Machine Learning Assignment 3: Logistic Regression
060 Machine Learning Assignment 3: Logistic Regression Due: Sept. 25th, 23:59 EST, via AutoLab Late submission due: Sept. 27th, 23:59 EST with 50% discount of credits TAsincharge: William Wang, Pengtao
More informationTraining Neural Networks, Part 2. FeiFei Li & Justin Johnson & Serena Yeung. Lecture 71
Lecture 7: Training Neural Networks, Part 2 Lecture 71 Administrative  Assignment 1 is being graded, stay tuned  Project proposals due today by 11:59pm  Assignment 2 is out, due Thursday May 4 at 11:59pm
More informationActive Learning for High Dimensional Inputs using Bayesian Convolutional Neural Networks
Active Learning for High Dimensional Inputs using Bayesian Convolutional Neural Networks Riashat Islam Department of Engineering University of Cambridge M.Phil in Machine Learning, Speech and Language
More informationLinear Models Continued: Perceptron & Logistic Regression
Linear Models Continued: Perceptron & Logistic Regression CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Linear Models for Classification Feature function
More informationA study of the NIPS feature selection challenge
A study of the NIPS feature selection challenge Nicholas Johnson November 29, 2009 Abstract The 2003 Nips Feature extraction challenge was dominated by Bayesian approaches developed by the team of Radford
More informationDeep neural networks III
Deep neural networks III June 5 th, 2018 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, JiaBin Huang, Derek Hoiem, Adriana Kovashka, Announcements PS due 6/ (Thurs), 11:59 pm Review
More informationCSC412/2506 Probabilistic Learning and Reasoning
CSC412/2506 Probabilistic Learning and Reasoning Introduction Jesse Bettencourt Today Course information Overview of ML with examples Ungraded, anonymous background quiz Thursday: Basics of ML vocabulary
More informationUnsupervised Learning: Clustering
Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning
More information15 : Case Study: Topic Models
10708: Probabilistic Graphical Models, Spring 2015 15 : Case Study: Topic Models Lecturer: Eric P. Xing Scribes: Xinyu Miao,Yun Ni 1 Task Humans cannot afford to deal with a huge number of text documents
More informationBackpropagation in recurrent MLP
Backpropagation in recurrent MLP Christopher Bishop, Pattern Recognition and Machine Learning, Springer, 2006 Chapter 5.3 Training and design issues in MLP Christopher Bishop, Pattern Recognition and Machine
More information