Deep Learning in Computational Chemistry

Similar documents
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Knowledge Transfer in Deep Convolutional Neural Nets

Python Machine Learning

A Neural Network GUI Tested on Text-To-Phoneme Mapping

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

Artificial Neural Networks written examination

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

INPE São José dos Campos

(Sub)Gradient Descent

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

A Deep Bag-of-Features Model for Music Auto-Tagging

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

arxiv: v1 [cs.lg] 15 Jun 2015

Generative models and adversarial training

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Calibration of Confidence Measures in Speech Recognition

Test Effort Estimation Using Neural Network

Lecture 1: Machine Learning Basics

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

A study of speaker adaptation for DNN-based speech synthesis

A Review: Speech Recognition with Deep Learning Methods

Evolutive Neural Net Fuzzy Filtering: Basic Description

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

SARDNET: A Self-Organizing Feature Map for Sequences

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Learning Methods for Fuzzy Systems

An empirical study of learning speed in backpropagation

CSL465/603 - Machine Learning

Attributed Social Network Embedding

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

Model Ensemble for Click Prediction in Bing Search Ads

Human Emotion Recognition From Speech

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Deep Neural Network Language Models

Softprop: Softmax Neural Network Backpropagation Learning

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE

Axiom 2013 Team Description Paper

THE enormous growth of unstructured data, including

arxiv: v1 [cs.lg] 7 Apr 2015

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

On the Formation of Phoneme Categories in DNN Acoustic Models

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

Evolution of Symbolisation in Chimpanzees and Neural Nets

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Artificial Neural Networks

Second Exam: Natural Language Parsing with Neural Networks

WHEN THERE IS A mismatch between the acoustic

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

CS 446: Machine Learning

DNN ACOUSTIC MODELING WITH MODULAR MULTI-LINGUAL FEATURE EXTRACTION NETWORKS

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

arxiv: v2 [cs.ir] 22 Aug 2016

UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Learning to Schedule Straight-Line Code

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Issues in the Mining of Heart Failure Datasets

Laboratorio di Intelligenza Artificiale e Robotica

arxiv: v1 [cs.cv] 10 May 2017

Exploration. CS : Deep Reinforcement Learning Sergey Levine

CS Machine Learning

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures

Speaker Identification by Comparison of Smart Methods. Abstract

Probabilistic Latent Semantic Analysis

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Speech Emotion Recognition Using Support Vector Machine

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Reducing Features to Improve Bug Prediction

Forget catastrophic forgetting: AI that learns after deployment

Lecture 10: Reinforcement Learning

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Using Deep Convolutional Neural Networks in Monte Carlo Tree Search

Dual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors

Word Segmentation of Off-line Handwritten Documents

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

arxiv: v2 [cs.ro] 3 Mar 2017

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

*** * * * COUNCIL * * CONSEIL OFEUROPE * * * DE L'EUROPE. Proceedings of the 9th Symposium on Legal Data Processing in Europe

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

***** Article in press in Neural Networks ***** BOTTOM-UP LEARNING OF EXPLICIT KNOWLEDGE USING A BAYESIAN ALGORITHM AND A NEW HEBBIAN LEARNING RULE

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

arxiv: v2 [cs.cv] 30 Mar 2017

SORT: Second-Order Response Transform for Visual Recognition

Corrective Feedback and Persistent Learning for Information Extraction

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Rule Learning With Negation: Issues Regarding Effectiveness

Laboratorio di Intelligenza Artificiale e Robotica

An OO Framework for building Intelligence and Learning properties in Software Agents

Глубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках

Early Model of Student's Graduation Prediction Based on Neural Network

A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

A deep architecture for non-projective dependency parsing

Transcription:

Deep Learning in Computational Chemistry

What is a Neuron? A neuron is a computaeonal unit in the neural network that exchanges messages with each other. Possible acevaeon funceons: Step funceon/ threshold funceon Sigmoid funceon (a.k.a, logisec funceon)

Feed Forward & Backpropagation Feed forward algorithm: AcEvate the neurons from the left to the right. BackpropagaEon: Randomly iniealize the parameters Calculate total error at the right, "6(%) Then calculate contribueons to error, &', at each step going backwards.

-0.06 2.7-2.5-8.6 f(x) 0.002 1.4 x = -0.06 2.7 + -2.5-8.6 + 1.4 0.002 = 21.34

Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc Initialise with random weights

Training data Present a training pattern Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc 1.4 2.7 1.9

Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc Feed it through to get output 1.4 2.7 0.8 1.9

Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc 1.4 Compare with target output 2.7 0.8 0 1.9 error 0.8

Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc 1.4 Adjust weights based on error 2.7 0.8 0 1.9 error 0.8

Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc 6.4 2.8 1.7 Present a training pattern

Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc 6.4 Feed it through to get output 2.8 0.9 1.7

Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc 6.4 Compare with target output 2.8 0.9 1 1.7 error -0.1

Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc 6.4 Adjust weights based on error 2.8 0.9 1 1.7 error -0.1

Training data Fields class 1.4 2.7 1.9 0 3.8 3.4 3.2 0 6.4 2.8 1.7 1 4.1 0.1 0.2 0 etc 6.4 And so on. 2.8 0.9 1 1.7 error -0.1 Repeat this thousands, maybe millions of times each time taking a random training instance, and making slight weight adjustments Algorithms for weight adjustment are designed to make changes that will reduce the error

The Main Points to Remember weight-learning algorithms for NNs are simple they work by making thousands and thousands of tiny adjustments, each making the network do better at the most recent pattern, but perhaps a little worse on many others but, by luck, eventually this tends to be good enough to learn effective classifiers for many real applications

The Decision Boundary Perspective Initial random weights

The Decision Boundary Perspective Present a training instance / adjust the weights

The Decision Boundary Perspective Present a training instance / adjust the weights

The Decision Boundary Perspective Present a training instance / adjust the weights

The Decision Boundary Perspective Present a training instance / adjust the weights

The Decision Boundary Perspective Eventually.

If f(x) is linear, the NN can only draw straight decision boundaries (even if there are many layers of units)

NNs use nonlinear f(x) so they can draw complex boundaries, but keep the data unchanged SVMs only draw straight lines, but they transform the data first in a way that makes that OK

Limitations of Neural Networks Random ini8aliza8on + densely connected networks lead to: High cost Each neuron in the neural network can be considered as a logisec regression. Training the enere neural network is to train all the interconnected logisec regressions. Difficult to train as the number of hidden layers increases Recall that logisec regression is trained by gradient descent. In backpropagaeon, gradient is progressively geqng more dilute. That is, below top layers, the correceon signal &' is minimal. Stuck in local opema The objeceve funceon of the neural network is usually not convex. The random iniealizaeon does not guarantee stareng from the proximity of global opema. SoluEon Deep Learning/Learning muleple levels of representaeon

What exactly is deep learning? Why is it generally better than other methods on image, speech and certain other types of data? The short answers Deep Learning means using a neural network with several layers of nodes between input and output The series of layers between input & output do feature identification and processing in a series of stages, just as our brains seem to.

Multi-layer neural networks have been around for about 25 years. What s actually new? We have always had good algorithms for learning the weights in networks with 1 hidden layer But these algorithms are not good at learning the weights for networks with more hidden layers What s new is: algorithms for training many-layer networks

How to Train a Multi-Layer Network Train this layer first then this layer then this layer then this layer finally this layer

EACH of the (non-output) layers is trained to be an auto-encoder. Basically, it is forced to learn good features that describe what comes from the previous layer.

Networks for Deep Learning Deep Belief Networks and Autoencoders employs layer-wise unsupervised learning to iniealize each layer and capture muleple levels of representaeon simultaneously. Hinton, G. E, Osindero, S., and Teh, Y. W. (2006). A fast learning algorithm for deep belief nets. Neural ComputaEon, 18:1527-1554. Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. (2007). Greedy Layer-Wise Training of Deep Networks, Advances in Neural InformaEon Processing Systems 19 Convolu9onal Neural Network organizes neurons based on animal s visual cortex system, which allows for learning pa_erns at both local level and global level. Y. LeCun, L. Bo_ou, Y. Bengio and P. Haffner: Gradient-Based Learning Applied to Document RecogniEon, Proceedings of the IEEE, 86(11):2278-2324, November 1998

Deep Belief Networks A deep belief network (DBN) is a probabilisec, generaeve model made up of muleple layers of hidden units. A composieon of simple learning modules that make up each layer A DBN can be used to generaevely pre-train a DNN by using the learned DBN weights as the inieal DNN weights. Back-propagaEon or other discriminaeve algorithms can then be applied for fine-tuning of these weights. Advantages: ParEcularly helpful when limited training data are available These pre-trained weights are closer to the opemal weights than are randomly chosen inieal weights.

Convolutional Neural Networks ConvoluEonal Neural Networks are inspired by mammalian visual cortex. The visual cortex contains a complex arrangement of cells, which are sensieve to small sub-regions of the visual field, called a recepeve field. These cells act as local filters over the input space and are well-suited to exploit the strong spaeally local correlaeon present in natural images. Two basic cell types: Simple cells respond maximally to specific edge-like pa_erns within their recepeve field. Complex cells have larger recepeve fields and are locally invariant to the exact posieon of the pa_ern.

Yann LeCun (56, born in Paris, now lives in NYC) LeNet image recognition inventor of backpropagation methods for training, and of convolutional neural nets current director of Artificial Intellegence at Facebook

Convolutional Neural Network for Image Classification

Representation of an Image as Pixels

Image Filter

The ReLU (Rectified Linear Unit) Operation

The Max Pooling Operation

Pooling Applied to Rectified Feature Maps

Pooling Applied to Rectified Feature Maps

Training of a Convolutional Neural Net Step 1: Initialize all filters and parameters / weights with random values. Step 2: The network takes a training image as input, goes through the forward propagation step (convolution, ReLU and pooling operations along with forward propagation in the Fully Connected layer) and finds the output probabilities for each class. Lets say the output probabilities for the boat image above are [0.2, 0.4, 0.1, 0.3] Since weights are randomly assigned for the first training example, output probabilities are also random. Step 3: Calculate the total error at the output layer (summation over all 4 classes) Total Error = ½ (target probability output probability) ² Step 4: Use Backpropagation to calculate the gradients of the error with respect to all weights in the network and use gradient descent to update all filter values / weights and parameter values to minimize the output error. The weights are adjusted in proportion to their contribution to the total error. When the same image is input again, output probabilities might now be [0.1, 0.1, 0.7, 0.1], which is closer to the target vector [0, 0, 1, 0]. This means that the network has learnt to classify this particular image correctly by adjusting its weights / filters such that the output error is reduced. Parameters like number of filters, filter sizes, architecture of the network etc. have all been fixed before Step 1 and do not change during training process only the values of the filter matrix and connection weights get updated. Step 5: Repeat steps 2-4 with all images in the training set.

Convolutional Neural Nets: Putting It All Together

TensorFlow is an open source library for machine learning tasks developed by Google and first released in November 2015 It is a second generation system for machine learning, based on deep learning neural networks RackBrain now handles a large number of Google searches, and is powered by TensorFlow TensorFlow calculations are generally expressed as stateful dataflow graphs. The name, TensorFlow

DeepDream - Convolutional Neural Network Original Image After 10 Iterations of DeepDream

Three Men in a Pool (DeepDream)

In Nature, 27 January 2016 DeepMind s program AlphaGo beat Fan Hui, the European Go champion, five Emes out of five in tournament condieons... AlphaGo was not preprogrammed to play Go: rather, it learned using a generalpurpose algorithm that allowed it to interpret the game s pa_erns. AlphaGo program applied deep learning in neural networks (convolueonal NN) braininspired programs in which conneceons between layers of simulated neurons are strengthened through examples and experience.

Predicted C7H10O2 Isomerization Enthalpies JCTC, 11, 2087-2096 (2015)

JCTC, 11, 3225-3233 (2015)

JCTC, Vol 13, (2017)

Solvation via FEP and MDFP+ Machine Learning JCTC, Vol 13 (2017)

Atomic Forces from Machine Learning

Newton-in-a-Box Molecular MD System Trajectory F = ma Deep Learning??