Forget catastrophic forgetting: AI that learns after deployment Anatoly Gorshechnikov CTO, Neurala 1
Neurala at a glance Programming neural networks on GPUs since circa 2 B.C. Founded in 2006 expecting that there will be a programmable GPU in every cell phone Concentrating on the edgebased cloud indifferent AI Actively growing Experience in building and commercializing AI 2
Built with Brains For Bots SDK Example: inspections with drones Use case: inspection of telecommunication towers, solar panels, roofs, power lines Problem: what if tomorrow we need to add new data to the set? 3
Theory of Catastrophic Forgetting at a glance Distributed representation: very many neurons are involved in correct classification of every object Very many weights are important for each classification Learning of new objects perturbs all weights, hence forgetting 4
Brute Force Solution Combine the old and the new datasets and retrain the network General issues: Requires powerful server + Takes a lot of time 5
Many More Practical Client-Specific Issues Clients don t like to share their data Clients definitely do not want us to keep their data after training is done How can we combine old and new data sets in these conditions? Client wants their toy to recognize their other toys: easy factory pretraining Client also wants their toy to recognize its owner: need to train after deployment Privacy laws do not allow uploading of kid s images to the cloud 6
How can we Add new data to existing DNN knowledge without forgetting the old data? Do this without powerful servers and cloud access? Do this within seconds rather than hours? 7
Ignorance Is Bliss We age, advance in our career paths, and quit reviewing papers New generation does not pay attention to papers from 20 years ago A lot of ways to alleviate catastrophic forgetting we discussed in mid-90es 8
How Psych Took Insight from Computer Science Brain has multiple neural networks with different properties and functions Need an interacting system of short and long term memory 9
Deeper in Neurobiology Hasselmo (1999) Neuromodulation: Acetylcholine and memory consolidastion Brain switches between learning and recall modes by regulating ACh levels High levels of ACh suppress feedback and enhance feedforward processing Low levels do the opposite 10
One of Most Recent Solutions Kirkpatrick et al (2017) Overcoming catastrophic forgetting in neural networks Protects individual network parameters such as synaptic weights by evaluating their importance for prior learning Supported by neurophysiology (see Hasselmo, 2017 for references) Can be complimentary to our solution 11
Flashback from GTC 2016 35 30 25 20 15 10 5 0 1188 4752 10680 19000 29700 42768 Computational complexity of neurons on GPU for different network sizes 35 30 25 20 15 10 5 0 1188 4752 10680 19000 29700 42768 Amount of communication between neurons on GPU 12
No Need to Restrict Ourselves to Simple Models Design a fast learning system based on what we know about the hippocampus: High degree of recurrent projections (auto or heteroassociative NN) Hebbian-like learning Alteration between learning and recall modes High learning rates for learning mode Local inhibition in learning mode is removed during recall mode 13
Neurala s Solution for Fast Learning (Recognition) For recognition learning we had a solution for some time Take the existing recognition network (AlexNet trained on ImageNet) Surgically insert our hippocampus : fast learning associative NN The resulting architecture learns fast and can run even without GPUs 14
Neurala s Solution for Fast Learning (Detection) For detection we show our solution here for the first time Take the existing detection network (YOLO) Surgically insert an extended version of the hippocampus : need to detect location, not just class Needs GPU to run smoothly (dev code on TX1 ~5fps) Needs about 10-20s of training per new object Currently is distance sensitive Shown in booth 522 Add tracker for the learning process 15
Future Steps Integrate with Kirkpatrick et al (2017) solution for sleep consolidation Switch to segmentation from bounding boxes Add the ability to selectively forget bad data Add the ability to share new knowledge between agents directly 16
Ultimate Goal 17
Questions? Neurala Inc. 8 St. Mary s Street, Suite 613 Boston, MA 02215 info@neurala.com sales@neurala.com tel. +1.671.418.6161 18