Forget catastrophic forgetting: AI that learns after deployment

Similar documents
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Blended E-learning in the Architectural Design Studio

Python Machine Learning

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

arxiv: v1 [cs.lg] 15 Jun 2015

File # for photo

Breaking the Habit of Being Yourself Workshop for Quantum University

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

Computerized Adaptive Psychological Testing A Personalisation Perspective

Customised Software Tools for Quality Measurement Application of Open Source Software in Education

Lecture 1: Machine Learning Basics

Modeling function word errors in DNN-HMM based LVCSR systems

Laboratorio di Intelligenza Artificiale e Robotica

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

How People Learn Physics

Accelerated Learning Course Outline

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

TotalLMS. Getting Started with SumTotal: Learner Mode

Accelerated Learning Online. Course Outline

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

Axiom 2013 Team Description Paper

(Sub)Gradient Descent

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

ED 294 EDUCATIONAL PSYCHOLOGY

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

INPE São José dos Campos

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

Human Emotion Recognition From Speech

Smiley Face Feedback Form

Cognitive Self- Regulation

Preprint.

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Knowledge Transfer in Deep Convolutional Neural Nets

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Nothing is constant, except change - about the hard job of East German SMEs to move towards new markets

New Paths to Learning with Chromebooks

Evolutive Neural Net Fuzzy Filtering: Basic Description

Artificial Neural Networks written examination

No Parent Left Behind

Your School and You. Guide for Administrators

Artificial Neural Networks

Lucy Calkins Units of Study 3-5 Heinemann Books Support Document. Designed to support the implementation of the Lucy Calkins Curriculum

Competent Mortgage Adviser Certificate (CMAcert ) Study Guide

M55205-Mastering Microsoft Project 2016

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Modeling function word errors in DNN-HMM based LVCSR systems

Australian Journal of Basic and Applied Sciences

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

MULTIMEDIA Motion Graphics for Multimedia

Emergent Narrative As A Novel Framework For Massively Collaborative Authoring

Deep Neural Network Language Models

Critical Thinking in the Workplace. for City of Tallahassee Gabrielle K. Gabrielli, Ph.D.

Spinal Cord. Student Pages. Classroom Ac tivities

MyUni - Turnitin Assignments

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Beyond Classroom Solutions: New Design Perspectives for Online Learning Excellence

INDIVIDUALIZED STUDY, BIS

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

Speech Emotion Recognition Using Support Vector Machine

SARDNET: A Self-Organizing Feature Map for Sequences

Steve Miller UNC Wilmington w/assistance from Outlines by Eileen Goldgeier and Jen Palencia Shipp April 20, 2010

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

STUDENT MOODLE ORIENTATION

Alyson D. Stover, MOT, JD, OTR/L, BCP

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Top US Tech Talent for the Top China Tech Company

Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures

The University of Amsterdam s Concept Detection System at ImageCLEF 2011

A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation

DYNAMIC ADAPTIVE HYPERMEDIA SYSTEMS FOR E-LEARNING

Ericsson Wallet Platform (EWP) 3.0 Training Programs. Catalog of Course Descriptions

The Good Judgment Project: A large scale test of different methods of combining expert predictions

SELF-STUDY QUESTIONNAIRE FOR REVIEW of the COMPUTER SCIENCE PROGRAM

Council of Educational Facilities Planners, International

Modelling interaction during small-group synchronous problem-solving activities: The Synergo approach.

Monitoring Metacognitive abilities in children: A comparison of children between the ages of 5 to 7 years and 8 to 11 years

UTILITY POLE ATTACHMENTS Understanding New FCC Regulations and Industry Trends

Ecosystem: Description of the modules:

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

XXII BrainStorming Day

From Self Hosted to SaaS Our Journey (LEC107648)

Storytelling Made Simple

Using EEG to Improve Massive Open Online Courses Feedback Interaction

Outreach Connect User Manual

Master s Programme in Computer, Communication and Information Sciences, Study guide , ELEC Majors

The Foundations of Interpersonal Communication

arxiv: v1 [cs.lg] 7 Apr 2015

arxiv: v1 [cs.cl] 27 Apr 2016

Generative models and adversarial training

PRODUCT COMPLEXITY: A NEW MODELLING COURSE IN THE INDUSTRIAL DESIGN PROGRAM AT THE UNIVERSITY OF TWENTE

Introduction to Mobile Learning Systems and Usability Factors

Learning From the Past with Experiment Databases

PREP S SPEAKER LISTENER TECHNIQUE COACHING MANUAL

Taxonomy of the cognitive domain: An example of architectural education program

WHY GO TO GRADUATE SCHOOL?

Transcription:

Forget catastrophic forgetting: AI that learns after deployment Anatoly Gorshechnikov CTO, Neurala 1

Neurala at a glance Programming neural networks on GPUs since circa 2 B.C. Founded in 2006 expecting that there will be a programmable GPU in every cell phone Concentrating on the edgebased cloud indifferent AI Actively growing Experience in building and commercializing AI 2

Built with Brains For Bots SDK Example: inspections with drones Use case: inspection of telecommunication towers, solar panels, roofs, power lines Problem: what if tomorrow we need to add new data to the set? 3

Theory of Catastrophic Forgetting at a glance Distributed representation: very many neurons are involved in correct classification of every object Very many weights are important for each classification Learning of new objects perturbs all weights, hence forgetting 4

Brute Force Solution Combine the old and the new datasets and retrain the network General issues: Requires powerful server + Takes a lot of time 5

Many More Practical Client-Specific Issues Clients don t like to share their data Clients definitely do not want us to keep their data after training is done How can we combine old and new data sets in these conditions? Client wants their toy to recognize their other toys: easy factory pretraining Client also wants their toy to recognize its owner: need to train after deployment Privacy laws do not allow uploading of kid s images to the cloud 6

How can we Add new data to existing DNN knowledge without forgetting the old data? Do this without powerful servers and cloud access? Do this within seconds rather than hours? 7

Ignorance Is Bliss We age, advance in our career paths, and quit reviewing papers New generation does not pay attention to papers from 20 years ago A lot of ways to alleviate catastrophic forgetting we discussed in mid-90es 8

How Psych Took Insight from Computer Science Brain has multiple neural networks with different properties and functions Need an interacting system of short and long term memory 9

Deeper in Neurobiology Hasselmo (1999) Neuromodulation: Acetylcholine and memory consolidastion Brain switches between learning and recall modes by regulating ACh levels High levels of ACh suppress feedback and enhance feedforward processing Low levels do the opposite 10

One of Most Recent Solutions Kirkpatrick et al (2017) Overcoming catastrophic forgetting in neural networks Protects individual network parameters such as synaptic weights by evaluating their importance for prior learning Supported by neurophysiology (see Hasselmo, 2017 for references) Can be complimentary to our solution 11

Flashback from GTC 2016 35 30 25 20 15 10 5 0 1188 4752 10680 19000 29700 42768 Computational complexity of neurons on GPU for different network sizes 35 30 25 20 15 10 5 0 1188 4752 10680 19000 29700 42768 Amount of communication between neurons on GPU 12

No Need to Restrict Ourselves to Simple Models Design a fast learning system based on what we know about the hippocampus: High degree of recurrent projections (auto or heteroassociative NN) Hebbian-like learning Alteration between learning and recall modes High learning rates for learning mode Local inhibition in learning mode is removed during recall mode 13

Neurala s Solution for Fast Learning (Recognition) For recognition learning we had a solution for some time Take the existing recognition network (AlexNet trained on ImageNet) Surgically insert our hippocampus : fast learning associative NN The resulting architecture learns fast and can run even without GPUs 14

Neurala s Solution for Fast Learning (Detection) For detection we show our solution here for the first time Take the existing detection network (YOLO) Surgically insert an extended version of the hippocampus : need to detect location, not just class Needs GPU to run smoothly (dev code on TX1 ~5fps) Needs about 10-20s of training per new object Currently is distance sensitive Shown in booth 522 Add tracker for the learning process 15

Future Steps Integrate with Kirkpatrick et al (2017) solution for sleep consolidation Switch to segmentation from bounding boxes Add the ability to selectively forget bad data Add the ability to share new knowledge between agents directly 16

Ultimate Goal 17

Questions? Neurala Inc. 8 St. Mary s Street, Suite 613 Boston, MA 02215 info@neurala.com sales@neurala.com tel. +1.671.418.6161 18