Modelling Time Series Data with Theano. Charles Killam, LP.D. Certified Instructor, NVIDIA Deep Learning Institute NVIDIA Corporation

Similar documents
Python Machine Learning

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Modeling function word errors in DNN-HMM based LVCSR systems

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Modeling function word errors in DNN-HMM based LVCSR systems

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

Top US Tech Talent for the Top China Tech Company

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Forget catastrophic forgetting: AI that learns after deployment

Dialogue Live Clientside

Deep Neural Network Language Models

Lecture 1: Machine Learning Basics

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Three Strategies for Open Source Deployment: Substitution, Innovation, and Knowledge Reuse

arxiv: v1 [cs.lg] 7 Apr 2015

Intel-powered Classmate PC. SMART Response* Training Foils. Version 2.0

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Netsmart Sandbox Tour Guide Script

A study of speaker adaptation for DNN-based speech synthesis

Carnegie Mellon University Department of Computer Science /615 - Database Applications C. Faloutsos & A. Pavlo, Spring 2014.

An Introduction to Simio for Beginners

Houghton Mifflin Online Assessment System Walkthrough Guide

IVY TECH COMMUNITY COLLEGE

arxiv: v4 [cs.cl] 28 Mar 2016

Appendix L: Online Testing Highlights and Script

Financial Aid Self-Service. Service Preview. January 29, 2009

Android App Development for Beginners

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

Axiom 2013 Team Description Paper

Human Emotion Recognition From Speech

Model Ensemble for Click Prediction in Bing Search Ads

Assignment 1: Predicting Amazon Review Ratings

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

PowerTeacher Gradebook User Guide PowerSchool Student Information System

SECTION 12 E-Learning (CBT) Delivery Module

SARDNET: A Self-Organizing Feature Map for Sequences

Second Exam: Natural Language Parsing with Neural Networks

Beginning Blackboard. Getting Started. The Control Panel. 1. Accessing Blackboard:

arxiv: v1 [cs.cl] 27 Apr 2016

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Office of Planning and Budgets. Provost Market for Fiscal Year Resource Guide

Earthsoft s EQuIS Database Lower Duwamish Waterway Source Data Management

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I

UNIDIRECTIONAL LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORK WITH RECURRENT OUTPUT LAYER FOR LOW-LATENCY SPEECH SYNTHESIS. Heiga Zen, Haşim Sak

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

arxiv: v1 [cs.cv] 10 May 2017

A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval

i>clicker Setup Training Documentation This document explains the process of integrating your i>clicker software with your Moodle course.

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Creating a Test in Eduphoria! Aware

ACADEMIC TECHNOLOGY SUPPORT

A Reinforcement Learning Variant for Control Scheduling

CENTRAL MAINE COMMUNITY COLLEGE Introduction to Computer Applications BCA ; FALL 2011

Student Handbook. This handbook was written for the students and participants of the MPI Training Site.

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

SkillPort Quick Start Guide 7.0

Outreach Connect User Manual

FACULTY GUIDE ON INTERNSHIP ADVISING

Moodle Student User Guide

Online Testing - Quick Troubleshooting Tips

How to set up gradebook categories in Moodle 2.

STUDENT MOODLE ORIENTATION

Learning From the Past with Experiment Databases

GACE Computer Science Assessment Test at a Glance

Executive Guide to Simulation for Health

Continuing Education Unit Program Course Catalog

arxiv: v1 [cs.lg] 15 Jun 2015

Quick Start Guide 7.0

Introduction to Moodle

CS Machine Learning

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

Preferences...3 Basic Calculator...5 Math/Graphing Tools...5 Help...6 Run System Check...6 Sign Out...8

An Industrial Technologist s Core Knowledge: Web-based Strategy for Defining Our Discipline

CIS 121 INTRODUCTION TO COMPUTER INFORMATION SYSTEMS - SYLLABUS

Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures

TeacherPlus Gradebook HTML5 Guide LEARN OUR SOFTWARE STEP BY STEP

Getting Started Guide

Online ICT Training Courseware

PeopleSoft Human Capital Management 9.2 (through Update Image 23) Hardware and Software Requirements

Artificial Neural Networks written examination

Dyslexia and Dyscalculia Screeners Digital. Guidance and Information for Teachers

RETURNING TEACHER REQUIRED TRAINING MODULE YE TRANSCRIPT

Moodle 3.2 Backup and Simple Restore

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

On-Line Data Analytics

Millersville University Degree Works Training User Guide

Managing the Student View of the Grade Center

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Глубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

New Features & Functionality in Q Release Version 3.1 January 2016

Tools to SUPPORT IMPLEMENTATION OF a monitoring system for regularly scheduled series

TIPS PORTAL TRAINING DOCUMENTATION

Schoology Getting Started Guide for Teachers

Urban Analysis Exercise: GIS, Residential Development and Service Availability in Hillsborough County, Florida

MYCIN. The MYCIN Task

M55205-Mastering Microsoft Project 2016

Transcription:

Modelling Time Series Data with Theano Charles Killam, LP.D. Certified Instructor, NVIDIA Deep Learning Institute NVIDIA Corporation 1

DEEP LEARNING INSTITUTE DLI Mission Helping people solve challenging problems using AI and deep learning. Developers, data scientists and engineers Self-driving cars, healthcare and robotics Training, optimizing, and deploying deep neural networks 2

Lab Perspective RNNs / LSTMs Keras / Theano TOPICS Pandas / Numpy / Matplotlib Lab Discussion / Overview Launching the Lab Environment Lab Review 3

LAB PERSPECTIVE 4

PURPOSE / GOAL Predict severity of illness in patients based on information found in electronic health records (EHRs) Provide feedback to clinicians when trying to assess the impact of treatment decision or raise early warning signs to flag 5

WHAT THIS LAB IS Discussion on the tools, techniques and processes commonly used to build RNN / LSTM networks to evaluate EHRs Introduction to aspects of RNNs, LSTMs, Keras, Theano, Pandas, Numpy and Matplotlib Guided, hands-on exercise using the tools noted above to build a LSTM network to evaluate EHRs 6

WHAT THIS LAB IS NOT Introduction to machine learning from first principles Explanation of electronic health records Rigorous mathematical formalism of neural networks Survey of all the features and options of Keras / Theano 7

ASSUMPTIONS You are familiar with: Concept of electronic health records Basics of neural networks Basics of Pandas, Numpy and Matplotlib Helpful to have: Familiarity with recurrent neural network (RNNs) 8

TAKE AWAYS Ability to setup your own recurrent neural network workflow using Keras / Theano and adapt it to your use case Know where to go for more info on RNNs, Keras and Theano Familiarity with data preparation process using Pandas, Numpy and Keras 9

RNN / LSTM 10

RECURRENT NEURAL NETWORK RNN = Recurrent Neural Network Similar to traditional feed-forward network RNNs include previous output state Limited to looking back only a few steps due to vanishing gradient Errors are backpropagated through time Inputs from previous time steps get exponentially down weighted and are eventually driven to zero 11

RNN 12

LONG SHORT TERM MEMORY LSTM = Long Short Term Memory Variant of RNN No vanishing gradient problem LSTMs can learn very deep tasks that require memories of events that happened or millions of discrete time steps ago At each time step a measurement is recorded and used as input into the LSTM to yield a probability of survival prediction Enables a real time monitoring of the patients probability of survival and insight into the patients trajectory 13

KERAS RNN 14

KERAS / THEANO 15

KERAS Modular neural network written in Python Runs on TensorFlow and Theano Theano excels at RNNs / LSTMs Keras library allows for easy and fast prototyping Runs on GPUs and CPUs Compatible with Python 2.7 3.5 16

THEANO Theano excels at RNNs in general and LSTMs in particular Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently (http://deeplearning.net/software/theano/) Runs on either GPU or CPU architectures 17

PANDAS / NUMPY / MATPLOTLIB 18

PANDAS Used in academia and commercial domains Open-source, BSD-licensed project Fast and efficient DataFrame object for data manipulation with integrated indexing Contains tools for reading and writing data between in-memory data structures and different formats such as: CSV and text files Microsoft Excel SQL databases HDF5 19

NUMPY NumPy is a Python scientific computing package Open-source software Includes: Support for large, N-dimensional arrays and matrices Collection of high-level mathematical functions 20

MATPLOTLIB Matplotlib is a Python 2D plotting library producing publication quality figures Matplotlib can be used in: Python scripts Python and IPython shell Jupyter notebook Web application servers Supports Python version 2.7 3.5 21

LAB DISCUSSION / OVERVIEW 22

DATA Electronic health records (EHRs) Contains medical treatments and histories of patients over time 15 years of data Data provided by PICU at Children s Hospital Los Angeles 76,693 observations across 5,000+ unique patient encounters Data is an irregular time series of measurements taken over the course of a patient s stay in the ICU 23

DATA Measurements include: Statistics - gender, age, weight Vitals - heart rate, respiratory rate Labs glucose, creatinine Interventions intubation, O2 Drugs dopamine, epinephrine 24

DATA Not all measurements were taken for all patients Dependent variable: Alive 1 Not alive 0 1,113,529 rows containing 265 independent variables Mean observations per patient encounter = 223 Median observations per patient encounter = 94 25

DATA Hierarchical Data Format (HDF) 5 Stores and organizes large amounts of scientific data Designed by National Center for Supercomputing Applications API supports most languages Libraries compatible with Windows, OSX and Linux Binary format Not human readable Efficient in storage size Scales will to very large operational projects 26

LAB PROCESS 1. Setup a. Configure Theano options b. Import Numpy, Pandas and Matplotlib c. Define folders which contain training / testing datasets d. Load data using Pandas API 27

LAB PROCESS 2. Data Preparation a. Data review b. Data normalization c. Filling data gaps d. Data sequencing 28

LAB PROCESS 3. Architect LSTM network using Keras and Theano 4. Build the model (feed data into network for training) 5. Evaluate model using validation (test) data 6. Visualize results 7. Compare baseline to PRISM3 and PIM2 29

LAB ENVIRONMENT 30

NAVIGATING TO QWIKLABS 1. Navigate to: https://nvlabs.qwiklab.com 2. Login or create a new account 31

ACCESSING LAB ENVIRONMENT Click on Modelling Complex Data Sequences with Theano Then click on Select 32

ACCESSING LAB INSTRUCTIONS 1. Click Start Lab to create an instance of the lab environment 2. Once the lab environment starts, click here to access lab instructions (Jupyter notebook) 33

Should see Jupyter notebook ACCESSING LAB INSTRUCTIONS Place cursor in code block and click execute button 34

ACCESSING LAB INSTRUCTIONS Place cursor in code block and click execute button 35

LAB REVIEW 36

LAB REVIEW 1. Setup a. Configure Theano options b. Import Numpy, Pandas and Matplotlib c. Define folders which contain training / testing datasets d. Load data using Pandas API 37

LAB REVIEW - IMPORT LIBRARIES #1B 38

LAB REVIEW - DEFINE PATHS #1C 39

LAB REVIEW - LOAD DATA #1D 40

LAB REVIEW 2. Data Preparation a. Data review b. Data normalization c. Filling data gaps d. Data sequencing 41

LAB REVIEW DATA REVIEW #2A 42

LAB REVIEW DATA REVIEW #2A 43

LAB REVIEW DATA REVIEW #2A 44

LAB REVIEW DATA REVIEW #2A 45

LAB REVIEW DATA REVIEW #2A 46

LAB REVIEW - DATA NORMALIZATION #2B 47

LAB REVIEW - DATA GAPS #2C 48

LAB REVIEW - DATA GAPS #2C 49

LAB REVIEW - DATA GAPS #2C 50

LAB REVIEW - DATA GAPS #2C 51

LAB REVIEW - DATA SEQUENCING #2D 52

LAB REVIEW - DATA SEQUENCING #2D 53

LAB REVIEW - DATA SEQUENCING #2D 54

LAB REVIEW 3. Architect LSTM network using Keras and Theano 4. Build the model (feed data into network for training) 5. Evaluate model using validation (test) data 6. Visualize results 7. Compare baseline to PRISM3 and PIM2 55

LAB REVIEW - ARCHITECT LSTM #3 56

LAB REVIEW - ARCHITECT LSTM #3 57

LAB REVIEW BUILD / TRAIN MODEL #4 58

LAB REVIEW EVALUATE MODEL #5 That is, we have 2690 patient encounters for testing, and at each of the observations the model predicts survivability. Lets plot some predictions! 59

LAB REVIEW - VISUALIZE RESULTS #6 60

LAB REVIEW - COMPARE BASELINE #7 61

LAB REVIEW - COMPARE BASELINE #7 62

LAB REVIEW - COMPARE BASELINE #7 63

WHAT ELSE? Many ways to explore and improve model: Add a second and third LSTM layer to the network Change the number of layers and the number of neurons in those layers Change some of the meta parameters in the network configuration like dropout or learning rate, etc. Try using a CNN? Does it outperform the RNN / LSTM model? 64

WHAT S NEXT 65

WHAT S NEXT Use / practice what you learned Discuss with peers practical applications of DNN Reach out to NVIDIA and the Deep Learning Institute Attend local meetup groups Follow people like Andrej Karpathy and Andrew Ng 66

WHAT S NEXT TAKE SURVEY for the chance to win an NVIDIA SHIELD TV. ACCESS ONLINE LABS Check your email for details to access more DLI training online. Check your email for a link. ATTEND WORKSHOP Visit www.nvidia.com/dli for workshops in your area. JOIN DEVELOPER PROGRAM Visit https://developer.nvidia.com/join for more. 67

www.nvidia.com/dli 68