Deep Learning and Storage

Similar documents
Forget catastrophic forgetting: AI that learns after deployment

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Python Machine Learning

Top US Tech Talent for the Top China Tech Company

Lip Reading in Profile

An Introduction to Simio for Beginners

Pod Assignment Guide

ATENEA UPC AND THE NEW "Activity Stream" or "WALL" FEATURE Jesus Alcober 1, Oriol Sánchez 2, Javier Otero 3, Ramon Martí 4

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

Research computing Results

Bayllocator: A proactive system to predict server utilization and dynamically allocate memory resources using Bayesian networks and ballooning

arxiv: v1 [cs.dc] 19 May 2017

FY16 UW-Parkside Institutional IT Plan Report

Generative models and adversarial training

MYCIN. The MYCIN Task

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Laboratorio di Intelligenza Artificiale e Robotica

Education the telstra BLuEPRint

AI Agent for Ice Hockey Atari 2600

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Chapter 7 Information and Communications Technology: Platforms for Learning and Teaching

Lecture 10: Reinforcement Learning

Knowledge based expert systems D H A N A N J A Y K A L B A N D E

Improving Fairness in Memory Scheduling

Lecture 1: Basic Concepts of Machine Learning

Discriminative Learning of Beam-Search Heuristics for Planning

Education for an Information Age

European Cooperation in the field of Scientific and Technical Research - COST - Brussels, 24 May 2013 COST 024/13

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

A virtual surveying fieldcourse for traversing

Emergency Management Games and Test Case Utility:

Using dialogue context to improve parsing performance in dialogue systems

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Evolutive Neural Net Fuzzy Filtering: Basic Description

An investigation of imitation learning algorithms for structured prediction

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Circuit Simulators: A Revolutionary E-Learning Platform

Version Space. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Version Space Term 2012/ / 18

Introduction to Mobile Learning Systems and Usability Factors

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Computed Expert System of Support Technology Tests in the Process of Investment Casting Elements of Aircraft Engines

THE VIRTUAL WELDING REVOLUTION HAS ARRIVED... AND IT S ON THE MOVE!

arxiv: v2 [cs.cv] 30 Mar 2017

Truth Inference in Crowdsourcing: Is the Problem Solved?

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

On the Combined Behavior of Autonomous Resource Management Agents

Using Deep Convolutional Neural Networks in Monte Carlo Tree Search

Axiom 2013 Team Description Paper

WebLogo-2M: Scalable Logo Detection by Deep Learning from the Web

November 17, 2017 ARIZONA STATE UNIVERSITY. ADDENDUM 3 RFP Digital Integrated Enrollment Support for Students

WebLogo-2M: Scalable Logo Detection by Deep Learning from the Web

Learning Methods in Multilingual Speech Recognition

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Learning to Schedule Straight-Line Code

An empirical study of learning speed in backpropagation

Modeling user preferences and norms in context-aware systems

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

arxiv:submit/ [cs.cv] 2 Aug 2017

Reinforcement Learning by Comparing Immediate Reward

Skillsoft Acquires SumTotal: Frequently Asked Questions. October 2014

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Week 01. MS&E 273: Technology Venture Formation

Feature-oriented vs. Needs-oriented Product Access for Non-Expert Online Shoppers

Lecture 1: Machine Learning Basics

arxiv: v1 [cs.lg] 15 Jun 2015

Laboratorio di Intelligenza Artificiale e Robotica

CS 446: Machine Learning

Davidson College Library Strategic Plan

A Compact DNN: Approaching GoogLeNet-Level Accuracy of Classification and Domain Adaptation

Seminar - Organic Computing

UNDERSTANDING DECISION-MAKING IN RUGBY By. Dave Hadfield Sport Psychologist & Coaching Consultant Wellington and Hurricanes Rugby.

Computer Organization I (Tietokoneen toiminta)

A Cost-Effective Cloud Service for E-Learning Video on Demand

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

A Case-Based Approach To Imitation Learning in Robotic Agents

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

Applications of memory-based natural language processing

Australian Journal of Basic and Applied Sciences

Your Partner for Additive Manufacturing in Aachen. Community R&D Services Education

Modeling function word errors in DNN-HMM based LVCSR systems

Overcoming the Tyranny of Distance in 21 st Century Research AARNet/Pacific Wave. Overcoming the Tyranny of Distance in 21 st Century Research

Intel-powered Classmate PC. SMART Response* Training Foils. Version 2.0

Blended E-learning in the Architectural Design Studio

CS Machine Learning

Computers Change the World

Distributed Weather Net: Wireless Sensor Network Supported Inquiry-Based Learning

Preliminary AGENDA. Practical Applications of Load Resistance Factor Design for Foundation and Earth Retaining System Design and Construction

ACCELERATE LEADERSHIP DEVELOPMENT WITH OPTIMAL DESIGN: SIX KEY PRINCIPLES. { perspectives } LEARNING DESIGN

LEGO MINDSTORMS Education EV3 Coding Activities

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

AC : DESIGNING AN UNDERGRADUATE ROBOTICS ENGINEERING CURRICULUM: UNIFIED ROBOTICS I AND II

Geospatial Visual Analytics Tutorial. Gennady Andrienko & Natalia Andrienko

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

Bootstrapping Personal Gesture Shortcuts with the Wisdom of the Crowd and Handwriting Recognition

SYSTEM QUALITY CHARACTERISTICS FOR SELECTING MOBILE LEARNING APPLICATIONS

Transcription:

Keep Those GPUs Busy Deep Learning and Storage Igor Ostrovsky igor@purestorage.com 1

THREE PILLARS OF DEEP LEARNING EXPERTISE TECHNIQUES & TOOLS COMPUTE FROM CPU TO GPU SERVERS DATA MASSIVE TRAINING SETS 2

THREE PILLARS OF DEEP LEARNING EXPERTISE TECHNIQUES & TOOLS COMPUTE FROM CPU TO GPU SERVERS DATA MASSIVE TRAINING SETS 3

DEEP LEARNING AND DATA 4

MODEL SIZE GROWTH BEN TAYLOR, ZIFF GB kb MB 1987 1997 2007 2015 5

BIG MODELS NEED BIG DATA SETS Dog Cat Dog or cat? Powerful models have many parameters Many parameters require big training sets 6

Performance DEEP LEARNING AND TRAINING SET OBSERVATION BY ANDREW NG, AI LEADER Deep learning Older learning algorithms Amount of data 7

TRAINING SETS FOR DEEP LEARNING 2011 flickr30k 2012 GBs TBs PBs 8

TRAINING SETS FOR DEEP LEARNING Synthetic data 2011 flickr30k 2012 GBs TBs PBs 9

We don t have better algorithms. We just have more data. PETER NORVIG Google Research Director 10

DEEP LEARNING INFRASTRUCTURE FOR TRAINING 11

INGEST COMPLEXITIES OF AI IN PRODUCTION CLEAN & TRANSFORM EXPLORE TRAIN From sensors, machines, & user generated Label, anomaly detection, ETL, prep, stage Quickly iterate to converge on models Run for hours to days in production cluster CPU Servers GPU Server GPU Production Cluster COPY & TRANSFORM COPY & TRANSFORM COPY & TRANSFORM 12

INGEST REAL WORLD PIPELINE IN AN AUTONOMOUS CAR COMPANY CLEAN, LABEL, RESIZE EXPLORE TRAIN INFERENCE IN VIRTUAL WORLD CPU Servers GPU Server GPU Production Cluster GPU Production Cluster 10 S OF PB COLD STORAGE 13

SOFTWARE PIPELINE EXAMPLE 14

WIDE RANGE OF NEEDS IN THE PIPELINE SIGNIFICANT CHALLENGE INGEST CLEAN & TRANSFORM EXPLORE TRAIN & VALIDATE Sensors, Synthetic CPU Servers GPU Server GPU Cluster 15

WHAT IS FLASHBLADE? 17 TB or 52 TB blades 1.5 M IOPS and 16 GB/s performance ELASTIC FABRIC is FAST AND SIMPLE and scales linearly NFS, S3, SMB, HTTP and 1.6 PB (3:1) N+2 redundancy POWER max 1850 WATT fully loaded 16

FROM THE EXPERTS Building and managing data pipelines is typically one of the most costly pieces of a complete machine learning solution. If your boss asks you, tell them that I said [to] build a unified data warehouse. Jeremy Hermann & Mike Del Balso Uber Machine Learning Platform https://eng.uber.com/michelangelo/ Andrew Ng Former head of Baidu AI/Google Brain Nuts and Bolts of Applying Deep Learning 17

TRAINING BENCHMARKS 18

AI SYSTEMS DESIGN PATTERNS HOW MUCH PERFORMANCE PENALTY DUE TO SHARED STORAGE? FULL TRAINING WORKFLOW decode scale evaluate forward-propagation update back-propagation I/O CPU GPU BENCHMARK SETUP Setup #1: DGX-1 with 4x Local SSDs Setup #2: DGX-1 with 1x FlashBlade 19

IMAGENET TRAINING DGX-1, 8 x P100 Model Year Top-5 DGX-1 images/s AlexNet 2012 84.7% 9968 Image classification challenge 1.28M labeled images 1000 categories VGG-16 2014 92.5% 1093 Inception V3 2015 94.4% 1052 ResNet-50 2016 94.8% 1542 ResNet-152 2016 95.5% 673 20

TIME TO RESULTS TIME TO RESULTS TensorFlow ResNet-50 Training, 200kB Images 1.8 Hours process 10M images 1.8 Hours process 10M images 21 NVIDIA DGX-1 Local SSDs NVIDIA DGX-1 Pure FlashBlade

TIME TO RESULTS TIME TO RESULTS TensorFlow ResNet-50 Training, 200kB Images 0.9 Hours load 2TB 33% Faster 1.8 Hours process 10M images 1.8 Hours process 10M images 22 NVIDIA DGX-1 Local SSDs NVIDIA DGX-1 Pure FlashBlade

TIME TO RESULTS TIME TO FIRST RESULT TensorFlow ResNet-50 Training, 200kB Images 0.9 Hours load 2TB Instant 23 NVIDIA DGX-1 Local SSDs NVIDIA DGX-1 Pure FlashBlade

TIME TO RESULTS TIME TO FIRST RESULT TensorFlow ResNet-50 Training, 200kB Images [M]achine learning is [ ] an iterative process of running the learner, analyzing the results, modifying the data and/or the learner, and repeating. Pedro Domingos professor at University of Washington author of The Master Algorithm 0.9 Hours load 2TB Instant 24 NVIDIA DGX-1 Local SSDs NVIDIA DGX-1 Pure FlashBlade

CASE STUDIES 25

MAKING AUTONOMOUS CARS POSSIBLE BY 2021 Zenuity, a joint venture of Volvo and Autoliv, chose NVIDIA DGX-1 and Pure FlashBlade systems for their deep learning infrastructure. 26

10X FASTER INVESTMENT DECISIONS WITH PURE FLASHBLADE Our quants want to test a model, get the results, and then test another one all day long. 27 Gary Collier, co-cto, Man AHL

LESSONS FROM CUSTOMERS HOW IS STORAGE IMPORTANT? 1. 2. 3. DATA COPY ELIMINATION Dramatically improve time to results & time to first result PERFORMANCE & SCALABILITY Support evolving data pipeline with varying I/O patterns SIMPLICITY Focus more on AI, less on infrastructure 28

igor@purestorage.com