Memory Intensive Architectures

Similar documents
A Variation-Tolerant Multi-Level Memory Architecture Encoded in Two-state Memristors

PROGRAM Day 1 - Thursday, May 28, 2015

ENEE 302h: Digital Electronics, Fall 2005 Prof. Bruce Jacob

Improving Fairness in Memory Scheduling

Device Design And Process Window Analysis Of A Deep- Submicron Cmos Vlsi Technology (The Six Sigma Research Institute Series) By Philip E.

Circuit Simulators: A Revolutionary E-Learning Platform

Training Memristors for Reliable Computing

Computer Science. Embedded systems today. Microcontroller MCR

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

ZACHARY J. OSTER CURRICULUM VITAE

Python Machine Learning

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

NSF Grantee s Meeting December 4 th, Gerhard Klimeck

Course Specifications

Axiom 2013 Team Description Paper

MINISTRY OF EDUCATION

Embedded Real-Time Systems

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

Aviation English Solutions

Intel-powered Classmate PC. SMART Response* Training Foils. Version 2.0

Physics 270: Experimental Physics

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Automating the E-learning Personalization

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

A Practical Approach to Embedded Systems Engineering Workforce Development

New Jersey Institute of Technology Newark College of Engineering

WHEN THERE IS A mismatch between the acoustic

Form no. (12) Course Specification

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Learning to Schedule Straight-Line Code

Learning Methods for Fuzzy Systems

Information System Design and Development (Advanced Higher) Unit. level 7 (12 SCQF credit points)

MinE 382 Mine Power Systems Fall Semester, 2014

Evolving Spiking Networks with Variable Resistive Memories

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Computer Organization I (Tietokoneen toiminta)

Lecture 1: Machine Learning Basics

Edoardo Charbon Education: Areas of Expertise: Professional Experience: Professor Dept. of ECE (I&C) Chief Architect Post-doctoral fellow ERL

Computer Science 141: Computing Hardware Course Information Fall 2012

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

A SURVEY OF FUZZY COGNITIVE MAP LEARNING METHODS

Nanotechnology STEM Program via Research Experience for High School Teachers

Visual CP Representation of Knowledge

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Modeling function word errors in DNN-HMM based LVCSR systems

Probabilistic Latent Semantic Analysis

Given a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations

Education: Integrating Parallel and Distributed Computing in Computer Science Curricula

Modeling function word errors in DNN-HMM based LVCSR systems

A Corpus-based Evaluation of a Domain-specific Text to Knowledge Mapping Prototype

A Context-Driven Use Case Creation Process for Specifying Automotive Driver Assistance Systems

IMPROVED MANUFACTURING PROGRAM ALIGNMENT W/ PBOS

Introduction to Causal Inference. Problem Set 1. Required Problems

Computer Architecture CSC

A Pipelined Approach for Iterative Software Process Model

An empirical study of learning speed in backpropagation

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

arxiv: v1 [cs.dc] 19 May 2017

TEACHING AND EXAMINATION REGULATIONS (TER) (see Article 7.13 of the Higher Education and Research Act) MASTER S PROGRAMME EMBEDDED SYSTEMS

Artificial Neural Networks

Abstractions and the Brain

DEVELOPMENT OF AN INTELLIGENT MAINTENANCE SYSTEM FOR ELECTRONIC VALVES

Multisensor Data Fusion: From Algorithms And Architectural Design To Applications (Devices, Circuits, And Systems)

Hard Drive 60 GB RAM 4 GB Graphics High powered graphics Input Power /1/50/60

CNS 18 21th Communications and Networking Simulation Symposium

Electric Power Systems Education for Multidisciplinary Engineering Students

High School Digital Electronics Curriculum Essentials Document

An Introduction to Simio for Beginners

Evolutive Neural Net Fuzzy Filtering: Basic Description

Cooperative evolutive concept learning: an empirical study

M-Learning. Hauptseminar E-Learning Sommersemester Michael Kellerer LFE Medieninformatik

CPMT 1347 Computer System Peripherals COURSE SYLLABUS

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Syllabus: MKT Online Marketing (MKT3202) / MKT Introduction into Online Technologies for Marketing Professionals (MKT3205)

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Efficient Use of Space Over Time Deployment of the MoreSpace Tool

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

huis I. stein AI/VLSI Project Computer Science Department Rutgers University New Brunswick, NJ 08903

The University of Amsterdam s Concept Detection System at ImageCLEF 2011

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

Massachusetts Institute of Technology Tel: Massachusetts Avenue Room 32-D558 MA 02139

wise 2012 digital copy

A Power Systems Protection Teaching Laboratory for Undergraduate and Graduate Power Engineering Education

Report on the Use of Intel Classmates

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

Reduce the Failure Rate of the Screwing Process with Six Sigma Approach

Week 4: Action Planning and Personal Growth

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

PHYSICS 40S - COURSE OUTLINE AND REQUIREMENTS Welcome to Physics 40S for !! Mr. Bryan Doiron

Matrices, Compression, Learning Curves: formulation, and the GROUPNTEACH algorithms

Electrical Testing Equipment Performance Rubrics

Firms and Markets Saturdays Summer I 2014

Computer Science (CSE)

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

arxiv: v1 [cs.lg] 7 Apr 2015

Rule Learning With Negation: Issues Regarding Effectiveness

CURRICULUM VITAE FOR ANNET NSIIMIRE

Transcription:

Memory Intensive Architectures Shahar Kvatinsky Viterbi Faculty of Electrical Engineering Technion Israel Institute of Technology ICRI-CI June 2017 1

Memristors Emerging Nonvolatile Memory Technologies Resistive RAM (RRAM) STT MRAM Phase Change Memory (PCM) 2

Example: Intel 3D Xpoint 16 GB, PCIe 3.0, 241 mm 2 20nm process, 4F 2, 1S1R cells between M4 and M5 91.4% memory efficiency (4.5X higher than DRAM) 3

Memristors Add New Capabilities to CMOS Sea of memory above the logic 4 Dense, nonvolatile, fast, and CMOS compatible

Memory Intensive Architectures Input CPU Output Memory Tight integration of memory and logic 5 Bringing memory to logic In-memory computing

MIA Research Projects Memory design Embedded memory 6 Neuromorphic computing Memristive Memory Processing Unit

Memory Design and Methodologies Understanding the fundamental issues in resistive memory Circuits for memory (Ramadan et al., submitted to TCAS-I) Coding for RRAM (Cassuto et al., ISIT 13, 16, TIT 16) RRAM/PCM in the memory system (Nishil Talati) 7

On-Die Intensive Memory Circuits and Architectures Multistate Register (TVLSI 15) Continuous Flow Multithreading (CAL 14) 8 IoT RFIC (Wainstein et al., ISCAS 17, Memrisys 17)

Neuromorphic Computing Online gradient descent training (TNNLS 15, ISCAS 16) Machine learning accelerators (Tzofnat Greenberg) Configurable mixed signal circuits (Loai Danial) 9

Agenda Memristors and MIA Memristive MPU (mmpu) architecture Summary 10

Processing In-Memory (PIM) Reducing Data Movement 4

Processing In-Memory (PIM) Reducing Data Movement 90 s Recent Prior Art Configuration PIM machine Active Pages SA connected to SIMD pipeline Micron Automata Memory Processor CPU Memory (DRAM) Memory (DRAM) CMOS Processing Units (PUs) Data transfer is still required to/from DRAM and PUs M. Gokhale et al., Processing in memory: the Terasys massively parallel PIM array, Computer, 1995 M. Oskin et al., Active pages: A computation model for intelligent memory, Comput. Archit. News, 1998 D. Elliott et al., Computational ram: Implementing processors in memory, IEEE Des. Test, 1999 P. Dlugosch et al., "An Efficient and Scalable Semiconductor Architecture for Parallel Automata Processing," IEEE TPDS, 2014 4

Real Computing within the Memory Beyond von Neumann Architecture Input Device CPU Control Unit Arithmetic/ Logic Unit Output Device Memory Processing Unit Memory 13 (MPU)

mmpu: Solving the von Neumann Bottleneck Moving from DRAM to memristive memory CPU mmpu: performing computation USING the memristive memory cells Clock, Address, Data, and Controls mmpu mmpu 5

Logic within Memory Logic Families x 0 x 1 y 0 M Z M Z f 01 M Z M Z Unipolar logic (ICSEE 16, VLSI-Soc 16) IMPLY (ICCD 11, TVLSI 14) M y Z M Z 1 M Z M Z f 10 f out a Array 2x2 model Akers array (MEJ 14) f out MAGIC 15 (TCAS II 14, TNANO 16)

Memristor Memory Resistor Resistor with Varying Resistance Current Voltage Decrease Increase resistance Current 16

MAGIC Memristor Aided LoGIC Example of MAGIC NOR Initialize OUT to R ON R ON =Logic 1 R OFF =Logic 0 R OFF ON R OFF >> R ON <<V >V G G /2 IN 1 IN 2 NOR 0 0 1 0 1 0 1 0 0 1 1 0 R OFF ON R ON OFF Increase resistance 17 S. Kvatinsky, D. Belousov, S. Liman, G. Satat, N. Wald, E. G. Friedman, A. Kolodny, and U. C. Weiser, "MAGIC Memristor Aided LoGIC," IEEE TCAS II, Nov. 2014

Real MAGIC 18 B. C. Jung et al., Zero-static-power nonvolatile logic-in-memory circuits for flexible electronics, Nano Research, April 2017

MAGIC NOR in a Crossbar V G V G IN 1 IN 2 OUT 19

MAGIC NOR in a Crossbar V G V G IN 1 IN 2 OUT 20

MAGIC NOR in a Memristive Memory V G V G OU IN IN 1 IN IN 2 OUT T V Isolate IN 1 IN 2 OUT 21 N. Talati, S. Gupta, P. Mane, and S. Kvatinsky, Logic Design within Memristive Memories Using MAGIC," IEEE Transactions on Nanotechnology, July 2016

Hierarchy of Logical Functions Matrix multiplication Convolution MUL POW SQRT DIV ADD NOR AND SUB NOT XOR OR COPY NAND 22 MAGIC - NOR

Parallel Vector Operation within Memristive MPU f n : R n R n R n Control a 0 b 0 c 0 a 1 b 1 c 1 f n : a 2 b 2 = c 2 a n, b n c n 23 Latency of the vector operation is independent of the length of the vector

mmpu µarchitecture Column Control Memristive Memory a 0 b 0 a 1 b 1 a 2 b 2 a n b n Row Control mmpu Controller R. Ben-Hur 24 and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, July 2016

mmpu µarchitecture Column Control Memristive Memory a 0 b 0 a 1 b 1 a 2 b 2 a n b n Row Control mmpu Controller R. Ben-Hur 25 and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, July 2016

mmpu µarchitecture Column Control Memristive Memory a 0 b 0 a 1 b 1 a 2 b 2 a n b n Row Control mmpu Controller R. Ben-Hur 26 and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, July 2016

mmpu µarchitecture Column Control Memristive Memory a 0 a 1 a 2 a n b 0 b 1 b 2 b n c 0 c 1 c 2 c n Row Control mmpu Controller R. Ben-Hur 27 and S. Kvatinsky, "Memory Processing Unit for In-Memory Processing," Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, July 2016

CPU mmpu Systems Accelerator or Main Memory? Accelerators mmpu Clock, Address, Data, and Controls Memristive memory with processing capabilities 28 mmpu Memristive Memory DRAM DIMM?

Issues Involved in mmpu Architecture Memory Design mmpu Controller Design and Optimization Periphery Design Programming Model Software mmpu Architecture CPU? mmpu Controller mmpu 29 Applications

Agenda Memristors and MIA Memristive MPU (mmpu) architecture Summary 30

Memristors to the Rescue? New technologies enable memory intensive architectures Better processors (multithreading, low power) Accelerators (machine learning) Smart memories (memory processing unit) 31

Thanks! 32