Fast STA Prediction-based Gate-level Timing Simulation

Similar documents
SPECIAL ARTICLES Pharmacy Education in Vietnam

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I

SANTIAGO CANYON COLLEGE Reading & English Placement Testing Information

Sweden, The Baltic States and Poland November 2000

ENEE 302h: Digital Electronics, Fall 2005 Prof. Bruce Jacob

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

THE DEPARTMENT OF DEFENSE HIGH LEVEL ARCHITECTURE. Richard M. Fujimoto

A Pipelined Approach for Iterative Software Process Model

Infrared Paper Dryer Control Scheme

Circuit Simulators: A Revolutionary E-Learning Platform

Software Maintenance

PH.D. IN COMPUTER SCIENCE PROGRAM (POST M.S.)

Device Design And Process Window Analysis Of A Deep- Submicron Cmos Vlsi Technology (The Six Sigma Research Institute Series) By Philip E.

An Introduction to Simio for Beginners

Computer Organization I (Tietokoneen toiminta)

Visual CP Representation of Knowledge

Computer Science. Embedded systems today. Microcontroller MCR

Introduction to Simulation

A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique

On the Combined Behavior of Autonomous Resource Management Agents

P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas

Learning Methods for Fuzzy Systems

Computer Science 141: Computing Hardware Course Information Fall 2012

Emergency Management Games and Test Case Utility:

What is PDE? Research Report. Paul Nichols

Evolutive Neural Net Fuzzy Filtering: Basic Description

Using Moodle in ESOL Writing Classes

Computer Architecture CSC

Reducing Features to Improve Bug Prediction

Requirements-Gathering Collaborative Networks in Distributed Software Projects

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

LEGO MINDSTORMS Education EV3 Coding Activities

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

COURSE GUIDE: PRINCIPLES OF MANAGEMENT

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

Computing Curricula -- Software Engineering Volume. Second Draft of the Software Engineering Education Knowledge (SEEK) December 6, 2002

Education: Integrating Parallel and Distributed Computing in Computer Science Curricula

PROCESS USE CASES: USE CASES IDENTIFICATION

Abstractions and the Brain

Using Virtual Manipulatives to Support Teaching and Learning Mathematics

CS Machine Learning

European Cooperation in the field of Scientific and Technical Research - COST - Brussels, 24 May 2013 COST 024/13

Axiom 2013 Team Description Paper

COMM370, Social Media Advertising Fall 2017

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Utilizing Soft System Methodology to Increase Productivity of Shell Fabrication Sushant Sudheer Takekar 1 Dr. D.N. Raut 2

Software Security: Integrating Secure Software Engineering in Graduate Computer Science Curriculum

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Oregon Institute of Technology Computer Systems Engineering Technology Department Embedded Systems Engineering Technology Program Assessment

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

Generating Test Cases From Use Cases

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering

Faculty Schedule Preference Survey Results

The open source development model has unique characteristics that make it in some

Learning to Schedule Straight-Line Code

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

DIDACTIC MODEL BRIDGING A CONCEPT WITH PHENOMENA

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

CPMT 1347 Computer System Peripherals COURSE SYLLABUS

Ministry of Education, Republic of Palau Executive Summary

Phys4051: Methods of Experimental Physics I

Measurement & Analysis in the Real World

SARDNET: A Self-Organizing Feature Map for Sequences

Word Segmentation of Off-line Handwritten Documents

Towards a Collaboration Framework for Selection of ICT Tools

Using Deep Convolutional Neural Networks in Monte Carlo Tree Search

Arizona s College and Career Ready Standards Mathematics

Automating the E-learning Personalization

CREATING SHARABLE LEARNING OBJECTS FROM EXISTING DIGITAL COURSE CONTENT

Quantitative Evaluation of an Intuitive Teaching Method for Industrial Robot Using a Force / Moment Direction Sensor

Xinyu Tang. Education. Research Interests. Honors and Awards. Professional Experience

Practical Integrated Learning for Machine Element Design

First Grade Standards

ECO 2013: PRINCIPLES OF MACROECONOMICS Spring 2017

Moderator: Gary Weckman Ohio University USA

An Estimating Method for IT Project Expected Duration Oriented to GERT

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

A Context-Driven Use Case Creation Process for Specifying Automotive Driver Assistance Systems

How to make successful presentations in English Part 2

SCHOOL WITHOUT CLASSROOMS BERLIN ARCHITECTURE COMPETITION TO

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS

Radius STEM Readiness TM

Teaching a Laboratory Section

Data Fusion Models in WSNs: Comparison and Analysis

Agent-Based Software Engineering

What is a Mental Model?

Improving Fairness in Memory Scheduling

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

A study of speaker adaptation for DNN-based speech synthesis

Test Effort Estimation Using Neural Network

Major Milestones, Team Activities, and Individual Deliverables

Integrating Blended Learning into the Classroom

CSC200: Lecture 4. Allan Borodin

THE CONSENSUS PROCESS

An Investigation into Team-Based Planning

Transcription:

Fast STA Preiction-base Gate-level Tariq B Ahma ECE Department University of Massachusetts Amherst, MA, USA tbashir@ecsumasseu Maciej J Ciesielski ECE Department University of Massachusetts Amherst, MA, USA ciesiel@ecsumasseu Abstract Traitional ynamic simulation with stanar elay format (SDF) back-annotation cannot be reliably performe on large esigns The large size of SDF files makes the event-riven ulation extremely slow as it has process an excessive number of events In orer accelerate gate-level timing simulation we propose an aumate fast preiction-base gatelevel ulation that combines static timing analysis (STA) at the block level with ynamic ulation at the I/O interfaces We emonstrate that the propose timing simulation can be one earlier in the esign cycle in parallel with synthesis Inex Terms Gate-level timing, static timing analysis, ynamic ulation, ASIC, Opencores,, Verilog I INTRODUCTION A Literature Survey on Verification As esign size an complexity increase, so is the nee verify esigns quickly an reliably This, combine with the reuce esign cycle of 3-6 months, makes verification an extremely challenging task Toay, verification consumes over 70% of the esign cycle time an, on an average, the ratio of verification esign engineers is 3:1 [1][2] Verification engineers use a wie variety of verification approaches, incluing constraine ranom simulation for atapath components, equivalence checking for pre- an postsynthesis netlists, an formal property verification for control an procol checking As the esign gets refine in lower levels of abstraction, such as gate or layout level, in an application specific integrate circuit (ASIC) or fiel programmable gate array (FPGA) esign flow, the performance of simulation rops significantly This is ue the large size of gate-level an layout-level netlists, an gate an wire elays available only at these lower levels of abstraction Formal property verification still cannot cope with the esign complexity beyon register transfer level () of abstraction Equivalence checking can only compare functionality (but not the timing) of two esigns; it suffers from memory capacity limitations for large esigns an may require efining structurally similar cut points as a basis of comparison Other techniques, such as static timing analysis (STA), are prone manually impose constraints A esigner may inavertently miss false or multi-cycle paths or a such paths that shoul not have been constraine [3] Furthermore, STA oes not work for asynchronous interfaces [15] an there is no way valiate the results of STA, except by simulation To accelerate simulation at lower levels of abstraction, harware assiste simulation acceleration (base on FPGAs), emulation (such as Caence Pallaium platform or EVE/Synopsys Zebu platforms), an other techniques have been introuce These techniques are expensive, quite complex eal with, an may require reesigning testbench or the esign uner verification (DUV) [4] Despite this, traitional harware escription language (HDL) simulation remains the most popular metho of esign verification, because of its ease of use, inexpensive computing platform, 100% signal controllability an observability [5] Figure 1 illustrates the use of simulation in a typical ASIC esign flow Level of abstraction Start Algorithm Development in C/C++ Translation in Harware Description language HDL Functional Harware in logic gates Post Functional & Layout Post Layout Functional & Figure 1 in ASIC/FPGA Design Flow It is clear from the above escription that simulation has its own special place in the esign flow an it is not going away in the foreseeable future As the esign gets refine in lower levels of abstraction, such as gate-level an layout level, functional (zero-elay) an ulations can valiate the results of synthesis, STA or equivalence checking Moreover, neither STA nor equivalence checking can fin bugs ue X (unknown) signal propagation Even though regression is run on a aily basis, inustry insists on gate-level simulation before sign-off Gate-level simulation is necessary valiate the results of an logic synthesis At this stage, gate-level simulation En 978-3-9815370-2-4/DATE14/ 2014 EDAA

can be functional (zero-elay) or unit-elay, where all gatelevel cells are assume have elay value of 1 timescale unit Later, gate-level ulation can be performe in the prelayout or post-layout stage using stanar elay format (SDF) back-annotation Gate-level simulations are consiere a must for verifying timing critical paths of asynchronous esign as such paths cannot be hanle by STA ols Furthermore, gatelevel simulation is use verify the constraints of static verification ols such as STA an equivalence checking These constraints are ae manually, an the quality of results obtaine with static ols is only as goo as the impose constraints Gate-level simulation is also use verify the power-up, power-own an reset sequences of the full chip It is also use estimate ynamic power rawn by the chip Finally, gate-level simulation is use after engineering change orer (ECO) verify the applie changes [15] B Issues with The ominant technique use for functional an timing simulation is event-riven HDL simulation [5] However, event-riven simulation suffers from very low performance because of its inherently sequential nature an heavy event activities in gate-level simulation As the esign gets refine in lower levels of abstraction, an as more ebugging features are ae in the esign, simulation time increases significantly Figure 2 shows the simulation performance of Opencores [14] AES128 esign [23] at various levels of abstraction with ebugging features enable As the level of abstraction goes own gate or layout level an ebugging features are enable, simulation performance rops own significantly This is ue a large number of events at the gate-level or layout level, timing checks an isk access ump simulation ata time (min) Level of abstraction + ebugging effect on simulation performance of AES-128 350 300 250 200 150 100 50 0 Figure 2 Drop own in simulation performance with level of abstraction + ebugging C Scope of this Work gl_0 gl_timing +assertion +ump Level of abstraction + ebugging This work aresses the issue of improving performance of event-riven gate-level ulation using static timing analysis (STA) as timing preicr at the block level We propose an aumate partitioning scheme that partitions the gate-level netlist in blocks for SDF annotation an STA We also propose a new esign/verification flow where timing simulation can be one early in the esign cycle using cycleaccurate The next section briefly reviews literature on improving simulation performance using parallel simulation Section 3 presents a new approach accelerating gate-level timing simulation using STA Section 4 escribes the setup, experiments an results base on the new approach Section 5 escribes how verify simulation results using the propose flow New simulation flow base on early simulation is iscusse in Section 6 The paper is conclue in Section 7 an References are liste in Section 8 Our contributions in this work span Sections 3 through 7 II PARALLEL GATE-LEVEL ULATION A Parallel Discrete Event-Driven (PDES) To aress the performance of event-riven gate-level simulation (both functional an timing), istribute parallel simulation [6][7] has been propose Unfortunately, it has not been very successful for the following reasons: i) ifficulty in esign partitioning an loa balancing; ii) communication overhea; iii) synchronization overhea between esign blocks impose by the istribute environment; an iv) lack of concurrency in the original esign The area of parallel simulation is rich in literature, an most of the known work concerns traitional parallel simulation, base on physical partitioning of the esign in moules istribute iniviual simulars PDES is not practical for gate-level timing simulation as gate-level ulation involves processing huge number of events across partitions which severely egraes simulation performance For this reason, recent parallel multi-core simulars provie by major EDA venors [20][21] o not hanle gate-level ulation in their multi-core simulars B Time Parallel (TP) In contrast the parallel iscrete event HDL simulation escribe above, which partitions the esign in spatial omain, there has been some interesting work on time-parallel iscrete event HDL simulation [17] This approach, calle multi-level temporal parallel event-riven simulation (MULTES) [18], parallelizes simulation in time omain by iviing it in inepenent time intervals (simulation slices) Each slice is then simulate on a ifferent processor The key requirement of this technique is fining the initial state of each slice, which is a challenging problem, especially for a esign obtaine by retiming an re-synthesis [18] For functional gate-level simulation, is use fin the initial state of each slice an for gate-level ulation, functional gate-level is use fin the initial state of each time slice Limitations this inclue space complexity (each simulation slice simulates the whole esign) an limite applicability multi-core architecture Multi-core architecture is more suitable esign partitions rather than running entire esign on every core In general, the metho oes not scale well with the multi-core

architecture, cannot be fully aumate an requires manual interaction However, if the esigner requires gate-level timing simulation with full SDF back-annotation, TP can be use This will nee manual interaction an state matching A Basic Iea III HYBRID GATE-LEVEL TIMING ULATION We present a new approach improve performance of gate-level ulation The basic iea is use static timing analysis (STA) as timing preicr at the block level It uses worst case (critical path) elay, capture by STA, instea of the actual cell elays for annotating block-level timing uring simulation This iea is illustrate in Figures 3 an 4 Figure 3 shows gate-level ulation of a esign consisting of two blocks, with ulation accomplishe with SDF back-annotation applie the entire esign However, for large esigns, such SDF back-annotation will negatively impact the performance of gate-level timing simulation Test bench in1 in2 Figure 3 Gate-level ulation with full SDF annotation To improve the performance of gate-level timing simulation, we propose a hybri approach, shown in Figure 4, where only gate-level block 2 is SDF back-annotate Gatelevel block 1 is analyze by STA ol which reports the maximum elay insie the block Only this value is backannotate uring simulation as sta at the output of block1 This type of timing annotation is terme as selective SDF annotation Note that STA can be performe on gate-level block 1 as part of the whole esign or separately if I/O elays are moele appropriately Essentially, block 1 is simulate in functional (zeroelay) moe ie, without SDF back- annotation, while block 2 is simulate with SDF back-annotation In case of multiple blocks, the propose STA base timing preiction approach can be use for majority of the blocks spee up gate-level ulation Designers typically know the timing critical blocks in a esign where selective SDF back-annotation can be use quickly verify timing B Partitioning Block 1 Gate-Level with SDF a b comb1 comb2 c out1 Block 2 Gate-Level with SDF Partitioning of gate-level netlist in blocks for SDF annotation an STA is a challenging problem as verification engineer may not have sufficient insight in ientifying timingcritical blocks Furthermore, partitioning schemes are often manually riven This may cause a problem when ealing with large gate-level netlists Often gate-level netlist is flattene an hierarchy is not preserve We propose a partitioning in3 comb3 f g e h out2 scheme base on STA that is fully aumate an works for flat or hierarchical gate-level netlist This is one of the most important contributions of this paper Moreover, the partitioning oes not have be at the register bounary For multi-clock esigns, clock omain crossings (CDC) are always SDF back-annotate Formal ols like Synopsys Formality can etect CDC paths in a esign Test bench Block 1 Gate-Level with 0-elay comb comb in3 sta Block 2 Gate-Level with SDF comb3 Figure 4Gate-level ulation with hybri approach STA etermines slowest (critical path) in a esign One can also choose report not only the most timing critical path but the next most timing critical path an so on STA report then reports these timing critical paths an the associate moule instances Since these paths are time critical, one woul always want o SDF back-annotate ulation on these moule instances make sure that their timing conforms STA results In brief, one can inclue all the moule instances that are in the timing critical paths for SDF back-annotation This group of instances is shown as Block2 in Figure 4 All the other moule instances can be consiere not timing critical These moule instances shall be simulate in functional (zero-elay) moe This group of instances are in Block1 However, one nees run STA on Block1 fin out their worst case elay sta as shown in Figure 4 All of this can be aumate in a flow as shown in Figure 5 Block2 Gate-level netlist List of moule instances in the critical paths STA 1 or more critical paths Constraint file (tfile) STA at Block s 1 bounary List of moule instances NOT in the critical paths Figure 5 Aumate Partitioning an simulation flow for hybri gate-level ulation STA sta g Block1 h out2

C Integration with the existing Design Flow The flow for this approach is shown in Figure 6 The key iea is capture peripheral timing of each block via static timing analysis or various estimates erive from time bugeting As some (non-critical) of the esign blocks are simulate in functional (zero-elay) moe, except at the block periphery, this shoul result in a significant speeup compare the simulation with full SDF back-annotation To further improve the performance of gate-level timing simulation, the majority of gate-level blocks can be replace by their cycle-accurate blocks with peripheral timing capture via STA, time bugeting or other estimates be explaine next STA + SDF gen (SDF contains gateelays) Place & Route STA + SDF gen (SDF contains gate+ wire elays) Figure 6 Propose flow for hybri gate-level ulation D Early Gate-level Traitional annotate gatelevel sim (Slow) This Work Selective SDF annotate +STA gate-level sim (Fast) The iea of early simulation is shown in Figure 7, where gate-level Block1 is replace by equivalent Now is simulate instea of gate-level for Block1 The key iea is perform ulation using estimate timing est early in the esign cycle when the blocks have not been synthesize The estimate timing can come from time bugeting or a ol like Synopsys DC Explorer [22] This is in contrast the conventional approach, where gate-level simulation is performe later in the esign flow, after synthesis or place & route step, with all the etaile elay ata alreay available Major simular venors have alreay embrace the iea of early ulation base on the estimate elays realizing that performing gate-level ulations late in the esign cycle is prohibitively slow Verification engineers get aroun this problem by performing gate-level timing simulation of only time critical blocks with few test vecrs However, they are not able perform full chip timing simulation with large number of test vecrs, which often leaves certain timing bugs unetecte Synopsys [21] has recently announce a new prouct calle DC Explorer [22] that is base on the same iea of early esign exploration It can o early synthesis, timing an other estimates with sufficient accuracy for esigns start the simulation process early in the esign flow For this reason, Synopsys DC Explorer is rapily getting aoption in the inustry Test bench Figure 7 Early ulation using with estimate of peripheral timing A Setup IV EXPERIMENTS We teste the propose approach by measuring the performance of gate-level ulation of several Opencores Verilog esigns [14], namely AES-128 [23], 3-DES [24], VGA controller [25] an JPEG encoer [26] esigns We use Caence [20] Incisive Unifie Simular 131 on quacore Intel CPU with 8GB RAM The esigns were synthesize with Synopsys Design Compiler using TSMC 65nm stanar cell library All these esigns except VGA controller are single clock esigns The following Table 1 shows essential statistics for these esigns B Results Block 1 Assign c = A & B; Always @ (posegeclk) State < = next_state; estimate at Block s 1 bounary Table 1 Gate-level esign statistics Block 2 Gate-Level in SDF First, we show simulation results with the AES-128 esign We starte with SDF annotation of majority of blocks ( accommoate many timing critical paths) an then graually ecrease the number of blocks in SDF annotation one ( accommoate the worst case timing path) Table 2 shows that significant speeup (~5x) over full SDF annotate timing simulation can be obtaine The waveforms in Figure 8 illustrate the ifference between full SDF annotation an selective SDF annotation It shows that signal from selective SDF annotation is elaye more than the SDF-annotate signal ue STA elay, but contains no glitches This means fewer events process uring simulation an hence faster simulation Both signals match at the clock cycle bounary (positive ege of the clock) In the next set of experiments, all esigns were ivie in two gate-level blocks, Block1 an Block2 as in Figure 4 Block 2 contains moule instances from the most timing est Implementations AES-128 Iterative in3 g Synthesize Area in NAND2 equivalents 18400 3-DES 96650 VGA 144189 JPEG 968788 comb3 h out2

critical path Here, the number of timing critical paths consiere is one The propose approach has an aitional avantage that it valiates the result of STA which is epens on manual constraints entry If the simulation exhibits timing failure, it will help ebug STA constraints Once the constraints are correcte, STA is run again provie new # sta value This STA--simulation cycle is repeate until all timing failures are ebugge an remove from the simulation Table 2 Gate-level ulation speeup of AES-128 for variable number of blocks in SDF annotation #of moules instances in SDF Annotation /17 Moule Instances in 0-elay 16 testu0us00 (one Sbox) 16 testu0u0 (key_expan) 15 testu0us00 testu0us01 (two Sboxes) 13 testu0us00 testu0us03 (Four sboxes) 9 testu0us00 testu0us13 (8 sboxes) 7 testu0us00 testu0us23 (12 sboxes) 1 testu0us00 testu0us33 (16 sboxes) annotate (T1) Min Selective SDF annotate sim (T2) Min Speeup (T1/T2) 172 115 149 172 84 204 172 110 156 172 100 172 172 77 223 172 56 307 172 37 464 the propose simulation approach In practice, verification engineer can skip this step reuce the verification time While testbench can verify functional correctness of the two simulations, the propose verification scheme helps in verifying timing correctness of the two simulations In orer for both simulations be timing correct, the monire signals from the two simulations shoul match at the clock cycle bounary Unfortunately, umping, as shown in Figure 2 can rastically reuce simulation performance Further, the amount of umping can cause the isk quickly become full Therefore, it is recommene that umping shoul be one for a small time interval rather than for the entire simulation We use small simulation intervals verify timing correctness of the output signals of the esigns Caence Comparescan ol was use compare the umpe signals The ol reporte the signals be matching at the clock cycle bounary Table 3 shows comparison between full SDF gate-level timing simulation an propose hybri gate-level ulation for all the flip-flops/registers in VGA an AES-128 esigns The fact that the values of the registers match at the clock cycle bounary uring the entire simulation confirms the accuracy of our approach Hybri gate-level with umping annotate gate-level with umping Sim Dump Data Sim Dump Data Compare signals at clock cycle bounary T/F Figure 9 Verification flow for hybri gate-level ulation Clock Full sf annotate signal Selective sf annotate signal Figure 8 annotation vs selective SDF annotation in waveform Table 3 Speeup with hybri gate-level ulation Implementations annotate T1 (min) Hybri timing simulation T2 (min) Speeup (T1/T2) AES-128 172 37 464 3-DES 196 51 392 VGA Controller 812 232 350 JPEG Controller 273 79 345 V VERIFICATION OF ULATION RESULTS In orer verify the timing correctness of the approach, we propose the following umping-base flow, shown in Figure 9 Note that this is an optional step, use only verify Table 4 Accuracy of hybri gate-level ulation at the register bounary Design name A: Total # of registers B: # of timing vs selective SDF timing register match Lower boun on hybri preiction accuracy (B/A)*100 VGA 1611 1611 100 % controller AES128 530 530 100 % VI NEW GATE-LEVEL TIMING ULATION FLOW We also propose the esign/verification flow in which gate-level ulation is performe early in the esign cycle, using estimates from time bugeting an/or STA Tools like Synopsys DC Explorer [22] can provie timing estimates for running gate-level ulation As alreay mentione performing gate-level ulation late in the esign cycle is prohibitively slow an may result in esign changes back in the or may require ECO Further, the iea of performing long full chip ulation in a short amount of time is much welcome by the inustry Figures 10 an 11 show the traitional an new flow for simulation,

respectively The obvious avantage of the new flow is rapi gate-level ulation early in the esign cycle so that timing checks are valiate an bugs are caught early on with peripheral timing Figure 10 Traitional simulation flow in ASIC esign 0-elay & timing 0-elay Hybri Our work Figure 11 Propose flow of early simulation in ASIC esign VII CONCLUSION AND FUTURE WORK Toay, system-on-chip (SoC) esigns have become wiesprea These esigns integrate multiple harware cores working at ifferent frequencies simulation of such multi-clock omain esigns is critical Traitional ynamic simulation with SDF back-annotation cannot be one on such large esigns In aition, event-riven ulation is extremely slow, suffers from capacity issues because of large SDF files (exceeing 10GB for small SoC esigns) an is generally one late in the ASIC esign cycle after synthesis or layout This paper provies a proof of concept of hybri gate-level ulation that makes use of STA an selective SDF back-annotation accelerate gate-level ulation STA acts as timing preicr for blocks which are run without SDF back-annotation The approach also valiates the result of STA which epens on manual constraints entry The propose approach is applicable multi-clock omain esigns with clock omain crossings (CDC) We are actively working on such larger esigns Further, we propose a flow for early simulation in the ASIC/FPGA esign flow that inclues rapi hybri gate-level ulation ACKNOWLEDGMENT This work was supporte in part by funing from the National Science Founation, awar No CCF 1017530 REFERENCES Layout Layout 0-elay & timing 0-elay Hybri Our work [1] T Anerson, an R Bhagat, Tackling Functional Verification for Virtual Components, ISD Magazine, pp 26, November 2000 [2] P Rashinkar, an L Singh, New SoC Verification Techniques, Abstract for turial, IP/SOC 2001 Conference, March 19, 2001 [3] Symbolic spees Closure (http://wwwtechesignforumscom/ea/ea-pics/verifie-rtl- gates/symbolic-simulation-spees-timing-closure) [4] D Kim, M Ciesielski, an S Yang, A new Distribute Eventriven Gate-level HDL by Accurate Preiction, Design an Test Europe (DATE 2011), pp 547-550, March 2011 [5] WK Lam, Harware Design Verification: an Formal Metho-Base Approaches, Prentice Hall, 2005 [6] SimCluster atasheet, Avery Design Aumation (http://wwwaveryesigncom) [7] MP-Sim atasheet, Axiom Design Aumation (http://wwwaxiomacom) [8] RM Fujimo, Parallel Discrete Event, Communication of the ACM, Vol 33, No 10, pp 30-53, Oct 1990 [9] A Gafni Rollback Mechanisms for Optimistic Distribute Systems, SCS Multiconference on Distribute, vol3, pp 61-67, July 1988 [10] RM Fujimo, Time Warp on a Share Memory Multiprocessor, Transactions of the Society for Computer, Vol, 6, No 3, pp 211-239, July 1989 [11] DM Nicol, Principles of Conservative Parallel, Proceeings of the 28th Winter Conference, pp 128 135, 1996 [12] D Chatterjee, A DeOrio, an V Bertacco, Event- riven Gatelevel with General Purpose GPUs, Proceeings of Design Aumation Conference (DAC09), pp 557-562, 2009 [13] Y Zhu, B Wang, an Y Deng, Massively Parallel Logic with GPUs, article 29, ACM Trans Design Aumation of Electronic Systems, June 2011 [14] Opencores esigns (wwwopencoresorg) [15] VerificationBlog (http://whatisverificationblogspotcom/2011/06/gate-levelsimulations-necessary-evilhtml) [16] L Li, an C Tropper, A esign-riven Partitioning Algorithm for Distribute Verilog, in Proc 20th International Workshop on Principles of Avance an Distribute (PADS), pp 211 218, 2007 [17] D Kim, M Ciesielski, an S Yang, "MULTES: MUlti-Level Temporal-parallel Event-riven," IEEE Trans on CAD of Integrate Circuits an Systems 32(6): pp 845-857 (2013) [18] DKim, MULTES : Multi-level Temporal-parallel Event-riven, PhD Thesis, University of Massachusetts Amherst, 2011 [19] F Roriguez-Henriques, N Saqib, A Perez, an C Koc, Crypgraphic Algorithms on Reconfigurable Harware, Springer, 2006 [20] Caence (http://wwwcaencecom) [21] Synopsys (http://wwwsynopsyscom) [22] Synopsys DC Explorer (http://wwwsynopsyscom/ols/implementation/rtlsynthesis/c explorer/pages/efaultaspx) [23] AES-128 Opencores esign (http://opencoresorg/project,aes_core) [24] DES-3 Opencores esign (http://opencoresorg/project,es) [25] VGA Controller Opencores esign (http://opencoresorg/project,vga_lc) [26] JPEG Encoer Opencores esign (http://opencoresorg/project,jpegencoe)