Challenges of a Systematic Approach to Parallel Computing and Supercomputing Education

Similar documents
international PROJECTS MOSCOW

May To print or download your own copies of this document visit Name Date Eurovision Numeracy Assignment

Introduction Research Teaching Cooperation Faculties. University of Oulu

HIGHER EDUCATION IN POLAND

Welcome to. ECML/PKDD 2004 Community meeting

Education: Integrating Parallel and Distributed Computing in Computer Science Curricula

National Academies STEM Workforce Summit

Software Maintenance

SOCRATES PROGRAMME GUIDELINES FOR APPLICANTS

Challenges for Higher Education in Europe: Socio-economic and Political Transformations

ATENEA UPC AND THE NEW "Activity Stream" or "WALL" FEATURE Jesus Alcober 1, Oriol Sánchez 2, Javier Otero 3, Ramon Martí 4

Overall student visa trends June 2017

Python Machine Learning

The Rise of Populism. December 8-10, 2017

New Models for Norwegian - Russian Education and Research Cooperation in the Field of Energy

Tailoring i EW-MFA (Economy-Wide Material Flow Accounting/Analysis) information and indicators

2017 Florence, Italty Conference Abstract

BMBF Project ROBUKOM: Robust Communication Networks

OCW Global Conference 2009 MONTERREY, MEXICO BY GARY W. MATKIN DEAN, CONTINUING EDUCATION LARRY COOPERMAN DIRECTOR, UC IRVINE OCW

Science and Technology Indicators. R&D statistics

California Digital Libraries Discussion Group. Trends in digital libraries and scholarly communication among European Academic Research Libraries

ENGINEERING What is it all about?

RUFINA GAFEEVA Curriculum Vitae

Department of Education and Skills. Memorandum

ОТЕЧЕСТВЕННАЯ И ЗАРУБЕЖНАЯ ПЕДАГОГИКА

Strategy and Design of ICT Services

HIGHLIGHTS OF FINDINGS FROM MAJOR INTERNATIONAL STUDY ON PEDAGOGY AND ICT USE IN SCHOOLS

The Survey of Adult Skills (PIAAC) provides a picture of adults proficiency in three key information-processing skills:

Competition in Information Technology: an Informal Learning

Universities as Laboratories for Societal Multilingualism: Insights from Implementation

Introduction to Mobile Learning Systems and Usability Factors

PROGRESS TOWARDS THE LISBON OBJECTIVES IN EDUCATION AND TRAINING

Circuit Simulators: A Revolutionary E-Learning Platform

06-07 th September 2012, Constanta Romania th Sept 2012

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

University of Groningen. Systemen, planning, netwerken Bosman, Aart

The International Coach Federation (ICF) Global Consumer Awareness Study

Modeling user preferences and norms in context-aware systems

From Virtual University to Mobile Learning on the Digital Campus: Experiences from Implementing a Notebook-University

Universität Innsbruck Facts and Figures

Impact of Educational Reforms to International Cooperation CASE: Finland

The European Higher Education Area in 2012:

On the Combined Behavior of Autonomous Resource Management Agents

The recognition, evaluation and accreditation of European Postgraduate Programmes.

EQE Candidate Support Project (CSP) Frequently Asked Questions - National Offices

IAB INTERNATIONAL AUTHORISATION BOARD Doc. IAB-WGA

LANGUAGES, LITERATURES AND CULTURES

PROCEEDINGS OF SPIE. Double degree master program: Optical Design

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

"On-board training tools for long term missions" Experiment Overview. 1. Abstract:

An Introduction to Simio for Beginners

EUROPEAN UNIVERSITIES LOOKING FORWARD WITH CONFIDENCE PRAGUE DECLARATION 2009

2001 MPhil in Information Science Teaching, from Department of Primary Education, University of Crete.

DEVELOPMENT AID AT A GLANCE

Question 1 Does the concept of "part-time study" exist in your University and, if yes, how is it put into practice, is it possible in every Faculty?

TIMSS Highlights from the Primary Grades

Problems of the Arabic OCR: New Attitudes

Computer Organization I (Tietokoneen toiminta)

Seminar - Organic Computing

Introduction to Causal Inference. Problem Set 1. Required Problems

Mathematics 112 Phone: (580) Southeastern Oklahoma State University Web: Durant, OK USA

Rethinking Library and Information Studies in Spain: Crossing the boundaries

EUROPEAN STUDY & CAREER FAIR

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

Ericsson Wallet Platform (EWP) 3.0 Training Programs. Catalog of Course Descriptions

PIRLS. International Achievement in the Processes of Reading Comprehension Results from PIRLS 2001 in 35 Countries

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

UNIVERSITY AUTONOMY IN EUROPE II

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

University of Illinois

REGISTRATION OF THE EXPRESSIONS OF INTEREST

Integration of ICT in Teaching and Learning

NISPAcee ( Calendar of Events in the Region Summer 2005

ehealth Governance Initiative: Joint Action JA-EHGov & Thematic Network SEHGovIA DELIVERABLE Version: 2.4 Date:

International Branches

RELATIONS. I. Facts and Trends INTERNATIONAL. II. Profile of Graduates. Placement Report. IV. Recruiting Companies

Computer Science. Embedded systems today. Microcontroller MCR

New Paths to Learning with Chromebooks

Development of an IT Curriculum. Dr. Jochen Koubek Humboldt-Universität zu Berlin Technische Universität Berlin 2008

National Pre Analysis Report. Republic of MACEDONIA. Goce Delcev University Stip

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Conversions among Fractions, Decimals, and Percents

Summary and policy recommendations

Using Deep Convolutional Neural Networks in Monte Carlo Tree Search

Abstract. Janaka Jayalath Director / Information Systems, Tertiary and Vocational Education Commission, Sri Lanka.

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

2013 Annual HEITS Survey (2011/2012 data)

THE ECONOMIC IMPACT OF THE UNIVERSITY OF EXETER

Bluetooth mlearning Applications for the Classroom of the Future

P. Belsis, C. Sgouropoulou, K. Sfikas, G. Pantziou, C. Skourlas, J. Varnas

Note: Principal version Modification Amendment Modification Amendment Modification Complete version from 1 October 2014

Speak Up 2012 Grades 9 12

Value Creation Through! Integration Workshop! Value Stream Analysis and Mapping for PD! January 31, 2002!

Undergraduate Program Guide. Bachelor of Science. Computer Science DEPARTMENT OF COMPUTER SCIENCE and ENGINEERING

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

One Hour of Code 10 million students, A foundation for success

AUTHORING E-LEARNING CONTENT TRENDS AND SOLUTIONS

Educational system gaps in Romania. Roberta Mihaela Stanef *, Alina Magdalena Manole

LIFELONG LEARNING PROGRAMME ERASMUS Academic Network

DOUBLE DEGREE PROGRAM AT EURECOM. June 2017 Caroline HANRAS International Relations Manager

Transcription:

The First European Workshop on Parallel and Distributed Computing Education for Undergraduate Students Euro-EDUPAR 2015 Challenges of a Systematic Approach to Parallel Computing and Supercomputing Education August 24 th, 2015, Vienna

Challenges of a Systematic Approach to Parallel Computing and Supercomputing Education

Challenges of a Systematic Approach to Parallel Computing and Supercomputing Education Why Parallel Computing?

Supercomputers Servers PCs, Laptops Tablets, Smartphones

Degree of parallelism 2005 2015 2025 10 4 10 6 10 9 Supercomputers 2-4 12-64 10 4 Servers 1 4-8 10 3 PCs, Laptops 1 1-4 10 2 Tablets, Smartphones

Degree of parallelism 2005 2015 2025 10 4 2-4 1 1 10 6 12-64 4-8 1-4 10 9 10 4 10 3 10 2 Parallel/Serial Amdahl s law Synchronization Scheduling Load imbalance Parallel complexity Critical path Race condition Critical resource Critical section Overheads Communications Waiting Scalability Locality Large problems Supercomputers Servers PCs, Laptops Tablets, Smartphones

Challenges of a Systematic Approach to Parallel Computing and Supercomputing Education Why Supercomputing?

If we want to out compete, we have to out compute

Challenges of a Systematic Approach to Parallel Computing and Supercomputing Education Why Supercomputing Education?

Supercomputers Servers PCs, Laptops Tablets, Smartphones

Supercomputing Education What is information structure of algorithms and programs? How many students know this notion and can use it?

Supercomputing Education What is scalability/efficiency of applications/computers? How many students know root causes of scalability and efficiency degradation? How many students are able to analyze algorithms / codes / architecture for scalability and efficiency?

Supercomputing Education How many students know what data locality is and why it is important to keep data locality at a high level in applications for any computing platform? Random Access FFT Linpack

Supercomputing Education How many qualified parallel computing university teachers are there in your university / country?

Challenges of a Systematic Approach to Parallel Computing and Supercomputing Education Why Systematic Approach?

Do we need parallel computing education? Exascale (at 2020-2021) Supercomputers billions cores, Laptops thousands cores, Mobile devices dozens/hundreds cores. Parallelism will be everywhere And what does it mean? All software engineers need to be fluent in the concept of parallelism.

It is Time to Act! (Exascale is NOT far away ) Bachelor degree 3(4) years, Master degree 2 years, 2015 + 5(6) years at universities = 2020 (2021) If we start this activity now then we get first graduate students at the Exa -point (2020-2021). All our students will live in a Extremely parallel Computer World. It is really time to think seriously about Parallel Computing and Supercomputing education

Simple questions? (ask students from your faculties ) What is complexity of a parallel algorithm? Why do we need to know a critical path of an informational graph (data dependency graph)? Is it possible to construct a communication free algorithm for a particular method? How to detect and describe potential parallelism of an algorithm? How to extract potential parallelism from codes or algorithms? What is co-design? What is data locality? How to estimate data locality in my application? How to estimate scalability of an algorithm and/or application? How to improve scalability of an application? How to express my problem in terms of MapReduce model? What is efficiency of a particular application? What parallel programming technology should I use for SMP/GPU/FPGA/vector/cluster/heterogeneous computers? How many software developers will be able to use easily these notions?

Simple questions? (ask students from your faculties ) What is complexity of a parallel algorithm? Why do we need to know a critical path of an informational graph (data dependency graph)? Is it possible to construct a communication free algorithm for a particular method? How to detect and describe potential parallelism of an algorithm? How to extract potential parallelism from codes or algorithms? What is co-design? What is data locality? How to estimate data locality in my application? How to estimate scalability of an algorithm and/or application? How to improve scalability of an application? How to express my problem in terms of MapReduce model? What is efficiency of a particular application? What parallel programming technology should I use for SMP/GPU/FPGA/vector/cluster/heterogeneous computers? How many software developers will be able to use easily these notions?

i j k Informational Structure is a Key Notion (matrix multiplication as an example) Do i = 1, n Do j = 1, n A(i,j) = 0 Do k = 1, n A(i,j) = A(i,j) + B(i,k)*C(k,j) 1 2 1 2 (1) 1 1 1 1 1 из i i j j k n i n j (2) 1 2 1 1 1 1 1 из k k j j i i n k n i n j

GAUSS elimination method (informational structure) s = s + A(i,j)*x(j) s = s + A(i,j)*x(j) x(i) = (b(i) - s)/a(i,i) Serial only x(i) = (b(i) - s)/a(i,i) Parallel execution do i = n, 1, -1 s = 0 do j = i+1, n s = s + A(i,j)*x(j) end do x(i) = (b(i) - s)/a(i,i) end do Order of iterations is the only difference! do i = n, 1, -1 s = 0 do j = n, i+1, -1 s = s + A(i,j)*x(j) end do x(i) = (b(i) - s)/a(i,i) end do

Simple questions? (ask students from your faculties ) What is complexity of a parallel algorithm? Why do we need to know a critical path of an informational graph (data dependency graph)? Is it possible to construct a communication free algorithm for a particular method? How to detect and describe potential parallelism of an algorithm? How to extract potential parallelism from codes or algorithms? What is co-design? What is data locality? How to estimate data locality in my application? How to estimate scalability of an algorithm and/or application? How to improve scalability of an application? How to express my problem in terms of MapReduce model? What is efficiency of a particular application? What parallel programming technology should I use for SMP/GPU/FPGA/vector/cluster/heterogeneous computers? How many software developers will be able to use easily these notions?

Efficiency, % Efficiency of Applications (variants of TRIAD operation) Триада 1) A[i] = B[i]*X + C 2) A[i] = B[i]*X[i] + C 3) A[i] = B[i]*X + C[i] 4) A[i] = B[i]*X[i] + C[i] 5) A[ind1[i]] = B[ind1[i]]*X + C 6) A[ind1[i]] = B[ind1[i]]*X[ind1[i]] + C 7) A[ind1[i]] = B[ind1[i]]*X + C[ind1[i]] 8) A[ind1[i]] = B[ind1[i]]*X[ind1[i]] + C[ind1[i]] ind1[i] = i 1 2 3 9) A[ind2[i]] = B[ind2[i]]*X + C 10) A[ind2[i]] = B[ind2[i]]*X[ind2[i]] + C 11) A[ind2[i]] = B[ind2[i]]*X + C[ind2[i]] 12) A[ind2[i]] = B[ind2[i]]*X[ind2[i]] + C[ind2[i]] ind2[i] = random_access Courtesy of Vad.Voevodin, MSU

Simple questions? (ask students of your faculties ) What is complexity of a parallel algorithm? Why do we need to know a critical path of an informational graph (data dependency graph)? Is it possible to construct a communication free algorithm for a particular method? How to detect and describe potential parallelism of an algorithm? How to extract potential parallelism from codes or algorithms? What is co-design? What is data locality? How to estimate data locality in my application? How to estimate scalability of an algorithm and/or application? How to improve scalability of an application? How to express my problem in terms of MapReduce model? What is efficiency of a particular application? What parallel programming technology should I use for SMP/GPU/FPGA/vector/cluster/heterogeneous computers? How many software developers will be able to use easily these notions?

What could be a solution? Trainings or student schools?!

Trainings and Schools Trainings on Intel programming tools. Optimization and tuning of user s applications. Student summer schools on parallel programming technologies Trainings on Accelrys Material Studio

GPU Technology Schools Major topics at schools: Massively Parallel Processing GPGPU Evolution Architecture of NVIDIA GPUs CUDA Programming Model CPU-GPU Interaction CUDA Memory Types Standard Algorithms on GPU: Matrix Multiplication, Reduction CUDA application libraries: CURAND, CUBLAS, CUSPARSE, CUFFT, MAGMA, Thrust Program Profiling, Performance Analysis, Debugging and Optimization Asynchronous Execution and CUDA Streams Multi-GPU Systems: Programming and Debugging nvcc Compiler Driver, cuda-gdb Debugger Kernel Configuration and Paralleling of Loops OpenACC Directives In collaboration with NVIDIA and Applied Parallel Computing

What could be a solution? Not only trainings or student schools We must think about education! No Supercomputing and Parallel Computing Education No Exascale Future

Is Parallel Computing & Supercomputing a strategically important area? Informatics Europe: a survey on needs for Supercomputing education Respondents 64, from 22 countries: Austria Denmark Estonia France 3 Germany 5 Greece India 3 Iran Italy Latvia Norway Pakistan Portugal Romania Russia 12 Serbia Spain 15 Sweden Switzerland Turkey 8 Ukraine United Kingdom 3

Challenges of a Systematic Approach to Parallel Computing and Supercomputing Education Why Challenges?

HPC Educational Infrastructure Interaction with government, ministries, funding agencies. Close contacts with leading IT companies and research institutes. Strong interuniversity collaboration. Body of knowledge on HPC & Parallel Computing. All target groups: researchers, students, teachers, schoolchildren.. All forms: Bachelors, Masters, PhDs, schools, universities, online.. Courses, textbooks, intensive practice, trainings on HPC. Individual research projects of students. Bank of exercises and tests on HPC & Parallel Computing. National scientific conferences and student schools. Research on advanced computing techniques, HW, SW, apps HPC and Industry. Supercomputing and HPC resources. International collaboration. PR, mass-media, Internet resources on HPC.

HPC Educational Infrastructure Interaction with government, ministries, funding agencies. Close contacts with leading IT companies and research institutes. Strong interuniversity collaboration. Body of knowledge on HPC & Parallel Computing. All target groups: researchers, students, teachers, schoolchildren.. All forms: Bachelors, Masters, PhDs, schools, universities, online.. Courses, textbooks, intensive practice, trainings on HPC. Individual research projects of students. Bank of exercises and tests on HPC & Parallel Computing. National scientific conferences and student schools. Research on advanced computing techniques, HW, SW, apps HPC and Industry. Supercomputing and HPC resources. International collaboration. PR, mass-media, Internet resources on HPC.

Supercomputing Consortium of Russian Universities (http://hpc-russia.ru)

HPC Educational Infrastructure Interaction with government, ministries, funding agencies. Close contacts with leading IT companies and research institutes. Strong interuniversity collaboration. Body of knowledge on HPC & Parallel Computing. All target groups: researchers, students, teachers, schoolchildren.. All forms: Bachelors, Masters, PhDs, schools, universities, online.. Courses, textbooks, intensive practice, trainings on HPC. Individual research projects of students. Bank of exercises and tests on HPC & Parallel Computing. National scientific conferences and student schools. Research on advanced computing techniques, HW, SW, apps HPC and Industry. Supercomputing and HPC resources. International collaboration. PR, mass-media, Internet resources on HPC.

Project Supercomputing Education Presidential Commission for Modernization and Technological Development of Russia s Economy Duration: 2010-2012 Coordinator of the project: M.V.Lomonosov Moscow State University Wide collaboration of universities: Nizhny Novgorod State University Tomsk State University South Ural State University St. Petersburg State University of IT, Mechanics and Optics Southern Federal University Far Eastern Federal University Moscow Institute of Physics and Technology (State University) members of Supercomputing Consortium of Russian Universities More than 600 people from 75 universities were involved in the project. Budget: 236,42 million rubles (about $8M)

National System of Research and Education Centers on Supercomputing Technologies in Federal Districts of Russia 8 centers were established in 7 federal districts of Russia during 2010-2012

Entry-Level Training on Supercomputing Technologies 3269 people passed trainings, 60+ universities from 35 cities of Russia All Federal Districts of Russia

Series of Books Supercomputing Education National Book Contest-2013: Nomination Textbooks of the 21 th century 1 st Prize There are 21 books in the Supercomputing Education series. 31.500 books of the series were delivered to 43 Russian universities.

Retraining Programs for Faculty Staff 453 faculty staff passed trainings, 50 organisations, 29 cities, 10 education programs. All Federal districts of Russia.

Education Courses on Supercomputing Technologies Development of new courses and extension of existing ones 50 courses covering all major parts of the Body of Knowledge in SC "Parallel Computing", "High Performance Computing for Multiprocessing Multi-Core Systems", "Parallel Database Systems", "Practical Training on MPI and OpenMP", "Parallel Programming Tools for Shared Memory Systems", "Distributed Object Technologies", "Scientific Data Visualization on Supercomputers", "Natural Models of Parallel Computing", "Solution of Aero- and Hydrodynamic problems by Flow Vision", "Algorithms and Complexity Analysis", "History and Methodology of Parallel Programming", "Parallel Numerical Methods", "Parallel Computations in Tomography", "Final-Element Modeling with Distributed Computations", "Parallel Computing on CUDA and OpenCL Technologies", "Biological System Modeling on GPU, "High Performance Computing System: Architecture and Software",

Intensive Trainings in Special Groups 40 special groups of trainees were formed, 790 trainees successfully passed advanced training, 15 educational programs, All Federal districts of Russia.

IT-Companies + Research Institutes & Universities (special group of students on Parallel Software Development) 55 students of MSU (Math, Physics, Chemistry, Biology, ) Moscow State University in collaboration with: Intel T-Platforms NVIDIA TESIS IBM Center on Oil & Gas Research Keldysh Institute of Applied Mathematics, RAS Institute of Numerical Mathematics, RAS

Supercomputing Education Project (key results for 2010-2012) National system of research and education centers on supercomputing technologies: 8 centers in 7 Federal districts of Russia, Body of Knowledge on parallel computing and supercomputing, Russian universities involved in supercomputing education 75, Entry-level trainings: 3269 people, 60+ universities, 34 Russian cities, Intensive training in special groups 790 people, 40 special groups, Retrained faculty staff on HPC technologies 453 people, 50 organizations, New and modified curriculum and courses of lectures 50, Using distant learning technology 731 people, 100 cities, Partners from science, education and industry 120 Russian and 65 foreign organizations, Series of books and textbooks Supercomputing Education 21 books, 31500 books were delivered to 43 Russian universities, National system of scientific conferences and students schools on HPC,

HPC Educational Infrastructure Interaction with government, ministries, funding agencies. Close contacts with leading IT companies and research institutes. Strong interuniversity collaboration. Body of knowledge on HPC & Parallel Computing. All target groups: researchers, students, teachers, schoolchildren.. All forms: Bachelors, Masters, PhDs, schools, universities, online.. Courses, textbooks, intensive practice, trainings on HPC. Individual research projects of students. Bank of exercises and tests on HPC & Parallel Computing. National scientific conferences and student schools. Research on advanced computing techniques, HW, SW, apps HPC and Industry. Supercomputing and HPC resources. International collaboration. PR, mass-media, Internet resources on HPC.

Moscow State University (established in 1755) 41 faculties 350+ departments 5 major research institutes 45 000+ students, 2500 full doctors (Dr.Sci.), 6000 PhDs, 1000+ full professors, 5000 researchers.

Computing Center, MSU, 1955

MSU Computing Center, 1956 Strela is the first Russian massproduction computer Peak performance: 2000 instr/sec Total area: 300 m 2 Power consumption: 150 kwatt

MSU Computing Center, 1959 Setun computer The first computer in the world based on ternary (-1/0/1) logic.

HPC Stages at MSU

Computing Facilities of Moscow State University (from 1956 up to now) Strela Setun BESM-6 10 3 1956 1959 1967 10 6 BlueGene/P Lomonosov 2000 2008 Chebyshev 2003 10 10 2008 10 15 2009

Computing Facilities of Moscow State University (from 1956 up to now) Brief Stats on MSU Supercomputer Center Users: 2511 Projects: 1607 Organizations: 302 MSU Faculties / Institutes: 20+ Computational science is everywhere 1 rack = 256 nodes: Intel + NVIDIA = 515 Tflop/s Lomonosov-2 supercomputer (5 racks) = 2.5 Pflop/s

Computing Facilities of Moscow State University (brief statistics) OpenMP Parallel methods and algorithms Debugging of parallel applications Optimization and fine tuning of applications GPU programming MPI Requests for trainings

Computing Facilities of Moscow State University (brief statistics) Diversity of software in use

Computing Facilities of Moscow State University (brief statistics) Diversity of parallel programming technologies in use

Supercomputer Centers and Applications Challenge of the Day Extremely Low Efficiency (for free)

Efficiency of Supercomputing Centers (straightforward approach) Peak performance of a core = 12 Gflops 400 Mflops = 3,33% Average performance (one core) of Chebyshev supercomputer for 3 days

Efficiency of Supercomputing Centers 1 Pflop/s system What do we expect? useful 1Pflop * 60sec * 60min * 24hours * 365days = 31,5 ZettaFlop (10 21 ) per year What is in reality? A small, small, small fraction Supercomputers and Steam Locomotives Who are more efficient? Current trend: peculiarities of hardware, complicated job flows, poor data locality, huge degree of parallelism in hardware, etc decrease efficiency of supercomputers dramatically.

One-semester course for Bachelors: Supercomputers and Parallel Data Processing at CMC MSU Solving problems: main stages Problem Method Programming Technology Supercomputer (millions, billions ) Compiler Algorithm Code Problem-specific side Computer-specific side

One-semester course for Bachelors: Supercomputers and Parallel Data Processing at CMC MSU Solving problems: main stages Problem Supercomputer (millions, billions ) Method Programming Technology Structure of Algorithms and Programs Computer Architectures Compiler Algorithm Parallel Programming Technologies Code Problem-specific side Computer-specific side

Supercomputers and Parallel Data Processing (one-semester course for Bachelors at CMC MSU) Lectures 3 5 2 4 2 Introduction. Computers/supercomputers, parallel computing, large problems, history of parallelism in computer architecture, supercomputing in our life, Amdahl s law Architecture of parallel computing systems. Shared/distributed memory computers, SMP/NUMA/ccNUMA, multicore processors, clusters, distributed computing, vector-parallel computers, vector instructions, VLIW, superscalar architectures, GPU, exascale challenges Performance of parallel computing systems. R peak and R max, MIPS/Mflops, Linpack, STREAM, NPB, HPCC, APEX, general-purpose and special-purpose processors, root causes for performance degradation Parallel programming technologies. Parallel programs, SPMD, Masters/Workers; parallel programming technologies: efficiency, productivity, portability; MPI, OpenMP, Linda, Send/Recv/Put/Get, efficiency, scalability Introduction to information structure of algorithms and programs. Graph-based models of programs, control/information graphs, graphs/histories, information (dependency) graph, information dependency, information independency, parallel processing, resource of parallelism, equivalent transformations of codes, critical path

What is at the end?

What models can be used for development of parallel codes? SPMD NUMA Master / Workers VLIW Send / Receive Put / Get Correct YES YES YES YES Key words: models, parallel program, computer architectures, parallel processes, message passing, shared and distributed memory computers

What is at the end?

A computer multiplies two square dense matrices (type real) by the classical method for 5 seconds at a performance of 50 Gflops. What is the matrix size? Correct 500 * 500 1000 * 1000 2500 * 2500 4000 * 4000 5000 * 5000 YES There is no correct answer Key words: complexity of algorithms, structure of algorithms, sustained performance, peak performance, serial and parallel computing

Algorithms + Software + Architectures Is a Key Point of Supercomputing Education

HPC Educational Infrastructure Interaction with government, ministries, funding agencies. Close contacts with leading IT companies and research institutes. Strong interuniversity collaboration. Body of knowledge on HPC & Parallel Computing. All target groups: researchers, students, teachers, schoolchildren.. All forms: Bachelors, Masters, PhDs, schools, universities, online.. Courses, textbooks, intensive practice, trainings on HPC. Individual research projects of students. Bank of exercises and tests on HPC & Parallel Computing. National scientific conferences and student schools. Research on advanced computing techniques, HW, SW, apps HPC and Industry. Supercomputing and HPC resources. International collaboration. PR, mass-media, Internet resources on HPC.

National system of student schools February, Arkhangelsk April, Saint-Petersburg June, Moscow October, Nizhny Novgorod December, Tomsk

Summer Supercomputing Academy at Moscow State University June,22 July,3, 2015 Plenary lectures by prominent scientists, academicians, CEO/CTO s from Russia and abroad, 6 parallel educational tracks, Trainings on a variety of topics, Attendees: from students up to professors (about 120 attendees). Supported by: Intel, IBM, NVIDIA, T-Platforms, HP, NICEVT

Scientific workshop "Extreme Scale Scientific Computing"

Summer Supercomputing Academy at Moscow State University June,22 July,3, 2015 Educational tracks: MPI / OpenMP programming technologies NVIDIA GPU programming technologies Intel new architectures and software tools Industrial mathematics and computational hydrodynamics OpenFOAM/Salome/Paraview open software Parallel computing for school teachers of informatics Supported by: Intel, IBM, NVIDIA, T-Platforms, HP, NICEVT

Schoolchildren at MSU Supercomputing Center (600+ visitors per year)

Schoolchildren at MSU Supercomputing Center (600+ visitors per year)

Parallel Computing and Primary School? Easily! Can you do it faster? How to work in a team? Parallel execution Synchronization Synchronization Load balancing Scheduling Critical resource Courtesy of M.A.Plaksin, Perm, Russia

Parallel Computing and Primary School? Easily! Can you do it faster? Parallel execution Load balancing Parallel/Serial Amdahl s law Synchronization Scheduling Load imbalance Parallel complexity Critical path Race condition Critical resource Critical section Overheads Communications Waiting Scalability Locality Large problems Synchronization Scheduling How to work in a team? Synchronization Critical resource Courtesy of M.A.Plaksin, Perm, Russia

Who will live/work beyond Exascale? 2005 2015 2025 10 4 10 6 10 9 Supercomputers 2-4 12-64 10 4 Servers 1 4-8 10 3 PCs, Laptops 1 1-4 10 2 Tablets, Smartphones

Who will live/work beyond Exascale? 2005 2015 2025 10 4 10 6 10 9 Supercomputers 2-4 12-64 10 4 Servers 1 4-8 10 3 PCs, Laptops 1 1-4 10 2 Tablets, Smartphones

My deep gratitude to: Victor Gergel, NNSU Nina Popova, MSU Our Colleagues from Supercomputing Consortium of Russian Universities

The First European Workshop on Parallel and Distributed Computing Education for Undergraduate Students Euro-EDUPAR 2015 Thank You! August 24 th, 2015, Vienna