HIGH PERFORMANCE COMPUTING

International Teaching HIGH PERFORMANCE COMPUTING

Back

0522500136
COMPUTER SCIENCE
EQF7
COMPUTER SCIENCE
2023/2024

YEAR OF DIDACTIC SYSTEM 2016
SPRING SEMESTER
CFUHOURSACTIVITY
648LAB
Objectives
THIS COURSE AIMS TO BUILD THE PROGRAMMING SKILLS AND TO ACQUIRE THE KNOWLEDGE ABOUT ALGORITHMS, APPLICATIONS AND COMPUTER ARCHITECTURES FOR MODERN HIGH PERFORMANCE COMPUTING.

KNOWLEDGE AND UNDERSTANDING: THE LEARNING OBJECTIVES OF THIS COURSE ARE TO ACQUIRE KNOWLEDGE AND PROGRAMMING TECHNIQUES FOR HPC COMPUTING SYSTEMS. STUDENTS WILL DEVELOP AND ACQUIRE KNOWLEDGE ON THE FOLLOWING TOPICS:
• HPC ARCHITECTURES
• PROGRAMMING MODELS FOR HPC AND PARALLEL PROGRAMMING PATTERNS
• PROGRAMMING ON SHARED MEMORY (OPENMP)
• HETEROGENOUS AND GPU PROGRAMMING (CUDA/OPENCL/SYCL)
• DISTRIBUTED MEMORY PROGRAMMING (MPI)
• VECTORIZATION (INTRINSICS)
• PARALLEL PROGRAM OPTIMIZATION AND TUNING
• COMPILATION FOR HPC AND AUTOMATIC PARALLELIZATION
• HPC APPLICATIONS

APPLYING KNOWLEDGE AND UNDERSTANDING: THE STUDENTS WILL BE ABLE TO APPLY THEIR KNOWLEDGE AND ACQUIRE THE FOLLOWING SKILLS:
• BE ABLE TO PROGRAM MULTICORE APPLICATION WITH OPENMP
• BE ABLE TO PROGRAM A GPU/HETEROGENOUS APPLICATION
• BE ABLE TO PROGRAM A DISTRIBUTED APPLICATION WITH MPI
• TO RECOGNIZE AND APPLY PARALLEL PROGRAMMING PATTERN
• BE ABLE TO UNDERSTAND AND EXPLOIT COMPILATION TECHNIQUES IN HPC APPLICATIONS
• TO BE ABLE TO APPLY OPTIMIZATION TECHNIQUES FOR PARALLEL AND DISTRIBUTED APPLICATIONS
Prerequisites
THE COURSE REQUIRES A GOOD PROFICIENCY IN TECHNICAL ENGLISH AND GOOD KNOWLEDGE ON
-C PROGRAMMING IN UNIX/LINUX
-ALGORITHMS AND DATA STRUCTURES
-COMPUTER ARCHITECTURE
Contents
THE COURSE INCLUDES FOUR MODULES, OVERALL 48 HOURS OF LABORATORY WORK.

MODULE ON MULTICORE PROGRAMMING, 12 HOURS:
-PARALLEL DESIGN PATTERN
-OPENMP PROGRAMMING
-CACHE AND FALSE SHARING
-TASK PARALLELISM
-NESTED PARALLELISM AND NUMA ARCHITECTURE

MODULE ON SIMD PROGRAMMING, 12 HOURS:
-INTRINSICS PROGRAMMING WITH INTEL AVX
-OPEMP SIMD DIRECTIVES
-COMPILATION AND AUTOMATIC VECTORIZATION
-VLA PROGRAMMING WITH ARM SVE
-DATA REORGANIZATION PATTERN

MODULE ON HETEROGENOUS AND GPU PROGRAMMING, 12 HOURS:
-OPENCL AND SYCL PROGRAMMING
-GPU ARCHITECTURE: MEMORY MODEL, THREAD MODEL, OPTIMIZATION
-PARALLEL DESIGN PATTERNS OPTIMIZED FOR GPU
-BRIEF OVERVIEW OF CUDA AND HIP PROGRAMMING

MODULE ON ADVANCED TOPICS, 12 HOURS:
-ROOFLINE MODEL
-BENCHMARKING
-COMPILATION FOR HIGH PERFORMANCE
-AUTOTUNING
-ENERGY OPTIMIZATION AND APPROXIMATE COMPUTING
-ADVANCED MPI

Teaching Methods
THE COURSE COMPRISES 48 HOURS OF LABORATORY WORK.

IN EACH LECTURE, STUDENTS LEARN ABOUT COMPUTER ARCHITECTURES AND PROGRAMMING MODELS FOR HIGH PERFORMANCE COMPUTING, WHICH ARE FOLLOWED BY PROGRAMMING SESSIONS WHERE STUDENTS CAN APPLY THE ACQUIRED KNOWLEDGE BY TARGETING PARALLEL SYSTEMS.

THE TEACHING MATERIALS INCLUDES SLIDES, ADDITIONAL MATERIAL FROM SELECTED BOOKS, A LIST OF SCIENTIFIC PUBLICATIONS, AND REFERENCE SOURCE CODES PROVIDED IN THE LABS.

STUDENTS WILL HAVE TO SUBMIT A FINAL PROJECT, WHOSE EVALUATION WILL CONTRIBUTE TO THE FINAL VOTE. ATTENDANCE IS HIGHLY RECOMMENDED.
Verification of learning
THERE WILL BE A WRITTEN EXAMINATION AND A PROJECT. THE FINAL GRADE IS A WEIGHTED AVERAGE OF THE TWO COMPONENTS.

THE WRITTEN EXAMINATION COMPRISES EXERCISES COVERING ALL FOUR COURSE MODULES. TO PERFORM THE WRITTEN EXAMINATION, THE PROJECT MUST BE SUBMITTED AT THE DATE OF THE EXAM.

THE PROJECT, WHOSE TOPIC IS DEFINED BY THE TEACHER, FOCUSES ON A SPECIFIC TOPIC RELATED TO HIGH PERFORMANCE COMPUTING. EXAMPLES ARE: TO SOLVE A SPECIFIC PROBLEM ON A DEFINED PARALLE ARCHITECTURE; HARDWARE FEATURE EXPLORATION; A QUANTITATIVE ANALYSIS OF AN EXISTING ALGORITHM. PRACTICALLY, THE PROJECT'S OUTPUT COMPRISES THE SOURCE CODE AND A SHORT REPORT.
Texts
-STRUCTURED PARALLEL PROGRAMMING: PATTERNS FOR EFFICIENT COMPUTATION. MORGAN KAUFMANN. MICHAEL MCCOOL, ARCH D. ROBISON, JAMES REINDERS. 2012. ISBN 0124159931
-THE OPENMP COMMON CORE. BY TIMOTHY G. MATTSON, YUN (HELEN) HE AND ALICE E. KONIGES. 2019, MIT PRESS
-JASON SANDERS, EDWARD KANDROT, CUDA BY EXAMPLE: AN INTRODUCTION TO GENERAL-PURPOSE GPU PROGRAMMING, ADDISON-WESLEY, 2010, ISBN 978-0131387683
-PETER PACHECO, AN INTRODUCTION TO PARALLEL PROGRAMMING, MORGAN KAUFMANN, 2011, ISBN 978-0123742605
-DATA PARALLEL C++. MASTERING DPC++ FOR PROGRAMMING OF HETEROGENEOUS SYSTEMS USING C++ AND SYCL. AUTHORS: REINDERS, J., ASHBAUGH, B., BRODMAN, J., KINSNER, M., PENNYCOOK, J., TIAN, X. 2021. ISBN 978-1-4842-5574-2
More Information
IN THE DEPARTMENT ELEARNING PLATFORM, THE STUDENTS CAN FIND INFO FOR EACH LECTURE, INCLUDING SLIDES, CODE EXAMPLES, REFERENCE PAPERS AND ARTICLES, AND OTHER SUPPORT MATERIAL.
  BETA VERSION Data source ESSE3