Benvenuto in Scribd!

Master in High Performance Computing Advanced Parallel Programming LABS

Caricato da

Il 0% ha trovato utile questo documento (0 voti)

18 visualizzazioni2 pagine

The labs will be performed on the Finis Terrae (FT2) supercomputer. For each lab, a small report must be written explaining what was done, resulting codes, and performance analysis with varying numbers of threads. The performance analysis should measure speedup for computations taking at least seconds. Three exercises are described: 1) Vectorizing matrix multiplication and SAXPY operations with OpenMP, 2) Parallelizing numerical integration, dot products, and matrix-vector multiplication with OpenMP and MPI in hybrid configurations, and 3) Analyzing speedup for different process and thread counts keeping CPU count fixed.

Descrizione originale:

Titolo originale

OpenMPLabs

Copyright

Formati disponibili

PDF, TXT o leggi online da Scribd

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Segnala questo documento

Copyright:

Formati disponibili

Scarica in formato PDF, TXT o leggi online su Scribd

Segnala contenuti inappropriati

Il 0% ha trovato utile questo documento (0 voti)

18 visualizzazioni2 pagine

Master in High Performance Computing Advanced Parallel Programming LABS

Caricato da

nijota1

Copyright:

Formati disponibili

Scarica in formato PDF, TXT o leggi online su Scribd

Segnala contenuti inappropriati

Salta alla pagina

Sei sulla pagina 1di 2

Cerca all'interno del documento

Master in High Performance Computing

Advanced Parallel Programming LABS

The labs will be performed in the Finis Terrae (FT2) supercomputer of the Galicia
Supercomputing Center (CESGA). The use of this system was studied in the subject
Parallel Programming in first semester. In aula.cesga.es a small guide about the FT2
architecture, as well as information about how to compile and execute code in the FT2
can be found (see aula.cesga.es → Advanced Parallel Programming course → Documents
→ MPIatFT.pdf).
For each one of the labs you will have to write a small report, just explaining what
you have done in each exercise, the resulting codes, and the performance analysis. The
performance analysis implies to get the speedup with different number of threads and
to get some conclusions. Note that, to perform a good analysis and obtain reasonable
conclusions, the computing time of the tests should be at least of the order of seconds.
The memory can be written in English or Spanish. The deadline dates for each lab will
be communicated via slack.

OpenMP: Vectorization and Hybrid Programming

LABS 1
Starting codes are in Lab1Codes.zip file.

Vectorization with OpenMP. We are going to use the gcc compiler, so module
load gcc[/6.4.0] is needed. For testing, compute -c 4 can be enough.

1. The code multf.c performs the product of matrices

D = A × BT

Parallelize and vectorize the product. Analyze the performance with different
number of threads (1, 2, and 4) with vectorization and without it.
2. In the code saxpy.c there are two different functions for SAXPY operation.
Vectorize the loops of N iterations calling saxpy and saxpyi functions. The
saxpy_no_simd and saxpyi_no_simd functions (not to be vectorized) are only
to compare the performance between vectorization and without it.
3. Using the program done in the previous point, parallelize the four loops of N
iterations. Analyze the performance.

Hybrid Programming. We can use Intel MPI implementation, module load

intel impi, or OpenMPI one, load gcc openmpi. There may be some differences.
The exercises are based on codes that you parallelized in MPI in Parallel Program-
ming.

1. The code pi_integral.c computes the value of π with numerical integration

in the interval [0,1], using N intervals of the same size and adding their areas:
Z 1 1
1 π
dx = arctan(x) = arctan(1) − arctan(0) =
0 1 + x2 0 4
Parallelize it using MPI. After that include OpenMP directives to parallelize
the loop. Analyze the speedup for different number of processes and threads
for the same number of CPUs, that is, for a given number of CPUs, analyze
the different configurations of processes and threads. For example, for 16 CPUs
the configurations (processes, threads) can be: (1, 16) which implies OpenMP
only, (2, 8), (4, 4), (8, 2), and (16, 1) which implies MPI only.
2. The program dotprod.c computes the dot product of two vectors. Parallelize
it using MPI. After that include OpenMP directives to parallelize the loop.
Analyze the speedup for different number of processes and threads for the same
number of CPUs, that is, for a given number of CPUs, analyze the different
configurations of processes and threads.
3. The program mxvnm.c computes the matrix vector product. Parallelize it using
MPI the loop through the rows of the matrix (N ). After that include OpenMP
directives to parallelize the innermost loop (M ). Analyze the speedup for dif-
ferent number of processes and threads for the same number of CPUs, that is,
for a given number of CPUs, analyze the different configurations of processes
and threads. Try with different values of N and M in order to find a situation
in which hybrid programming is better than OpenMP or MPI.

Potrebbero piacerti anche

Master in High Performance Computing Advanced Parallel Programming MPI: Nonblocking Collective Communications
Documento1 pagina
Master in High Performance Computing Advanced Parallel Programming MPI: Nonblocking Collective Communications
nijota1
Nessuna valutazione finora
Lab 5
Documento1 pagina
Lab 5
ims
Nessuna valutazione finora
Master in High Performance Computing Advanced Parallel Programming MPI: Topologies and Neighborhood Collectives
Documento1 pagina
Master in High Performance Computing Advanced Parallel Programming MPI: Topologies and Neighborhood Collectives
nijota1
Nessuna valutazione finora
Fig. 1.22. Parallel Sparse Matrix Multiply Code Fragment, in Fortran, That Times The Operation
Documento5 pagine
Fig. 1.22. Parallel Sparse Matrix Multiply Code Fragment, in Fortran, That Times The Operation
latinwolf
Nessuna valutazione finora
Lab 1
Documento2 pagine
Lab 1
ims
Nessuna valutazione finora
A Tutorial On Parallel and Concurrent Programming in Haskell
Documento40 pagine
A Tutorial On Parallel and Concurrent Programming in Haskell
Nguyễn M.Đức Tuân
Nessuna valutazione finora
17 Shared Memory
Documento73 pagine
17 Shared Memory
Yuniel Armando González Torres
Nessuna valutazione finora
A Tutorial On Parallel and Concurrent Programming in Haskell
Documento40 pagine
A Tutorial On Parallel and Concurrent Programming in Haskell
jase21
Nessuna valutazione finora
Ecole Militaire Polytechnique: Content
Documento16 pagine
Ecole Militaire Polytechnique: Content
Tarak Kai
Nessuna valutazione finora
TP OpenMP
Documento16 pagine
TP OpenMP
Tarak Kai
Nessuna valutazione finora
Unit 4 Shared-Memory Parallel Programming With Openmp
Documento37 pagine
Unit 4 Shared-Memory Parallel Programming With Openmp
Sudha Palani
Nessuna valutazione finora
Parallel Regions: Compute
Documento8 pagine
Parallel Regions: Compute
puskesmas sukawening
Nessuna valutazione finora
Lab 2: Brief Tutorial On Openmp Programming Model: Adrián Álvarez, Sergi Gil Par4207 2019/2020
Documento11 pagine
Lab 2: Brief Tutorial On Openmp Programming Model: Adrián Álvarez, Sergi Gil Par4207 2019/2020
Adrián Alvarez
Nessuna valutazione finora
Java Practise Exercise
Documento3 pagine
Java Practise Exercise
sivaterror
Nessuna valutazione finora
K Means Clustering Using Openmp: Subject: Operating Systems
Documento12 pagine
K Means Clustering Using Openmp: Subject: Operating Systems
Thìn Nguyễn
Nessuna valutazione finora
22 R. A. Kendall Et Al.: 1.3.2 The Openmp Model
Documento4 pagine
22 R. A. Kendall Et Al.: 1.3.2 The Openmp Model
latinwolf
Nessuna valutazione finora
Programming Assignment: On Openmp
Documento19 pagine
Programming Assignment: On Openmp
yogesh
Nessuna valutazione finora
CD Multicore
Documento10 pagine
CD Multicore
bhalchimtushar0
Nessuna valutazione finora
Soto Ferrari
Documento9 pagine
Soto Ferrari
fawzi5111963_7872830
Nessuna valutazione finora
Practical-02 PC
Documento12 pagine
Practical-02 PC
SAMINA ATTARI
Nessuna valutazione finora
High Performance Scientific Computing: Module: Openmp Programming
Documento25 pagine
High Performance Scientific Computing: Module: Openmp Programming
kushal bosu
Nessuna valutazione finora
Code: First Method:: (1) Write A C Program Using Open MP To Estimate The Value of PI (Use Minimum Two Methods)
Documento8 pagine
Code: First Method:: (1) Write A C Program Using Open MP To Estimate The Value of PI (Use Minimum Two Methods)
Salina Ranabhat
Nessuna valutazione finora
Parallel and Distributed Computing Lab Digital Assignment - 5
Documento7 pagine
Parallel and Distributed Computing Lab Digital Assignment - 5
ajay
Nessuna valutazione finora
Day 1 1-12 Intro-Openmp
Documento57 pagine
Day 1 1-12 Intro-Openmp
JAMEEL AHMAD
Nessuna valutazione finora
PL01 Guiao
Documento3 pagine
PL01 Guiao
João Lourenço
Nessuna valutazione finora
The Tip of The Iceberg: 1 Before You Start
Documento18 pagine
The Tip of The Iceberg: 1 Before You Start
regupathi6413
Nessuna valutazione finora
PC File
Documento57 pagine
PC File
Avinash Vad
Nessuna valutazione finora
Scientific Python: Computational Fluid Dynamics: 17 July 2014
Documento19 pagine
Scientific Python: Computational Fluid Dynamics: 17 July 2014
preetham108
Nessuna valutazione finora
Omp Hands On SC08
Documento153 pagine
Omp Hands On SC08
Sreeram Kumar
Nessuna valutazione finora
Open MP1551363136163
Documento29 pagine
Open MP1551363136163
rezaeem373373
Nessuna valutazione finora
Mpi, Openmp, Matlab P: 2.1 Programming Style
Documento15 pagine
Mpi, Openmp, Matlab P: 2.1 Programming Style
Web dev
Nessuna valutazione finora
Parallel Computing Lab Manual PDF
Documento51 pagine
Parallel Computing Lab Manual PDF
SAMINA ATTARI
Nessuna valutazione finora
Lectia 4. Aspecte Comparative Ale Modelelor de Programare Paralela Mpi Si Openmp
Documento9 pagine
Lectia 4. Aspecte Comparative Ale Modelelor de Programare Paralela Mpi Si Openmp
Slavic Bodiştean
Nessuna valutazione finora
Assignment 1
Documento5 pagine
Assignment 1
Changwook Jung
Nessuna valutazione finora
Parallel Programming
Documento17 pagine
Parallel Programming
Yang Yi
Nessuna valutazione finora
Unit 1
Documento72 pagine
Unit 1
Syed Sulaiman
Nessuna valutazione finora
Report - Viber String
Documento26 pagine
Report - Viber String
LopUi
Nessuna valutazione finora
Lecture15 PDF
Documento32 pagine
Lecture15 PDF
Daniel Mora
Nessuna valutazione finora
CCP - Parallel Computing
Documento10 pagine
CCP - Parallel Computing
Naveen Setty
Nessuna valutazione finora
MC Openmp
Documento10 pagine
MC Openmp
Bui Khoa Nguyen Dang
Nessuna valutazione finora
Nscet E-Learning Presentation: Listen Learn Lead
Documento54 pagine
Nscet E-Learning Presentation: Listen Learn Lead
durai murugan
Nessuna valutazione finora
Lab1 1
Documento10 pagine
Lab1 1
Bogdan Lupe
Nessuna valutazione finora
Master in High Performance Computing Advanced Parallel Programming MPI: Remote Memory Access Operations
Documento1 pagina
Master in High Performance Computing Advanced Parallel Programming MPI: Remote Memory Access Operations
nijota1
Nessuna valutazione finora
Embedded Linux - Gse5 Lab5 - Introduction To Opencl: Barriga Ponce de Leon Ricardo Guo Ran
Documento8 pagine
Embedded Linux - Gse5 Lab5 - Introduction To Opencl: Barriga Ponce de Leon Ricardo Guo Ran
RicardoBarrigaPoncedeLeon
Nessuna valutazione finora
Quick Sort
Documento5 pagine
Quick Sort
dmaheshk
Nessuna valutazione finora
Laboratory 1 Discrete and Continuous-Time Signals
Documento8 pagine
Laboratory 1 Discrete and Continuous-Time Signals
Yasitha Kanchana Manathunga
Nessuna valutazione finora
Ex 3 (Openmp - Iii) : Int Alg - Matmul2d (Int M, Int N, Int P, Float A, Float B, Float C)
Documento3 pagine
Ex 3 (Openmp - Iii) : Int Alg - Matmul2d (Int M, Int N, Int P, Float A, Float B, Float C)
rahul krishna
Nessuna valutazione finora
HPC Project Mpi
Documento17 pagine
HPC Project Mpi
jaya vignesh
Nessuna valutazione finora
Algorithm
Documento15 pagine
Algorithm
Mansi Singh
Nessuna valutazione finora
Comp Architecture Sample Questions
Documento9 pagine
Comp Architecture Sample Questions
Mohamaad Sihatth
Nessuna valutazione finora
Module 2 - New 1
Documento72 pagine
Module 2 - New 1
Bantu Aadhf
Nessuna valutazione finora
Lab 6
Documento1 pagina
Lab 6
ahmad.nawaz
Nessuna valutazione finora
Exploiting Loop-Level Parallelism For Simd Arrays Using: Openmp
Documento12 pagine
Exploiting Loop-Level Parallelism For Simd Arrays Using: Openmp
Spin Fotonio
Nessuna valutazione finora
Lab 4
Documento18 pagine
Lab 4
Vuong Phạm
Nessuna valutazione finora
OpenMP Tutorial
Documento82 pagine
OpenMP Tutorial
Liang Xinghua
100% (1)
Open MP1
Documento15 pagine
Open MP1
l215376
Nessuna valutazione finora
Microprocessor and Microcontroller Laboratory Manual
Documento23 pagine
Microprocessor and Microcontroller Laboratory Manual
anand_gudnavar
Nessuna valutazione finora
VT 2010 Parfor
Documento68 pagine
VT 2010 Parfor
Jair Sandoval
Nessuna valutazione finora
Guidelines For Programming in High Performance Fortran: by Dave Pruett
Documento4 pagine
Guidelines For Programming in High Performance Fortran: by Dave Pruett
kadung3394
Nessuna valutazione finora
Ian Talks Python A-Z
Da Everand
Ian Talks Python A-Z
Ian Eress
Nessuna valutazione finora
Geometry Class 6 Final Exam
Documento4 pagine
Geometry Class 6 Final Exam
Tanzim
Nessuna valutazione finora
Year 6
Documento53 pagine
Year 6
Baiat Finutz
Nessuna valutazione finora
Calibration of Sensors
Documento5 pagine
Calibration of Sensors
Subhrajit Moharana
Nessuna valutazione finora
FANUC Series 16i 18i 21i Expansion of Custom Macro Interface Signal
Documento10 pagine
FANUC Series 16i 18i 21i Expansion of Custom Macro Interface Signal
mahdi elmay
Nessuna valutazione finora
Profibus Manual English
Documento87 pagine
Profibus Manual English
pecirepi
0% (1)
First Order Differentiate PDF
Documento21 pagine
First Order Differentiate PDF
Dharam
Nessuna valutazione finora
A Scheduling Approach For Ship Design Project With Fields Constraint in Tasks and Human Resources
Documento6 pagine
A Scheduling Approach For Ship Design Project With Fields Constraint in Tasks and Human Resources
avciahm
Nessuna valutazione finora
FPGA Implementation of CORDIC Processor: September 2013
Documento65 pagine
FPGA Implementation of CORDIC Processor: September 2013
lordaranor
Nessuna valutazione finora
HW 4 Assignment - E7
Documento9 pagine
HW 4 Assignment - E7
Krupa Bhulani
Nessuna valutazione finora
Gaussian Beam Optics
Documento16 pagine
Gaussian Beam Optics
ModyKing99
Nessuna valutazione finora
Liquid Holdup in Packed
Documento11 pagine
Liquid Holdup in Packed
Victor Vazquez
Nessuna valutazione finora
Kinematics of Mechanisms and Machines Assignment-0
Documento5 pagine
Kinematics of Mechanisms and Machines Assignment-0
Appam Nihar
Nessuna valutazione finora
Failure Rate Modeling Using Equipment Inspection Data: Richard E. Brown (SM)
Documento8 pagine
Failure Rate Modeling Using Equipment Inspection Data: Richard E. Brown (SM)
Jackie Acuña
Nessuna valutazione finora
TCS2
Documento5 pagine
TCS2
Harinath Yadav Chittiboyena
Nessuna valutazione finora
Robotics
Documento579 pagine
Robotics
Daniel Milosevski
83% (6)
ASIC Based Design For Energy Meters
Documento7 pagine
ASIC Based Design For Energy Meters
Zhy Shone
Nessuna valutazione finora
Simple Braced Non-Sway
Documento23 pagine
Simple Braced Non-Sway
dinesh
Nessuna valutazione finora
p8 1c Miniaturetrain
Documento10 pagine
p8 1c Miniaturetrain
api-249926147
Nessuna valutazione finora
Comparison Between Full Order and Minimum Order Observer Controller For DC Motor
Documento6 pagine
Comparison Between Full Order and Minimum Order Observer Controller For DC Motor
International Journal of Research and Discovery
Nessuna valutazione finora
13-Case and Switch VS If
Documento16 pagine
13-Case and Switch VS If
Angel Cruz
Nessuna valutazione finora
NCERT Solutions For Class 6 Maths Chapter 8 Decimals
Documento23 pagine
NCERT Solutions For Class 6 Maths Chapter 8 Decimals
ABYAN Shaik
Nessuna valutazione finora
Prediction of Fatigue Failure in A Camshaft Using The Crack Method
Documento9 pagine
Prediction of Fatigue Failure in A Camshaft Using The Crack Method
Diego Poveda
Nessuna valutazione finora
Upper and Lower Bounds Questions MME
Documento8 pagine
Upper and Lower Bounds Questions MME
RUDRAKSHI PATEL
Nessuna valutazione finora
Design of Passenger Aerial Ropeway For Urban Envir
Documento13 pagine
Design of Passenger Aerial Ropeway For Urban Envir
mani
Nessuna valutazione finora
ch11 Sol
Documento20 pagine
ch11 Sol
John Nigz Payee
Nessuna valutazione finora
Bike Talk KTH 2006
Documento66 pagine
Bike Talk KTH 2006
jdpatel28
Nessuna valutazione finora
B077H9TR12
Documento190 pagine
B077H9TR12
André Silveira
Nessuna valutazione finora
Crystal Ball Report - Full
Documento7 pagine
Crystal Ball Report - Full
Van A Hoang
Nessuna valutazione finora
06 - Class 06 - Trade Setups
Documento12 pagine
06 - Class 06 - Trade Setups
Chandler Bing
Nessuna valutazione finora
PHP Array Functions
Documento54 pagine
PHP Array Functions
Deepak Mitra
Nessuna valutazione finora