Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Vectorization with OpenMP. We are going to use the gcc compiler, so module
load gcc[/6.4.0] is needed. For testing, compute -c 4 can be enough.
D = A × BT
Parallelize and vectorize the product. Analyze the performance with different
number of threads (1, 2, and 4) with vectorization and without it.
2. In the code saxpy.c there are two different functions for SAXPY operation.
Vectorize the loops of N iterations calling saxpy and saxpyi functions. The
saxpy_no_simd and saxpyi_no_simd functions (not to be vectorized) are only
to compare the performance between vectorization and without it.
3. Using the program done in the previous point, parallelize the four loops of N
iterations. Analyze the performance.