Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Part 1: Caching
[25 pts] 1. In class, code for computing the sum of all cells in a 10,000 x 10,000 matrix of double precision numbers was presented, and the performance of doing row-wise summation vs. column-wise summation was compared. For the row-wise summation, the following output was produced when the program was executed under the performance profiler on Linux:
duane@laptop:~/mat_sum$ perf stat -e cycles:u -e instructions:u \ -e L1-dcache-loads:u -e L1-dcache-load-misses:u \ -e L1-icache-load-misses:u ./matrix_sum_rowwise Sum: 49997291.929030 Performance counter stats for './matrix_sum_rowwise': 3,884,859,754 cycles:u # 0.000 GHz 8,800,430,880 instructions:u # 2.27 insns per cycle 2,900,762,202 L1-dcache-loads 12,569,154 L1-dcache-misses dcache hits 8,246 L1-icache-misses 1.304023634 seconds time elapsed # 0.43% of all L1-
Based on the number of L1 data cache misses (slightly over 12,500,000) while computing the sum of the matrix in a row-wise fashion, how many bytes are stored in each L1 cache entry of the processor on which this program was executed? Describe how you derive your answer.
2. [10 pts] The i7 processor has three levels of cache. Each processor has at least two cores (like CPUs within a CPU). Each core has its own small L1 cache, and a larger and slightly slower L2 cache. Then, there is an L3 cache shared by the whole processor, and the even larger L3 cache is connected to the memory system.
What is the advantage to the L3 cache being shared by all cores, instead of each core having its own large L3 cache?
2. [35 pts] Assuming the structure as provided in question #1, if a program is assigned 3G of memory: a. What is the smallest number of pages assigned to this program? b. What is the smallest number of page tables (pte_t page frames) required for this program? c. What is the smallest number of pmd tables required for this program? d. What is the smallest number of pgd tables required for this program? e. Assuming the space overhead for memory management is the number of bytes required to store all the tables related to paging (i.e. your answers to b, c, and d), how much memory is consumed in memory management overhead for this program?