Sei sulla pagina 1di 3

COMP 2213 X2: Computer Architecture and Organization

Assignment 3 Due: Wednesday, March 20, 2013 at midnight

Part 1: Caching
[25 pts] 1. In class, code for computing the sum of all cells in a 10,000 x 10,000 matrix of double precision numbers was presented, and the performance of doing row-wise summation vs. column-wise summation was compared. For the row-wise summation, the following output was produced when the program was executed under the performance profiler on Linux:
duane@laptop:~/mat_sum$ perf stat -e cycles:u -e instructions:u \ -e L1-dcache-loads:u -e L1-dcache-load-misses:u \ -e L1-icache-load-misses:u ./matrix_sum_rowwise Sum: 49997291.929030 Performance counter stats for './matrix_sum_rowwise': 3,884,859,754 cycles:u # 0.000 GHz 8,800,430,880 instructions:u # 2.27 insns per cycle 2,900,762,202 L1-dcache-loads 12,569,154 L1-dcache-misses dcache hits 8,246 L1-icache-misses 1.304023634 seconds time elapsed # 0.43% of all L1-

Based on the number of L1 data cache misses (slightly over 12,500,000) while computing the sum of the matrix in a row-wise fashion, how many bytes are stored in each L1 cache entry of the processor on which this program was executed? Describe how you derive your answer.

2. [10 pts] The i7 processor has three levels of cache. Each processor has at least two cores (like CPUs within a CPU). Each core has its own small L1 cache, and a larger and slightly slower L2 cache. Then, there is an L3 cache shared by the whole processor, and the even larger L3 cache is connected to the memory system.

What is the advantage to the L3 cache being shared by all cores, instead of each core having its own large L3 cache?

Part 2: Virtual Memory


1. [30pts] Suppose in an Intel system with a 3-level hierarchical page table, as described in section 3.1 of https://www.kernel.org/doc/gorman/html/understand/understand006.html , that entries for a process' pgd table and the pmd tables consumed 8 bytes each, each entry for the page tables (described as pte_t page frames in the above URL) consume 16 bytes each. Also assume that each table consumes 4K of memory. a. How many pages can be assigned by the system to a single program? b. Assuming 4K page sizes, how much memory (in gigabytes) can be assigned to a single program?

2. [35 pts] Assuming the structure as provided in question #1, if a program is assigned 3G of memory: a. What is the smallest number of pages assigned to this program? b. What is the smallest number of page tables (pte_t page frames) required for this program? c. What is the smallest number of pmd tables required for this program? d. What is the smallest number of pgd tables required for this program? e. Assuming the space overhead for memory management is the number of bytes required to store all the tables related to paging (i.e. your answers to b, c, and d), how much memory is consumed in memory management overhead for this program?

Potrebbero piacerti anche