Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
3. Assume a disk subsystem with the following components and MTTF: 12 disks, each rated at 1,500,000-hour MTTF 1 SCSI controller, 750,000-hour MTTF 1 power supply, 200,000-hour MTTF 1 fan, 300,000-hour MTTF 1 SCSI cable, 2,000,000-hour MTTF Using the simplifying assumptions that the components lifetimes are exponentially distributed which means that the age of the component is not important in probability of failure and that failures are independent, compute the MTTF of the system as a whole.
4. Suppose the following branch instructions have been executed. Label . 1 2 3 4 5 101101 101101 101101 110011 110011 b1 b1 b1 b2 b2 NT NT T NT T Address branch Taken/Not Taken
a. Show the prediction for each branch instruction using a tournament predictor with 2 entries. Also show the final contents of Predictor 1 buffer and Predictor 2 buffer. Predictor 1 and Predictor 2 are 2-bit saturating counters with 2 prediction entries. Note that Predictor 1 is a local predictor while Predictor 2 is global. Assume all table and buffer contents are initialized to zero. Instruction 1 2 3 4 5 Prediction
-2-
5. The latencies of the pipeline function units are: Function Unit Type Integer FP adder Assume the following: - There is no forwarding between function units; results are communicated by a CDB. - There are separate integer functional units for effective address calculation, for ALU operation, and for branch condition evaluation. - There are two FP adder units. - The EX stage does the effective address calculation only for loads and stores. - The issue (IS) and write result (WB) stages each take 1 clock cycle. - There are 5 load buffer slots and 5 store buffer slots. The Load and Store latencies are 3 cycles (1 for address calculation and 2 for memory access). - The BNE takes 1 clock cycle. Assume branches single issue but that branch prediction is perfect Fill out the timetable of a pipeline using two-issue Tomasulo algorithm. Cycles in EX 1 5 Number of Reservation Stations 5 2
Instruction L.D F0, 0(R1) ADD.D F4, F0, F2 S.D F4, 0(R1) DADDIU R1, R1, #-8 BNE R1, R2, LOOP L.D F0, 0(R1) ADD.D F4, F0, F2 S.D F4, 0(R1) DADDIU R1, R1, #-8 BNE R1, R2, LOOP
Issue
Write CDB
comments
6. A pipelined microprocessor has separated integer and floating point functional units. Use latencies of the instructions in the following table: Inst producing result FP ALU op FP ALU op Load double Load double Integer ALU op Inst using result Another FP ALU op Store double Another FP ALU op Store double Any Int -3Latency in clock cycles 3 2 1 0 0
Given the following source code for (i = 100; i > 1; i --) A[i] = x*B[i] + y*C[i]; and its translated MIPS code for the loop body: Loop: L.D F4, 0(R1) MUL.D F6, F4, F0 L.D F8, 0(R2) MUL.D F10, F8, F2 ADD.D F12, F6, F10 S.D F12, 0(R3) DADDUI R3, R3, #-8 DADDUI R1, R1, #-8 DADDUI R2, R2, #-8 BNE R3, R4, Loop ; load B[i] ; multiply x*B[i] ; load C[i] ; multiply y*C[i] ; add x*B[i] + y*C[i] ; store A[i] ; decrement A index ; decrement B index ; decrement C index ; exit loop if done
a. If the processor has in-order-execution implementation, identify stalls in the above code. Write the number of stall cycles for each stall and the instruction that causes the stall. How much is the execution time of the loop body (per iteration)?
b. With a single-issue pipeline, unroll two times to schedule it without any delays. Show the schedule after eliminating any redundant overhead instructions. How much is the execution of the loop body (per iteration) after unrolling and scheduling?
-4-