Sei sulla pagina 1di 12

Performance

CSE141: Introduction to Computer Architecture

Performance

When would we compare the Performance of two computers?

Design
Purchase

What metrics do we use to compare the Performance of two computers?

Cost
Hard disk capacity
Weight
Battery
Throughput
Latency/Execution Time

Throughput vs Latency

Throughput
the total amount of work done in a given time
Latency (Response Time/Execution Time)
the time between the start and completion of a task

Latency / Execution Time - Shorter execution time means better


performance

Exercise

Do the following changes to a computer system increase throughput,


decrease response time, or both?

Replacing the processor in a computer with a faster version.

Adding additional processors to a system that uses multiple processors for separate tasks
for example, searching the web.

Solution

Decreasing response time almost always improves throughput. Hence, in case 1, both
response time and throughput are improved.

In case 2, no one task gets work done faster, so only throughput increases.

Performance Equation
CPU execution time
of a program

Instruction count
of a program

ET = IC * CPI * CT
where

ET
IC
CPI
CT

= Execution Time
= Instruction Count
= Cycles per Instruction
= Clock Time

Average CPI

Clock cycle
Time

Speedup

Compare the relative performance of the baseline system and the improved
system
Definition

Execution Time baseline system


Speedup

=
Execution Timeimproved system

Exercise

Assume that for a given program 70% of the executed instructions are arithmetic, 10% are
load/store, and 20% are branch. Given this instruction mix and the assumption that an arithmetic
instruction requires 2 cycles, a load/store instruction takes 6 cycles, and a branch instruction takes 3
cycles, find the average CPI.

Instruction Type

Frequency

CPI

Frequency x CPI

Arithmetic

0.70

1.4

Load/Store

0.10

0.6

Branch

0.20

0.6
= 2.6

Exercise

Assume that we have an application composed with a total of 500000 instructions, in which 20% of
them are the load/store instructions with an average CPI of 6 cycles, and the rest instructions
are integer instructions with average CPI of 1 cycle. If the processor runs at 1GHz, how long is
the execution time?

Average CPI

= 0.20 x 6 + 0.80 x 1

500000 x 2 x 1

= 2

1000000
=

ET =
1 x 109

= 1,000,000 ns or 1 millisecond
1 x 10

Exercise

Assume that we have an application composed with a total of 500000 instructions, in which 20% of
them are the load/store instructions with an average CPI of 6 cycles, and the rest instructions
are integer instructions with average CPI of 1 cycle. If the processor runs at 1GHz, how long is
the execution time?

ET =

0.20 x 500,000 x 6
+
1 x 109

1,000,000

0.80 x 500,000 x 1

1,000,000 ns or 1 millisecond

=
1 x 109

1 x 109

Exercise

Assume that we have an application composed with a total of 500000 instructions, in which 20% of
them are the load/store instructions with an average CPI of 6 cycles, and the rest instructions
are integer instructions with average CPI of 1 cycle. If the processor runs at 1GHz, how long is
the execution time?

If we double the clock rate to 2GHz without improving the memory latency, the average

CPI for load/store instruction will also become 12 cycles. Whats the performance
improvement after this change?
ET = IC x CPI x CT
1,600,000

0.20 x 500000 x 12 + 0.80 x 500000 x 1


ET =
2 x 10

Performance Change ( Speedup ) = 1 ms / 0.8 ms = 1.25

=
2 x 10

0.8 ms

x86
x86 is a prime example of CISC

Many Instruction formats.


Variable length.
Many complex rules about which register can be used when, and which
addressing modes are valid where.
Very complex instructions
Combined memory/arithmetic.
Special-purpose registers.
Huge instruction set.

Differences between X86 and MIPS

x86 instructions can operate on memory or registers or both


x86 is a two address ISA
Both arguments are sources, One is also the destination
x86 has (lots of) special-purpose registers
x86 has variable-length instructions - Between 1 and 15 bytes

Potrebbero piacerti anche