Power PC Architecture

Power PC Architecture
ARKADIP RAY
Introduction
o PowerPC (Performance Optimization With Enhanced RISC –
Performance Computing) is a RISC architecture created by
(AIM) Apple–IBM–Motorola alliance in 1991.
o The original idea for the PowerPC architecture came

from IBM’s Power architecture (introduced in the Risc/6000) and
retains a high level of compatibility with it.
o The intention was to build a high-performance, superscalar

low-cost processor.
History - POWER architecture
In February 1990, IBM introduced RS/6000 microprocessor based
on POWER architecture with UNIX operating system.The
POWER architecture incorporated lots of the RISC
characteristics :
 fixed-length instructions (4 byte). This allows single decoding
mechanism.
 register-to-register architecture,
 simple addressing modes,
 large general register file
 Mostly Single Cycle Instruction execution
 Less number of instructions.
PowerPC Architecture
o PowerPC was created in 1991 by Apple-IBM-Motorola alliance. Originally
intended for personal computers , PowerPC CPUs have since become
popular with embedded and high-performance processors as well. It is
largely based and compatible with POWER microprocessor. Design features
of PowerPC are as follows -
 Broad range implementation
 Simple processor design
 Superscalar architecture
 Multiprocessor features
 64-bit architecture
 Support for operation in both big-endian and little-endian mode. PowerPC can
switch from one mode to another at run time.
 Separate set of floating point instructions for Separate set of Floating Point
Registers (FPRs).
PowerPC Generations
 PowerPC e200 - 32 bit power architecture microprocessor - speed ranging up to
600 MHz - ideal for embedded applications.
 PowerPC e300 – similar to e200 with an increase in speed upto 667 MHz.
 PowerPC e600 – speed upto 2 Ghz – ideal for high performance routing and
telecommunications applications.
 POWER5 – IBM – dual core μP
 POWER6 – IBM – Dual core μP - A notable difference from POWER5 is that the
POWER6 executes instructions in-order instead of out-of-order
 PowerPC G3 - Apple Macintosh computers such as the PowerBook G3, the
multicolored iMacs, iBooks and several desktops, including both the Beige and Blue and
White Power Macintosh G3s.
 PowerPC G4 - is a designation used by Apple Computer to describe a fourth generation
of 32-bit PowerPC microprocessors.
 PowerPC G5 - 64-bit Power Architecture processors
 Xenon - based on IBM’s PowerPC ISA – XBOX 360 game console.
 Broadway – based on IBM’s PowerPC ISA – Nintendo Wii gaming console
 Blue Gene/L - dual core PowerPC 440, 700 MHz, 2004
 Blue Gene/P - quad core PowerPC 450, 850 MHz, 2007
Operating Systems
o Operating systems that work on the PowerPC architecture are
generally divided into those that are oriented toward the general-
purpose PowerPC systems, and those oriented toward the
embedded PowerPC systems.
e.g. - Apple’s Classic mac OS, IBM i5/OS, Linux (Debian, Fedora, Mint,
Red Hat Enterprise Edition, Ubuntu, Solaris), Windows NT, CellOS for
Playstation
 Companies that produces 32/64 bit PowerPC till date
e.g. - Intel, Apple, Cisco Systems for Routers, Motorola, Samsung,
Xilinx, Microsoft, HCL, Sony, Toshiba
 Gaming Consoles
e.g. - Bandai, Xbox, Nintendo, PlayStation
Overall design
o Instruction Unit
a) Instruction Queue
o Independent Execution Unit
a) Branch Processing Unit (BPU)
b) Integer Unit (IU)
c) Floating Point Unit (FPU)
o Memory Management Unit (MMU)
o Cache Unit
o Memory Unit
o System Interface
Overview
o The 601 is the first implementation of the PowerPC family of reduced
instruction set computer (RISC) microprocessors.
o The 601 implements the 32-bit portion of the PowerPC architecture, which
provides 32- bit effective (logical) addresses, integer data types of 8, 16, and
32 bits, and floating-point data types of 32 and 64 bits.
o For 64-bit PowerPC implementations, the PowerPC architecture provides
64-bit integer data types, 64-bit addressing, and other features required to
complete the 64-bit architecture.
o The 601 is a superscalar processor capable of issuing and retiring three
instructions per clock, one to each of three execution units.
o The 601 integrates three execution units—an integer unit (IU), a branch
processing unit (BPU), and a floating-point unit (FPU). The ability to execute
three instructions in parallel and the use of simple instructions with rapid
execution times yield high efficiency and throughput for 601-based systems.
o The 601 includes an on-chip, 32-Kbyte, eight-way set-associative, physically
addressed, unified instruction data cache and an on-chip memory
management unit (MMU).
o The 601 has a 64-bit data bus and a 32-bit address bus.
Instruction Unit
 The 601 instruction unit, which contains an instruction queue and the
BPU, provides centralized control of instruction flow to the execution
units. The instruction unit determines the address of the next
instruction to be fetched based on information from a sequential
fetcher and the BPU. The IU also enforces pipeline interlocks and
controls feed-forwarding.
Instruction Queue
 The instruction queue holds as many as eight instructions (a cache
block) and can be filled from the cache during a single cycle.
 The upper half of the instruction queue (Q4–Q7) provides buffering
to reduce the frequency of cache accesses.
 Integer and branch instructions are dispatched to their respective
execution units from Q0 through Q3. Q0 functions as the initial
decode stage for the IU.
Independent Execution Unit
Branch Processing Unit (BPU)
 The BPU performs condition register (CR) look-ahead operations on conditional
branches. The BPU looks through the bottom half of the instruction queue for a
conditional branch instruction and attempts to resolve it early, achieving the effect of
a zero-cycle branch in many cases.
 The BPU contains an adder to compute branch target addresses and three special-
purpose, user-control registers—the link register (LR), the count register (CTR),
and the CR.
Integer Unit (IU)

 The IU executes all integer instructions and executes floating-point memory accesses
in concert with the FPU.
 The IU executes one integer instruction at a time, performing computations with its
arithmetic logic unit (ALU), multiplier, divider, integer exception register (XER),
and the general-purpose register file. Most integer instructions are single-cycle
instructions.
Independent Execution Unit Continued …
Floating Point Unit (FPU)
 The FPU contains a single-precision multiply-add array, the floating-point status and
control register (FPSCR), and thirty-two 64-bit FPRs.
 The multiply-add array allows the 601 to efficiently implement floating-point
operations such as multiply, add, divide, and multiply-add.
 The FPU is pipelined so that most single-precision instructions and many double-
precision instructions can be issued back-to-back.
 The FPU contains two additional instruction queues. These queues allow floating-
point instructions to be issued from the instruction queue even if the FPU is busy,
making instructions available for issue to the other execution units.
Memory Management Unit (MMU)
 The 601’s MMU supports up to 4 Petabytes (252) of virtual memory and 4
Gigabytes (232) of physical memory. The MMU also controls access privileges for
these spaces on block and page.
 Referenced and changed status are maintained by the processor for each page to
assist implementation of a demand-paged virtual memory system.
 The instruction unit generates all instruction addresses; these addresses are both for
sequential instruction fetches and addresses that correspond to a change of program
flow.
 The integer unit generates addresses for data accesses (both for memory and the I/O
controller interface).
 After an address is generated, the upper order bits of the logical (effective) address
are translated by the MMU into physical address bits. Simultaneously, the lower
order address bits (that are untranslated and therefore considered both logical and
physical), are directed to the on-chip cache where they form the index into the
eight-way set-associative tag array.
Cache Unit
 The PowerPC 601 microprocessor contains a 32-Kbyte, eight-way set associative,
unified (instruction and data) cache.
 The cache line size is 64 bytes, divided into two eight-word sectors, each of which
can be snooped, loaded, cast-out, or invalidated independently.
 The cache is designed to adhere to a write-back policy, but the 601 allows control of
cache ability, write policy, and memory coherency at the page and block level.
 The cache uses a least recently used (LRU) replacement policy.
Memory Unit
 The 601’s memory unit contains read and write queues that buffer operations
between the external interface and the cache.
 These operations are comprised of operations resulting from load and store
instructions that are cache misses and read and write operations required to maintain
cache coherency, table search, and other operations.
 The memory unit also handles address-only operations and cache-inhibited loads and
stores.
System Interface
The 601’s System Interface supports
 Burst-read memory operations, followed by burst-write memory operations,
 I/O controller interface operations, and single-beat (noncacheable or write-
through) memory read and write operations.
 Additionally, address-only operations, variants of the burst and single-beat operations
(global memory operations and atomic memory operations) and address retry
activity.
 Memory accesses can occur in single-beat (1–8 bytes) and four-beat burst (32 bytes)
data transfers.
 The address and data buses are independent for memory accesses to support
pipelining and split transactions.
 The 601 can pipeline as many as two transactions and has limited support for out-of-
order split-bus transactions.
PowerPC Registers
PowerPC's application-level registers are broken into three categories:
general purpose, floating point and special purpose registers.
o General-purpose registers (GPRs) - r0 to r31
flat-scheme of 32 general purpose registers.
Source and destination for all integer operations
address source for all load/store operations.
They also provide access to SPRs.
All GPRs are available for use with one exception: in certain instructions, GPR0 simply means the
value 0, and no lookup is done for GPR0's contents.
o Some of these registers have special tasks assigned to them:
• r0Volatile register which may be modified during function linkage
• r1 Stack frame pointer, always valid
• r2 System-reserved register
• r3-r4Volatile registers used for parameter passing and return values
• r5-r10Volatile registers used for parameter passing
• r11-r12Volatile registers which may be modified during function linkage
• r13 Small data area pointer register
• r14-r30 Registers used for local variables
• r31 Used for local variables or “environment pointers”
Floating Point Registers
o Floating-point registers (FPRs)- fr0 to fr31
 32 floating-point registers with 64-bit precision.
 source and destination operands of all floating-point operations
 can contain 32-bit and 64-bit signed and unsigned integer values, as well as single-
precision and double-precision floating-point values.
 FPR’s also provide access to the FPSCR(Floating-Point Status and Control Register)
 FPSCR captures status and exceptions resulting from floating-point operations, and
also provides control bits for enabling specific exception types.
 Instructions to load and store double precision floating point numbers transfers 64-
bit of data without conversion.
 Instructions to load from memory single precision floating point numbers convert to
double precision format before storing them in the register.
• f0Volatile register
• f1 Volatile register used for parameter passing and return values
• f2-f8 Volatile registers used for parameter passing
• f9-f13 Volatile registers
• f14-f31 Registers used for local variables
Special-purpose registers (SPRs)
 The Fixed-Point Exception Register (XER)- used for indicating
conditions for integer operations, such as carries and overflows.
 The Floating-Point Status and Control Register (FPSCR)- 32-bit
register used to store the status and control of the floating-point
operations.
 The Count Register (CTR)- used to hold a loop count that can be
decremented during the execution of branch instructions.
 The Condition Register (CR)-32-bit register grouped into eight
fields, where each field is 4 bits that signify the result of an instruction’s
operation: Equal (EQ), Greater Than (GT), Less Than (LT), and
Summary Overflow (SO).
 The Link Register (LR) contains the address to return to at the end
of a function call.
Data Types
 It can use either little-endian or big-endian style.
 Fixed-point data types include:
o Unsigned byte 8–bits
o Unsigned halfword 16-bits
o Signed halfword 16-bits
o Unsigned word 32-bit
o Signed word 32-bit
o Unsigned doubleword 64-bits
o Byte Strings: From 0 – 128 bytes in length
 2’s complement is used for negative values
 floating-point data formats
o single-precision, 32 bits long (23 + 8 + 1)
o double-precision, 64 bits long (52 + 11 + 1)
 characters are stored using 8-bit ASCII codes
Instruction Types
Instruction Format
 All instruction encodings are 32 bits in length.
 Bit numbering for PowerPC is the opposite of most other definitions: bit 0 is the
most significant bit, and bit 31 is the least significant bit.
 Instructions are first decoded by the upper 6 bits in a field, called the primary
opcode. The remaining 26 bits contain fields for operand specifiers, immediate
operands, and extended opcodes, and these may be reserved bits or fields.
 Common Instruction formats:
Format 0-5 6-10 11-15 16-20 21-25 26-29 30 31
D-form opcd tgt/src src/tgt immediate
X-form opcd tgt/src src/tgt src extended opcd
A-form opcd tgt/src src/tgt src src extended opcd Rc
BD-form opcd BO BI BD AA LK
I-form opcd LI AA LK
Instruction format
 D-form- provides up to two registers as source operands, one immediate source, and up to two
registers as target operands. Some variations of this instruction format use portions of the target and
source register operand specifiers as immediate fields or as extended opcodes.
D-form opcd tgt/src src/tgt immediate
 X-form- provides up to two registers as source operands and up to two target operands. Some
variations of this instruction format use portions of the target and source operand specifiers as
immediate fields or as extended opcodes.
X-form opcd tgt/src src/tgt src extended opcd
 A-form- provides up to three registers as source operands, and one target operand. Some variations
of this instruction format use portions of the target and source operand specifiers as immediate fields
or as extended opcodes.
A-form opcd tgt/src src/tgt src src extended opcd Rc
 BD-form- conditional branch instruction. The BO field specifies the type of condition ; BI field
specifies which CR bit to be used as the condition; BD field is used as the branch displacement. AA bit
specifies whether the branch is an absolute or relative branch. The LK bit specifies whether the
address of the next sequential instruction is saved in the Link Register as a return address for a
subroutine call.
BD-form opcd BO BI BD AA LK
 I-form- used by the unconditional branch instruction. Being unconditional, the BO and BI fields of
the BD format are exchanged for additional branch displacement to form the LI instruction field. This
instruction format also supports the AA and LK bits in the same fashion as the BD format.
I-form opcd LI AA LK
PowerPC Addressing Modes
 Load/store architecture
 Indirect
 Instruction includes 16 bit displacement to be added to base register (may be GP
register)
 Can replace base register content with new address
 Indirect indexed
 Instruction references base register and index register (both may be GP)
 EA is sum of contents
 Branch address Target address calculation
 Absolute TA= actual address
Relative TA= current instruction address + displacement {25 bits, signed}
 Indirect
 Arithmetic
 Operands in registers or part of instruction
 Floating point is register only
Link Register TA= (LR)
Count Register TA= (CR)
References
 http://www.ibm.com/developerworks/linux/library/l-
powarch/
 http://en.wikipedia.org/wiki/PowerPC
 https://nptel.ac.in/courses/Webcourse-contents/IIT-
KANPUR/microcontrollers/micro/ui/Course_home4_36.htm
 https://www.nxp.com/files/product/doc/MPCFPE32B.pdf

Power PC Architecture

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Power PC Architecture

Caricato da

Copyright:

Formati disponibili

Power PC Architecture

o The original idea for the PowerPC architecture came

o The intention was to build a high-performance, superscalar

Integer Unit (IU)

Potrebbero piacerti anche