Sei sulla pagina 1di 27

Arithmetic, Logic, and ALU Introduction

The ability to perform arithmetic and logical computations Critical task for computer ALU is common building block in most CPU type functions Device is able to perform variety of arithmetic and logical operations On two N bit numbers Generate N bit output Control inputs specify operation to be performed Most ALUs today can perform following operations Simple operations Basic addition, subtraction, multiplication, division In very simple devices operations limited to Addition and subtraction Bit wise logical operations AND, OR, NOR, XOR Bit shift operations Shifting or rotating word Left or right With or without sign extension Comparison Complex operations As complexity of supported operations increases Cost, size, and power all increase Complexity can branch in different directions Speed Perform elementary arithmetic or logical operations One or several clock cycles Barrel shifter is good example Can shift data word specified number of bits In single clock cycle Functionality Implement operations such as floating point math In hardware

During course of our studies Will examine basic implementation of Four fundamental arithmetic functions Add Subtract Multiply Divide Several essential logical operations - 1 of 27 -

Complement Shifting Examine alternate implementation methods to improve performance Speed Cost Lets look at each of these Well begin with basic binary arithmetic Well see that this is no different from what weve been doing In base 10
Binary Arithmetic Operations

We will begin with binary arithmetic Using unsigned numbers For now will work only with integers Floating point operations Merely extension of basic ideas Unsigned Numbers Unsigned numbers Considered to be all positive We began learning decimal arithmetic By studying operations on single digit Use same technique for binary Addition and Subtraction Addition and subtraction relatively straight forward operations Executed much as one would expect
Addition

Basic binary addition proceeds as follows 0 0 1 1 augend +0 +1 +0 +1 addend ---- ---- ------0 1 1 11 carry sum Carries propagate to the left as we have in familiar decimal addition Often name further qualified as carry out Indicated carry out from one column to next Further qualification Carry out from ith column becomes Carry in to i+1th column We see this in next example We can extend to multiple bits very easily - 2 of 27 -

Example

111 11 1001 0111 1011 0101 1010 1101 0001 1010 ------ ------ ------ -----10011 10100 1100 1111

Observe Carry propagation Carry out from one column as carry in to next Overflow In several cases We get a carry out from the most significant bit This is called overflow The result is too large To fit in word or register designated to hold it Here we are working with 4 bit words We have a 5 bit result Can potentially be serious problem If overflow not registered Result can be interpreted as Adding two large numbers Producing small result
The Full Adder

We can build a hardware circuit To implement such an adder In rather straight forward way Begin with single bit adder Called full adder A full adder has Three inputs Augend bit Addend bit Carry in bit Two outputs Sum bit Carry out bit If we do not have carry in bit Called half adder - 3 of 27 -

The full adder block diagram follows To implement adder in hardware Begin with truth table Must consider Two inputs and carry in Sum output and carry out X 0 0 0 0 1 1 1 1 Y 0 0 1 1 0 0 1 1 Ci 0 1 0 1 0 1 0 1 S 0 1 1 0 1 0 0 1 Co 0 0 0 1 0 1 1 1
X Y Ci Full Adder Sum Co

Can now write 2 equations S = !X!YCi + !XY!Ci + X!Y!Ci + XYCi Co = !XYCi + X!YCi + XY!Ci + XYCi These reduce to S = X Y Ci Co = Ci (X Y)+ XY Two equations give what we call full adder Inputs X, Y, Ci Output S, Co
Ripple Carry Adder

Consider now that we are working with 4 bit words 1011 0110 -----------10001 The result of adding these two numbers Generates carry out of MSB Result is too large to fit into 4 bits Have produced overflow Must be aware of this - 4 of 27 -

To compute 4 bit sum in hardware Can use 4 full adders from above We take the carry out from stage i Treat as carry in to stage i+1 Thus for 4 bit adder we have
Sum X Y Ci Full Adder Co X Y Ci Full Adder Co Sum X Y Ci Full Adder Co Sum X Y Ci Full Adder Co Sum

If we examine the process of producing the 4 bit sum We observe that we cannot compute sum in column i+1 until Carry from column i is available If we define the carry propagation delay as carry See that for each additional column Availability of final sum delayed by carry For 32 bit word Total delay 32 * carry Which can become significant Advantage Design is low cost Simple to implement Disadvantage Slow Long carry delay path Asynchronous
Carry Save Adder

Carry save adder is simple and low cost Like ripple carry adder basic algorithm Replicates pencil and paper method Components Single full adder Several shift registers Control logic Block diagram given in next diagram In design Data Loaded into X and Y registers in parallel Read from Sum register in parallel X, Y, and Sum registers n bits long - 5 of 27 -

Control algorithm Reset Load X Load Y repeat n times clock X, Y, Sum registers and D flip-flop end repeat Read Sum register and Co Variations X and Y registers can be Serially loaded Configured as circular registers Sum register read serially Advantage Design is low cost Simple to implement Synchronous Carry does not have long ripple path delay Disadvantage Slow
Carry Look Ahead

To get around carry delay problem Use technique called carry look ahead Idea amounts to computing carry at same time as sum Lets examine the carry out equation from above Co = Ci (X Y) + XY Examining equation Co will be a true under the following conditions When addend and augend bits are both 1 Such a condition generates a carry When either bit is 1 and there is a carry in Condition in which either bit is 1 cannot generate a carry However can combine with incoming carry To propagate that carry to next stage Term XY Called generate term G Term (X Y) Called propagate term P

- 6 of 27 -

Can rewrite equation as Co = Ci P+ G For stage 0 we have Co0 = Ci0 (X0 Y0) + X0Y0 For stage 1 we have Co1 = Ci1 (X1Y1) + X1Y1 = Ci1 P1+ G1 Co1 = Co0 (X1Y1) + X1Y1 = Co0 P1+ G1 Co1 = (Ci0 (X0Y0) + X0Y0 ) (X1Y1) + X1Y1 = (Ci0 P0+ G0 ) P1+ G1 We can continue in the same way for each stage We can write the equation in much more compact form Thus we have general term Coi = Cii Pi + Gi or Cn = Gn + Gn-1Pn + Gn-2Pn-1Pn +..+CinP0P1Pn As is evident Building full carry look ahead even for small adder Becomes large very quickly Doing so not reasonable for large system Such as 64 bit adder Rather that full look ahead Build as hybrid Implemented as multilevel system Build full look ahead across smaller blocks of adders Ripple carries between such blocks Design for 64 bit look ahead carry adder Modules Four bit adder module Comprises four full adders Look ahead carry across four bits Module illustrated in accompanying diagram

- 7 of 27 -

Four bit look ahead carry generator Look ahead carry across four bits Module illustrated in accompanying diagram

Can now implement system as illustrated in following diagram

Table Look Up

Method to perform arithmetic operations very quickly Simply look up answer Operands provide address into table Answers precomputed and stored in table At location indicated by operands Consider simple case of two bit operands Concatenated operands form 4 bit number Four bit number gives 16 combinations Table entries store 4 bit word Carry Two bit sum
Adder Symbol

In best object centered sense Specific implementation of adder is context dependent Based upon requirements From user perspective We draw adder as follows Number of bits in each input operand and in result output Given by context Subtraction Subtraction operation - 8 of 27 -

Parallels addition Basic binary subtraction proceeds as follows 0 0 1 1 minuend -0 -1 -0 -1 subtrahend ---- ---- ------0 11 1 0 borrow difference Borrows propagate to the left as we have in familiar decimal subtraction As discussed with addition Borrow has several interpretations Borrow out from ith column becomes Borrow in to i+1th column As we did with the adder Can extend to multiple bits
Example

111 1001 1010 -----1111

11 0111 1011 0101 1101 0001 1010 ------ ------ -----1010 1010 1011

Observe Borrow propagation Underflow In several cases We get a borrow out from the most significant bit This is called underflow The result is too small To fit in word designated to hold it Here we are working with 4 bit words We have a 5 bit result Can potentially be serious problem If underflow not registered Result can be interpreted as Subtracting two numbers Producing result larger than original
Full Subtractor

We can build a hardware circuit To implement such an adder In rather straight forward way Begin with single bit subtractor Called full subtractor - 9 of 27 -

A full subtractor has Three inputs Minuend bit Subtrahend bit Borrow in bit Two outputs Difference bit Borrow out bit If we do not have borrow in bit Called half subtractor The full subtractor block diagram accompanies To implement subtractor in hardware Must consider Two inputs and borrow in Difference output and borrow out X 0 0 0 0 1 1 1 1 Y 0 0 1 1 0 0 1 1 Bi 0 1 0 1 0 1 0 1 D 0 1 1 0 1 0 0 1 Bo 0 1 1 1 0 0 0 1
X Y Bi Full Subtractor Difference Bo

Can now write 2 equations D = !X!YBi + !XY!Bi + !XYBi + XYBi Bo = !X!YBi + !XY!Bi + !XYBi + XYBi These reduce to D = X Y Bi Bo = Bi (!X+Y)+ !XY Two equations give what we call full subtractor Inputs X, Y, Bi Output D,Bo Consider now that we are working with 4 bit words

- 10 of 27 -

0110 1011 -----------11011 The result of adding these two numbers Generates borrow out of MSB Result is too small to fit into 4 bits Have produced underflow
Ripple Borrow Subtractor

To compute 4 bit difference in hardware Can use 4 full subtractors from above We take the borrow out from stage i Treat as borrow in to stage i+1 Thus for 4 bit subtractor we have

If we examine the process of producing the 4 bit difference We observe that we cannot compute difference in column i+1 until Borrow from column i is available If we define the borrow propagation delay as borrow See that for each additional column Availability of final difference delayed by borrow For 32 bit word Total delay 32 * borrow Which can become significant Advantage Design is low cost Simple to implement Disadvantage Slow Long borrow delay path Asynchronous
Borrow Save Subtactor

Borrow save subtractor is low cost and simple modification of adder Like ripple borrow subtractor Basic algorithm replicates pencil and paper method - 11 of 27 -

Components Single full subtractor Several shift registers Control logic Block diagram given in next diagram

In design Data Loaded into X and Y registers in parallel Read from Difference register in parallel X, Y, and Difference registers n bits long Control algorithm Reset Load X Load Y repeat n times clock X, Y, Difference registers and D flip-flop end repeat Read Difference register and Bo Variations X and Y registers can be Serially loaded Configured as circular registers Sum register read serially Advantage Design is low cost Simple to implement Synchronous Carry does not have long ripple path delay Disadvantage Slow In doing so We encounter same borrow prop delay as with 4 bit adder Can implement look ahead borrow in same way as we did with adder - 12 of 27 -

Look Ahead Borrow Subtractor

Look ahead borrow subtractor Mirrors look ahead carry design Rarely implemented
Table Look Up

Table look up design Similarly parallels work done for adder design
Subtractor Symbol
Opr0 Difference Sub

Generally subtraction implemented using 2s complement Opr1 Like adder specific implementation is context dependent Based upon requirements From user perspective We draw subtractor as follows Number of bits in each input operand and in result output Given by context

Multiplication Basic binary multiplication proceeds as follows 0 0 1 1 multiplicand x0 x1 x0 x1 multiplier ---- ---- ------0 0 0 1 product Observe that binary multiplication operation Implements AND operation There are variety of ways to implement multiplication in ALU Well look at a couple
Shift and Add

Simplest method duplicates pencil and paper approach Consider following block diagram for 16 bit multiplier

16 Bit Multiplicand

32 Bit Product

16 Bit Multiplier

Operation proceeds as follows Algorithm - 13 of 27 -

load multiplier and multiplicand registers clear product register repeat n times if multiplier LSB = 1 product register product register + multiplicand end if shift product register right 1 bit shift multiplier register right 1 bit end repeat

Example

Multiply 1101 by 1011


1 1 0 1 1 0 1 0 1 0 1 0 1 1 0 0 1 1 1 1 0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 1 1 0 1 1 0 1 1 1 0 1 1 0

0 0 1 1 1 1 1 1

0 0 0 0 1 1 1 1

0 0 0 0 0 1 1 1

0 0 0 0 0 0 0 1

0 1

reset product register multiply add shift multiply add shift shift multiply add shift

Advantage Shift and add algorithm simple Easy to implement Disadvantage Slow
Wallace Tree

Utilizes fact basic arithmetic Combinational logic problem Multiplication repeated add operation Bit multiplication binary AND operation Wallace tree Builds multilevel combinational logic array Level 0 Builds partial product array Implemented as array of AND operations Shifting inherent in array interconnections Level 1..N Implemented as N levels of arrays of full adders Full adders viewed as Three input two output adders - 14 of 27 -

Input 3 binary bits carry in used as one of bits Output Sum and carry out Successive levels reduce partial product array Until two integer numbers remain Level N+1 Two integer numbers produced from partial product array Added in parallel to give final product
Example

Build Wallace tree 4 x 4 multiplier Both multiplier and multiplicand have 4 bits In diagram Bit values not relevant Indicate by Partial product array Built as array of 2 input AND gates Each gate ANDs One bit from multiplicand One bit from multiplier AND gate outputs Connected to full or half adder inputs In first full adder array Followed by succession of levels of full adder arrays Partial product array will have N rows N is number of bits in multiplier or multiplicand Full adder arrays First full adder array Implemented on partial product array Within each full adder array Each full or half adder in column i produces Sum bit into column i Carry bit into column i+1 In next full adder array Each successive full adder array level Reduces array size until two rows remain Final product produced by adding final two rows
Table Lookup

Multiplication can benefit significantly From table lookup approach - 15 of 27 -

As with addition and subtraction Operands provide address into Table of precomputed results For multiplication intensive applications Fast Fourier Transforms for example Speed improvement can be significant Multiplier can be implemented as Single large table For 32 bit operands can be rather large Alternately Can divide multiplier into 4 bytes or 8 nibbles Look up 4 or 8 partial products Perform high-speed addition operations To combine individual pieces
Division

Lets looks at several alternative methods for implementing division Both are variations on the pencil and paper methods
Restoring

To examine the first method Lets work with the following block diagram Register initially holding dividend Double length Ultimately will hold Remainder Quotient Identified as RQ register Operation proceeds as follows Algorithm
divisor register divisor RQ register dividend repeat n times shift RQ left 1 bit position R R divisor if R negative q0 0 R R + divisor else q0 1 end if end repeat // right aligned // right aligned

// restore

- 16 of 27 -

if R negative R R + divisor end if dividend upper contains remainder dividend lower contains quotient

Example

0000 1000 0001 0000 0011 -----1110 0000 0011 -----0001 0000 0010 0000 0011 -----1111 0000 0011 -----0010 0000 0100 0000 0011 -----0001 0001 0010 0010 0010 0010
Non Restoring

dividend upper, lower shift left divisor negative - restore q0 = 0, shift left subtract negative - restore q0 = 0, shift left subtract positive q0 = 1, shift left remainder 2, quotient 2

Second method Uses same block diagram Register initially holding dividend Double length Ultimately will hold Remainder Quotient Identified as RQ register

- 17 of 27 -

Operation proceeds as follows Algorithm


divisor register divisor RQ register dividend repeat n times shift RQ left 1 bit position if R positive R R divisor else if R negative R R + divisor end if if R positive q0 1 else if R negative q0 0 end if end repeat if R negative R R + divisor end if dividend upper contains remainder dividend lower contains quotient // right aligned // right aligned

Example

0000 1000 0001 0000 0011 -----1110 0000 1100 0000 1100 0000 0011 -----1111 0000 1110 0000

dividend upper, lower q0 = 0, shift left divisor negative - add next time q0 = 0, shift left add negative - add next time q0 = 0, shift left

- 18 of 27 -

1110 0000 0011 -----0001 0000 0001 0001 0010 0010 0011 -----1111 0010 1111 0010 0011 -----0010 0010 0010 0010
Signed Numbers

add positive - subtract next time q0 = 1, shift left subtract negative restore remainder q0 = 0 restore remainder

remainder 2, quotient 2

Up to this point Been working with operations on unsigned fixed point numbers Signed numbers Permit both positive and negative numbers Variety of different methods for representing such numbers We will examine 3 Sign and Magnitude 1s complement 2s complement
Sign and Magnitude

As name suggests Number comprised of Sign part Usually the MSB 0 - positive 1 - negative Magnitude part Remaining bits Observe with such scheme One bit devoted to sign Reduces Expressive power by one----why Maximum and minimum values 2n-1 +1 to 2n-1 - 1 - 19 of 27 -

max = 2n -1 max = 2n-1- 1

min = 0

+/- 0 min = -2n-1+ 1

1s Complement Formally 1s complement of number computed as 2n -N - 1 n is the number of bits in the number Easiest way is to simply invert all the bits 1s complement reduces Expressive power by one----why Maximum and minimum values - 2n-1 + 1 to 2n-1 - 1
Example

1011 24 -1011 - 0001 10000 - 01011 ------00101 - 00001 ------00100 Inverting 1011 0100 2s Complement Formally 2s complement of number computed as 2n -N n is the number of bits in the number Easiest way is to simply invert all the bits and add 1 Does not affect Expressive power ----why Maximum and minimum values - 2n-1 to 2n-1 - 1
Example

1011 24 -1011 10000 - 01011 ------00101

- 20 of 27 -

Inverting 1011 0100 Adding 1 0100 + 1 = 0101 Observe 2s complement of a number is the original number Proof Select an arbitrary number N Let the 2s complement of N be given by N1 Thus N1 = 2n -N Compute the 2s complement of N1 2n -N1 = 2n - (2n -N) = N
Addition and Subtraction Using Signed Numbers

We perform addition and subtraction using signed numbers By performing addition of the numbers Either as positive numbers Expressed in complementary form Observe N1 - N2 can be converted to N1 + (-N2) -N2 expressed in either 1s or 2s complement form Lets develop a generalized notion of such arithmetic First step All negative numbers expressed in complemented form Dont need to worry about sign bit Must consider 2 cases Numbers Same sign Opposite signs Only need to worry about addition at this point---why
Same Sign

Again two cases Both positive Both negative Both Positive Simply add numbers - 21 of 27 -

Sum = N1 + N2
Example

+13 + 9 ----+22

00001101 00001001 -----------00010110

Observe Possibility of overflow exists must be handled Both Negative Simply add complemented numbers Sum = N1c + N2c = (2n -N1) + (2n -N2) = 2n - (N1 + N2) + 2n Observe The result is in 2s complement form The trailing 2n is ignored Possibility of underflow exists must be handled
Example

-13 - 9 -----22

11110011 + 11110111 -----------111101010 ^ ignore

2s complement of 11101010 00010110


Opposite Signs

Note We cannot have overflow or underflow Again two cases 1. N1 0 N2 < 0 2. N2 0 N1 < 0 Observe Case 1. If | N1 | |N2| Result must be positive - 22 of 27 -

Sum = N1 + N2c = N1 + (2n -N2) = (N1 - N2) + 2n Ignore the last carry out If | N2 | > | N1| Result must be negative Sum = N1 + N2c = N1 + (2n -N2) = 2n + (N1 - N2) = 2n - (N2 - N1)
Example

13 - 9 ----4

00001101 + 11110111 -----------100000100 ^ ignore

Example

9 - 13 -----4

00001001 + 11110011 -----------11111100

Case 2. If | N2 | | N1| Result must be positive Sum = N1c + N2 = 2n -N1 + N2 = (N2 N1) + 2n Ignore the last carry out If | N1 | > | N2 | Result must be negative Sum = N1c + N2 = 2n - N1 + N2 = 2n - (N1 N2)

- 23 of 27 -

Example

-9 13 ----4
Example

11110111 + 00001101 -----------100000100 ^ ignore 11110011 + 00001001 -----------11111100

- 13 9 ----- 4

Sign Extension When working with signed numbers Express positive numbers in natural form MSB is sign bit value 0 Indicating positive Express negative numbers in 2s complement form MSB is sign bit value 1 Indicating negative Will find occasions when must work with numeric values With number of bits less than full word size Require such operations in two common cases Short jumps in branching or looping operations Must jump forwards or backwards by few instruction addresses Typically implemented by Adding (algebraically) value to PC Working with immediate mode constants as part of instruction Arithmetic with such shortened values proves interesting When must put shortened value into full word Where do missing most significant bits come from Consider 2 cases of adding 8 bit quantity to 16 bit word 1001 0011 1010 0110 + 0110 1110 1001 0011 1010 0110 + 1110 1110 When second operand placed into register What goes into upper 8 bits In first case Can simply fill with 0s Following operation resulting sum will be correct - 24 of 27 -

In second case Since MSB is 1 number is negative 2s complement Filling with 0s will not give correct answer Now want upper 8 bits to be 1s To preserve 2s complement We see then to ensure correct results Positive number must be filled with 0s Negative (2s complement) number must be filled with 1s In reality such operations called sign extension Extending sign bit to fill MSB positions
Logical Operations

In addition to arithmetic operations ALU must be able to perform various logical type operations Operations include Logic AND, OR, XOR, complement Shifting Left, right Circular (rotate), arithmetic, logical Comparison >, >=, <. <= Logic AND, OR, XOR Can easily be implemented using standard logic gates Complement Useful in arithmetic operations Can be implemented simply by Inverting Q output from register Reading ~Q output from register Shift Several different kinds of shifts defined Bit shift or logical shift All bits in register Shifted left or right 0s entered into register on left or right respectively Often used when operand is interpreted as set of bits Rather than (signed) number Logical shifts illustrated in next diagram Logical left shift by 1 bit

- 25 of 27 -

Logical right shift by 1 bit

Arithmetic shift Also known as signed shift Similar to logical shift One simple difference All bits in register Shifted left or right Right shifts MSB continually entered into register on left Called sign extension For signed operands Left shift 0s entered into register on right Arithmetic shifts illustrated in next diagram Arithmetic right shift by 1 bit Same as logical shift left

Logical right shift by 1 bit Circular shift Circular shifts appear in cryptographic applications Used to permute bit sequences Similar to logical and arithmetic shift One simple difference All bits in register Shifted left or right Right shifts LSB continually entered into register on left Left shift MSB entered into register on right Circular shifts illustrated in next diagram Circular right shift by 1 bit

Circular left shift by 1 bit - 26 of 27 -

Comparison Device will accept two N bit binary numbers Depending upon design Will produce Single output Indicating that two input numbers are equal High level diagram given in accompanying figure Two outputs N1 is larger than N2 N2 is larger than N1 If both are true Numbers are equal Three outputs N1 is larger than N2 N2 is larger than N1 N1 is equal to N2 More complex designs Include inputs signaling Less than Greater than Equal Using such inputs Can cascade series of comparators To compare two M digit numbers High level diagram given in accompanying figure

N1

EqualO

N2 EqualI GreaterI LessI

GreaterO LessO

Summary

Have introduced studied and developed Number of fundamental algorithms for Performing basic arithmetic and logical computations Such computations are building block commonly found In most ALU

- 27 of 27 -

Potrebbero piacerti anche