Arhitektura

University of Maryland
Ele tri al and Computer Engineering Department
College Park, MD 20742-3285

Glenn L. Martin Institute of Te hnology
A. James Clark S hool of Engineering
Dr. Charles B. Silio, Jr.
Telephone 301-405-3668
Fax 301-314-9281
silioeng.umd.edu
The Mi roar hite ture/Mi roprogramming Level
These notes are based on and extend material in Chapter 4 of A. S. Tanenbaum, Stru tured
Computer Organization, 3rd Edition, Prenti e Hall, 1990. The a umulator based ma hine whose
instru tion set ar hite ture is alled the Ma -1 has its data path mi roar hite ture and its mi roprogrammed implementation ( alled the Mi -1) presented here. This presentation diers from the
sta k oriented IJVM and orresponding Mi -1 in Tanenbaum's 5th Edition textbook.
One of the dieren es between the Mi -1 (mi roprogrammed omputer) presented here and the
one in the 5th Edition textbook is that all registers in this Mi -1 are onstru ted from lo ked (or
gated) D-lat hes, as shown in Fig. 1; whereas, registers in the 5th Edition text use edge-triggered
ip- ops. Fig. 2 shows how an 8-bit register is built using lo ked D-lat hes and three-state (i.e.,
tri-state) buers for onne tion to two output buses.
11
00
00
11
1
0
Q
D
CLK
1
0
1
0
CLK
Figure 1: Clo ked D-lat h
11
00
00
11
11
00
00
11
b7
Q
CLK
Load
OE-A
00
11
11
00
b6
CLK
1
0
A-Bus
11
00
00
11
11
00
B-Bus
b4
Q
CLK
1
0
0
1
1
0
0
1
11
00
00
11
1
0
0
1
b3
Q
CLK
1
0
0
1
00
11
0
1
11
00
1
0
0
1
11
00
00
11
1
0
0
1
11
00
00
11
11
00
00
11
11
00
00
11
b2
Q
CLK
00
11
0
1
11
00
0
1
1
0
Q
Q
D
00
11
1
0
1
0
0 00
1
110
110
00
11
00
1
0 11
1
00 11
001
11
00
00
11
OE-B
b5
Q
CLK
11
00
00
11
1
0
Q
Q
D
1
0
0
1
1
0
C-Bus
b1
Q
CLK
b0
Q
CLK
1
0
1
0
10
0
11
00
00
11
1
11
00
0 11
1
00 11
00
1
0
0
1
00
11
11
00
0
1
00
11
0
1
1
0
11
00
00
11
1
0
1
0
0
1
Figure 2: Eight-bit register and bus onne tions
11
00
Registers: A register is a devi e apable of storing information. Con eptually, registers are
the same as main memory, the dieren e being that the registers are lo ated physi ally within
the pro essor itself, so they an be read from and stored into faster than words in main memory,
whi h is usually o- hip. Larger and more expensive ma hines usually have more registers than
smaller and heaper ones, whi h must use main memory for storing intermediate results. On some
omputers a set of registers numbered 0; 1; 2; : : : ; n 1, is available at the mi roprogramming
level and is alled lo al storage or s rat hpad storage.
A register an be hara terized by a single number: namely, how many bits an it hold (e.g.,
Fig. 2 is an 8-bit register). The bits (binary digits) in an n-bit register ould be numbered from
left to right or from right to left. The numbering onvention assumed in these notes for the bits in
an n-bit register is right to left from 0 to n 1 in the natural powers of two order of a positional
number system for integers. In other words, bit 0 is stored in the rightmost D-lat h in Fig. 2 and
bit 7 is stored in the leftmost D-lat h (whi h orresponds to bit n 1 when n = 8).
Information pla ed in a register remains there until some other information repla es it. The
pro ess of reading information out of a register does not ae t the ontents of the register. In other
words, when a register is read, a opy is made of its ontents and the original is left undisturbed in
the register. Similarly, when information is moved from one register to another, a opy is loaded
into the destination register and the ontents of the sour e register remain undisturbed.
A bus is a olle tion of wires used to transmit signals in parallel. For example, buses are
used to allow the ontents of one register to be opied to another one. A bus may be unidire tional or
bidire tional. A unidire tional bus an transfer data only in one dire tion; whereas, a bidire tional
bus an transfer data in either dire tion but not both simultaneously. Unidire tional buses are
typi ally used to onne t two registers, one of whi h is always the sour e and the other of whi h is
always the destination. Bidire tional buses are typi ally used when any of a olle tion of registers
an be the sour e and any other one an be the destination.
Many devi es have the ability to onne t and dis onne t themselves ele tri ally from the buses
to whi h they are physi ally atta hed. These onne tions an be made or broken in nanose onds.
A bus whose devi es have this property is alled a tri-state (or three-state) bus (the term tri-state
being a registered trademark of National Semi ondu tor Corp.). A tri-state buer amplier is used
to make the onne tions. These tri-state buer ampliers are shown in Fig. 2 as triangular shapes
whose inputs ome from the output of the D-lat h to whi h ea h is onne ted and ea h of whose
outputs is onne ted to a single bus wire. The other input to the buer amplier (labeled either
OE-A or OE-B) is a ontrol (or enable) input. If this ontrol input is in the logi zero state, then
the output of its buer amplier is in the high-impedan e state (i.e., dis onne ted from the bus
wire to whi h it is atta hed). If the ontrol input is in the logi one state (also alled a tive-high)
then the buer amplier's output value equals its input value (either logi 0 or logi 1), and the
D-lat h's output state is onne ted to the orresponding bus wire.
In most mi roar hite tures, some registers are onne ted to one or more input buses and to
one or more output buses. Fig. 2 depi ts an 8-bit register onne ted to one input bus and to two
output buses. The register has three ontrol inputs: namely, Load, OE-A, and OE-B, where OE
stands for \output enable." When \Load" is in the logi zero state, the ontents of the register
are not ae ted by the signals on the C-bus wires. When \Load" is raised to the logi 1 state the
values on the C-bus wires are opied into their orresponding D-lat hes in parallel. After the new
values are lat hed \Load" an be returned to its logi zero state, and the register remembers the
binary value last loaded into it.
When \OE-A" is at the logi zero level, the register is dis onne ted from the A-Bus (and
similarly for \OE-B" with respe t to the B-Bus). When \OE-A" is raised to the logi 1 level, the
register is onne ted to the A-Bus wires (and similarly for \OE-B" with respe t to the B-bus).
In order to transfer data from this register to another register R using the A and C buses. The
input to register R must be onne ted to the C-Bus, and \OE-A" for this register must be raised to
Buses:
the logi 1 level in order to pla e the register's ontents on the A-Bus. Other ir uitry su h as an
arithmeti and logi unit (ALU) whi h is not shown here must then be used to onne t the A-Bus
wires to the C-Bus wires. After a short time to allow the signals on the buses to settle down and
be ome stable then the Load signal onne ted to register R is raised to the logi 1 level and the
information transfer is a omplished.
Be ause drawing all of the wires and lat hes shown in Fig. 2 requires too mu h spa e, a
shorthand s hemati su h as that shown in Fig. 3 is used instead. Fig. 3 depi ts a 16-bit register
that would be onstru ted internally in the same fashion as the 8-bit register shown in Fig. 2 but
with 8 more lat hes, 16 more buer ampliers, and 24 more bus wires.
16
16
From C-Bus
16-bit Register
To A-Bus
16
To B-Bus
Load Clk
OE-A
OE-B
Figure 3: Sixteen-bit register s hemati

Cir uits that have one or more input lines and ompute one or
more output values that are uniquely determined by the present inputs are alled ombinational
ir uits. Two important ombinational ir uits are de oders and multiplexers. A de oder has
n output lines numbered 0 to 2n 1. If the binary number on the input lines
n input lines and 2
has de imal value k, then output line number k takes the value 1 and all other output lines take
the value 0. A de oder always has exa tly one output line whose value is set to 1, with all the rest
set to 0. A multiplexer has 2n data inputs (either individual lines or buses), one data output of
the same width as the inputs, and an n-bit ontrol input that is internally de oded to sele t one of
the inputs and route it to the output. The stru ture of a 2 to 1 multiplexer is shown in Fig. 4. If
instead of a single input line one wishes to swit h the ontents of one of two n-bit input buses to
an n-bit output bus, then one must use n 2 to 1 multiplexers (one per output bit line) all sele ted
by the same sele tion input value S .
De oders and Multiplexers:
I1
Z
I1
2 to 1
MUX
I0
I0
1
0
S
S
Figure 4: 2 to 1 Multiplexer (one for ea h output bit when used with registers)
An Example Mi roar hite ture
The data path of our example mi roar hite ture is shown in Fig. 5. The data path is that
part of the entral pro essing unit (CPU) that ontains the arithmeti and logi unit (ALU) and
its inputs and outputs. In this ase it ontains 16 identi al 16-bit registers, labeled PC, AC, SP,
and so on, that form a s rat hpad memory a essible only to the the mi roprogramming level.
The registers labeled 0, +1, and -1 will be used to hold the indi ated onstants (with -1 in two's
omplement form). The meaning of the other register names will be explained later. Ea h register
an output its ontents onto one or both of two internal buses, the A-Bus and the B-Bus, and ea h
an be loaded from a third internal bus, the C-Bus as shown in the gure.
3
C-Bus
A-Bus
B-Bus
CPU reg. adr.
Enc
T4
C-Bus
Decoder
16
PC
AC
SP
IR
TIR
+1
-1
4-Phase
Clock
Generator
T4 T3 T2 T1
16
4
C
Field
16
8
AMASK
SMASK
10
11
12
13
14
15
A-Bus
Decoder
A
Field
B-Bus
Decoder
4
B
Field
To
MMUX
Main
Memory
4095
B-Latch
A-Latch
C1
C0
MAR
MBR
I1
Wr
Mbr
Rd
T4
AMUX
T2
I0
Mar
Micro
Sequencing
Logic
F1
F0
T3
ALU
Amux
Shifter
N
Z
S1
S0
Figure 5: The data path for example mi roar hite ture (Mi 1/Ma 1)
The A and B buses respe tively feed the left and right inputs of a 16-bit wide ALU that an
perform four fun tions: addition (A + B), bitwise logi al AND (A.AND.B), left input straightthrough (A), and bitwise logi al omplement (i.e., 1's omplement) of the ontent of the left input
(NOT A). The fun tion to be performed is spe ied by the two ALU ontrol lines F1 and F0 . The
ALU generates two status bits based on the urrent ALU output: N, whi h takes the value 1 when
the ALU output is negative, and Z, whi h takes the value one when the ALU output is zero. The
N bit is just a opy of the high-order (bit position 15) output bit. The Z bit is the NOR of all the
ALU output bits (namely, bits 0 through 15).
The 16-bit ALU output goes into a shifter, whi h is a ombinational ir uit that an logi ally
shift its input 1 bit left or right, or not at all, and gate the result to its 16-bit output. The fun tion
to be performed by the shifter is spe ied the the two shifter ontrol lines S1 and S0 . It is possible
to perform a 2-bit left shift of a register, R, by using the ALU to ompute R + R (whi h is a 1-bit
left shift) and then shifting this sum another bit left using the shifter.
The A-Bus de oder is used to de ode a 4-bit register designator (A-eld) that sele ts one of
the 16 s rat hpad registers to be gated onto the A-Bus. The outputs of the de oder are 16 output
enable (OE-A) signals (one for ea h register) and one and only one of the OE-A signals takes the
value 1. The B-Bus de oder is used to de ode a 4-bit register designator (B-eld) that sele ts one
of the 16 s rat hpad registers to be gated onto the B-Bus. The outputs of the de oder are 16
output enable (OE-B) signals (one for ea h register) and one and only one of the OE-B signals
takes the value 1. The C-Bus de oder is used to de ode a 4-bit C-eld register designator that
sele ts the s rat hpad register to be loaded from the C-Bus. The outputs of the C-Bus de oder
are 16 load lo k signals (one for ea h register). Be ause all 16 possible C-eld values are assigned
to the 16 registers, an additional ontrol input is needed to prevent loading any of the registers.
This additional ontrol input is ENC (for enable-C). If ENC = 0, then all 16 of the de oder's load
outputs remain at logi level zero, and none of the registers is overwritten. If ENC = 1, then
one and only one of the destination registers sees a load lo k line = 1 at the appropriate time
determined by yet another ontrol input alled T4.
Neither the A-Bus nor the B-Bus feeds the ALU dire tly. Instead, ea h one feeds a lat h (i.e.,
a register) that in turn feeds the ALU. The lat hes are needed be ause the ALU is a ombinational
ir uit { it ontinuously omputes the output for the urrent input and fun tion ode. Feeding
the left and right ALU inputs dire tly from the A and B buses (without the additional lat hes)
an ause ra e problems. For example, onsider assigning to the destination register A the sum
of the ontents of registers A and B, denoted A:= A + B. As A is being written into, the value
on the A-Bus begins to hange, whi h auses the ALU output and thus the ontents of the C-Bus
to hange as well. Consequently, the wrong value may be stored into A. In other words, in the
assignment A:= A + B, the old A on the right-hand side is the original A value, not some bit-by-bit
mixture of the old and new values. By inserting lat hes (namely, the A-lat h and B-lat h) into the
A and B buses, we an freeze the original A and B values there early in the y le, so that the ALU
is shielded from hanges on the buses as the new value is being stored into the s rat hpad.
One an think of the A-lat h and the B-lat h as shared slave lat hes for the orrespondingly
sele ted sour e master lat hes in the s rat hpad. This saves using slave lat hes in ea h s rat hpad
register that using master-slave ip- ops to build the registers would require, but it ompli ates the
timing somewhat. The A-lat h and B-lat h are loaded by timing ontrol signal T2 that is generated
by a 4-phase lo k generator ir uit shown in Fig. 6.
Computer ir uits are normally driven by a lo k, a devi e that emits a periodi sequen e of
pulses. These pulses dene ma hine y les. During ea h ma hine y le, some a tivity o urs, su h
as the exe ution of a mi roinstru tion. It is often useful to divide a y le into sub y les so dierent
parts of the mi roinstru tion an be performed in a well-dened order. For example, the inputs to
the ALU must be made available and allowed to be ome stable before the output an be stored.
T1
Run/Stop
Reset
Four Phase
Clock Generator
Finite State
Machine
T2
T3
T4
Master Clock
Pulses
11
00
00
11
00
11
00
11
00
11
00
11
Master Clock
T1
11
00
00
11
00
11
00
11
00
11
00
11
T2
11
00
00
11
00
11
00
11
00
11
00
11
T3
11
00
00
11
00
11
00
11
00
11
00
11
Subcycle
T4
11
00
00
11
00
11
00
11
00
11
00
11
CPU Cycle
(Repeats while Run = True)
Figure 6: Four Phase Clo k Cy le

Main Memory: Pro essors need to be able to read data from memory and write data to
memory. Most omputers have an address bus, a data bus, and a ontrol bus for ommuni ation
between the CPU and memory. To read from memory, the CPU puts a memory address on the
address bus and sets the ontrol signals appropriately, for example by asserting \Rd" (READ). The
memory then puts the requested item on the data bus. In some omputers memory read/write is
syn hronous; that is, the memory must respond within a xed time. This is what we assume for
our mi roar hite ture; namely, the memory must respond within four lo k (sub y le) ti ks. On
other omputers, the memory may take as long as it wants, signaling the presen e of data using a
(e.g., READY or memory fun tion omplete) ontrol line when it is nished.
Writes to memory are done similarly. The CPU puts the data to be written on the data bus and
the address to be stored into on the address bus and then it asserts \Wr" (WRITE). (An alternative
to having \Rd" and \Wr" is to have MREQ, whi h indi ates that a memory request is desired, and
R/W, whi h distinguishes read from write. In either ase two ontrol lines are required.)
On most ma hines (ex ept for our example) a memory a ess is nearly always onsiderably longer
than the time required to exe ute a single mi roinstru tion. Consequently the mi roprogram must
keep the orre t values on the address and data buses for several mi roinstru tions (i.e., ma hine
y les). To simplify this task, it is often onvenient to have two registers, the MAR (Memory
Address Register) and the MBR (Memory Buer Register), that drive the address and data buses,
respe tively. Both registers sit between the CPU and the system's memory bus. The address bus
is unidire tional on both sides and is loaded from the CPU side when the \Mar" ontrol line is
asserted. The output to the system address lines is always enabled (as is the ase here) [or possibly
only during reads and writes, whi h requires an output enable line driven by the OR of \Rd" and
\Wr" (not shown). Be ause the main memory in our example mi roar hite ture has only 4096
16-bit words, the MAR is a 12-bit register. In our example mi roar hite ture the MAR is onne ted
to the B-Bus (rather than to the C-Bus) so that both the MAR and the MBR an be loaded in
the same ma hine y le (i.e., loaded by the same mi roinstru tion) Be ause the MAR is onne ted
to the B-Bus (rather than to the C-Bus) and be ause of the way the mi roprogram ontrolling
this ma hine is written, it is restri ted in size to 12-bits. To allow for a 16-bit MAR onne ted
in this way would require two ma hine y les (i.e., two mi roinstru tions) and the use of another
s rat hpad register to properly load all 16 bits into the MAR. This is a onsequen e of the design
hoi es made by others; namely, the author of the text from whi h this example is derived. A more
detailed s hemati of the MAR and its onne tions is shown if Fig. 7.
B-Bus
12
To Memory
Address
Decoder
Memory Address Register

MAR
12-bits
Load
OE=1
16
Low Order
12
MAR
T3
MAR
(Control Bit)
Figure 7: Memory Address Register (MAR)

As shown in Fig. 8, the \Mbr" ontrol line auses the MBR to be loaded from the C-Bus on
the CPU side. The MBR output is always enabled on the CPU side and is presented to a 2 to 1
multiplexer (the AMUX) that swit hes the input to the left ALU input between the A-lat h and
the MBR under ontrol of a signal alled Amux. If Amux = 0, the left input of the ALU sees the
ontents of the A-lat h and if Amux = 1, it sees the ontents of the MBR. The system's memory
data bus is bidire tional, and the \Rd" and \Wr" ontrol signals are used to determine its dire tion
between memory and the MBR (to memory on write and from memory on read).
16
16
To/From
16
Memory
11
00
00
11
To11
00
00
11
1
0
I1
Memory Buffer Register
2:1
MBR
MUX Z
16-bits
(x16) 16
I0
Load MBR
S
T4
OE=1
To AMUX
MBR
0
1
Control
00
11
00 RD
11
11
00
00
11
Bit
(Read)
WR
(Write)
Memory
16
C-Bus
(From Shifter)
Figure 8: Memory Buer Register (MBR)
To ontrol the data path of our mi roar hite ture in Fig. 5 requires 60 signals that belong to
the following nine fun tional groupings:
16 signals to ontrol loading the A-Bus from the s rat hpad
16 signals to ontrol loading the B-Bus from the s rat hpad
16 signals to ontrol loading the s rat hpad from the C-Bus
1 signal to ontrol loading the A and B lat hes
2 signals to ontrol the ALU fun tion
2 signals to ontrol the shifter
4 signals to ontrol the MAR and MBR
2 signals to indi ate memory read or memory write
1 signal to ontrol the AMUX
Given the values of the 60 signals, we an perform one y le of the data path. A y le onsists of
gating values onto the A and B buses, lat hing them in the two bus lat hes, running the values
through the ALU and shifter, and nally storing the results in the s rat hpad and/or the MBR. In
addition, the MAR an also be loaded, and a memory y le initiated. As a rst approximation we
ould have a 60-bit ontrol register, with one bit for ea h ontrol signal. A 1 bit means that the
signal is asserted and a 0 means that it is not asserted (i.e., negated). We also need a multiphase
lo k generator ir uit to ontrol when things happen during the y le.
However, at the pri e of a small in rease in ir uitry, we an greatly redu e the number of bits
needed to ontrol the data path. Using all 16-bits to ontrol the A-Bus would allow 216 ombinations
of signal values, only 16 of whi h are valid be ause the s rat hpad has only 16 registers. Therefore,
we an en ode the A-Bus ontrol information in a 4-bit eld and use a de oder to generate the 16
ontrol signals. The same holds true for the B-Bus.
The situation is slightly dierent for the C-Bus. In prin iple, multiple simultaneous stores
into the s rat hpad are feasible, but in pra ti e this feature is only infrequently useful, and most
hardware designs do not provide for it. Therefore, we will also en ode the C-Bus ontrol into a 4
bit eld. Having en oded some of the ontrol signals into elds and in turn supplied orresponding
de oder ir uits, we have saved 3 12 bits. We now need only 24 bits to ontrol the data path.
Be ause the A and B lat hes are always loaded at a ertain point in time, we an supply a multiphase
lo k generator ir uit and use one of the lo k phases (say T2) as this ontrol input, leaving 23
ontrol bits needed. After the values in the A and B lat hes settle down the MAR an be lo ked
(at sub y le time T3) to opy the ontent of the B lat h if the \Mar" ontrol bit is set to 1. More
time is needed, however, for the data signals to propagate through the ALU and shifter ir uitry
before they have settled down and an be opied from the C-Bus into their destination(s) in the
s rat hpad or MBR at sub y le time T4. One additional signal that is not stri tly required, but is
often useful, is one to enable/disable storing the C-Bus into the s rat hpad. In some situations one
merely wishes to perform an ALU operation to generate the N and Z signals, but does not wish to
store the result. With this extra bit, whi h we will all ENC (ENable C), we an indi ate that the
C-Bus ontents are to be stored (ENC = 1) or not (ENC = 0).
With ENC in luded we an ontrol the data path with a 24 bit number. Now we note that \Rd"
and \Wr" an be used to ontrol the lat hing of the MBR from the system's memory data bus and
the enabling of the MBR onto it, respe tively (as shown in Fig. 8). This observation redu es the
number of independent ontrol signals needed from 24 down to 22.
8
The next step in the design of the mi roar hite ture is to invent a mi roinstru tion format
ontaining 22 bits. Fig. 9 shows su h a format with two additional elds COND and ADDR, whi h
will be des ribed shortly. The mi roinstru tion ontains 13 elds, 11 of whi h are as follows:
AMUX { ontrols left ALU input: 0 = A-lat h, 1 = MBR
ALU
{ ALU fun tion: 0 = A + B, 1 = A.AND.B, 2 = A, 3 = A
SHFT { shifter fun tion: 0 = no shift, 1 = right shift, 2 = left shift
MBR
{ loads MBR from shifter: 0 = don't load MBR, 1 = load MBR
MAR
{ loads MAR from B-lat h: 0 = don't load MAR, 1 = load MAR
RD
{ requests memory read: 0 = no read, 1 = load MBR from memory
WR
{ requests memory write; 0 = no write, 1 = write MBR to memory
ENC
{ ontrols storing into s rat hpad: 0 = don't store, 1 = store
C
{ sele ts register for storing into if ENC = 1: 0 = PC, 1 = AC, et .
B
{ sele ts B-Bus sour e: 0 = PC, 1 = AC, 2 = SP, 3 = IR, et .
A
{ sele ts A-Bus sour e: 0 = PC, 1 = AC, 2 = SP, 3 = IR, et .
Microinstruction Format (32-bit word)
Number of bits in each field:
1
A
M
U
X
2
2
C
A
O
L
N
U
D
2
S
H
F
T
1 1 1 1 1
M M
E
B A R W N
R R D R C
8
ADDR
C1 C0 F1 F0S1 S0
AMUX
COND
0 = A-latch
1 = MBR
C1 C0
0
0
1
1
0
1
0
1
=
=
=
=
No Jump
Jump if N=1
Jump if Z=1
Jump always
SHFT
ALU
F1 F0
0
0
1
1
0
1
0
1
=
=
=
=
A + B
A and B
A
A
S1 S0
0
0
1
1
0
1
0
1
=
=
=
=
No shift
Shift right 1 bit
Shift left 1 bit
(not used)
MBR, MAR, RD, WR, ENC

0 = No
1 = Yes
Figure 9: Mi roinstru tion Format (32-bits) for Mi -1/Ma -1 mi roar hite ture
The ordering of the elds is ompletely arbitrary. This ordering has been hosen to minimize line
rossings in a subsequent gure. (A tually, this riterion is not as razy as it sounds; line rossings
in gures usually orrespond to wire rossings on printed ir uit boards or on integrated ir uit
hips, whi h ause di ulties in two-dimensional designs.)
Mi roinstru tion Timing: Although our dis ussion of how a mi roinstru tion an ontrol
the data path during one y le is almost omplete, we have mostly negle ted one issue up until
now: timing. A basi ALU y le onsists of setting up the A and B lat hes, giving the ALU and
shifter time to do their work, and storing the results. It is obvious that these events must happen
in that sequen e. If we try to store the C-Bus ontents into the s rat hpad before the A and B
lat hes have been loaded, garbage will be stored instead of useful data. To a hieve the orre t event
sequen ing, we use a four-phase lo k, that is a lo k with four sub y les, as shown in Fig. 6. The
key events during ea h of the four sub y les are as follows:
1. Load the next mi roinstru tion to be exe uted into a register alled MIR, the Mi roInstru tion Register.
9
2. Gate sele ted s rat hpad registers onto the A and B buses and apture them in the A and B
lat hes.
3. Now that the inputs are stable, give the ALU and shifter time to produ e a stable output
and load the MAR if required.
4. Now that the shifter output is stable, store the C-Bus ontents into the s rat hpad and load
the MBR, if either is required.
Fig. 10 presents a detailed blo k diagram of the omplete mi roar hite ture of our example ma hine. It may look imposing initially, but it is worth studying arefully. When you fully understand
every box and every line on it, you will be well on your way to understanding the mi roprogramming level. The blo k diagram has two parts, the data path on the left, whi h we have already
dis ussed in detail, and the ontrol se tion on the right, whi h we will now examine.
The largest and most important item in the ontrol portion of the ma hine is the ontrol store.
This spe ial, high-speed memory is where the mi roinstru tions are kept. On some ma hines it is
read-only memory (ROM); on others it is read/write memory. In our example, mi roinstru tions
are 32 bits wide and the mi roinstru tion address spa e onsists of 256 words, so the ontrol store
o upies a maximum of 256 32 = 8192 bits. By omparison, the Digital Equipment Corporation
(DEC) PDP-11/40 was a popular and ommer ially su essful mi roprogrammed mini omputer in
the mid 1970's that also had a 256 word ontrol store, but its mi roinstru tions were 56 bits wide.
Like any other memory, the ontrol store needs an MAR and an MBR. In this ase we will
all the MAR the MPC (Mi roProgram Counter) be ause its only fun tion is to point to the next
mi roinstru tion to be fet hed from the memory for exe ution. The MBR is just the MIR as
mentioned above. In this mi roar hite ture the ontrol store and the main memory are dierent
entities; the ontrol store holds the mi roprogram and the main memory holds the onventional
ma hine language program.
From Fig. 10 it is lear that the ontrol store ontinuously tries to opy the mi roinstru tion
addressed by the MPC into the MIR. However, the MIR is loaded only during sub y le 1, as
indi ated by the dashed line from lo k output T1 to it. During the other three sub y les of the
lo k, it is not ae ted, no matter what happens to the MPC.
During sub y le 2 (whi h lasts between the rising edge of T1 and the rising edge of T2) the
MIR be omes stable, and the various elds begin ontrolling the data path. In parti ular the A
and B elds sele t the s rat hpad registers to be gated onto the A and B buses, respe tively. The
A and B de oder boxes provide for the 4-to-16 de oding of ea h eld needed to drive the OE-A
and OE-B lines at the s rat hpad registers (see Fig. 3). Clo k signal T2 loads the A and B lat hes,
whi h after their outputs settle, provide stable ALU inputs for all remaining sub y les during the
rest of the y le. While data are being gated onto the A and B buses, the in rement unit in the
ontrol se tion of the ma hine omputes MPC + 1, in preparation for loading the next sequential
mi roinstru tion during the next y le. By overlapping these two oprations, instru tion exe ution
an be speeded up.
In sub y le 3, the ALU and shifter are given time to produ e valid results. The AMUX mi roinstru tion eld determines the left input to the ALU; the right input always omes from the
B-lat h. Although the ALU is a ombinational ir uit, the time it takes to ompute the sum is
determined by the arry-propagation time, not the normal gate delay. The arry-propagation time
is proportional to the number of bits in the word. While the ALU and shifter are omputing, the
MAR is loaded from the output of the B-lat h at T3 if the MAR eld in the mi roinstru tion is 1.
10
16 Load-Reg
16 OE-B
B-Bus
Decoder
16 OE-A
A-Bus
Decoder
A-Bus
C-Bus
C-Bus
Decoder
T4
T3
T2
T1
B-Bus
Run/
Stop
4-Phase
Clock
Generator
Reset
0 PC
1 AC
2 SP
I0
I1
MMUX
16 CPU
Registers
Increment
MPC + 1
15 F
MPC
256 wds X 32 bits Control Store

(ROM, PROM, EPROM, EEPROM)
A-Latch
B-Latch
A
M
U
X
MAR
MBR
MIR
S MM
C
E
A
H BA RW N
O
C
L
D
R
F
N
RR
C
D U T
2
I1
I0
AMUX
ADDR
ALU
N
Z
Micro
Seq.
Logic
2
Shifter
Rd
Wr
Figure 10: The omplete blo k diagram for example mi roar hite ture (Mi -1/Ma -1)
11
During the fourth and nal sub y le, the C-Bus may be stored ba k into the s rat hpad and
MBR, depending on ENC and MBR. The box labeled \C de oder" takes ENC, T4, and the C eld
from the mi roinstru tion as inputs and generates the one (or none) of the 16 register load signals.
Internally it performs a 4-to-16 de ode of the C eld and then ANDs ea h of these 16 signals with
a signal derived from ANDing sub y le 4 line T4 with ENC. Thus, a s rat hpad register is loaded
only if three onditions prevail:
1. ENC = 1.
2. It is sub y le 4 with T4 = 1.
3. The register has been sele ted by the C eld.
The MBR is also loaded during sub y le 4 if MBR = 1.
Mi roinstru tion Sequen ing: The only remaining issue is how the next mi roinstru tion is
hosen. Although some of the time it is su ient just to fet h the next mi roinstru tion in sequen e,
some me hanism is needed to allow onditional jumps in the mi roprogram in order to enable it to
make de isions. For this reason two elds are provided in ea h mi roinstru tion; namely, ADDR,
whi h is the 8-bit address of a potential su essor to the urrent mi roinstru tion, and COND,
whi h determines whether the next mi roinstru tion is fet hed from the ontrol store address that
is one greater than the ontents of the urrent MPC (i.e., MPC + 1) or from the lo ation spe ied
by the ADDR eld. Every mi roinstru tion potentially ontains a onditional jump. The de ision
to allow for this in the mi roinstru tion format was made be ause onditional jumps are very
ommon in mi roprograms, and allowing every mi roinstru tion to have two possible su essors
makes them run faster than the alternative of setting up some ondition in one mi roinstru tion
and then testing it in the next.
The hoi e of address from whi h the next mi roinstru tion will be fet hed is determined by
the box labeled \Mi ro Sequen ing Logi " during sub y le 4, when the ALU output signals N and
Z are valid. The output of this box ontrols the M multiplexer (MMUX), whi h routes either MPC
+ 1 or ADDR to the MPC (loaded by lo k signal T4) where it will dire t the fet hing of the next
mi roinstru tion. The desired hoi e is indi ated by the setting of the COND eld as follows:
0 = Do not jump: next mi roinstru tion is taken from MPC + 1

1 = Jump to ADDR if N = 1
2 = Jump to ADDR if Z = 1
3 = Jump to ADDR un onditionally
The Mi ro Sequen ing Logi ombines the two ALU bits, N and Z, and the two COND bits C1
and C0 to generate an output that is then used as the sele tion input to the MMUX. The Boolean
expression for generating the sele tion signal (Mmux) is:
Mmux = C1 C0 N _ C1 C0 Z _ C1 C0 = C0 N _ C1 Z _ C1 C0
where \_" means logi al OR. In words, the sele tion ontrol signal to the MMUX is 1 (routing
ADDR to MPC) if C1 C0 is 012 and N = 1, or C1 C0 is 102 and Z = 1, or C1 C0 is 112 . Otherwise,
it is 0 and the next mi roinstru tion in sequen e is fet hed.
Be ause the MAR is loaded at time T3, the memory ontrol unit will not have enough time to
de ode the address spe ied and either read from or write to it when lo k pulse T4 omes along.
In fa t, during a memory read the MBR will be loaded with garbage by the rst T4 lo k pulse
following the loading of the MAR at T3. Hen e, if a mi roinstru tion starts a main memory read,
by setting \Rd" to 1, it must also have Rd = 1 in the next mi roinstru tion exe uted (whi h may or
12
may not be lo ated at the next ontrol store address). In other words, \Rd" must be set to 1 in two
onse utive mi roinstru tions in order for the MBR to be loaded with orre t data (returning from
main memory) by the se ond T4 lo k pulse following the loading of the MAR at T3. A full four
lo k ti ks ( orresponding to a full mi roinstru tion y le time) are needed for the main memory to
respond with valid data. Thus, the data be ome available two mi roinstru tions after the read was
initiated. If the mi roprogram has nothing else useful to do in the mi roinstru tion following the
one that initiated a memory read (or write), that mi roinstru tion's only task is then to keep Rd =
1 (or for writes Wr = 1). In the same way, a memory write also takes two mi roinstru tion times to
omplete. In the mi roinstru tion initiating the write the MAR is typi ally loaded with the address
into whi h data will be written at lo k pulse T3, and the data to be written are loaded into the
MBR at lo k pulse T4. The main memory again needs four lo k ti ks to de ode the address and
omplete the write. Thus, \Wr" must be set equal to 1 in two onse utive mi roinstru tions (the
one initiating the write and the one following it in time).
An Example Ma roar hite ture, the Ma -1
We now onsider the instru tion set ar hite ture of the onventional ma hine level to be supported by the mi roprogrammed interpreter running on the ma hine of Fig. 10. For onvenien e,
we will all the ar hite ture of the level 2 or 3 ma hine the ma roar hite ture to ontrast it with
level 1, the mi roar hite ture. (We will basi ally ignore level 3 at this point be ause its instru tions
are largely those of level 2 and the dieren es are not important here.) Similarly, we will all the
level 2 instru tions ma roinstru tions. Thus, the normal ADD, MOVE, and other instru tions
of the onventional ma hine level will be alled ma roinstru tions. (The point of repeating this
remark is that some assemblers have a fa ility to dene assembly-time \ma ros" that are in no way
related to what we mean by ma roinstru tions.) We will sometimes refer to our example level 1
ma hine as Mi -1 and the level 2 ma hine as Ma -1.
Sta ks: A modern ma roar hite ture should be designed with the needs of high-level languages
in mind. One of the most important design issues is addressing. A me hanism must be provided
for saving a urrent address pointer when a pro edure (or fun tion) is alled and then returning
ba k to where it ame from in the alling program when exiting the pro edure. In some high-level
languages these alled pro edures are alled subprograms, subroutines, or fun tions, and we will use
these terms inter hangably. A way of passing parameters to the alled pro edure where the alled
pro edure will know to look for them also must be made available. The alled pro edure itself may
need to allo ate some memory spa e for lo al temporary variables in order to do its work and then be
able to release the allo ated spa e when returning to the alling program. Furthermore, a hardware
me hanism that will onveniently support re ursive alls (i.e., pro edures alling themselves) is also
desirable. Blo k stru tured languages (like Pas al and others) are normally implemented in su h a
way that when a pro edure is exited, the storage it has been using for lo al variables is released.
The easiest way to a hieve this goal is by using a data stru ture alled a sta k.
A sta k is a ontiguous blo k of memory ontaining some data that operates on a last-in
rst-out basis mu h like a sta k of afeteria trays on a spring loaded base. A pointer (usually
implemented by a CPU register) alled the sta k pointer (SP) is used to point to the urrent
top of sta k lo ation in the region of main memory where the sta k is lo ated. Just like with the
afeteria trays, when a new tray is pla ed on the sta k, its weight pushs down on the spring in the
suporting base. Thus, sta ks are sometimes alled push-down sta ks, and the ma hine instru tion
used to pla e a new data item or address on the sta k is usually alled a PUSH instru tion. On
the other hand, the instru tion used to remove the top item from a sta k (and pla e it elsewhere)
is variously alled by dierent manufa turers a POP instru tion or a PULL instru tion. With the
afeteria tray analogy POP likely refers to the spring in the base popping up a not h when the
weight of the top tray is removed. In other ontexts PULL is obviously the opposite of PUSH. In
the ma roar hite ture des ribed here we will in lude the instru tions PUSH and POP for putting
13
data items on the sta k or getting them o the sta k. The register le in Fig. 5 already ontains a
register alled SP that we an use as the sta k pointer register to point to the urrent top of sta k
lo ation in memory. It also has a PC register that we an use as a program ounter to point to
where the next ma hine instru tion will be found in memory. The instru tion CALL will rst push
the ontent of the PC register onto the sta k before jumping o to the alled pro edure. The jump
to the alled pro edure is a omplished by overwriting the PC register with a new value, alled
the target address (or the entry point of the pro edure) and then letting the omputer fet h its
next instru tion for exe ution from there. By rst saving the PC register ontents on the sta k
before overwriting the PC with a new target address, the alled pro edure will be able to return
to the alling program where it left o. The instru tion RETURN, when exe uted by the alled
pro edure, will simply pop the top of sta k entry into the PC register, thus pointing the program
ounter ba k to a lo ation (the return point) in the alling program, and will in ee t ause a
jump ba k to the alling program. The CALL and RETURN instru tions then provide a means
for saving and then restoring the ontents of the PC register using the sta k when entering and
exiting from alled pro edures.
Although one ould name any register in the PUSH and POP instru tions as the sour e of the
data for a push and the destination for the data from a POP, our example ma hine will impli itly
use only the AC register as the sour e of data for a PUSH and the destination for a POP. Now
a PUSH must advan e the sta k pointer by one memory lo ation before writing the ontents of
the AC register into the memory lo ation at the top of the sta k. One ould hoose either of the
following options for how to advan e the sta k pointer: (1) allow the sta k to grow upward from
low memory addresses to high memory addresses by in rementing SP on a PUSH; or (2) allow the
sta k to grow downward from high memory addresses to low memory addresses by de rementing
SP on a PUSH. Intel has hosen option (2) for the 80X86 ar hite tures and so will we. Be ause
the sta k pointer points to the urrent top of sta k lo ation, a PUSH must rst de rement (the
ontents of) SP and then opy the ontents of the AC to the memory lo ation whose address is in
the SP register. A POP will rst opy the ontents of the top of sta k lo ation into the AC register
and then in rement (the ontent of) the SP register.
In order to permit programs to reserve (or delete) spa e on the sta k for temporary lo al variables, instru tions are needed for in rementing (or de rementing) the ontents of the SP register by
variable amounts. Hen e, the instru tion set will have instru tions for in rementing SP (INSP) and
de rementing SP (DESP) whi h allow the level 2 programmer to spe ify the variable amount with
an 8-bit onstant. Furthermore, instru tions for getting at lo al variables or in oming parameters
on the sta k relative to where the SP (or some other register) urrently points are also useful;
thus, instru tions providing a form of sta k relative indexed addressing are also needed so that one
doesn't have to keep moving the sta k pointer to get at these items. In other words, the Ma -1
needs an addressing mode that fet hes or stores a word at a known distan e relative to the sta k
pointer (or some equivalent addressing mode). In the Ma -1 these sta k pointer relative indexed
addressing mode instru tions will be known as load lo al (LODL), store lo al (STOL), add lo al
(ADDL) and subtra t lo al (SUBL); they will allow the level 2 programmer to spe ify a 12-bit
oset (or base) value and, hen e, they will have a memory referen e format.
The Ma roinstru tion Set: The instru tion set (or repertoire) is the set of all instru tions
that the Ma -1 is apable of exe uting. The Ma -1's ar hite ture onsists of a memory with 4096
16-bit words and three registers visible to the level 2 programmer. The registers are the program
ounter (PC), the sta k pointer (SP), and the a umulator (AC) whi h is used for moving data
around, for arithmeti , and for other purposes. Three addressing modes are provided: dire t,
indire t, and lo al. Instru tions using dire t addressing ontain a 12-bit absolute memory address
in their low-order 12 bits; and instru tions using this format are usually alled \memory referen e
instru tions". Indire t addressing allows the programmer to ompute a memory address, put it in
the AC, and then read or write the word pointed at by the ontents of the AC register; this mode
14
is sometimes alled register indire t addressing. Lo al addressing spe ies an oset from where
the SP points, and is used (among other things) to a ess lo al variables. Together, these three
addressing modes provide a simple but adequate addressing system.
MAC-1 Instru tion Repertoire
OpCode
Binary
0000xxxxxxxxxxxx
0001xxxxxxxxxxxx
0010xxxxxxxxxxxx
0011xxxxxxxxxxxx
0100xxxxxxxxxxxx
0101xxxxxxxxxxxx
0110xxxxxxxxxxxx
0111xxxxxxxxxxxx
1000xxxxxxxxxxxx
1001xxxxxxxxxxxx
1010xxxxxxxxxxxx
1011xxxxxxxxxxxx
1100xxxxxxxxxxxx
1101xxxxxxxxxxxx
1110xxxxxxxxxxxx
1111000000000000
1111001000000000
1111010000000000
1111011000000000
1111100000000000
1111101000000000
11111100yyyyyyyy
11111110yyyyyyyy
1111111111111111
OpCode
Hex
0xxx
1xxx
2xxx
3xxx
4xxx
5xxx
6xxx
7xxx
8xxx
9xxx
axxx
bxxx
xxx
dxxx
exxx
f000
f200
f400
f600
f800
fa00
f yy
feyy
ffff
Assembly
Mnemoni
lodd
stod
addd
subd
jpos
jzer
jump
lo o
lodl
stol
addl
subl
jneg
jnze
all
pshi
popi
push
pop
retn
swap
insp
desp
halt
Instru tion
Load dire t
Store dire t
Add dire t
Subtra t dire t
Jump if positive
Jump if zero
Jump
Load onstant
Load lo al
Store lo al
Add lo al
Subtra t lo al
Jump if negative
Jump if nonzero
Call pro edure
Push indire t
Pop indire t
Push onto sta k
Pop from sta k
Return
Swap a , sp
In rement sp
De rement sp
Halt ma hine
Meaning
or A tion
a :=m[x
m[x:=a
a :=a +m[x
a :=a m[x
if a 0 then p :=x
if a =0 then p :=x
p :=x
a :=x (0x4095)
a :=m[x+sp
m[x+sp:=a
a :=a +m[x+sp
a :=a m[x+sp
if a <0 then p :=x
if a 6=0 then p :=x
sp:=sp 1;m[sp:=p ;p :=x
sp:=sp 1;m[sp:=m[a
m[a :=m[sp;sp:=sp+1
sp:=sp 1;m[sp:=a
a :=m[sp;sp:=sp+1
p :=m[sp;sp:=sp+1
tmp:=a ;a :=sp;sp:=tmp
sp:=sp+y (0y255)
sp:=sp y (0y255)
stops fet hing instru tions
xxxxxxxxxxxx is a 12-bit ma hine address (or onstant); in olumn 2 it is alled xxx and in olumn 5 it is
alled x.
yyyyyyyy is an 8-bit onstant; in olumn 2 it is alled yy and in olumn 5 it is alled y.
Figure 11: Table of Ma -1 Instru tions

The Ma -1 instru tion set is shown in Fig. 11. Ea h instru tion ontains an operation ode
(op ode) and sometimes a memory address or onstant. The op ode spe ies the operation to be
performed and is shown in binary in the rst olumn of the table. The 12 x's in the instru tions
having a memory referen e format reserve a 12-bit eld for a memory address (or in the ase
of LOCO a onstant) to be spe ied by the level 2 programmer. The same is true of the 8 y's
in the INSP and DESP instru tions that reserve an 8-bit onstant eld to be spe ied by the
level 2 programmer. Column two gives the instru tion en oding in hexade imal shorthand, and
olumn three spe ies the assembly language mnemoni for ea h instru tion's op ode. Although the
assembler program for this instru tion set is ase sensitive and wants to see the ma hine instru tion
15
mnemoni s in all lower- ase letters, we will use upper- ase in this text for emphasis when talking
about spe i instru tions. Column four gives a short des ription of what the instru tion does
and olumn ve spe ies the a tion performed in a register transfer language notation. In olumn
ve, if there is more than one a tion o uring, then ea h part of the a tion sequen e is separated
from the next by a semi olon, and the sequen e of a tions o urs in left to right order. Column
ve spe ies the register transfers and a tions using a pseudo-Pas al language fragment. In these
fragments, \m[x" refers to memory word \x."
LODD loads the a umulator (AC register) from the memory word spe ied in its low-order
12 bits. LODD thus spe ies dire t addressing; whereas, LODL loads the a umulator from the
word at a distan e \x" from where the SP register points and thus spe ies indexed addressing
with the SP register a ting as an index register. LODD, STOD, ADDD, and SUBD perform four
basi fun tions using dire t addressing, and LODL, STOL, ADDL, and SUBL perform the same
fun tions using indexed (or lo al relative to the SP) addressing.
Five jump instru tions are provided, one un onditional jump (JUMP) and four onditional
ones (JPOS, JZER, JNEG, and JNZE). JUMP always opies its low-order 12 bits into the program
ounter (PC); whereas, the other four do so only if the spe ied ondition is met.
LOCO loads a 12-bit onstant in the range 0 to 4095 (in lusive) into the AC. PSHI pushes onthe
the sta k the word whose address is present in the AC register. The inverse operation is POPI,
whi h pops a word from the sta k and stores it in the memory word whose address is in the AC
register. PUSHI and POPI thus spe iy register indire t addressing using the impli it AC register
as the holder of the indire t address. PUSH and POP are useful for manipulating the sta k in a
variety of ways. SWAP ex hanges the ontents of AC and SP, whi h provides a way of loading the
SP register with a new value. It is also useful for initializing SP at the start of exe ution. INSP
and DESP are used to hange SP by amounts known at ompile time. Be ause the number of
instru tions to be en oded is more than a 16-bit word with a 12-bit address elds will allow, it has
been ne essary to tradeo bits in the address eld with bits in the op ode eld and use \expanding
op odes" to en ode all of the instru tions. The osets for INSP and DESP are limited to 8 bits
in the (in lusive) range of 0 to 255. Finally, CALL alls a pro edure, saving the return address on
the sta k, and RETN returns from a pro edure by popping the return address and putting it in
the PC register.
Input/Output: The Ma -1 does not have any expli it input or output instru tions. Instead,
it uses memory-mapped I/O. A read from address 4092 will yield a 16-bit word with the next
ASCII hara ter from the standard input devi e in the low-order 7 bits and zeros in the high-order
9 bits of the AC register. When a hara ter is available in the data register whose address is 4092,
the standard input devi e will set to 1 the high-order bit of the input status register at memory
address 4093. The a tion of loading the ontent of the input data register at memory address
4092 into the AC register lears (i.e., sets to zero) the ontent of ip- ops in the status register at
memory address 4093. The input routine will normally sit in a tight loop waiting for the ontent
of 4093 to go negative. When it does, the input routine will load the AC from 4092 and return.
Output is a omplished using a similar s heme. A write (i.e., store) to the output data register
at memory address 4094 opies the low-order 7 bits in the AC register to the standard output
devi e and at the same time lears (i.e., sets to 0) the high-order bit of the output status register
at memory address 4095. The high-order bit in the output status register at memory address 4095
is later set to 1 by the standard output devi e when it is again ready to a ept another hara ter
in its data register. Standard input and output may be a terminal keyboard and visual display,
or a ard reader and printer, or some other ombination. (Unfortunately, the simulators used to
exe ute level 2 programs on this ma roar hite ture have not as yet implemented the input/output
data and status registers; so input and output are not simulated.)
16
An Example Mi roprogram
Having spe ied both the mi roar hite ture and the ma roar hite ture in detail, the remaining
issue is the implementation: What does a program running on the former and interpreting the latter
look like, and how does it work? Here we will examine how the hardware omponents are ontrolled
by the mi roprogram and how the mi roprogram interprets the onventional ma hine level. Early
omputers were not mi roprogrammed at all and had instru tions for arithmeti , Boolean oprations,
shifting, omparing, looping, and so on, that were all dire tly exe uted by the hardware. Modern
day redu ed instru tion set omputers (RISC) do likewise, but their level 2 ma hine instru tions are
merely highly en oded mi roinstru tions; so in this ase ompilers translate the high level language
statements into sequen es of mi roinstru tions that are easy to de ode and dire tly ontrol the
mi roar hite ture's data path. Mi roprogrammed ma hines, on the other hand, interpret the level
2 ma hine instru tions using a mi roprogram stored in ontrol memory. The mi roprogram is
written by a mi roprogrammer (an individual who writes mi roprograms and not merely a small
programmer). The ompilers for mi roprogrammed ma hines usually translate high-level languages
into sequen es of level 2 ma hine language statements that are in turn fet hed and de oded by the
mi roprogram that dire tly ontrols the data path's mi roar hite ture.
We ould write the mi roprogram to fet h, de ode and exe ute the level 2 ma hine instru tions
by dire tly spe ifying the sequen es of 32-bit binary numbers (to be stored in ontrol memory)
that ea h dire tly ontrol the hardware for one ma hine y le omprising the four lo k ti ks of
the four-phase y le. This tedious task is what ultimately must be done, but having a higher level
symboli language notation that is then translated into the 32-bit numbers will make the task
easier.
The Mi ro Assembly Language (MAL): One possible notation is to have the mi roprogrammer spe ify one mi roinstru tion per line, naming ea h nonzero eld and its value. For example, to add (the ontents of the) AC to (the ontents of the) A register and store the result in the
AC register, we ould write
ENC = 1, C = 1, B = 1, A = 10
Many mi roprogramming languages look like this; however, this notation is awful.
A mu h better idea is to use a high-level language notation, while retaining the basi on ept of
one sour e line per mi roinstru tion. Con eivably, one ould write mi roprograms in an ordinary
high-level language, but be ause e ien y is ru ial in mi roprograms, we will sti k to assembly
language, whi h we dene as a symboli language that has a one-to-one mapping onto ma hine
instru tions. Our high-level Mi ro Assembly Language will be alled \MAL," the Fren h word
for \si k." In MAL, stores into the 16 s rat hpad registers or MAR and MBR are denoted by
assignment statements. Thus, the above example in MAL be omes: a :=a + a. (Be ause the
intention is to make MAL Pas al-like, we adopt the usual Pas al onvention of lower- ase names
for identiers.)
To indi ate the use of the ALU fun tions 0, 1, 2, and 3, we an write, for example,
a :=a + a , a:=band(ir,smask), a :=a, and a:=inv(a),
respe tively, where \band" stands for \Boolean AND" and \inv" stands for \invert" (i.e., bitwise
logi al omplement). Shifts an be denoted by the fun tions \lshift" for left shifts and \rshift" for
right shifts, as in
tir:=lshift(tir + tir)
whi h puts the ontents of the TIR register on both the A and B buses, auses the ALU to perform
an addition, and left shifts the sum 1 bit left before storing it ba k into the TIR register.
17
Un onditional jumps an be handled with

outputs N and Z; for example,
goto
if n then
statements; onditional jumps an test ALU
goto
27
Assignments and jumps an be ombined on the same line. However, a slight problem arises if
we wish to test a register but not make a store. How do we spe ify whi h register is to be tested?
To solve this problem, we introdu e the pseudo variable \alu," whi h an be used in the language to
form a valid assignment statement but whi h in reality has no destination farther than the ALU's
output. (Re all that the ALU is made of only ombinational logi omponents and ontains no
registers or other memory devi es.) For example,
alu:=tir; if n then goto 27
means that the ontent of the TIR register is to be run through the ALU un hanged on the A-bus
(ALU ode = 2) so its high-order bit an be tested. Note that this use of \alu" means that ENC
= 0.
To indi ate memory reads and writes, we will just put \rd" and \wr" in the sour e program.
The order of the various parts of the sour e statement is, in prin iple, arbitrary but to enhan e
readability we will try to arrange them in the order that they are arried out. Fig. 12 gives a few
examples of MAL statements along with the translated elds of the orresponding mi roinstru tions
(shown in de imal shorthand for ea h eld).
A
M
U
Statement
X
mar:=p ; rd
0
rd
0
ir:=mbr
1
p :=p + 1
0
mar:=ir; mbr:=a ; wr
0
alu:=tir; if n then goto 15
0
a :=inv(mbr)
1
tir:=lshift(tir); if n then goto 25
0
alu:=a ; if z then goto 22
0
a :=band(ir, amask); goto 0
0
sp:=sp + (-1); rd
0
tir:=lshift(ir + ir); if n then goto 69 0
C
S
A
O A H M M
E
D
N L F B A R W N
D
D U T R R D R C C B A R
0 2 0 0 1 1 0 0 0 0 0 00
0 2 0 0 0 1 0 0 0 0 0 00
0 2 0 0 0 0 0 1 3 0 0 00
0 0 0 0 0 0 0 1 0 6 0 00
0 2 0 1 1 0 1 0 0 3 1 00
1 2 0 0 0 0 0 0 0 0 4 15
0 3 0 0 0 0 0 1 1 0 0 00
1 2 2 0 0 0 0 1 4 0 4 25
2 2 0 0 0 0 0 0 0 0 1 22
3 1 0 0 0 0 0 1 1 8 3 00
0 0 0 0 0 1 0 1 2 2 7 00
1 0 2 0 0 0 0 1 4 3 3 69
Figure 12: Some MAL statements and their orresponding mi roinstru tions.
The Example Mi roprogram: We have nally rea hed the point where we an put all the
pie es together. Fig. 13 is the mi roprogram that runs on the Mi -1 and interprets the Ma -1. It
is a surprisingly short program { only 81 lines. By now the hoi e of names for the s rat hpad
registers in Fig. 5 is obvious: PC, AC, and SP are used to hold the three Ma -1 registers. IR is the
instru tion register and holds the ma roinstru tion urrently being exe uted. TIR is a temporary
opy of the IR, used for de oding the op ode. The next three registers hold the indi ated onstants.
AMASK is the address mask 0FFF16 , and is used to separate out op ode and address bits. SMASK
is the sta k mask, 00FF16 , and is used in the INSP and DESP instru tions to isolate the 8-bit oset
value. The remaining six registers have no assigned fun tion and an be used as s rat h registers
for whatever the mi roprogrammer wishes.
18
Like all interpreters, the mi roprogram in Fig. 13 has a main loop that fet hes, de odes, and
exe utes instru tiions from the program being interpreted, in this ase level 2 instru tions. Its
main loop begins on line 0, where it begins fet hing the ma roinstru tion whose memory address
is in the PC register. While waiting for this instru tion to arrive, the mi roprogram in rements
the ontent of the PC and ontinues to assert the \Rd" bus signal. When it arrives, in line 2, it is
stored in the IR register and simultaneously the high-order bit (bit 15) is tested. If bit 15 is a 1,
de oding pro eeds to line 28; otherwise, it ontinues on line 3. Assuming for the moment that the
instru tion is a LODD, bit 14 is tested on line 3, and the TIR register is loaded with the original
instru tion shifted left 2 bit positions, one shift using the adder and one using the shifter. Note
that the ALU status bit N is determined by the ALU output in whi h bit 14 is the high-order bit,
be ause IR + IR shifts the IR ontents left 1 bit position. The shifter output does not ae t the
ALU status bit.
All instru tions having 00 in their two high-order bits eventually ome to line 4 to have bit 13
tested, with the instru tions beginning with 000 going to line 5 and those beginning with 001 going
to line 11. Line 5 is an example of a mi roinstru tion with ENC = 0; it just tests the ontent of the
TIR register, but does not hange it. Depending on the out ome of this test, the ode for LODD
or STOD is sele ted.
For LODD, the mi ro ode must rst fet h the word dire tly addressed by loading the low-order
12 bits of the IR into the MAR. In this ase, the high-order 4 bits are all zero, but for STOD and
other instru tions they are not. However, be ause the MAR is only 12 bits wide and onne ted to
only the low-order 12 bits on the B-bus, the op ode bits do not ae t the hoi e of the word to be
read. In line 7, the mi roprogram has nothing to do, so it just waits. When the word arrives, it
is opied into the AC register and the mi roprogram jumps ba k to the top of the loop where the
instru tion fet h y le begins. STOD, ADDD, and SUBD are similar. The only noteworthy point
on erning them is how subtra tion is done.
Re all that in radix r the radix omplement (RC) of a number x is dened to be RC(x) = rn x.
Similarly, the diminished radix omplement (DRC) of x (also alled the r 1's omplement) is
dened to be DRC(x) = rn r m x. When m = 0 so that we are dealing only with n-bit registers
ontaining integers, then the 1's omplement of x is 1's(x) = 2n 20 x = 2n 1 x. The 2's
omplement of x is then 2's(x) = 2n x = 10 s(x) + 1, where the 1's omplement of x is the same
as the bitwise logi al omplement of the n-bit number x. Thus, SUBD makes use of the fa t that
x
=x+(
) = x + (y + 1) = x + 1 + y
in two's omplement. The addition of 1 to the ontent of the AC is done on line 16 (using the
ommutativity of additiion); otherwise line 16 would be wasted like line 13.
The mi ro ode for JPOS begins on line 21. If the ontent of the AC < 0, the bran h fails
and JPOS is terminated immediately by jumping ba k to the main loop and fet hing the next
instru tion in sequen e. If, however, the ontent of the AC 0, the low-order 12 bits of the IR are
extra ted by ANDing them with the 0FFF16 mask in the AMASK register and storing the result
in the PC register. It does not ost anything extra to remove the op ode bits here, so we might
as well do it. If it had ost an extra mi roinstru tion, however, we would have had to look very
arefully to see if having garbage in the high-order 4 bits of the PC ould ause trouble later.
In a ertain sense, JZER (line 23) works the opposite of JPOS. With JPOS, if the test ondition
is met, the jump fails and ontrol returns to the main loop. With JZER, if the test ondition is met,
the jump is taken. Be ause the ode for performing the jump is the same for all jump instru tions,
we an save mi ro ode by just going to line 22 whenever feasible. This style of programming
generally would be onsidered un outh in an appli ation program, but in a mi roprogram no holds
are barred. Performan e is everything.
19
Mi roprogram to fet h, de ode, and exe ute Ma -1 instru tions
Adr: Mi roinstru tion
0:
mar:=p ; rd;
Comment
fet h instr
Adr: Mi roinstru tion
Comment
41: alu:=tir; if n then goto 44;
de ode ir12
1: p :=p + 1; rd;
in rement p
42:
2: ir:=mbr; if n then goto 28;
de ode ir15
43: goto 0;
3: tir:=lshift(ir + ir); if n then goto 19;
de ode ir14
44:
4: tir:=lshift(tir); if n then goto 11;
de ode ir13
45: p :=band(ir,amask); goto 0;
de ode ir12
6:
mar:=ir; rd;
LODD
47:
0000 =
7: rd;
mar:=ir; mbr:=a ; wr;
0001 =
12:
mar:=ir; rd;
de ode ir12
0010 =
ADDD
13: rd;
mar:=ir; rd;
0011 =
de ode ir10
de ode ir9
53:
mar:=a ; rd;
56:
mar:=sp; sp:=sp + 1; rd;
58: mar:=a ; wr; goto 10;

de ode ir13
60:
de ode ir12
61: mar:=sp; mbr:=a ; wr; goto 10;
alu:=a ; if z then goto 22;
CALL
1111-0000 =
PSHI
1111-0010 =
POPI
55: mar:=sp; wr; goto 10;
SUBD
18: a :=a + a; goto 0;
23:
1110 =
de ode ir11
17: a:=inv(mbr);
alu:=a ; if n then goto 0;
de ode ir12
57: rd;
22: p :=band(ir,amask); goto 0;
JNZE
16: a :=a + 1; rd;
21:
1101 =
54: sp:=sp + (-1); rd;
14: a :=mbr + a ; goto 0;
15:
sp:=sp + (-1);
JNEG
49: p :=band(ir,amask); wr; goto 0;
STOD
10: wr; goto 0;

alu:=a ; if z then goto 0;
1100 =
48: mar:=sp; mbr:=p ; wr;
8: a :=mbr; goto 0;
9:
alu:=a ; if n then goto 22;
0100 =
JPOS
62:
sp:=sp + (-1);
perform jump
63: rd;
0101 =
64: a :=mbr; goto 0;
JZER
de ode ir9
1111-0100 =
PUSH
1111-0110 =
POP
24: goto 0;
else don't jump
de ode ir10
de ode ir12
de ode ir9
26:
27:
p :=band(ir,amask); goto 0;
0110 =
JUMP
= LOCO
67:
a :=band(ir,amask); goto 0;
0111
68: rd;
28: tir:=lshift(ir + ir); if n then goto 40;
de ode ir14
de ode ir13
70:
de ode ir12
1000 =
LODL
71: a :=sp;
72: sp:=a; goto 0;
1001 =
STOL
31:
a:=ir + sp;
32: mar:=a; rd; goto 7;
33:
a:=ir + sp;
36:
a:=ir + sp;
a:=ir + sp;
74:
de ode ir12
1010 =
a:=band(ir,smask);
1011 =
ADDL
77:
1111-1010 =
SWAP
de ode ir9
1111-1100 =
INSP
a:=band(ir, smask);
de ode ir8
1111-1110 =
DESP
1111-1111 =
HALT
78: a:=inv(a);
SUBL
79: a:=a + 1; goto 75;
80:
39: mar:=a; rd; goto 16 ;

a:=a ;
75: sp:=sp + a; goto 0;
37: mar:=a; rd; goto 13;
38:
RETN
69: p :=mbr; goto 0;
34: mar:=a; mbr:=a ; wr; goto 10;

1111-1000 =
halt; goto 80;
de ode ir13
The exe ution y le for ea h de oded MAC-1 instru tion begins at the ontrol store address whose line
is labeled with a omment showing the assembly language mnemoni for the orresponding instru tion
( apitalized for emphasis). \Adr:" is the ontrol store address. The instru tion fet h y le begins at ontrol
store address zero.
Figure 13: Mi roinstru tions to fet h, de ode, and exe ute Ma -1 instru tions on the example Mi -1
mi roar hite ture
20
JUMP and LOCO are straightforward, so the next interesting exe ution routine is for LODL.
First the absolute memory address to be referen ed is omputed by adding the oset ontained in
the instru tion to the ontent of the SP register. Then the memory read is initiated. Be ause the
rest of the ode is the same for LODL and LODD, we might as well use lines 7 and 8 for both
of them. Not only does this save ontrol store spa e with no loss of exe ution speed but it also
means fewer routines to debug. Analogous ode is used for STOL, ADDL, and SUBL. The ode for
JNEG and JNZE is similar to JZER and JPOS, respe tively (not the other way around). CALL
rst de rements the ontent of the SP register, then pushes the return address (whi h is the urrent
ontent of the PC register) onto the sta k, and nally jumps to the alled pro edure. Line 49 is
almost identi al to line 22; if it had been exa tly the same, we ould have eliminated line 49 by
putting an un onditional jump to 22 in 48. Unfortunately, we must ontinue to assert \Wr" for
another mi roinstru tion.
The rest of the ma roinstru tions all have 1111 as their high-order 4 bits, so de oding of (at
least some of) the low-order 12 bits in these instru tions is required to tell them apart. The a tual
exe ution routines are straightforward so we will not omment on them further.
A few more points are worth making. In Fig. 13 we in rement the ontent of the PC register
in line 1. It ould equally well have been done in line 0, thus freeing line 1 for something else while
waiting for memory to respond. In this ma hine there is nothing else to do, but in a real ma hine
the mi roprogram might use this opportunity to he k for I/O devi es awaiting servi e, refresh
dynami RAM, or something else.
If we leave line 1 the way it is , however, we ould speed up the ma hine by modifying line 8 to
read
mar:= p ; a := mbr; rd; goto 1;
In other words, we an start fet hing the next instru tion before we have really nished with the
urrent one. This apability provides a primitive form of instru tion pipelining. The same tri k
an be applied to other exe ution routines as well.
It is lear that a substantial amount of the exe ution time of ea h ma roinstru tion is devoted
to de oding it bit by bit. This observation suggests that it might be useful to be able to load
the MPC register under mi roprogram ontrol. On many existing omputers the mi roar hite ture
has hardware support for extra ting ma roinstru tion op odes and stung them dire tly into the
MPC to ee t a multiway bran h. If, for example, we ould shift the IR 9 bits to the right and
put the resulting number into the MPC, we would have a 128-way bran h to lo ations 0 through
127. Ea h of these words would ontain the rst mi roinstru tion in the exe ution sequen e for
the orresponding ma roinstru tion. Although this approa h wastes ontrol store spa e, it greatly
speeds up the ma hine, so something like it is nearly always used in pra ti e.
By using memory-mapped I/O, the CPU is not aware of the dieren e between true memory
addresses and I/O devi e registers. The mi roprogram handles reads and writes to the top four
words of the address spa e the same way it handles any other reads and writes.
Designing a ma hine as a series of levels is done for e ien y and simpli ity be ause ea h level
deals only with another level of abstra tion. The level 0 designer worries about how to squeeze the
last few nanose onds out of the ALU by using some means to redu e arry-propagation time. The
mi roprogrammer worries about how to get the most mileage out of ea h mi roinstru tion, typi ally
by exploiting as mu h of the hardware's inherent parallelism as possible. The ma roinstru tion set
designer worries about how to provide an interfa e that both the ompiler writer and mi roprogrammer an learn to love, and be e ient at the same time. Clearly, ea h level has dierent goals,
problems, te hniques, and in general, a dierent way of looking at the ma hine. By splitting the
total ma hine design problem into several subproblems, we an attempt to master the inherent
omplexity in designing a modern omputer.
21

Arhitektura

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Arhitektura

Caricato da

Copyright:

Formati disponibili

Ele tri al and Computer Engineering Department

College Park, MD 20742-3285

 A. James Clark S hool of Engineering

Dr. Charles B. Silio, Jr.

The Mi roar hite ture/Mi roprogramming Level

Figure 1: Clo ked D-lat h

Figure 2: Eight-bit register and bus onne tions

Figure 3: Sixteen-bit register s hemati

CPU reg. adr.

Figure 6: Four Phase Clo k Cy le

Memory Address Register

Figure 7: Memory Address Register (MAR)

Figure 8: Memory Bu er Register (MBR)

MBR, MAR, RD, WR, ENC

256 wds X 32 bits Control Store

0 = Do not jump: next mi roinstru tion is taken from MPC + 1

Figure 11: Table of Ma -1 Instru tions

Un onditional jumps an be handled with

statements; onditional jumps an test ALU

Mi roprogram to fet h, de ode, and exe ute Ma -1 instru tions

Adr: Mi roinstru tion

Adr: Mi roinstru tion

41: alu:=tir; if n then goto 44;

2: ir:=mbr; if n then goto 28;

3: tir:=lshift(ir + ir); if n then goto 19;

4: tir:=lshift(tir); if n then goto 11;

45: p :=band(ir,amask); goto 0;

5: alu:=tir; if n then goto 9;

46: tir:=lshift(tir); if n then goto 50;

mar:=ir; mbr:=a ; wr;

52: alu:=tir; if n then goto 56;

mar:=sp; sp:=sp + 1; rd;

58: mar:=a ; wr; goto 10;

19: tir:=lshift(tir); if n then goto 25;

20: alu:=tir; if n then goto 23;

61: mar:=sp; mbr:=a ; wr; goto 10;

alu:=a ; if z then goto 22;

55: mar:=sp; wr; goto 10;

18: a :=a + a; goto 0;

alu:=a ; if n then goto 0;

51: tir:=lshift(tir); if n then goto 59;

22: p :=band(ir,amask); goto 0;

50: tir:=lshift(tir); if n then goto 65;

16: a :=a + 1; rd;

54: sp:=sp + (-1); rd;

14: a :=mbr + a ; goto 0;

49: p :=band(ir,amask); wr; goto 0;

10: wr; goto 0;

alu:=a ; if z then goto 0;

48: mar:=sp; mbr:=p ; wr;

alu:=a ; if n then goto 22;

mar:=sp; sp:=sp + 1; rd;

64: a :=mbr; goto 0;

else don't jump

65: tir:=lshift(tir); if n then goto 73;

25: alu:=tir; if n then goto 27;

66: alu:=tir; if n then goto 70;

mar:=sp; sp:=sp + 1; rd;

28: tir:=lshift(ir + ir); if n then goto 40;

29: tir:=lshift(tir); if n then goto 35;

30: alu:=tir; if n then goto 33;

32: mar:=a; rd; goto 7;

73: tir:=lshift(tir); if n then goto 76;

76: alu:=tir; if n then goto 80;

A. James Clark S hool of Engineering

Figure 8: Memory Buer Register (MBR)