Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
University of Maryland
Telephone 301-405-3668
Fax 301-314-9281
silioeng.umd.edu
These notes are based on and extend material in Chapter 4 of A. S. Tanenbaum, Stru
tured
Computer Organization, 3rd Edition, Prenti
e Hall, 1990. The a
umulator based ma
hine whose
instru
tion set ar
hite
ture is
alled the Ma
-1 has its data path mi
roar
hite
ture and its mi
roprogrammed implementation (
alled the Mi
-1) presented here. This presentation diers from the
sta
k oriented IJVM and
orresponding Mi
-1 in Tanenbaum's 5th Edition textbook.
One of the dieren
es between the Mi
-1 (mi
roprogrammed
omputer) presented here and the
one in the 5th Edition textbook is that all registers in this Mi
-1 are
onstru
ted from
lo
ked (or
gated) D-lat
hes, as shown in Fig. 1; whereas, registers in the 5th Edition text use edge-triggered
ip-
ops. Fig. 2 shows how an 8-bit register is built using
lo
ked D-lat
hes and three-state (i.e.,
tri-state) buers for
onne
tion to two output buses.
11
00
00
11
1
0
Q
D
CLK
1
0
1
0
CLK
11
00
00
11
11
00
00
11
b7
Q
CLK
Load
OE-A
00
11
11
00
b6
CLK
1
0
A-Bus
11
00
00
11
11
00
B-Bus
b4
Q
CLK
1
0
0
1
1
0
0
1
11
00
00
11
1
0
0
1
b3
Q
CLK
1
0
0
1
00
11
0
1
11
00
1
0
0
1
11
00
00
11
1
0
0
1
11
00
00
11
11
00
00
11
11
00
00
11
b2
Q
CLK
00
11
0
1
11
00
0
1
1
0
Q
Q
D
00
11
1
0
1
0
0 00
1
110
110
00
11
00
1
0 11
1
00 11
001
11
00
00
11
OE-B
b5
Q
CLK
11
00
00
11
1
0
Q
Q
D
1
0
0
1
1
0
C-Bus
b1
Q
CLK
b0
Q
CLK
1
0
1
0
10
0
11
00
00
11
1
11
00
0 11
1
00 11
00
1
0
0
1
00
11
11
00
0
1
00
11
0
1
1
0
11
00
00
11
1
0
1
0
0
1
11
00
Registers: A register is a devi
e
apable of storing information. Con
eptually, registers are
the same as main memory, the dieren
e being that the registers are lo
ated physi
ally within
the pro
essor itself, so they
an be read from and stored into faster than words in main memory,
whi
h is usually o-
hip. Larger and more expensive ma
hines usually have more registers than
smaller and
heaper ones, whi
h must use main memory for storing intermediate results. On some
omputers a set of registers numbered 0; 1; 2; : : : ; n 1, is available at the mi
roprogramming
level and is
alled lo
al storage or s
rat
hpad storage.
A register
an be
hara
terized by a single number: namely, how many bits
an it hold (e.g.,
Fig. 2 is an 8-bit register). The bits (binary digits) in an n-bit register
ould be numbered from
left to right or from right to left. The numbering
onvention assumed in these notes for the bits in
an n-bit register is right to left from 0 to n 1 in the natural powers of two order of a positional
number system for integers. In other words, bit 0 is stored in the rightmost D-lat
h in Fig. 2 and
bit 7 is stored in the leftmost D-lat
h (whi
h
orresponds to bit n 1 when n = 8).
Information pla
ed in a register remains there until some other information repla
es it. The
pro
ess of reading information out of a register does not ae
t the
ontents of the register. In other
words, when a register is read, a
opy is made of its
ontents and the original is left undisturbed in
the register. Similarly, when information is moved from one register to another, a
opy is loaded
into the destination register and the
ontents of the sour
e register remain undisturbed.
A bus is a
olle
tion of wires used to transmit signals in parallel. For example, buses are
used to allow the
ontents of one register to be
opied to another one. A bus may be unidire
tional or
bidire
tional. A unidire
tional bus
an transfer data only in one dire
tion; whereas, a bidire
tional
bus
an transfer data in either dire
tion but not both simultaneously. Unidire
tional buses are
typi
ally used to
onne
t two registers, one of whi
h is always the sour
e and the other of whi
h is
always the destination. Bidire
tional buses are typi
ally used when any of a
olle
tion of registers
an be the sour
e and any other one
an be the destination.
Many devi
es have the ability to
onne
t and dis
onne
t themselves ele
tri
ally from the buses
to whi
h they are physi
ally atta
hed. These
onne
tions
an be made or broken in nanose
onds.
A bus whose devi
es have this property is
alled a tri-state (or three-state) bus (the term tri-state
being a registered trademark of National Semi
ondu
tor Corp.). A tri-state buer amplier is used
to make the
onne
tions. These tri-state buer ampliers are shown in Fig. 2 as triangular shapes
whose inputs
ome from the output of the D-lat
h to whi
h ea
h is
onne
ted and ea
h of whose
outputs is
onne
ted to a single bus wire. The other input to the buer amplier (labeled either
OE-A or OE-B) is a
ontrol (or enable) input. If this
ontrol input is in the logi
zero state, then
the output of its buer amplier is in the high-impedan
e state (i.e., dis
onne
ted from the bus
wire to whi
h it is atta
hed). If the
ontrol input is in the logi
one state (also
alled a
tive-high)
then the buer amplier's output value equals its input value (either logi
0 or logi
1), and the
D-lat
h's output state is
onne
ted to the
orresponding bus wire.
In most mi
roar
hite
tures, some registers are
onne
ted to one or more input buses and to
one or more output buses. Fig. 2 depi
ts an 8-bit register
onne
ted to one input bus and to two
output buses. The register has three
ontrol inputs: namely, Load, OE-A, and OE-B, where OE
stands for \output enable." When \Load" is in the logi
zero state, the
ontents of the register
are not ae
ted by the signals on the C-bus wires. When \Load" is raised to the logi
1 state the
values on the C-bus wires are
opied into their
orresponding D-lat
hes in parallel. After the new
values are lat
hed \Load"
an be returned to its logi
zero state, and the register remembers the
binary value last loaded into it.
When \OE-A" is at the logi
zero level, the register is dis
onne
ted from the A-Bus (and
similarly for \OE-B" with respe
t to the B-Bus). When \OE-A" is raised to the logi
1 level, the
register is
onne
ted to the A-Bus wires (and similarly for \OE-B" with respe
t to the B-bus).
In order to transfer data from this register to another register R using the A and C buses. The
input to register R must be
onne
ted to the C-Bus, and \OE-A" for this register must be raised to
Buses:
the logi
1 level in order to pla
e the register's
ontents on the A-Bus. Other
ir
uitry su
h as an
arithmeti
and logi
unit (ALU) whi
h is not shown here must then be used to
onne
t the A-Bus
wires to the C-Bus wires. After a short time to allow the signals on the buses to settle down and
be
ome stable then the Load signal
onne
ted to register R is raised to the logi
1 level and the
information transfer is a
omplished.
Be
ause drawing all of the wires and lat
hes shown in Fig. 2 requires too mu
h spa
e, a
shorthand s
hemati
su
h as that shown in Fig. 3 is used instead. Fig. 3 depi
ts a 16-bit register
that would be
onstru
ted internally in the same fashion as the 8-bit register shown in Fig. 2 but
with 8 more lat
hes, 16 more buer ampliers, and 24 more bus wires.
16
16
From C-Bus
16-bit Register
To A-Bus
16
To B-Bus
Load Clk
OE-A
OE-B
I1
Z
I1
2 to 1
MUX
I0
I0
1
0
S
S
Figure 4: 2 to 1 Multiplexer (one for ea
h output bit when used with registers)
An Example Mi
roar
hite
ture
The data path of our example mi
roar
hite
ture is shown in Fig. 5. The data path is that
part of the
entral pro
essing unit (CPU) that
ontains the arithmeti
and logi
unit (ALU) and
its inputs and outputs. In this
ase it
ontains 16 identi
al 16-bit registers, labeled PC, AC, SP,
and so on, that form a s
rat
hpad memory a
essible only to the the mi
roprogramming level.
The registers labeled 0, +1, and -1 will be used to hold the indi
ated
onstants (with -1 in two's
omplement form). The meaning of the other register names will be explained later. Ea
h register
an output its
ontents onto one or both of two internal buses, the A-Bus and the B-Bus, and ea
h
an be loaded from a third internal bus, the C-Bus as shown in the gure.
3
C-Bus
A-Bus
B-Bus
Enc
T4
C-Bus
Decoder
16
PC
AC
SP
IR
TIR
+1
-1
4-Phase
Clock
Generator
T4 T3 T2 T1
16
4
C
Field
16
8
AMASK
SMASK
10
11
12
13
14
15
A-Bus
Decoder
A
Field
B-Bus
Decoder
4
B
Field
To
MMUX
Main
Memory
4095
B-Latch
A-Latch
C1
C0
MAR
MBR
I1
Wr
Mbr
Rd
T4
AMUX
T2
I0
Mar
Micro
Sequencing
Logic
F1
F0
T3
ALU
Amux
Shifter
N
Z
S1
S0
Figure 5: The data path for example mi roar hite ture (Mi 1/Ma 1)
The A and B buses respe
tively feed the left and right inputs of a 16-bit wide ALU that
an
perform four fun
tions: addition (A + B), bitwise logi
al AND (A.AND.B), left input straightthrough (A), and bitwise logi
al
omplement (i.e., 1's
omplement) of the
ontent of the left input
(NOT A). The fun
tion to be performed is spe
ied by the two ALU
ontrol lines F1 and F0 . The
ALU generates two status bits based on the
urrent ALU output: N, whi
h takes the value 1 when
the ALU output is negative, and Z, whi
h takes the value one when the ALU output is zero. The
N bit is just a
opy of the high-order (bit position 15) output bit. The Z bit is the NOR of all the
ALU output bits (namely, bits 0 through 15).
The 16-bit ALU output goes into a shifter, whi
h is a
ombinational
ir
uit that
an logi
ally
shift its input 1 bit left or right, or not at all, and gate the result to its 16-bit output. The fun
tion
to be performed by the shifter is spe
ied the the two shifter
ontrol lines S1 and S0 . It is possible
to perform a 2-bit left shift of a register, R, by using the ALU to
ompute R + R (whi
h is a 1-bit
left shift) and then shifting this sum another bit left using the shifter.
The A-Bus de
oder is used to de
ode a 4-bit register designator (A-eld) that sele
ts one of
the 16 s
rat
hpad registers to be gated onto the A-Bus. The outputs of the de
oder are 16 output
enable (OE-A) signals (one for ea
h register) and one and only one of the OE-A signals takes the
value 1. The B-Bus de
oder is used to de
ode a 4-bit register designator (B-eld) that sele
ts one
of the 16 s
rat
hpad registers to be gated onto the B-Bus. The outputs of the de
oder are 16
output enable (OE-B) signals (one for ea
h register) and one and only one of the OE-B signals
takes the value 1. The C-Bus de
oder is used to de
ode a 4-bit C-eld register designator that
sele
ts the s
rat
hpad register to be loaded from the C-Bus. The outputs of the C-Bus de
oder
are 16 load
lo
k signals (one for ea
h register). Be
ause all 16 possible C-eld values are assigned
to the 16 registers, an additional
ontrol input is needed to prevent loading any of the registers.
This additional
ontrol input is ENC (for enable-C). If ENC = 0, then all 16 of the de
oder's load
outputs remain at logi
level zero, and none of the registers is overwritten. If ENC = 1, then
one and only one of the destination registers sees a load
lo
k line = 1 at the appropriate time
determined by yet another
ontrol input
alled T4.
Neither the A-Bus nor the B-Bus feeds the ALU dire
tly. Instead, ea
h one feeds a lat
h (i.e.,
a register) that in turn feeds the ALU. The lat
hes are needed be
ause the ALU is a
ombinational
ir
uit { it
ontinuously
omputes the output for the
urrent input and fun
tion
ode. Feeding
the left and right ALU inputs dire
tly from the A and B buses (without the additional lat
hes)
an
ause ra
e problems. For example,
onsider assigning to the destination register A the sum
of the
ontents of registers A and B, denoted A:= A + B. As A is being written into, the value
on the A-Bus begins to
hange, whi
h
auses the ALU output and thus the
ontents of the C-Bus
to
hange as well. Consequently, the wrong value may be stored into A. In other words, in the
assignment A:= A + B, the old A on the right-hand side is the original A value, not some bit-by-bit
mixture of the old and new values. By inserting lat
hes (namely, the A-lat
h and B-lat
h) into the
A and B buses, we
an freeze the original A and B values there early in the
y
le, so that the ALU
is shielded from
hanges on the buses as the new value is being stored into the s
rat
hpad.
One
an think of the A-lat
h and the B-lat
h as shared slave lat
hes for the
orrespondingly
sele
ted sour
e master lat
hes in the s
rat
hpad. This saves using slave lat
hes in ea
h s
rat
hpad
register that using master-slave
ip-
ops to build the registers would require, but it
ompli
ates the
timing somewhat. The A-lat
h and B-lat
h are loaded by timing
ontrol signal T2 that is generated
by a 4-phase
lo
k generator
ir
uit shown in Fig. 6.
Computer
ir
uits are normally driven by a
lo
k, a devi
e that emits a periodi
sequen
e of
pulses. These pulses dene ma
hine
y
les. During ea
h ma
hine
y
le, some a
tivity o
urs, su
h
as the exe
ution of a mi
roinstru
tion. It is often useful to divide a
y
le into sub
y
les so dierent
parts of the mi
roinstru
tion
an be performed in a well-dened order. For example, the inputs to
the ALU must be made available and allowed to be
ome stable before the output
an be stored.
T1
Run/Stop
Reset
Four Phase
Clock Generator
Finite State
Machine
T2
T3
T4
Master Clock
Pulses
11
00
00
11
00
11
00
11
00
11
00
11
Master Clock
T1
11
00
00
11
00
11
00
11
00
11
00
11
T2
11
00
00
11
00
11
00
11
00
11
00
11
T3
11
00
00
11
00
11
00
11
00
11
00
11
Subcycle
T4
11
00
00
11
00
11
00
11
00
11
00
11
CPU Cycle
(Repeats while Run = True)
asserted. The output to the system address lines is always enabled (as is the
ase here) [or possibly
only during reads and writes, whi
h requires an output enable line driven by the OR of \Rd" and
\Wr" (not shown). Be
ause the main memory in our example mi
roar
hite
ture has only 4096
16-bit words, the MAR is a 12-bit register. In our example mi
roar
hite
ture the MAR is
onne
ted
to the B-Bus (rather than to the C-Bus) so that both the MAR and the MBR
an be loaded in
the same ma
hine
y
le (i.e., loaded by the same mi
roinstru
tion) Be
ause the MAR is
onne
ted
to the B-Bus (rather than to the C-Bus) and be
ause of the way the mi
roprogram
ontrolling
this ma
hine is written, it is restri
ted in size to 12-bits. To allow for a 16-bit MAR
onne
ted
in this way would require two ma
hine
y
les (i.e., two mi
roinstru
tions) and the use of another
s
rat
hpad register to properly load all 16 bits into the MAR. This is a
onsequen
e of the design
hoi
es made by others; namely, the author of the text from whi
h this example is derived. A more
detailed s
hemati
of the MAR and its
onne
tions is shown if Fig. 7.
B-Bus
12
To Memory
Address
Decoder
16
Low Order
12
MAR
T3
MAR
(Control Bit)
Memory
11
00
00
11
To11
00
00
11
1
0
I1
Memory Buffer Register
2:1
MBR
MUX Z
16-bits
(x16) 16
I0
Load MBR
S
T4
OE=1
To AMUX
MBR
0
1
Control
00
11
00 RD
11
11
00
00
11
Bit
(Read)
WR
(Write)
Memory
16
C-Bus
(From Shifter)
To
ontrol the data path of our mi
roar
hite
ture in Fig. 5 requires 60 signals that belong to
the following nine fun
tional groupings:
16 signals to
ontrol loading the A-Bus from the s
rat
hpad
16 signals to
ontrol loading the B-Bus from the s
rat
hpad
16 signals to
ontrol loading the s
rat
hpad from the C-Bus
1 signal to
ontrol loading the A and B lat
hes
2 signals to
ontrol the ALU fun
tion
2 signals to
ontrol the shifter
4 signals to
ontrol the MAR and MBR
2 signals to indi
ate memory read or memory write
1 signal to
ontrol the AMUX
Given the values of the 60 signals, we
an perform one
y
le of the data path. A
y
le
onsists of
gating values onto the A and B buses, lat
hing them in the two bus lat
hes, running the values
through the ALU and shifter, and nally storing the results in the s
rat
hpad and/or the MBR. In
addition, the MAR
an also be loaded, and a memory
y
le initiated. As a rst approximation we
ould have a 60-bit
ontrol register, with one bit for ea
h
ontrol signal. A 1 bit means that the
signal is asserted and a 0 means that it is not asserted (i.e., negated). We also need a multiphase
lo
k generator
ir
uit to
ontrol when things happen during the
y
le.
However, at the pri
e of a small in
rease in
ir
uitry, we
an greatly redu
e the number of bits
needed to
ontrol the data path. Using all 16-bits to
ontrol the A-Bus would allow 216
ombinations
of signal values, only 16 of whi
h are valid be
ause the s
rat
hpad has only 16 registers. Therefore,
we
an en
ode the A-Bus
ontrol information in a 4-bit eld and use a de
oder to generate the 16
ontrol signals. The same holds true for the B-Bus.
The situation is slightly dierent for the C-Bus. In prin
iple, multiple simultaneous stores
into the s
rat
hpad are feasible, but in pra
ti
e this feature is only infrequently useful, and most
hardware designs do not provide for it. Therefore, we will also en
ode the C-Bus
ontrol into a 4
bit eld. Having en
oded some of the
ontrol signals into elds and in turn supplied
orresponding
de
oder
ir
uits, we have saved 3 12 bits. We now need only 24 bits to
ontrol the data path.
Be
ause the A and B lat
hes are always loaded at a
ertain point in time, we
an supply a multiphase
lo
k generator
ir
uit and use one of the
lo
k phases (say T2) as this
ontrol input, leaving 23
ontrol bits needed. After the values in the A and B lat
hes settle down the MAR
an be
lo
ked
(at sub
y
le time T3) to
opy the
ontent of the B lat
h if the \Mar"
ontrol bit is set to 1. More
time is needed, however, for the data signals to propagate through the ALU and shifter
ir
uitry
before they have settled down and
an be
opied from the C-Bus into their destination(s) in the
s
rat
hpad or MBR at sub
y
le time T4. One additional signal that is not stri
tly required, but is
often useful, is one to enable/disable storing the C-Bus into the s
rat
hpad. In some situations one
merely wishes to perform an ALU operation to generate the N and Z signals, but does not wish to
store the result. With this extra bit, whi
h we will
all ENC (ENable C), we
an indi
ate that the
C-Bus
ontents are to be stored (ENC = 1) or not (ENC = 0).
With ENC in
luded we
an
ontrol the data path with a 24 bit number. Now we note that \Rd"
and \Wr"
an be used to
ontrol the lat
hing of the MBR from the system's memory data bus and
the enabling of the MBR onto it, respe
tively (as shown in Fig. 8). This observation redu
es the
number of independent
ontrol signals needed from 24 down to 22.
8
The next step in the design of the mi
roar
hite
ture is to invent a mi
roinstru
tion format
ontaining 22 bits. Fig. 9 shows su
h a format with two additional elds COND and ADDR, whi
h
will be des
ribed shortly. The mi
roinstru
tion
ontains 13 elds, 11 of whi
h are as follows:
AMUX {
ontrols left ALU input: 0 = A-lat
h, 1 = MBR
ALU
{ ALU fun
tion: 0 = A + B, 1 = A.AND.B, 2 = A, 3 = A
SHFT { shifter fun
tion: 0 = no shift, 1 = right shift, 2 = left shift
MBR
{ loads MBR from shifter: 0 = don't load MBR, 1 = load MBR
MAR
{ loads MAR from B-lat
h: 0 = don't load MAR, 1 = load MAR
RD
{ requests memory read: 0 = no read, 1 = load MBR from memory
WR
{ requests memory write; 0 = no write, 1 = write MBR to memory
ENC
{
ontrols storing into s
rat
hpad: 0 = don't store, 1 = store
C
{ sele
ts register for storing into if ENC = 1: 0 = PC, 1 = AC, et
.
B
{ sele
ts B-Bus sour
e: 0 = PC, 1 = AC, 2 = SP, 3 = IR, et
.
A
{ sele
ts A-Bus sour
e: 0 = PC, 1 = AC, 2 = SP, 3 = IR, et
.
Microinstruction Format (32-bit word)
Number of bits in each field:
1
A
M
U
X
2
2
C
A
O
L
N
U
D
2
S
H
F
T
1 1 1 1 1
M M
E
B A R W N
R R D R C
8
ADDR
C1 C0 F1 F0S1 S0
AMUX
COND
0 = A-latch
1 = MBR
C1 C0
0
0
1
1
0
1
0
1
=
=
=
=
No Jump
Jump if N=1
Jump if Z=1
Jump always
SHFT
ALU
F1 F0
0
0
1
1
0
1
0
1
=
=
=
=
A + B
A and B
A
A
S1 S0
0
0
1
1
0
1
0
1
=
=
=
=
No shift
Shift right 1 bit
Shift left 1 bit
(not used)
Figure 9: Mi
roinstru
tion Format (32-bits) for Mi
-1/Ma
-1 mi
roar
hite
ture
The ordering of the elds is
ompletely arbitrary. This ordering has been
hosen to minimize line
rossings in a subsequent gure. (A
tually, this
riterion is not as
razy as it sounds; line
rossings
in gures usually
orrespond to wire
rossings on printed
ir
uit boards or on integrated
ir
uit
hips, whi
h
ause di
ulties in two-dimensional designs.)
Mi
roinstru
tion Timing: Although our dis
ussion of how a mi
roinstru
tion
an
ontrol
the data path during one
y
le is almost
omplete, we have mostly negle
ted one issue up until
now: timing. A basi
ALU
y
le
onsists of setting up the A and B lat
hes, giving the ALU and
shifter time to do their work, and storing the results. It is obvious that these events must happen
in that sequen
e. If we try to store the C-Bus
ontents into the s
rat
hpad before the A and B
lat
hes have been loaded, garbage will be stored instead of useful data. To a
hieve the
orre
t event
sequen
ing, we use a four-phase
lo
k, that is a
lo
k with four sub
y
les, as shown in Fig. 6. The
key events during ea
h of the four sub
y
les are as follows:
1. Load the next mi
roinstru
tion to be exe
uted into a register
alled MIR, the Mi
roInstru
tion Register.
9
2. Gate sele
ted s
rat
hpad registers onto the A and B buses and
apture them in the A and B
lat
hes.
3. Now that the inputs are stable, give the ALU and shifter time to produ
e a stable output
and load the MAR if required.
4. Now that the shifter output is stable, store the C-Bus
ontents into the s
rat
hpad and load
the MBR, if either is required.
Fig. 10 presents a detailed blo
k diagram of the
omplete mi
roar
hite
ture of our example ma
hine. It may look imposing initially, but it is worth studying
arefully. When you fully understand
every box and every line on it, you will be well on your way to understanding the mi
roprogramming level. The blo
k diagram has two parts, the data path on the left, whi
h we have already
dis
ussed in detail, and the
ontrol se
tion on the right, whi
h we will now examine.
The largest and most important item in the
ontrol portion of the ma
hine is the
ontrol store.
This spe
ial, high-speed memory is where the mi
roinstru
tions are kept. On some ma
hines it is
read-only memory (ROM); on others it is read/write memory. In our example, mi
roinstru
tions
are 32 bits wide and the mi
roinstru
tion address spa
e
onsists of 256 words, so the
ontrol store
o
upies a maximum of 256 32 = 8192 bits. By
omparison, the Digital Equipment Corporation
(DEC) PDP-11/40 was a popular and
ommer
ially su
essful mi
roprogrammed mini
omputer in
the mid 1970's that also had a 256 word
ontrol store, but its mi
roinstru
tions were 56 bits wide.
Like any other memory, the
ontrol store needs an MAR and an MBR. In this
ase we will
all the MAR the MPC (Mi
roProgram Counter) be
ause its only fun
tion is to point to the next
mi
roinstru
tion to be fet
hed from the memory for exe
ution. The MBR is just the MIR as
mentioned above. In this mi
roar
hite
ture the
ontrol store and the main memory are dierent
entities; the
ontrol store holds the mi
roprogram and the main memory holds the
onventional
ma
hine language program.
From Fig. 10 it is
lear that the
ontrol store
ontinuously tries to
opy the mi
roinstru
tion
addressed by the MPC into the MIR. However, the MIR is loaded only during sub
y
le 1, as
indi
ated by the dashed line from
lo
k output T1 to it. During the other three sub
y
les of the
lo
k, it is not ae
ted, no matter what happens to the MPC.
During sub
y
le 2 (whi
h lasts between the rising edge of T1 and the rising edge of T2) the
MIR be
omes stable, and the various elds begin
ontrolling the data path. In parti
ular the A
and B elds sele
t the s
rat
hpad registers to be gated onto the A and B buses, respe
tively. The
A and B de
oder boxes provide for the 4-to-16 de
oding of ea
h eld needed to drive the OE-A
and OE-B lines at the s
rat
hpad registers (see Fig. 3). Clo
k signal T2 loads the A and B lat
hes,
whi
h after their outputs settle, provide stable ALU inputs for all remaining sub
y
les during the
rest of the
y
le. While data are being gated onto the A and B buses, the in
rement unit in the
ontrol se
tion of the ma
hine
omputes MPC + 1, in preparation for loading the next sequential
mi
roinstru
tion during the next
y
le. By overlapping these two oprations, instru
tion exe
ution
an be speeded up.
In sub
y
le 3, the ALU and shifter are given time to produ
e valid results. The AMUX mi
roinstru
tion eld determines the left input to the ALU; the right input always
omes from the
B-lat
h. Although the ALU is a
ombinational
ir
uit, the time it takes to
ompute the sum is
determined by the
arry-propagation time, not the normal gate delay. The
arry-propagation time
is proportional to the number of bits in the word. While the ALU and shifter are
omputing, the
MAR is loaded from the output of the B-lat
h at T3 if the MAR eld in the mi
roinstru
tion is 1.
10
16 Load-Reg
16 OE-B
B-Bus
Decoder
16 OE-A
A-Bus
Decoder
A-Bus
C-Bus
C-Bus
Decoder
T4
T3
T2
T1
B-Bus
Run/
Stop
4-Phase
Clock
Generator
Reset
0 PC
1 AC
2 SP
I0
I1
MMUX
16 CPU
Registers
Increment
MPC + 1
15 F
MPC
B-Latch
A
M
U
X
MAR
MBR
MIR
S MM
C
E
A
H BA RW N
O
C
L
D
R
F
N
RR
C
D U T
2
I1
I0
AMUX
ADDR
ALU
N
Z
Micro
Seq.
Logic
2
Shifter
Rd
Wr
Figure 10: The omplete blo k diagram for example mi roar hite ture (Mi -1/Ma -1)
11
During the fourth and nal sub
y
le, the C-Bus may be stored ba
k into the s
rat
hpad and
MBR, depending on ENC and MBR. The box labeled \C de
oder" takes ENC, T4, and the C eld
from the mi
roinstru
tion as inputs and generates the one (or none) of the 16 register load signals.
Internally it performs a 4-to-16 de
ode of the C eld and then ANDs ea
h of these 16 signals with
a signal derived from ANDing sub
y
le 4 line T4 with ENC. Thus, a s
rat
hpad register is loaded
only if three
onditions prevail:
1. ENC = 1.
2. It is sub
y
le 4 with T4 = 1.
3. The register has been sele
ted by the C eld.
The MBR is also loaded during sub
y
le 4 if MBR = 1.
Mi
roinstru
tion Sequen
ing: The only remaining issue is how the next mi
roinstru
tion is
hosen. Although some of the time it is su
ient just to fet
h the next mi
roinstru
tion in sequen
e,
some me
hanism is needed to allow
onditional jumps in the mi
roprogram in order to enable it to
make de
isions. For this reason two elds are provided in ea
h mi
roinstru
tion; namely, ADDR,
whi
h is the 8-bit address of a potential su
essor to the
urrent mi
roinstru
tion, and COND,
whi
h determines whether the next mi
roinstru
tion is fet
hed from the
ontrol store address that
is one greater than the
ontents of the
urrent MPC (i.e., MPC + 1) or from the lo
ation spe
ied
by the ADDR eld. Every mi
roinstru
tion potentially
ontains a
onditional jump. The de
ision
to allow for this in the mi
roinstru
tion format was made be
ause
onditional jumps are very
ommon in mi
roprograms, and allowing every mi
roinstru
tion to have two possible su
essors
makes them run faster than the alternative of setting up some
ondition in one mi
roinstru
tion
and then testing it in the next.
The
hoi
e of address from whi
h the next mi
roinstru
tion will be fet
hed is determined by
the box labeled \Mi
ro Sequen
ing Logi
" during sub
y
le 4, when the ALU output signals N and
Z are valid. The output of this box
ontrols the M multiplexer (MMUX), whi
h routes either MPC
+ 1 or ADDR to the MPC (loaded by
lo
k signal T4) where it will dire
t the fet
hing of the next
mi
roinstru
tion. The desired
hoi
e is indi
ated by the setting of the COND eld as follows:
may not be lo
ated at the next
ontrol store address). In other words, \Rd" must be set to 1 in two
onse
utive mi
roinstru
tions in order for the MBR to be loaded with
orre
t data (returning from
main memory) by the se
ond T4
lo
k pulse following the loading of the MAR at T3. A full four
lo
k ti
ks (
orresponding to a full mi
roinstru
tion
y
le time) are needed for the main memory to
respond with valid data. Thus, the data be
ome available two mi
roinstru
tions after the read was
initiated. If the mi
roprogram has nothing else useful to do in the mi
roinstru
tion following the
one that initiated a memory read (or write), that mi
roinstru
tion's only task is then to keep Rd =
1 (or for writes Wr = 1). In the same way, a memory write also takes two mi
roinstru
tion times to
omplete. In the mi
roinstru
tion initiating the write the MAR is typi
ally loaded with the address
into whi
h data will be written at
lo
k pulse T3, and the data to be written are loaded into the
MBR at
lo
k pulse T4. The main memory again needs four
lo
k ti
ks to de
ode the address and
omplete the write. Thus, \Wr" must be set equal to 1 in two
onse
utive mi
roinstru
tions (the
one initiating the write and the one following it in time).
An Example Ma
roar
hite
ture, the Ma
-1
We now
onsider the instru
tion set ar
hite
ture of the
onventional ma
hine level to be supported by the mi
roprogrammed interpreter running on the ma
hine of Fig. 10. For
onvenien
e,
we will
all the ar
hite
ture of the level 2 or 3 ma
hine the ma
roar
hite
ture to
ontrast it with
level 1, the mi
roar
hite
ture. (We will basi
ally ignore level 3 at this point be
ause its instru
tions
are largely those of level 2 and the dieren
es are not important here.) Similarly, we will
all the
level 2 instru
tions ma
roinstru
tions. Thus, the normal ADD, MOVE, and other instru
tions
of the
onventional ma
hine level will be
alled ma
roinstru
tions. (The point of repeating this
remark is that some assemblers have a fa
ility to dene assembly-time \ma
ros" that are in no way
related to what we mean by ma
roinstru
tions.) We will sometimes refer to our example level 1
ma
hine as Mi
-1 and the level 2 ma
hine as Ma
-1.
Sta
ks: A modern ma
roar
hite
ture should be designed with the needs of high-level languages
in mind. One of the most important design issues is addressing. A me
hanism must be provided
for saving a
urrent address pointer when a pro
edure (or fun
tion) is
alled and then returning
ba
k to where it
ame from in the
alling program when exiting the pro
edure. In some high-level
languages these
alled pro
edures are
alled subprograms, subroutines, or fun
tions, and we will use
these terms inter
hangably. A way of passing parameters to the
alled pro
edure where the
alled
pro
edure will know to look for them also must be made available. The
alled pro
edure itself may
need to allo
ate some memory spa
e for lo
al temporary variables in order to do its work and then be
able to release the allo
ated spa
e when returning to the
alling program. Furthermore, a hardware
me
hanism that will
onveniently support re
ursive
alls (i.e., pro
edures
alling themselves) is also
desirable. Blo
k stru
tured languages (like Pas
al and others) are normally implemented in su
h a
way that when a pro
edure is exited, the storage it has been using for lo
al variables is released.
The easiest way to a
hieve this goal is by using a data stru
ture
alled a sta
k.
A sta
k is a
ontiguous blo
k of memory
ontaining some data that operates on a last-in
rst-out basis mu
h like a sta
k of
afeteria trays on a spring loaded base. A pointer (usually
implemented by a CPU register)
alled the sta
k pointer (SP) is used to point to the
urrent
top of sta
k lo
ation in the region of main memory where the sta
k is lo
ated. Just like with the
afeteria trays, when a new tray is pla
ed on the sta
k, its weight pushs down on the spring in the
suporting base. Thus, sta
ks are sometimes
alled push-down sta
ks, and the ma
hine instru
tion
used to pla
e a new data item or address on the sta
k is usually
alled a PUSH instru
tion. On
the other hand, the instru
tion used to remove the top item from a sta
k (and pla
e it elsewhere)
is variously
alled by dierent manufa
turers a POP instru
tion or a PULL instru
tion. With the
afeteria tray analogy POP likely refers to the spring in the base popping up a not
h when the
weight of the top tray is removed. In other
ontexts PULL is obviously the opposite of PUSH. In
the ma
roar
hite
ture des
ribed here we will in
lude the instru
tions PUSH and POP for putting
13
data items on the sta
k or getting them o the sta
k. The register le in Fig. 5 already
ontains a
register
alled SP that we
an use as the sta
k pointer register to point to the
urrent top of sta
k
lo
ation in memory. It also has a PC register that we
an use as a program
ounter to point to
where the next ma
hine instru
tion will be found in memory. The instru
tion CALL will rst push
the
ontent of the PC register onto the sta
k before jumping o to the
alled pro
edure. The jump
to the
alled pro
edure is a
omplished by overwriting the PC register with a new value,
alled
the target address (or the entry point of the pro
edure) and then letting the
omputer fet
h its
next instru
tion for exe
ution from there. By rst saving the PC register
ontents on the sta
k
before overwriting the PC with a new target address, the
alled pro
edure will be able to return
to the
alling program where it left o. The instru
tion RETURN, when exe
uted by the
alled
pro
edure, will simply pop the top of sta
k entry into the PC register, thus pointing the program
ounter ba
k to a lo
ation (the return point) in the
alling program, and will in ee
t
ause a
jump ba
k to the
alling program. The CALL and RETURN instru
tions then provide a means
for saving and then restoring the
ontents of the PC register using the sta
k when entering and
exiting from
alled pro
edures.
Although one
ould name any register in the PUSH and POP instru
tions as the sour
e of the
data for a push and the destination for the data from a POP, our example ma
hine will impli
itly
use only the AC register as the sour
e of data for a PUSH and the destination for a POP. Now
a PUSH must advan
e the sta
k pointer by one memory lo
ation before writing the
ontents of
the AC register into the memory lo
ation at the top of the sta
k. One
ould
hoose either of the
following options for how to advan
e the sta
k pointer: (1) allow the sta
k to grow upward from
low memory addresses to high memory addresses by in
rementing SP on a PUSH; or (2) allow the
sta
k to grow downward from high memory addresses to low memory addresses by de
rementing
SP on a PUSH. Intel has
hosen option (2) for the 80X86 ar
hite
tures and so will we. Be
ause
the sta
k pointer points to the
urrent top of sta
k lo
ation, a PUSH must rst de
rement (the
ontents of) SP and then
opy the
ontents of the AC to the memory lo
ation whose address is in
the SP register. A POP will rst
opy the
ontents of the top of sta
k lo
ation into the AC register
and then in
rement (the
ontent of) the SP register.
In order to permit programs to reserve (or delete) spa
e on the sta
k for temporary lo
al variables, instru
tions are needed for in
rementing (or de
rementing) the
ontents of the SP register by
variable amounts. Hen
e, the instru
tion set will have instru
tions for in
rementing SP (INSP) and
de
rementing SP (DESP) whi
h allow the level 2 programmer to spe
ify the variable amount with
an 8-bit
onstant. Furthermore, instru
tions for getting at lo
al variables or in
oming parameters
on the sta
k relative to where the SP (or some other register)
urrently points are also useful;
thus, instru
tions providing a form of sta
k relative indexed addressing are also needed so that one
doesn't have to keep moving the sta
k pointer to get at these items. In other words, the Ma
-1
needs an addressing mode that fet
hes or stores a word at a known distan
e relative to the sta
k
pointer (or some equivalent addressing mode). In the Ma
-1 these sta
k pointer relative indexed
addressing mode instru
tions will be known as load lo
al (LODL), store lo
al (STOL), add lo
al
(ADDL) and subtra
t lo
al (SUBL); they will allow the level 2 programmer to spe
ify a 12-bit
oset (or base) value and, hen
e, they will have a memory referen
e format.
The Ma
roinstru
tion Set: The instru
tion set (or repertoire) is the set of all instru
tions
that the Ma
-1 is
apable of exe
uting. The Ma
-1's ar
hite
ture
onsists of a memory with 4096
16-bit words and three registers visible to the level 2 programmer. The registers are the program
ounter (PC), the sta
k pointer (SP), and the a
umulator (AC) whi
h is used for moving data
around, for arithmeti
, and for other purposes. Three addressing modes are provided: dire
t,
indire
t, and lo
al. Instru
tions using dire
t addressing
ontain a 12-bit absolute memory address
in their low-order 12 bits; and instru
tions using this format are usually
alled \memory referen
e
instru
tions". Indire
t addressing allows the programmer to
ompute a memory address, put it in
the AC, and then read or write the word pointed at by the
ontents of the AC register; this mode
14
is sometimes
alled register indire
t addressing. Lo
al addressing spe
ies an oset from where
the SP points, and is used (among other things) to a
ess lo
al variables. Together, these three
addressing modes provide a simple but adequate addressing system.
MAC-1 Instru
tion Repertoire
OpCode
Binary
0000xxxxxxxxxxxx
0001xxxxxxxxxxxx
0010xxxxxxxxxxxx
0011xxxxxxxxxxxx
0100xxxxxxxxxxxx
0101xxxxxxxxxxxx
0110xxxxxxxxxxxx
0111xxxxxxxxxxxx
1000xxxxxxxxxxxx
1001xxxxxxxxxxxx
1010xxxxxxxxxxxx
1011xxxxxxxxxxxx
1100xxxxxxxxxxxx
1101xxxxxxxxxxxx
1110xxxxxxxxxxxx
1111000000000000
1111001000000000
1111010000000000
1111011000000000
1111100000000000
1111101000000000
11111100yyyyyyyy
11111110yyyyyyyy
1111111111111111
OpCode
Hex
0xxx
1xxx
2xxx
3xxx
4xxx
5xxx
6xxx
7xxx
8xxx
9xxx
axxx
bxxx
xxx
dxxx
exxx
f000
f200
f400
f600
f800
fa00
f
yy
feyy
ffff
Assembly
Mnemoni
lodd
stod
addd
subd
jpos
jzer
jump
lo
o
lodl
stol
addl
subl
jneg
jnze
all
pshi
popi
push
pop
retn
swap
insp
desp
halt
Instru tion
Load dire
t
Store dire
t
Add dire
t
Subtra
t dire
t
Jump if positive
Jump if zero
Jump
Load
onstant
Load lo
al
Store lo
al
Add lo
al
Subtra
t lo
al
Jump if negative
Jump if nonzero
Call pro
edure
Push indire
t
Pop indire
t
Push onto sta
k
Pop from sta
k
Return
Swap a
, sp
In
rement sp
De
rement sp
Halt ma
hine
Meaning
or A
tion
a
:=m[x
m[x:=a
a
:=a
+m[x
a
:=a
m[x
if a
0 then p
:=x
if a
=0 then p
:=x
p
:=x
a
:=x (0x4095)
a
:=m[x+sp
m[x+sp:=a
a
:=a
+m[x+sp
a
:=a
m[x+sp
if a
<0 then p
:=x
if a
6=0 then p
:=x
sp:=sp 1;m[sp:=p
;p
:=x
sp:=sp 1;m[sp:=m[a
m[a
:=m[sp;sp:=sp+1
sp:=sp 1;m[sp:=a
a
:=m[sp;sp:=sp+1
p
:=m[sp;sp:=sp+1
tmp:=a
;a
:=sp;sp:=tmp
sp:=sp+y (0y255)
sp:=sp y (0y255)
stops fet
hing instru
tions
xxxxxxxxxxxx is a 12-bit ma
hine address (or
onstant); in
olumn 2 it is
alled xxx and in
olumn 5 it is
alled x.
yyyyyyyy is an 8-bit
onstant; in
olumn 2 it is
alled yy and in
olumn 5 it is
alled y.
mnemoni
s in all lower-
ase letters, we will use upper-
ase in this text for emphasis when talking
about spe
i
instru
tions. Column four gives a short des
ription of what the instru
tion does
and
olumn ve spe
ies the a
tion performed in a register transfer language notation. In
olumn
ve, if there is more than one a
tion o
uring, then ea
h part of the a
tion sequen
e is separated
from the next by a semi
olon, and the sequen
e of a
tions o
urs in left to right order. Column
ve spe
ies the register transfers and a
tions using a pseudo-Pas
al language fragment. In these
fragments, \m[x" refers to memory word \x."
LODD loads the a
umulator (AC register) from the memory word spe
ied in its low-order
12 bits. LODD thus spe
ies dire
t addressing; whereas, LODL loads the a
umulator from the
word at a distan
e \x" from where the SP register points and thus spe
ies indexed addressing
with the SP register a
ting as an index register. LODD, STOD, ADDD, and SUBD perform four
basi
fun
tions using dire
t addressing, and LODL, STOL, ADDL, and SUBL perform the same
fun
tions using indexed (or lo
al relative to the SP) addressing.
Five jump instru
tions are provided, one un
onditional jump (JUMP) and four
onditional
ones (JPOS, JZER, JNEG, and JNZE). JUMP always
opies its low-order 12 bits into the program
ounter (PC); whereas, the other four do so only if the spe
ied
ondition is met.
LOCO loads a 12-bit
onstant in the range 0 to 4095 (in
lusive) into the AC. PSHI pushes onthe
the sta
k the word whose address is present in the AC register. The inverse operation is POPI,
whi
h pops a word from the sta
k and stores it in the memory word whose address is in the AC
register. PUSHI and POPI thus spe
iy register indire
t addressing using the impli
it AC register
as the holder of the indire
t address. PUSH and POP are useful for manipulating the sta
k in a
variety of ways. SWAP ex
hanges the
ontents of AC and SP, whi
h provides a way of loading the
SP register with a new value. It is also useful for initializing SP at the start of exe
ution. INSP
and DESP are used to
hange SP by amounts known at
ompile time. Be
ause the number of
instru
tions to be en
oded is more than a 16-bit word with a 12-bit address elds will allow, it has
been ne
essary to tradeo bits in the address eld with bits in the op
ode eld and use \expanding
op
odes" to en
ode all of the instru
tions. The osets for INSP and DESP are limited to 8 bits
in the (in
lusive) range of 0 to 255. Finally, CALL
alls a pro
edure, saving the return address on
the sta
k, and RETN returns from a pro
edure by popping the return address and putting it in
the PC register.
Input/Output: The Ma
-1 does not have any expli
it input or output instru
tions. Instead,
it uses memory-mapped I/O. A read from address 4092 will yield a 16-bit word with the next
ASCII
hara
ter from the standard input devi
e in the low-order 7 bits and zeros in the high-order
9 bits of the AC register. When a
hara
ter is available in the data register whose address is 4092,
the standard input devi
e will set to 1 the high-order bit of the input status register at memory
address 4093. The a
tion of loading the
ontent of the input data register at memory address
4092 into the AC register
lears (i.e., sets to zero) the
ontent of
ip-
ops in the status register at
memory address 4093. The input routine will normally sit in a tight loop waiting for the
ontent
of 4093 to go negative. When it does, the input routine will load the AC from 4092 and return.
Output is a
omplished using a similar s
heme. A write (i.e., store) to the output data register
at memory address 4094
opies the low-order 7 bits in the AC register to the standard output
devi
e and at the same time
lears (i.e., sets to 0) the high-order bit of the output status register
at memory address 4095. The high-order bit in the output status register at memory address 4095
is later set to 1 by the standard output devi
e when it is again ready to a
ept another
hara
ter
in its data register. Standard input and output may be a terminal keyboard and visual display,
or a
ard reader and printer, or some other
ombination. (Unfortunately, the simulators used to
exe
ute level 2 programs on this ma
roar
hite
ture have not as yet implemented the input/output
data and status registers; so input and output are not simulated.)
16
An Example Mi roprogram
Having spe
ied both the mi
roar
hite
ture and the ma
roar
hite
ture in detail, the remaining
issue is the implementation: What does a program running on the former and interpreting the latter
look like, and how does it work? Here we will examine how the hardware
omponents are
ontrolled
by the mi
roprogram and how the mi
roprogram interprets the
onventional ma
hine level. Early
omputers were not mi
roprogrammed at all and had instru
tions for arithmeti
, Boolean oprations,
shifting,
omparing, looping, and so on, that were all dire
tly exe
uted by the hardware. Modern
day redu
ed instru
tion set
omputers (RISC) do likewise, but their level 2 ma
hine instru
tions are
merely highly en
oded mi
roinstru
tions; so in this
ase
ompilers translate the high level language
statements into sequen
es of mi
roinstru
tions that are easy to de
ode and dire
tly
ontrol the
mi
roar
hite
ture's data path. Mi
roprogrammed ma
hines, on the other hand, interpret the level
2 ma
hine instru
tions using a mi
roprogram stored in
ontrol memory. The mi
roprogram is
written by a mi
roprogrammer (an individual who writes mi
roprograms and not merely a small
programmer). The
ompilers for mi
roprogrammed ma
hines usually translate high-level languages
into sequen
es of level 2 ma
hine language statements that are in turn fet
hed and de
oded by the
mi
roprogram that dire
tly
ontrols the data path's mi
roar
hite
ture.
We
ould write the mi
roprogram to fet
h, de
ode and exe
ute the level 2 ma
hine instru
tions
by dire
tly spe
ifying the sequen
es of 32-bit binary numbers (to be stored in
ontrol memory)
that ea
h dire
tly
ontrol the hardware for one ma
hine
y
le
omprising the four
lo
k ti
ks of
the four-phase
y
le. This tedious task is what ultimately must be done, but having a higher level
symboli
language notation that is then translated into the 32-bit numbers will make the task
easier.
The Mi
ro Assembly Language (MAL): One possible notation is to have the mi
roprogrammer spe
ify one mi
roinstru
tion per line, naming ea
h nonzero eld and its value. For example, to add (the
ontents of the) AC to (the
ontents of the) A register and store the result in the
AC register, we
ould write
ENC = 1, C = 1, B = 1, A = 10
Many mi
roprogramming languages look like this; however, this notation is awful.
A mu
h better idea is to use a high-level language notation, while retaining the basi
on
ept of
one sour
e line per mi
roinstru
tion. Con
eivably, one
ould write mi
roprograms in an ordinary
high-level language, but be
ause e
ien
y is
ru
ial in mi
roprograms, we will sti
k to assembly
language, whi
h we dene as a symboli
language that has a one-to-one mapping onto ma
hine
instru
tions. Our high-level Mi
ro Assembly Language will be
alled \MAL," the Fren
h word
for \si
k." In MAL, stores into the 16 s
rat
hpad registers or MAR and MBR are denoted by
assignment statements. Thus, the above example in MAL be
omes: a
:=a
+ a. (Be
ause the
intention is to make MAL Pas
al-like, we adopt the usual Pas
al
onvention of lower-
ase names
for identiers.)
To indi
ate the use of the ALU fun
tions 0, 1, 2, and 3, we
an write, for example,
a
:=a + a
, a:=band(ir,smask), a
:=a, and a:=inv(a),
respe
tively, where \band" stands for \Boolean AND" and \inv" stands for \invert" (i.e., bitwise
logi
al
omplement). Shifts
an be denoted by the fun
tions \lshift" for left shifts and \rshift" for
right shifts, as in
tir:=lshift(tir + tir)
whi
h puts the
ontents of the TIR register on both the A and B buses,
auses the ALU to perform
an addition, and left shifts the sum 1 bit left before storing it ba
k into the TIR register.
17
goto
if n then
goto
27
Assignments and jumps
an be
ombined on the same line. However, a slight problem arises if
we wish to test a register but not make a store. How do we spe
ify whi
h register is to be tested?
To solve this problem, we introdu
e the pseudo variable \alu," whi
h
an be used in the language to
form a valid assignment statement but whi
h in reality has no destination farther than the ALU's
output. (Re
all that the ALU is made of only
ombinational logi
omponents and
ontains no
registers or other memory devi
es.) For example,
alu:=tir; if n then goto 27
means that the
ontent of the TIR register is to be run through the ALU un
hanged on the A-bus
(ALU
ode = 2) so its high-order bit
an be tested. Note that this use of \alu" means that ENC
= 0.
To indi
ate memory reads and writes, we will just put \rd" and \wr" in the sour
e program.
The order of the various parts of the sour
e statement is, in prin
iple, arbitrary but to enhan
e
readability we will try to arrange them in the order that they are
arried out. Fig. 12 gives a few
examples of MAL statements along with the translated elds of the
orresponding mi
roinstru
tions
(shown in de
imal shorthand for ea
h eld).
A
M
U
Statement
X
mar:=p
; rd
0
rd
0
ir:=mbr
1
p
:=p
+ 1
0
mar:=ir; mbr:=a
; wr
0
alu:=tir; if n then goto 15
0
a
:=inv(mbr)
1
tir:=lshift(tir); if n then goto 25
0
alu:=a
; if z then goto 22
0
a
:=band(ir, amask); goto 0
0
sp:=sp + (-1); rd
0
tir:=lshift(ir + ir); if n then goto 69 0
C
S
A
O A H M M
E
D
N L F B A R W N
D
D U T R R D R C C B A R
0 2 0 0 1 1 0 0 0 0 0 00
0 2 0 0 0 1 0 0 0 0 0 00
0 2 0 0 0 0 0 1 3 0 0 00
0 0 0 0 0 0 0 1 0 6 0 00
0 2 0 1 1 0 1 0 0 3 1 00
1 2 0 0 0 0 0 0 0 0 4 15
0 3 0 0 0 0 0 1 1 0 0 00
1 2 2 0 0 0 0 1 4 0 4 25
2 2 0 0 0 0 0 0 0 0 1 22
3 1 0 0 0 0 0 1 1 8 3 00
0 0 0 0 0 1 0 1 2 2 7 00
1 0 2 0 0 0 0 1 4 3 3 69
Figure 12: Some MAL statements and their
orresponding mi
roinstru
tions.
The Example Mi
roprogram: We have nally rea
hed the point where we
an put all the
pie
es together. Fig. 13 is the mi
roprogram that runs on the Mi
-1 and interprets the Ma
-1. It
is a surprisingly short program { only 81 lines. By now the
hoi
e of names for the s
rat
hpad
registers in Fig. 5 is obvious: PC, AC, and SP are used to hold the three Ma
-1 registers. IR is the
instru
tion register and holds the ma
roinstru
tion
urrently being exe
uted. TIR is a temporary
opy of the IR, used for de
oding the op
ode. The next three registers hold the indi
ated
onstants.
AMASK is the address mask 0FFF16 , and is used to separate out op
ode and address bits. SMASK
is the sta
k mask, 00FF16 , and is used in the INSP and DESP instru
tions to isolate the 8-bit oset
value. The remaining six registers have no assigned fun
tion and
an be used as s
rat
h registers
for whatever the mi
roprogrammer wishes.
18
Like all interpreters, the mi
roprogram in Fig. 13 has a main loop that fet
hes, de
odes, and
exe
utes instru
tiions from the program being interpreted, in this
ase level 2 instru
tions. Its
main loop begins on line 0, where it begins fet
hing the ma
roinstru
tion whose memory address
is in the PC register. While waiting for this instru
tion to arrive, the mi
roprogram in
rements
the
ontent of the PC and
ontinues to assert the \Rd" bus signal. When it arrives, in line 2, it is
stored in the IR register and simultaneously the high-order bit (bit 15) is tested. If bit 15 is a 1,
de
oding pro
eeds to line 28; otherwise, it
ontinues on line 3. Assuming for the moment that the
instru
tion is a LODD, bit 14 is tested on line 3, and the TIR register is loaded with the original
instru
tion shifted left 2 bit positions, one shift using the adder and one using the shifter. Note
that the ALU status bit N is determined by the ALU output in whi
h bit 14 is the high-order bit,
be
ause IR + IR shifts the IR
ontents left 1 bit position. The shifter output does not ae
t the
ALU status bit.
All instru
tions having 00 in their two high-order bits eventually
ome to line 4 to have bit 13
tested, with the instru
tions beginning with 000 going to line 5 and those beginning with 001 going
to line 11. Line 5 is an example of a mi
roinstru
tion with ENC = 0; it just tests the
ontent of the
TIR register, but does not
hange it. Depending on the out
ome of this test, the
ode for LODD
or STOD is sele
ted.
For LODD, the mi
ro
ode must rst fet
h the word dire
tly addressed by loading the low-order
12 bits of the IR into the MAR. In this
ase, the high-order 4 bits are all zero, but for STOD and
other instru
tions they are not. However, be
ause the MAR is only 12 bits wide and
onne
ted to
only the low-order 12 bits on the B-bus, the op
ode bits do not ae
t the
hoi
e of the word to be
read. In line 7, the mi
roprogram has nothing to do, so it just waits. When the word arrives, it
is
opied into the AC register and the mi
roprogram jumps ba
k to the top of the loop where the
instru
tion fet
h
y
le begins. STOD, ADDD, and SUBD are similar. The only noteworthy point
on
erning them is how subtra
tion is done.
Re
all that in radix r the radix
omplement (RC) of a number x is dened to be RC(x) = rn x.
Similarly, the diminished radix
omplement (DRC) of x (also
alled the r 1's
omplement) is
dened to be DRC(x) = rn r m x. When m = 0 so that we are dealing only with n-bit registers
ontaining integers, then the 1's
omplement of x is 1's(x) = 2n 20 x = 2n 1 x. The 2's
omplement of x is then 2's(x) = 2n x = 10 s(x) + 1, where the 1's
omplement of x is the same
as the bitwise logi
al
omplement of the n-bit number x. Thus, SUBD makes use of the fa
t that
x
=x+(
) = x + (y + 1) = x + 1 + y
in two's
omplement. The addition of 1 to the
ontent of the AC is done on line 16 (using the
ommutativity of additiion); otherwise line 16 would be wasted like line 13.
The mi
ro
ode for JPOS begins on line 21. If the
ontent of the AC < 0, the bran
h fails
and JPOS is terminated immediately by jumping ba
k to the main loop and fet
hing the next
instru
tion in sequen
e. If, however, the
ontent of the AC 0, the low-order 12 bits of the IR are
extra
ted by ANDing them with the 0FFF16 mask in the AMASK register and storing the result
in the PC register. It does not
ost anything extra to remove the op
ode bits here, so we might
as well do it. If it had
ost an extra mi
roinstru
tion, however, we would have had to look very
arefully to see if having garbage in the high-order 4 bits of the PC
ould
ause trouble later.
In a
ertain sense, JZER (line 23) works the opposite of JPOS. With JPOS, if the test
ondition
is met, the jump fails and
ontrol returns to the main loop. With JZER, if the test
ondition is met,
the jump is taken. Be
ause the
ode for performing the jump is the same for all jump instru
tions,
we
an save mi
ro
ode by just going to line 22 whenever feasible. This style of programming
generally would be
onsidered un
outh in an appli
ation program, but in a mi
roprogram no holds
are barred. Performan
e is everything.
19
0:
mar:=p ; rd;
Comment
fet h instr
Comment
de ode ir12
1: p :=p + 1; rd;
in rement p
42:
de ode ir15
43: goto 0;
de ode ir14
44:
de ode ir13
de ode ir12
6:
mar:=ir; rd;
LODD
47:
0000 =
7: rd;
0001 =
12:
mar:=ir; rd;
de
ode ir12
0010 =
ADDD
13: rd;
mar:=ir; rd;
0011 =
de ode ir10
de ode ir9
53:
mar:=a ; rd;
56:
de ode ir13
60:
de ode ir12
CALL
1111-0000 =
PSHI
1111-0010 =
POPI
SUBD
23:
1110 =
de ode ir11
17: a:=inv(mbr);
de ode ir12
57: rd;
JNZE
21:
1101 =
15:
sp:=sp + (-1);
JNEG
STOD
1100 =
8: a :=mbr; goto 0;
9:
0100 =
JPOS
62:
sp:=sp + (-1);
perform jump
63: rd;
0101 =
JZER
de
ode ir9
1111-0100 =
PUSH
1111-0110 =
POP
24: goto 0;
de ode ir10
de ode ir12
de ode ir9
26:
27:
p :=band(ir,amask); goto 0;
0110 =
JUMP
= LOCO
67:
a :=band(ir,amask); goto 0;
0111
68: rd;
de ode ir14
de ode ir13
70:
de
ode ir12
1000 =
LODL
71: a
:=sp;
72: sp:=a; goto 0;
1001 =
STOL
31:
a:=ir + sp;
33:
a:=ir + sp;
36:
a:=ir + sp;
a:=ir + sp;
74:
de
ode ir12
1010 =
a:=band(ir,smask);
1011 =
ADDL
77:
1111-1010 =
SWAP
de
ode ir9
1111-1100 =
INSP
a:=band(ir, smask);
de
ode ir8
1111-1110 =
DESP
1111-1111 =
HALT
78: a:=inv(a);
SUBL
80:
a:=a ;
38:
RETN
1111-1000 =
de ode ir13
The exe
ution
y
le for ea
h de
oded MAC-1 instru
tion begins at the
ontrol store address whose line
is labeled with a
omment showing the assembly language mnemoni
for the
orresponding instru
tion
(
apitalized for emphasis). \Adr:" is the
ontrol store address. The instru
tion fet
h
y
le begins at
ontrol
store address zero.
Figure 13: Mi
roinstru
tions to fet
h, de
ode, and exe
ute Ma
-1 instru
tions on the example Mi
-1
mi
roar
hite
ture
20
JUMP and LOCO are straightforward, so the next interesting exe
ution routine is for LODL.
First the absolute memory address to be referen
ed is
omputed by adding the oset
ontained in
the instru
tion to the
ontent of the SP register. Then the memory read is initiated. Be
ause the
rest of the
ode is the same for LODL and LODD, we might as well use lines 7 and 8 for both
of them. Not only does this save
ontrol store spa
e with no loss of exe
ution speed but it also
means fewer routines to debug. Analogous
ode is used for STOL, ADDL, and SUBL. The
ode for
JNEG and JNZE is similar to JZER and JPOS, respe
tively (not the other way around). CALL
rst de
rements the
ontent of the SP register, then pushes the return address (whi
h is the
urrent
ontent of the PC register) onto the sta
k, and nally jumps to the
alled pro
edure. Line 49 is
almost identi
al to line 22; if it had been exa
tly the same, we
ould have eliminated line 49 by
putting an un
onditional jump to 22 in 48. Unfortunately, we must
ontinue to assert \Wr" for
another mi
roinstru
tion.
The rest of the ma
roinstru
tions all have 1111 as their high-order 4 bits, so de
oding of (at
least some of) the low-order 12 bits in these instru
tions is required to tell them apart. The a
tual
exe
ution routines are straightforward so we will not
omment on them further.
A few more points are worth making. In Fig. 13 we in
rement the
ontent of the PC register
in line 1. It
ould equally well have been done in line 0, thus freeing line 1 for something else while
waiting for memory to respond. In this ma
hine there is nothing else to do, but in a real ma
hine
the mi
roprogram might use this opportunity to
he
k for I/O devi
es awaiting servi
e, refresh
dynami
RAM, or something else.
If we leave line 1 the way it is , however, we
ould speed up the ma
hine by modifying line 8 to
read
mar:= p
; a
:= mbr; rd; goto 1;
In other words, we
an start fet
hing the next instru
tion before we have really nished with the
urrent one. This
apability provides a primitive form of instru
tion pipelining. The same tri
k
an be applied to other exe
ution routines as well.
It is
lear that a substantial amount of the exe
ution time of ea
h ma
roinstru
tion is devoted
to de
oding it bit by bit. This observation suggests that it might be useful to be able to load
the MPC register under mi
roprogram
ontrol. On many existing
omputers the mi
roar
hite
ture
has hardware support for extra
ting ma
roinstru
tion op
odes and stung them dire
tly into the
MPC to ee
t a multiway bran
h. If, for example, we
ould shift the IR 9 bits to the right and
put the resulting number into the MPC, we would have a 128-way bran
h to lo
ations 0 through
127. Ea
h of these words would
ontain the rst mi
roinstru
tion in the exe
ution sequen
e for
the
orresponding ma
roinstru
tion. Although this approa
h wastes
ontrol store spa
e, it greatly
speeds up the ma
hine, so something like it is nearly always used in pra
ti
e.
By using memory-mapped I/O, the CPU is not aware of the dieren
e between true memory
addresses and I/O devi
e registers. The mi
roprogram handles reads and writes to the top four
words of the address spa
e the same way it handles any other reads and writes.
Designing a ma
hine as a series of levels is done for e
ien
y and simpli
ity be
ause ea
h level
deals only with another level of abstra
tion. The level 0 designer worries about how to squeeze the
last few nanose
onds out of the ALU by using some means to redu
e
arry-propagation time. The
mi
roprogrammer worries about how to get the most mileage out of ea
h mi
roinstru
tion, typi
ally
by exploiting as mu
h of the hardware's inherent parallelism as possible. The ma
roinstru
tion set
designer worries about how to provide an interfa
e that both the
ompiler writer and mi
roprogrammer
an learn to love, and be e
ient at the same time. Clearly, ea
h level has dierent goals,
problems, te
hniques, and in general, a dierent way of looking at the ma
hine. By splitting the
total ma
hine design problem into several subproblems, we
an attempt to master the inherent
omplexity in designing a modern
omputer.
21