Lab Report-32bit ALU and ROM (Final)

HOON MIEW JIE
BK10110096
DISHEN A/L KESEVA KUMAR BK10110062

MOHD AIZZAT BIN MASTAN
BK10160339
SHERIELYN SAFIRA RANTI
BK10110269
TITLE:
1.
32-BIT ALU and ROM
INTRODUCTION
A computer is one of the most important discoveries of the human race. The CPU (Central
processing unit is the main brain of this great invention which could be one of the most used
appliances in the world today. Almost every household has a computer and it is a multi-million
dollar industry which is still vastly growing and expanding. The CPU is used to process
information needed by the computer, it acts like a brain when we think to move our body
parts and etc. When looking into a CPU, the few basic things that we should know is that the
CPU has four basic tasks that it performs. The tasks are fetch, decode, manipulate and output.
The speed rating for the CPU is measured by MHz, but in fact it is not all that accurate. This
CPU chip is comprised of a million logic gates that are embedded which are used to complete
a variety of different operations. These gates are used with a clock that regulates at a speed
which the CPU is fed data. The CPU is comprised of five basic components; RAM, Registers,
buses ALU and control unit. RAM is a component which is created from combining latches with
a decoder. The latches create circuitry that can remember while the decoder creates a way for
individual memory location to be selected. Registers are components that are special memory
locations that can be accessed very swiftly. The three registers are instruction register, the
program counter and the accumulator. Buses are the information path or highway for the CPU.
Buses are many tiny wires that carry data between components. The most important buses
are the address buses, the data buses and the control buses. The ALU or arithmetic control
unit performs all the mathematical calculations of the CPU. It is composed of complex circuitry
which makes this component very important. The ALU can add, subtract, multiply, divide and
perform a lot of other calculations on binary numbers as well.
2.
LITERATURE REVIEW
2.1
16-Bits
Risc
Processor
Architecture:
Controller
State
Machines
And
Functional Verification Using Verilog Hdl

Ismail Saad et al [1] presents the design and simulation of 16-bit RISC processor architecture
behavioral model based on HDL methodology using Verilog-HDL software. The processor
system consists of ROM, RAM, I/O and CPU. The CPU module is merely a shell which instances
the real processor definition in cpu_core.v, control.v, datapath.v and alsu.v file. Behavioral
model of control module which comprises of controller state machine, Instruction Register (IR)
and a group of Control Signals are explained thoroughly. The tasks of modeling Read, Write
and Tristate buffer operation for datapath module are also deeply being explained. The
functionality of the processor design was tested by executing three instructions type. Thus, it
is shown that Verilog-HDL can be used to improve the design process of new microprocessor
architecture.
Figure2.1 (a) Processor Control Unit Architecture and (b) Processor

Datapath Unit Architecture [1]
2.2
32-Bit ALU Design
In his book, Paul V. Bolotoff presents the general ALU function. Microprocessors tend to
have a single module that performs arithmetic operations on integer values. This is
because many of the different arithmetic and logical operations can be performed
using similar (if not identical) hardware. The component that performs the arithmetic
and logical operations is known as the Arithmetic Logic Unit, or ALU. [2]
The ALU is one of the most important components in a microprocessor, and is
typically the part of the processor that is designed first. Once the ALU is designed, the
rest of the microprocessor is implemented to feed operands and control codes to the
ALU.
Logic and addition are some of the easiest, but also the most common
operations. For this reason, typical ALUs are designed to handle these operations
specially, and other operations, such as multiplication and division, are handled in a
separate module.
Notice also that the ALU units that discussed here are only for integer
datatypes, not floating-point data. Luckily, once integer ALU and multiplier units have
been designed, those units can be used to create floating-point units (FPU). The
following is an example of a basic 2-bit ALU. The boxes on the right hand side of the
image are multiplexers and are used to select between various operations: OR, AND,
XOR, and addition.
Figure 2.2 A basic 2-bit ALU [2]

Unfortunately, implementing a true 32-bit ALU will quickly turn the project out of scale
if similar approach is taken. Therefore a simplification made in the ALU module in
Verilog is taken advantage of.
3.
METHODOLOGY
A 32-bit ALU is modified based on the work by Ismail Saad et al [1]. For a 32-bit
micorprocessor design, some of major modification is made, where modification for each
module and the justification is explained in details below.
3.1
Opcodes module
An opcodes is the operation codes which contains a list of defined function that are used for
the microprocessor. The coding itself does not have any specific task. It is just used to define
all the listed function which easier for anyone to understand the program such as we can just
call the opcode without remember the value inside it which might consist of a very long binary
number.
//Macro Opcodes
`define
`define
`define
`define
`define
`define
ADDr {4'b0000, ÀDD}

ADDi {4'b0100, ÀDD}
ADDrcc {4'b0010, ÀDD}
ADDicc {4'b0110, ÀDD}
CADDrZ {4'b0001, ÀDD}
CADDiZ {4'b0101, ÀDD}
Figure 3.1 Opcodes module

3.2
32-Bit Alu Module
Having 32-bit ALU makes it possible to implement floating-point (which has a 32-bit format),
possible of complex calculation within a very short cycle-time: a 8-bit processor using floatingpoint format will need at least 4-cycle for data transfer compared to a 32-bit processor which
only requires at least 1 cycle.
The ALU function is expanded to 4 bits, equivalent to 16 instructions in total. With this, extra
ALU function is implemented.
By increasing the number of bits, IR instruction set can be increased, where more ALU
function is implemented and the addressable register is increased to 255 register (2 8).
However, in this project, the register is only increased to 8 register, (in which RO to R8
registers are created) to keep the project manageable and as a proof of concept.
always @(input1 or input2 or Function)
case (Function)
ÀDD : result_temp = input1 + input2;
`SUB : result_temp = input1 - input2;
ÀND : result_temp = input1 & input2;
ÒR : result_temp = input1 | input2;
`XOR : result_temp = input1 ^ input2;
`NOT : result_temp = ~input1;
`SRA : result_temp = input1 >> 1;
`SLA : result_temp = input1 << 1;
//new function
`MOD : result_temp = input1 % input2;
//new function
ÈQL : result_temp = (input1==input2)? 1:0;
//new function
`GREATER : result_temp = (input1 > input2)? 1:0; //new function
`SMALLER : result_temp = (input1 < input2)? 1:0; //new function
`MULT : result_temp = input1 * input2;
//new function
`DIV : result_temp = input1 / input2;
//new function
default : result_temp = input1;
endcase
Figure 3.2 32-Bit Alu Module
Defined in the opcodes.v, the ALU perform the arithmetic and logical operation in the
microprocessor. In this microprocessor the ALU function. The ALU will operate the arithmetic
function according to the defined function in the opcodes. The ALUs 16 functions are
addition , subtraction, and ,or , xor, xnor, not, modulus, logical left and right shift, arithmetic
left and right shift ,multiplication, division, and choose the 1 st input and 2nd input. The input
and output are 32bit. The new function is labelled in green comment.
An additional flag status is also designed in ALU module. The overflow flag is designed.
The calculation is made in a buffer register of 2 64 bit. When the result is larger than 2 32, the
overflow logic is triggered. Note that for troubleshooting purpose, the members set the result
to return the remainder value if the resultant calculation is less than 2 32. In this case, the
function of carry-over bit is simulated. If the resultant calculation is bigger than 2 33, the result
will returns 232 value. In practice, this implementation of ALU design is wrong. ALU module are
designed in bit-by-bit which has inbuilt carry-over bit, like ripple carry-in bit. Overflow bit is
actually the of the MSB carry-over bit. This is designed as a proof-of concept.
always @(result_temp)
begin
if((result_temp > 32'hFFFFFFFF) || (result_temp == 32'hFFFFFFFF))//2^32
begin
Overflow = 1;
result = result_temp-64'hF00000000;
if (result_temp > 32'hFFFFFFFF) result = 32'hFFFFFFFF;
end
else
begin
Overflow = 0;
result=result_temp;
end
end
The overflow logic from notifies the control module and set the overflow flag if the setbit is
selected.
Figure 3.2 32-Bit Alu Module
3.3 32-Bit Control Module and register R4, R5, R6, R7
The control module executes operations in proper sequence by means of controller state
machine as shown in Figure 3. The control module generates the control signals shown in
Figure 2 that causes each instruction to be executed. The instruction register is a 16-bit
register with IR[15:9] being reserved for opcodes, IR[8:6] for the destination register (Rd),
IR[5:3] for Rs1 and IR[2:0] for Rs2.
The CSM has three states: Fetch1 (00), Fetch2 (11) and Execute (01) that coded by
using gray code. The controller state machine is based on the Mealy machines as referred in
the reference [1]. Details of the state transition are shown in the state diagram in the figure
below.
In addition it also has 4 memory cycles sub states: address_setup (00), address_hold
(01), data_setup (11) and data_hold (10). To distinguish transitions of operation from one state
to another, the data_hold sub state of memory cycle and the 2-mode bit fields of instruction
are used.
Referring to figure, TRUE or FALSE represents the presence of data_hold in the sub state
cycle, the 2-bit (00,01,11,10) is represent the possible values of mode bit and XX is referred
as dont care condition.
Figure 3.3 : Controller State Machine State Diagram

The following code is used to set the 32-bit IR in control module.
assign
assign
assign
assign
assign
assign
assign
assign
Opcode = IR[31:24];
ModeBit = IR[31:30];
Rd = IR[23:16];
Rs1 = IR[15:8];
Rs2 = IR[7:0];
ALUfunc = IR[27:24];
setbit = IR[29];
testbit = IR[28];
Figure 3.4 set the 32-bit IR in control module
Since new register address is appended to 8-bit; up to 2 8 register can be addressed. Four new
register are designed as a proof-of-concept to demonstrate the 32-bit architecture benefit. The
following code is set to add a register in control module.
input ReadR4_1;
wire ReadR4_1;
assign WriteR4 = ( En_wrt_dec && ( Rd == `R4)) ? 1 : 0;
assign ReadR4_1 = ( En_read_dec && Rs1 == `R4) ? 1 : 0;
Figure 3.5 add a register in control module.

The overflow logic from notifies the control module and set the overflow flag if the setbit is
selected. The following condition is set in Verilog by previous author [1].
always @ (posedge Clock)
begin
if ((((state == Èxecute && ModeBit == 2'b01) && sub_state ==`data_hold) || (state
== `Fetch1 && ModeBit == 2'b00 && sub_state ==`data_setup)) && setbit == 1'b1)
zero_flag_reg <= Zero;
overflow_flag_reg <= Overflow;
end
Figure 3.6 Verilog by previous author
Figure 3.7 32-Bit Instruction Register used in the CPU
3.4
32-Bit Datapath Module and register R4, R5, R6, R7

It consists of functional units such as the ALU and performs the data processing
operation. Datapath will store the data in the register of the main memory. Datapath module
is used to suit the definition of design processor to model read and write operation using Rs1
and Rs2 from the registers available in the microprocessor. In datapath, there is the tri state
buffer. The Tris state buffer (TrisALU, TrisPC, TrisRs2, TrisRd, nTrisRd) is used to control the flow
of data in the module. In our datapath module, there are 8 register being used which are PC,
R1, R2, R3, R4, R5, R6, and R7. The extra register is made by assigning new variable in the IR
instruction set. Then the selected register is assigned its control variable in control unit. A
further assignment is made in datapath module. The coding below shows the assignment of
the new register.
Assign Rs1 = ( ReadR4_1) ? R4 : 32bz;
Figure 3.8 assignment of the new register

3.5
32-bit ROM Module and 4x109 addressable Memory Address
Read only memory (ROM) functions as a permanent storage (non-volatile), for

microprogramming, library subroutines, system program (BIOS), function stables and
embedded system code.
The output of the ROM module is a16-bit data bus, data_bus and the input, an 8-bit address
bus, Address_bus determines which data stored in the ROM will be selected and to be
outputted
As stated before, the ROM address is also increased up to a limit of 2 32 which is 4x109. Again,
in order to keep project manageable, the maximum ROM address is set to 1024 only.
reg [31:0] Data_stored [1024:0];
Figure 3.9 Data Stored
TrisMem is to control the data transfer from ROM to the sysbus.
assign Sysbus = (TrisMem) ? Data_in: 32'bz;
Figure 3.10 TrisMem
Figure 3.11 The ROM block diagram and ROM implementation

A better memory management unit could be implemented instead of directly feeding the data
from the ROM module to the CPU module. However, it is not implemented as the members
have trouble with the TRIS timing on reading the ROM data. The approach the group choose is
to manually select the data address from the ROM module and transfer it to the CPU module
via testbench.
TestBench
#480 Address_bus=0;
#480 Address_bus=1;
ROM Module
Data_stored[0] = {`LD, `R1, `R1, `R1};
Data_stored[1] = 32'hFFFF0000;
Figure 3.12 select the data address from the ROM
4.0
RESULTS AND DISCUSSIONS
4.1
Verification of Waveform
For this project, a 32-bit is implemented onto the processor architecture. In changing the
number of bits from 16 to 32-bit, more ALU function and register can be used. For a 32-bit
ROM, maximum number address is increased; from 65x10 3 (216) to 4x109 (232).
Load and Store Operation
For the verification, the data 00FFFFFFhex and 0000FFFFhex is written to the register1 and
register2 respectively. The following coding is set in ROM. The testbench then transfers the
data from ROM into the CPU core using predefined timing.
//R1 <- 32'h0000FFFF
Data_stored[1] = 32'h00FFFFFF;
//R2 <- 32'hFFFF0000
//
Data_stored[3] = 32'h0000FFFF;
Figure 4.1 Load and Store Operation

The waveform of the timing result is displayed. The timing where the register changes the
result is highlighted.
Figure4.2 Loading data 00FFFFFF(hex) and 0000FFFF(hex) into register1

and register2 respectively.
Register-Register Operation - Multiplication and Division

At address 12, MULTIPLICATION instruction of register1 * register2 is performed. The result is
then written to register7. Subsequently, address 13 performs DIVISION and written into
register3.
The
division
of
00FFFFFFhex(16777215dec)
by
0000FFFFhex(65535dec)
is
00000100hex(256dec). The ALU is shown performing and a 32-bit ALU can address a
maximum of 232 digits.
Data_stored[12] = {4'b0010,`MULT, `R7, `R1, `R2};

Data_stored[13] = {4'b0010,`DIV, `R3, `R1, `R2};
Figure 4.3 Loading data 00FFFFFF(hex) and 0000FFFF(hex) into register1 and
register2 respectively.
For MULTIPLICATION operation, since the result is larger than 2 32, it triggers the overflow flag.
When the overflow flag is triggered, and the result is larger than 2 33, the result will
automatically be registered 232 as an indicator for user. In real operation, overflow flag should
triggers immediate program termination via interrupt function. Register7 is used in this case
to test the implementation.
Register-Register Operation Equal, Greater, Smaller
At address 14, an EQUAL comparison is made between the data in register1 and register2.
The data are not equal; therefore the result given is zero and written into register3. Address
15 and 16 performs GREATER and SMALLER operation respectively. The results given were 1
and 0 respectively; input1 being greater than input2. All the result is written in register3. The
following code is used to perform the operation.
Data_stored[14] = {4b0010,ÈQL, `R3, `R1, `R2};

Data_stored[15] = {4b0010,`GREATER, `R3, `R1, `R2};
Data_stored[16] = {4b0010,`SMALLER, `R3, `R1, `R2};
Figure 4.4 Register-Register Operation

At address 14, the EQUAL operation returned a 0 (false) value; the data in register1 and
register2 are unequal and is proven correct. The zero flag is also triggered as the result is
zero. At address 15, the GREATER operation returns 1 (true) value; where data in register1 is
larger than data in register2. Consequently, the zero flag is terminated. At address 15,
SMALLER operation returns 0 (false) value and the zero flag is triggered again.
Figure 4.5 Register-Register Operation

Summary
Logic comparators are essential in implementing algorithm where if-else, for, while logic is
based on. Multiplication and division, on the other hand is rather unrealistic as the operation
is completed within a cycle. Real multiplication and division operation are advanced
instruction based on conditional-shifting binary logic involving binary shift. Although this
implementation is wrong, it serves as a proof of concept. Overflow flag and zero flag is proved
working in the example as well.
4.2
Problems Encountered During The Design
1) inout, wire, assign declaration

Initially, the Verilog declaration was confusing and intimidating. Therefore, the declaration for
input and output between subsequent modules need to be clarified and understood.
**Error: Declaration for 'Sysbus_wire' incompatible with earlier vectored declaration at
C:/Users/Windows 7/Documents/New folder/cpu.v(16).
The error is due to input with reg and output with wire declared in wrong match.
**Error: (vlog-2110) Illegal reference to net "result"
** Error: Port mode is incompatible with declaration: input1
** Error: LHS in procedural continuous assignment may not be a net: Overflow.
The following error results when wire is declared as reg or inversely.
2) Register-Register Addition Operation

Half of the time is spent on troubleshooting the code, while trying to understand why the
operation such as operation involving adding register1 and register2 and then put the result
into register2 did not work accordingly. For troubleshooting purpose, waveform from wire Rs1,
Rs2 and Rd was extracted from the datapath module. By carefully monitoring the waveform
and also constant reference to the coding, the Verilog coding was painstakingly analyzed. It
was later found in that TrisALU which needs to be enabled for the results to be written into a
register. The condition for activating TrisALu is given below.
assign TrisALU = ((( sub_state == àddress_setup ) || (sub_state ==àddress_hold)) &&
( state == Èxecute ) && (ModeBit == 2'b11 || ModeBit ==2'b10));
R2 <- R1 SUBr R2
In the following waveform, an instruction was given to subtract register2 from register1 and
put the result in register3. The data 5 and 1 had been inserted into register1 and register2
respectively. A subtraction between the two register was then made. The result shows 4,
which is correct. However, the data was not written into register3.
Data_stored[5] = {`SUBr, `R3, `R1, `R2};
Data_stored[6] = {ÀDDr, `R3, `R1, `R2};
Result is correct but and is passed

to Rd wire but is not written to
the register.
Figure 4.6 Timing diagram of register and register operation
The clock cycle is later identified as the problem by increasing the delay between subsequent
instructions to 360 units. Load and store instruction requires 3 memory cycles to complete.
Since in the coding, 1 cycle is stated to be 10 units, 3 memory cycles is equivalent to 36 clock
cycle, or 360 units.
3) Understanding the Clock-Cycle
Therefore, in this program the pulse is set a #5 unit, where each 1 clock cycle (consist of
rising and falling edge) is #10 unit; where 1 unit is 100ps. A state has 4 sub-state, where each
sub-state has 3-clock cycle; giving a state of at least 12-clock cycle (1-memory cycle), or
12000 ps, or 120 unit. Some of the instructions take 2 states to complete; of which 240 units
is necessary. This information is important for the system to work. Although this information is
given in the beginning of the lesson, this understanding was made very late and a lot of time
is wasted on senseless trial-error troubleshooting. Unfortunately, most of the members were
overwhelmed at the information presented at the beginning of the lesson and had a hard time
discerning essential information. The lack of the members initiative and preparation hinders
the progress as well.
4) No Proper Coding Documentation
Although the paper presents extensive information and block diagram, the lack of coding
documentation makes it very hard to relate concept; especially when the code are given in
parts. The full picture and the usage of each module were understood at a very late stage.
Even then, it is very difficult to understand why the author chooses the particular coding. For
example: in the opcodes module, it is difficult to see what the assignment of the binary code
is use for. In the code below, there is no explanation on what is ADDr, ADDi, ADDrcc and so on
and a proper documentation on how to use it. Secondly, there is no explanation as on how the
author set the macro opcodes in relation to the control module. Third, a majority of the
explanation given is based on the paper which helps little as most of members are struggling
on implementing their idea into Verilog coding due to the lack of understanding on the Verilog
coding.
4.3
Drawbacks Of The 32-Bit Alu Design
1) Complex Architecture Implementation

Electronic implementation may be very difficult to achieve as sophisticated equipments is
necessary for such fabrication. High processing capability is also essential to utilize the 32-bit
data processing. As the result, the cost of fabrication is overwhelming and may be out of
reach for most people. A detailed structured design is also necessary to fully utilize the 32-bit
processing capability. As such, extensive research and time need to be invested, of which is
out of the scope of this project.
2) Used for High-end Application
A 32-bit processor is usually used for processing-extensive devices. As such, for this project,
an implementation of a 32-bit processor offers no real benefit as there are most likely no
application that is within the scope of the this study which requires such processing capability.
This implementation is however useful for students in understanding and learning how the
design of a processor can be made and simulated in Verilog environment. It also helps the
student to associate in designing and implementation in real-world engineering.
REFERENCES
[1]
Ismail Saad, P. V. (2004). HDL-BASED DESIGN METHODOLOGY OF 16-bit RISC

MICROPROCESSOR. 3rd International Conference on Advanced Manufacturing
Technology (ICAMT 2004). Kuala Lumpur, Malaysia.
[2]
CPU designers have used a variety of names for the arithmetic logic unit, including
"ALU", "integer execution unit", and "E-box". Paul V. Bolotoff. "Functional Principles of
Cache Memory" 2007.
[3]
Andrei-Sorin Gheorghe, C. B. (2010). GENERAL PURPOSE MICROPROCESSOR. SAVAGE16

16-BIT RISC ARCHITECTURE, 1.
[4]
Bhaaskaran, V. S. (2011). 16-Bit RISC Processor Design for Convolution Application. 1.
[5]
J., H. (1998). Computer architecture and organisation . Mcgraw Hill.
[6]
Kui YI, Y.-H. D. (2009). 32-bit RISC based on MIPS. instruction fetch module design, 1.
[7]
Liu, Y. (2012). HISC: A computer architeccture using operand descriptor. 1.
[8]
Rahman, A. B. (August 2004). HDL-BASED DESIGN METHODOLOGY OF 16-bit RISC

MICROPROCESSOR. A NEW 16-BITS RISC PROCESSOR ARCHITECTURE: CONTROLLER
STATE MACHINES AND FUNCTIONAL VERIFICATION USING VERILOG HDL, 12.
[9]
Raj Kumar Singh Parihar, S. R. (2006). REPORT ON DESIGN OF 16 BIT RISC
PROCESSOR.
[10] Repak, M. (2012). A Simple Simulator for a Basic microprocessor.

[11] Sawitzki, S. (2012). Processor design using a functional hardware description language.
1.
[12] Xiao Tiejun, L. F. (2008). 16-Bit Teaching Microprocessor Design and Application. 1.
APPENDIX: CODING
opcodes.v
//ALU functions
`define ADD
`define SUB
`define AND
`define OR
`define XOR
`define NOT
`define SRA
`define SLA
`define MULT
`define DIV
`define EQL
`define GREATER
`define SMALLER
`define MOD
4'd0
4'd4
4'd1
4'd2
4'd3
4'd5
4'd6
4'd7
4'd8
4'd9
4'd10
4'd11
4'd12
4'd13
//Registers
`define PC 8'b00000100
`define DR 8'b00000101
`define IR 8'b00000110
`define R0 8'b00000000
`define R1 8'b00000001
`define R2 8'b00000010
`define R3 8'b00000011
`define R4 8'b00000111
//multiply
//divide
//equal
//greater
//smaller
//modulus
//program counter
//Instruction Register
`define R5 8'b00001111
`define R6 8'b00011111
`define R7 8'b00111111
//Macro
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
Opcodes
ADDr
ADDi
ADDrcc
ADDicc
CADDrZ
CADDiZ
SUBr
SUBi
SUBrcc
SUBicc
CSUBrZ
CSUBiZ
ANDr
ANDi
ANDrcc
ANDicc
CANDrZ
CANDiZ
ORr
ORi
ORrcc
ORicc
CORrZ
CORiZ
XORr
XORi
XORrcc
XORicc
CXORrZ
CXORiZ
NOTr
NOTrcc
CNOTrZ
SRAr
`define LD
`define ST
{4'b0000,
{4'b0100,
{4'b0010,
{4'b0110,
{4'b0001,
{4'b0101,
{4'b0000,
{4'b0100,
{4'b0010,
{4'b0110,
{4'b0001,
{4'b0101,
{4'b0000,
{4'b0100,
{4'b0010,
{4'b0110,
{4'b0001,
{4'b0101,
{4'b0000,
{4'b0100,
{4'b0010,
{4'b0110,
{4'b0001,
{4'b0101,
{4'b0000,
{4'b0100,
{4'b0010,
{4'b0110,
{4'b0001,
{4'b0101,
{4'b0000,
{4'b0010,
{4'b0001,
{4'b0000,
ÀDD}
ÀDD}
ÀDD}
ÀDD}
ÀDD}
ÀDD}
`SUB}
`SUB}
`SUB}
`SUB}
`SUB}
`SUB}
ÀND}
ÀND}
ÀND}
ÀND}
ÀND}
ÀND}
ÒR}
ÒR}
ÒR}
ÒR}
ÒR}
ÒR}
`XOR}
`XOR}
`XOR}
`XOR}
`XOR}
`XOR}
`NOT}
`NOT}
`NOT}
`SRA}
//add R1 with R0
{8'b10000000}
{8'b11000000}
// The following line indicates that a file "monitor.v"exists and contains

// custom monitoring information
//`define special_monitor
// The following line indicates that a file "stimulus.v"exists and contains
// custom stimulus information
// (not required for simple simulations)
//
//`define special_stimulus
// The following code specifies the default value of theinput switches
// for the simulation in not already defined from thecommand line
//
ìfdef switch_value
// already defined - do nothing
èlse
`define switch_value 7
èndif
alu.v
ìnclude "opcodes.v"
`timescale 100ps / 10ps
module alu( Zero, Overflow, Underflow, result , input1, input2, Function);
output [31:0] result;
output Zero, Overflow, Underflow;
input [31:0] input1, input2;
input [3:0] Function;

wire Zero;
reg Overflow, Underflow;
reg [31:0] result;
reg [63:0] result_temp;
wire [31:0] input1, input2;
assign Zero = (result_temp == 0);
always @(result_temp)
begin
if((result_temp > 32'hFFFFFFFF) || (result_temp == 32'hFFFFFFFF))//2^32
begin
Overflow = 1;
result = result_temp-64'hF00000000;
if (result_temp > 32'hFFFFFFFF) result = 32'hFFFFFFFF;
end
else
begin
Overflow = 0;
result=result_temp;
end
end
always @(input1 or input2 or Function)
case (Function)
ÀDD : result_temp = input1 + input2;
`SUB : result_temp = input1 - input2;
ÀND : result_temp = input1 & input2;
ÒR : result_temp = input1 | input2;
`XOR : result_temp = input1 ^ input2;
`NOT : result_temp = ~input1;
`SRA : result_temp = input1 >> 1;
`SLA : result_temp = input1 << 1;
`MOD : result_temp = input1 % input2;
ÈQL : result_temp = (input1==input2)? 1:0;
`GREATER : result_temp = (input1 > input2)?
`SMALLER : result_temp = (input1 < input2)?
`MULT : result_temp = input1 * input2;
`DIV : result_temp = input1 / input2;
default : result_temp = input1;
endcase
//new function
//new function
//new function
1:0;
//new function
1:0;
//new function
//new function
//new function
endmodule
control.v
`define Fetch1 0
`define Execute 1
`define Fetch2 3
`define
`define
`define
`define
address_setup 0
address_hold 1
data_setup 3
data_hold 2

module control( Sysbus, nOE, RnW, nME, nALE, Zero, Overflow, Underflow, Function,
Clock,
nReset,TrisPC, TrisALU, ENB, TrisMem, TrisRs2, TrisRd, nTrisRd, ReadPC_1,
ReadPC_2,ReadR0_1,
ReadR0_2, ReadR1_1, ReadR1_2, ReadR2_1, ReadR2_2,ReadR3_1,ReadR3_2, ReadR4_1, ReadR4_2,
ReadR5_1, ReadR5_2, ReadR6_1, ReadR6_2, ReadR7_1, ReadR7_2, WriteR1, WriteR2, WriteR3,
WriteR4, WriteR5, WriteR6, WriteR7, PC_inc, Rs2_sel, LoadDR, LoadPC, sub_state_wire,
state_wire, Rs2_sel_wire, zero_flag_reg_wire, overflow_flag_reg_wire,
underflow_flag_reg_wire);
inout [31:0] Sysbus;

input Zero, Overflow, Underflow, Clock, nReset;
output [3:0] Function;
output TrisPC, TrisALU, ENB, TrisMem, TrisRs2, TrisRd, nTrisRd, nME, nALE,RnW, nOE;
output ReadPC_1, ReadPC_2, ReadR0_1, ReadR0_2, ReadR1_1, ReadR1_2,
ReadR2_1,ReadR2_2,ReadR3_1,ReadR3_2;
output WriteR1, WriteR2,WriteR3, Rs2_sel, PC_inc, LoadDR, LoadPC;
output [1:0] sub_state_wire, state_wire;
input ReadR4_1, ReadR4_2, ReadR5_1, ReadR5_2, ReadR6_1, ReadR6_2, ReadR7_1, ReadR7_2;
input WriteR4, WriteR5, WriteR6, WriteR7;
output Rs2_sel_wire, zero_flag_reg_wire, overflow_flag_reg_wire,
underflow_flag_reg_wire;
reg
reg
reg
reg
[1:0] state;
[1:0] sub_state;
[31:0] IR;
zero_flag_reg, overflow_flag_reg, underflow_flag_reg;
wire [7:0] Rd;

wire [7:0] Rs1;
wire [7:0] Rs2;
wire [7:0] Opcode;
wire [1:0] ModeBit;
wire [3:0] ALUfunc;
wire setbit,testbit;
wire TrisPC, TrisALU, TrisRs2, Rs2_sel, PC_inc, TrisMem, LoadDR;
wire memory_write;
wire ReadPC_1, ReadPC_2, ReadR0_1, ReadR0_2, ReadR1_1, ReadR1_2, ReadR2_1,ReadR2_2,
ReadR3_1, ReadR3_2;
wire WriteR1, WriteR2,WriteR3, Mux1_out;
wire ReadR4_1, ReadR4_2, ReadR5_1, ReadR5_2, ReadR6_1, ReadR6_2, ReadR7_1, ReadR7_2;
wire WriteR4, WriteR5, WriteR6, WriteR7;
wire En_wrt_dec, En_read_dec;
wire LoadPC, WritePC;
assign
assign
assign
assign
assign
state_wire = state;
sub_state_wire = sub_state;
Rs2_sel_wire = Rs2_sel;
zero_flag_reg_wire = zero_flag_reg;
overflow_flag_reg_wire = overflow_flag_reg;
//
// Divide instruction into Opcode, Rd, Rs1, Rs2 and Immediate
assign
assign
assign
assign
assign
assign
assign
assign
Opcode = IR[31:24];
ModeBit = IR[31:30];
Rd = IR[23:16];
Rs1 = IR[15:8];
Rs2 = IR[7:0];
ALUfunc = IR[27:24];
setbit = IR[29];
testbit = IR[28];
//
// Identify memory write and generate memory control signals
//
assign memory_write = (Opcode == `ST) && (state
assign nME = ( sub_state == àddress_setup ) ||
assign nALE = ( sub_state == àddress_setup );
assign RnW = ( sub_state == àddress_setup ) ||
~memory_write;
assign nOE = ( sub_state == àddress_setup ) ||
memory_write;
assign ENB = ~nOE;
//
== Èxecute);
( sub_state ==`data_hold );
( sub_state ==àddress_hold )||
( sub_state ==àddress_hold ) ||
// Generate tri-state contol signals for SysBus

// SysBus is driven by exactly one driver in each cycle
//
assign TrisMem = ( (sub_state == `data_setup || sub_state == `data_hold) && (state ==
`Fetch1 || state == `Fetch2 || ( state == Èxecute && (ModeBit== 2'b10 || ModeBit ==
2'b01))));
assign TrisALU = ((( sub_state == àddress_setup ) || (sub_state ==àddress_hold)) && (
state == Èxecute ) && (ModeBit == 2'b11 || ModeBit ==2'b10));
assign TrisPC = (((state == `Fetch1) || ( state == `Fetch2) || ( state== Èxecute &&
ModeBit == 2'b01)) && (sub_state == àddress_setup ||sub_state == àddress_hold ));
assign TrisRs2 = ((( sub_state == `data_setup ) || ( sub_state ==`data_hold )) &&
memory_write) && (ModeBit == 2'b11);
assign TrisRd = (( state == Èxecute ) && ( sub_state == `data_setup) &&( ModeBit ==
2'b10));
assign nTrisRd = ~TrisRd;
assign LoadPC = WritePC || PC_inc;
//
// Generate control signals for datapath
//
assign En_wrt_dec = (((state == `Fetch1 && sub_state == `data_hold &&ModeBit == 2'b00)
|| (state == Èxecute && sub_state == `data_setup && ModeBit == 2'b10) || (state ==
Èxecute && sub_state == `data_hold && ModeBit== 2'b01 )) && ((zero_flag_reg && testbit
== 1'b1) || ~testbit));
assign WriteR1 = (En_wrt_dec && Rd == 3'b001) ? 1 : 0;
assign WriteR2 = ( En_wrt_dec && ( Rd == 3'b010)) ? 1 : 0;
assign WriteR3 = ( En_wrt_dec && ( Rd == 3'b011)) ? 1 : 0;
assign WritePC = ( En_wrt_dec && ( Rd == 3'b100)) ? 1 : 0;
assign Mux1_out = ( En_wrt_dec && ( Rd == 3'b100)) ? 1 : 0;
assign En_read_dec = ((state == `Fetch1 && ModeBit == 2'b00) || (state ==Èxecute &&
(ModeBit == 2'b01 || ModeBit == 2'b10 || ModeBit == 2'b11)) ||(state == `Fetch2 &&
(ModeBit == 2'b11 || ModeBit == 2'b10)));
assign ReadR0_1 = ( En_read_dec && Rs1 == 3'b000) ? 1 : 0;
assign ReadPC_1 = ( En_read_dec && Rs1 == 3'b100) ? 1 : 0;
assign ReadPC_2 = ( En_read_dec && Rs2 == 3'b100) ? 1 : 0;
assign Rs2_sel = ~((( sub_state == àddress_setup || sub_state ==àddress_hold ) &&
( state == Èxecute && ( ModeBit == 2'b11 || ModeBit ==2'b10))) || ( sub_state ==
`data_hold && ((state == Èxecute && ModeBit ==2'b01) ||state == `Fetch2 )));
assign PC_inc = ((sub_state == àddress_hold ) && (( state == `Fetch1 )|| ( state ==
`Fetch2 ) || (state == Èxecute && ModeBit == 2'b01 ) ));
assign LoadDR = (( state == `Fetch2 ) && ( sub_state == `data_setup)) ||(( state ==
Èxecute ) && ( sub_state == `data_setup) && (ModeBit == 2'b01));
//
// Generate ALU control
//
assign Function = ALUfunc;
//
// Conditional Instruction
always @ (posedge Clock)
begin
if ((((state == Èxecute && ModeBit == 2'b01) && sub_state ==`data_hold) || (state
== `Fetch1 && ModeBit == 2'b00 && sub_state ==`data_setup)) && setbit == 1'b1)
zero_flag_reg <= Zero;
overflow_flag_reg <= Overflow;
underflow_flag_reg <= Underflow;
end
// All instructions complete in exactly 12 clock cycles
// state is Fetch1 (R+R), Fetch2(R+I) & Execute(L/S)
// sub_state controls memory cycle
//
always @(posedge Clock)
begin
if (nReset == 0)
state <= 32'bz;
else
begin
case (state)
0:if (sub_state == `data_hold && (ModeBit == 2'b00)) state <= #20 0;
else if (sub_state == `data_hold && ModeBit == 2'b01) state <= #20 1;
else if (sub_state == `data_hold && ((ModeBit == 2'b10) ||(ModeBit ==
2'b11))) state <= #20 3;
else if (ModeBit == 2'b01 || ModeBit == 2'b00) state <= #20 0;
3:if (sub_state == `data_hold && ((ModeBit == 2'b10) || (ModeBit ==2'b11)))
state <= #20 1;
else if (ModeBit == 2'b10 || ModeBit == 2'b11) state <= #20 3;
1:if (sub_state == `data_hold ) state <= #20 0;
else state <= #20 1;
endcase
sub_state[0] <= #20 ~sub_state[1];
sub_state[1] <= #20 sub_state[0];
end
end
//
// Update IR as required
//
begin
if ((state == `Fetch1) && ( sub_state == `data_setup ))
IR <= #20 Sysbus;
end
//
// Asynchronous reset
//
//
always @(nReset)
if (!nReset)
begin
assign state = 0;
assign sub_state = 0;
assign IR = 0;
assign zero_flag_reg = 0;
assign overflow_flag_reg = 0;
assign underflow_flag_reg = 0;
end
else
begin
deassign
deassign
deassign
deassign
deassign
deassign
end
state;
sub_state;
IR;
zero_flag_reg;
overflow_flag_reg;
underflow_flag_reg;
endmodule
datapath.v
module datapath( Sysbus, Zero, Overflow, Underflow, Function, TrisALU, TrisPC, TrisRs2,
TrisRd,
nTrisRd, Clock, nReset, ReadPC_1, ReadPC_2, ReadR0_1, ReadR0_2, ReadR1_1, PC_inc,
ReadR1_2,
ReadR2_1, ReadR2_2, ReadR3_1, ReadR3_2, ReadR4_1, ReadR4_2, ReadR5_1, ReadR5_2,
ReadR6_1, ReadR6_2,
ReadR7_1, ReadR7_2, PC_inc, WriteR1, WriteR2, WriteR3, WriteR4, WriteR5, WriteR6,
WriteR7, Rs2_sel,
LoadDR, LoadPC,PC_wire, R1_wire, R2_wire, R3_wire, R4_wire, R5_wire, R6_wire, R7_wire,
R0_wire,
DR_wire, Rd_wire, Rs1_wire, Rs2_wire, Result_wire);
output Zero, Overflow, Underflow;
input [3:0] Function;
input TrisALU, TrisPC, TrisRs2, TrisRd, nTrisRd, Rs2_sel,ReadPC_1, ReadPC_2;
input ReadR0_1, ReadR0_2, ReadR1_1, ReadR1_2,ReadR2_1, ReadR2_2, ReadR3_1,
ReadR3_2;
input PC_inc, WriteR1,WriteR2, WriteR3, LoadDR, LoadPC;
input ReadR4_1, ReadR4_2, ReadR5_1, ReadR5_2, ReadR6_1, ReadR6_2, ReadR7_1,
ReadR7_2;
input WriteR4, WriteR5, WriteR6, WriteR7;
input Clock, nReset;
output [31:0] PC_wire, R1_wire, R2_wire, R3_wire, R0_wire, DR_wire;
output [31:0] Rd_wire, Rs1_wire, Rs2_wire, Result_wire;
output [31:0] R4_wire, R5_wire, R6_wire, R7_wire;
reg [31:0] PC, R1, R2, R3, R4, R5, R6, R7, R0, DR;
wire Zero, Overflow, Underflow;
wire [31:0] Rd, Rs1, Rs2, result;
wire [31:0] Mux2_out, Mux1_out;
assign
assign
assign
assign
assign
assign
assign
assign
assign
Rs1
Rs1
Rs1
Rs1
Rs1
Rs1
Rs1
Rs1
Rs1
=
=
=
=
=
=
=
=
=
(
(
(
(
(
(
(
(
(
ReadPC_1) ? PC : 32'bz;
ReadR0_1) ? R0 : 32'bz;
ReadR1_1) ? R1 : 32'bz;
ReadR2_1) ? R2 : 32'bz;
ReadR3_1) ? R3 : 32'bz;
ReadR4_1) ? R4 : 32'bz;
ReadR5_1) ? R5 : 32'bz;
ReadR6_1) ? R6 : 32'bz;
ReadR7_1) ? R7 : 32'bz;
assign Rs2 = ( ReadPC_2) ? PC : 32'bz;

assign Rs2 = ( ReadR0_2) ? R0 : 32'bz;
assign Rs2 = ( ReadR1_2) ? R1 : 32'bz;
assign
assign
assign
assign
assign
assign
Rs2
Rs2
Rs2
Rs2
Rs2
Rs2
=
=
=
=
=
=
(
(
(
(
(
(
ReadR2_2)
ReadR3_2)
ReadR4_2)
ReadR5_2)
ReadR6_2)
ReadR7_2)
?
?
?
?
?
?
R2
R3
R4
R5
R6
R7
:
:
:
:
:
:
32'bz;
32'bz;
32'bz;
32'bz;
32'bz;
32'bz;
alu ALU ( alu_Zero, alu_Overflow, alu_Underflow, result, Rs1, Mux2_out, Function);

begin
if (LoadPC) PC = Mux1_out;
if (WriteR1) R1 = Rd;
if (LoadDR) DR = Sysbus;
end
assign
assign
assign
assign
assign
assign
assign
Mux2_out = ( Rs2_sel ) ? Rs2 : DR;

Zero = (result == 0)? 1:0;
Overflow = (result > 32'hFFFFFFFF)? 1:0;
Underflow = (result < 0)? 1:0;
Mux1_out = ( PC_inc ) ? PC + 1 : Rd;
Rd = (TrisRd) ? Sysbus : 32'bz;
Rd = (nTrisRd) ? result : 32'bz;
assign Sysbus = ( TrisALU ) ? result : 32'bz;

assign Sysbus = ( TrisPC ) ? PC : 32'bz;
assign Sysbus = ( TrisRs2) ? Rs2 : 32'bz;
always @(nReset)
if (!nReset)
begin
assign PC = 0;
assign R1 = 0;
assign R2 = 0;
assign R3 = 0;
assign R4 = 0;
assign R5 = 0;
assign R6 = 0;
assign R7 = 0;
assign DR = 0;
assign R0 = 0;
end
else
begin
deassign
deassign
deassign
deassign
deassign
deassign
deassign
deassign
deassign
end
PC;
R1;
R2;
R3;
R4;
R5;
R6;
R7;
DR;
assign
assign
assign
assign
assign
assign
assign
assign
assign
assign
assign
assign
assign
assign
PC_wire=PC;
R1_wire=R1;
R2_wire=R2;
R3_wire=R3;
R4_wire=R4;
R5_wire=R5;
R6_wire=R6;
R7_wire=R7;
DR_wire=DR;
R0_wire=R0;
Rs1_wire=Rs1;
Rs2_wire=Rs2;
Rd_wire=Rd;
Result_wire=result;
endmodule
cpu_core.v
//cpu_core.v
module cpu_core( Sysbus, Data_in, ENB, nME, nALE, RnW, nOE, SDO, Clock, nReset,
Test, SDI, PC_wire, R1_wire, R2_wire, R3_wire, R4_wire, R5_wire, R6_wire, R7_wire,
R0_wire, DR_wire, Rd_wire, Rs1_wire, Rs2_wire, Result_wire,
sub_state_wire, state_wire, Rs2_sel_wire, TrisMem_wire, Function,
zero_flag_reg_wire, overflow_flag_reg_wire, underflow_flag_reg_wire);
//
// I/O declarations
//
input [31:0] Data_in;
output ENB;
output nME, nALE, RnW, nOE, SDO;
//control memory
output[31:0] PC_wire, R1_wire, R2_wire, R3_wire, R4_wire, R5_wire, R6_wire, R7_wire;
output[31:0] Rd_wire, Rs1_wire, Rs2_wire, R0_wire, DR_wire, Result_wire;
output[3:0] Function;
output[1:0] sub_state_wire, state_wire;
output Rs2_sel_wire, TrisMem_wire;
output zero_flag_reg_wire, overflow_flag_reg_wire, underflow_flag_reg_wire;
input Clock, nReset, Test, SDI;
wire [31:0] Sysbus;
wire TrisMem;
assign TrisMem_wire = TrisMem;
/////////////////////////////////////////////////////////////////////
// Simulation of CPU core
/////////////////////////////////////////////////////////////////////
// This system doesn't simulate the scan path
//
assign SDO = SDI;
// This system has a single internal system address/data bus
// Data_in must pass though a tri-state buffer before connection to SysBus
//
assign Sysbus = (TrisMem) ? Data_in: 16'bz;
//
// This system is built from two smaller modules
//
control Control( Sysbus, nOE, RnW, nME, nALE, Zero, Overflow, Underflow, Function, Clock,
nReset,TrisPC, TrisALU, ENB, TrisMem, TrisRs2, TrisRd, nTrisRd, ReadPC_1,
ReadPC_2,ReadR0_1,
ReadR0_2, ReadR1_1, ReadR1_2, ReadR2_1, ReadR2_2,ReadR3_1,ReadR3_2, ReadR4_1,
ReadR4_2,
ReadR5_1, ReadR5_2, ReadR6_1, ReadR6_2, ReadR7_1, ReadR7_2, WriteR1, WriteR2,
WriteR3,
WriteR4, WriteR5, WriteR6, WriteR7, PC_inc, Rs2_sel, LoadDR, LoadPC, sub_state_wire,
state_wire, Rs2_sel_wire, zero_flag_reg_wire, overflow_flag_reg_wire,
underflow_flag_reg_wire);
datapath Datapath( Sysbus, Zero, Overflow, Underflow, Function, TrisALU, TrisPC, TrisRs2,
TrisRd,
nTrisRd, Clock, nReset, ReadPC_1, ReadPC_2, ReadR0_1, ReadR0_2, ReadR1_1, PC_inc,
ReadR1_2,
ReadR2_1, ReadR2_2, ReadR3_1, ReadR3_2, ReadR4_1, ReadR4_2, ReadR5_1, ReadR5_2,
ReadR6_1, ReadR6_2,
ReadR7_1, ReadR7_2, PC_inc, WriteR1, WriteR2, WriteR3, WriteR4, WriteR5, WriteR6,
WriteR7, Rs2_sel,
LoadDR, LoadPC,PC_wire, R1_wire, R2_wire, R3_wire, R4_wire, R5_wire, R6_wire, R7_wire,
R0_wire,
DR_wire, Rd_wire, Rs1_wire, Rs2_wire, Result_wire);
endmodule
rom.v
//room module
//words 0-255
`timescale 100ps/10ps
module rom (Data_bus, Address_bus, notOE, notCE);
output [31:0] Data_bus;
input [31:0] Address_bus;
input notOE, notCE;
specify
specparam tViolate=250;
$width (negedge notCE, tViolate);
$width (posedge notCE, tViolate);
$setuphold (edge[10, 01] notCE, Address_bus, tViolate, tViolate);
$setuphold (edge[10, 01] notCE, notOE, tViolate, tViolate);
endspecify
reg [31:0] Data_stored [1024:0];
assign Data_bus = ((notOE == 0) && (notCE == 0))? Data_stored [Address_bus] : 32'bz;
initial
begin
//CODE:
// opcode(modebit; setbit; testbit;) rd, rs1, rs2
//alu rs1 with mux2(rs2 or dr)
//R1 <- 32'h0000FFFF
Data_stored[1] = 32'h0000FFFFF;
//R2 <- 32'hFFFF0000
Data_stored[3] = 32'h000000FF;
//
Data_stored[4] = {`SUBr, `R3, `R1, `R2};

Data_stored[5] = {ÀDDr, `R4, `R1, `R2};
Data_stored[6] = {ÒRr, `R5, `R1, `R2};
Data_stored[7] = {ÀNDr, `R6, `R1, `R2};
Data_stored[8] = {`XORr, `R7, `R1, `R2};
Data_stored[9] = {`NOTr, `R3, `R1, `R1};
Data_stored[10] = {`SRAr, `R4, `R1, `R1};
//Data_stored[11] = {4'b0000,`SLA, `R3, `R1, `R1};
Data_stored[12] = {4'b0000,`MULT, `R3, `R1, `R2};
Data_stored[13] = {4'b0000,`DIV, `R4, `R1, `R2};
Data_stored[14] = {4'b0000,ÈQL, `R5, `R1, `R2};
Data_stored[15] = {4'b0000,`GREATER, `R6, `R1, `R2};
Data_stored[16] = {4'b0000,`SMALLER, `R7, `R1, `R2};
end
endmodule
cpu.v
//cpu.v
module cpu( nReset, Clock, Address_bus, Sysbus, PC_wire, R1_wire, R2_wire, R3_wire,
R4_wire, R5_wire, R6_wire, R7_wire, R0_wire, DR_wire, Rd_wire, Rs1_wire, Rs2_wire,
Result_wire, sub_state_wire, state_wire, Rs2_sel_wire, TrisMem_wire , Function,
zero_flag_reg_wire, overflow_flag_reg_wire);
input nReset, Clock;
input [31:0] Address_bus;
output[31:0] PC_wire, R1_wire, R2_wire, R3_wire, R0_wire, DR_wire;
output[31:0] Rd_wire, Rs1_wire, Rs2_wire, Result_wire;
output[31:0] R4_wire, R5_wire, R6_wire, R7_wire;
output[3:0] Function;
output[1:0] sub_state_wire, state_wire;
output Rs2_sel_wire, TrisMem_wire;
output zero_flag_reg_wire, overflow_flag_reg_wire;
wire [31:0] Data_out, Data_in;
reg initial_value;
wire nOE;
cpu_core CPU_CORE( Sysbus, Data_in, ENB, nME, nALE, RnW, nOE, SDO, Clock, nReset,
Test, SDI, PC_wire, R1_wire, R2_wire, R3_wire, R4_wire, R5_wire, R6_wire, R7_wire,
R0_wire, DR_wire, Rd_wire, Rs1_wire, Rs2_wire, Result_wire,
sub_state_wire, state_wire, Rs2_sel_wire, TrisMem_wire, Function,
zero_flag_reg_wire, overflow_flag_reg_wire, underflow_flag_reg_wire);
rom ROM( Data_out, Address_bus, nOE, nCE );

assign Data_in = Data_out;
assign nOE = initial_value;
assign nCE = initial_value;
always @(nReset)
if (!nReset)
begin
assign initial_value = 0;
end
else
begin
//deassign initial_value;
initial_value=1;
end
endmodule
test_bench7_cpu.v
//Module Name: cpu_tst.v
//Test Bench for cpu model.
`timescale 100ps/10ps
module test_bench7_cpu;
//data type declarations
reg nReset, Clock;
reg [31:0] Address_bus;
wire
wire
wire
wire
wire
wire
wire
[31:0] PC_wire, R1_wire, R2_wire, R3_wire;

[31:0] R4_wire, R5_wire, R6_wire, R7_wire, R0_wire, DR_wire;
[31:0] Rd_wire, Rs1_wire, Rs2_wire, Result_wire, Sysbus;
[3:0] Function;
[1:0] sub_state_wire, state_wire;
zero_flag_reg_wire, overflow_flag_reg_wire;
Rs2_sel_wire, TrisMem_wire;
//instance of the design

cpu CPU( nReset, Clock, Address_bus, Sysbus, PC_wire, R1_wire, R2_wire, R3_wire,
R4_wire, R5_wire, R6_wire, R7_wire, R0_wire, DR_wire, Rd_wire, Rs1_wire, Rs2_wire,
Result_wire, sub_state_wire, state_wire, Rs2_sel_wire, TrisMem_wire, Function,
zero_flag_reg_wire, overflow_flag_reg_wire);
//input stimulus
initial
begin
Address_bus = 0;
Clock
= 0;
nReset = 0;
#480
nReset
#480
#480
#480
#240
#480
#480
#480
#480
= 1;
Address_bus=0;
Address_bus=1;
Address_bus=2;
Address_bus=3;
Address_bus=4;
Address_bus=5;
Address_bus=6;
Address_bus=7;
#480
Address_bus=8;
#480
#480
#480
#480
#480
#480
#480
#480
#480
#480
#480
#480
#480
Address_bus=9;
Address_bus=10;
Address_bus=11;
Address_bus=12;
Address_bus=13;
Address_bus=14;
Address_bus=15;
Address_bus=16;
Address_bus=17;
Address_bus=18;
Address_bus=19;
Address_bus=20;
Address_bus=32;
#480 $stop;
//#480 $finish;
end
initial begin
forever #5 Clock = ~Clock;
end
endmodule

Lab Report-32bit ALU and ROM (Final)

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Lab Report-32bit ALU and ROM (Final)

Caricato da

Copyright:

Formati disponibili

HOON MIEW JIE

DISHEN A/L KESEVA KUMAR BK10110062

SHERIELYN SAFIRA RANTI

32-BIT ALU and ROM

Functional Verification Using Verilog Hdl

Figure2.1 (a) Processor Control Unit Architecture and (b) Processor

32-Bit ALU Design

Figure 2.2 A basic 2-bit ALU [2]

ADDr {4'b0000, `ADD}

Figure 3.1 Opcodes module

32-Bit Alu Module

Figure 3.3 : Controller State Machine State Diagram

Figure 3.5 add a register in control module.

Figure 3.6 Verilog by previous author

Figure 3.7 32-Bit Instruction Register used in the CPU

32-Bit Datapath Module and register R4, R5, R6, R7

Figure 3.8 assignment of the new register

32-bit ROM Module and 4x109 addressable Memory Address

Read only memory (ROM) functions as a permanent storage (non-volatile), for

Figure 3.11 The ROM block diagram and ROM implementation

RESULTS AND DISCUSSIONS

Figure 4.1 Load and Store Operation

Figure4.2 Loading data 00FFFFFF(hex) and 0000FFFF(hex) into register1

Register-Register Operation - Multiplication and Division

Data_stored[12] = {4'b0010,`MULT, `R7, `R1, `R2};

Data_stored[14] = {4b0010,`EQL, `R3, `R1, `R2};

Figure 4.4 Register-Register Operation

Figure 4.5 Register-Register Operation

Problems Encountered During The Design

1) inout, wire, assign declaration

2) Register-Register Addition Operation

Result is correct but and is passed

Drawbacks Of The 32-Bit Alu Design

1) Complex Architecture Implementation

Ismail Saad, P. V. (2004). HDL-BASED DESIGN METHODOLOGY OF 16-bit RISC

Andrei-Sorin Gheorghe, C. B. (2010). GENERAL PURPOSE MICROPROCESSOR. SAVAGE16

Bhaaskaran, V. S. (2011). 16-Bit RISC Processor Design for Convolution Application. 1.

J., H. (1998). Computer architecture and organisation . Mcgraw Hill.

Liu, Y. (2012). HISC: A computer architeccture using operand descriptor. 1.

Rahman, A. B. (August 2004). HDL-BASED DESIGN METHODOLOGY OF 16-bit RISC

[10] Repak, M. (2012). A Simple Simulator for a Basic microprocessor.

// The following line indicates that a file "monitor.v"exists and contains

input [3:0] Function;

`timescale 100ps / 10ps

inout [31:0] Sysbus;

wire [7:0] Rd;

// Generate tri-state contol signals for SysBus

assign Rs2 = ( ReadPC_2) ? PC : 32'bz;

alu ALU ( alu_Zero, alu_Overflow, alu_Underflow, result, Rs1, Mux2_out, Function);

Mux2_out = ( Rs2_sel ) ? Rs2 : DR;

assign Sysbus = ( TrisALU ) ? result : 32'bz;

Data_stored[4] = {`SUBr, `R3, `R1, `R2};

rom ROM( Data_out, Address_bus, nOE, nCE );

[31:0] PC_wire, R1_wire, R2_wire, R3_wire;

//instance of the design

Potrebbero piacerti anche