Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
BK10110096
BK10160339
BK10110269
TITLE:
1.
INTRODUCTION
A computer is one of the most important discoveries of the human race. The CPU (Central
processing unit is the main brain of this great invention which could be one of the most used
appliances in the world today. Almost every household has a computer and it is a multi-million
dollar industry which is still vastly growing and expanding. The CPU is used to process
information needed by the computer, it acts like a brain when we think to move our body
parts and etc. When looking into a CPU, the few basic things that we should know is that the
CPU has four basic tasks that it performs. The tasks are fetch, decode, manipulate and output.
The speed rating for the CPU is measured by MHz, but in fact it is not all that accurate. This
CPU chip is comprised of a million logic gates that are embedded which are used to complete
a variety of different operations. These gates are used with a clock that regulates at a speed
which the CPU is fed data. The CPU is comprised of five basic components; RAM, Registers,
buses ALU and control unit. RAM is a component which is created from combining latches with
a decoder. The latches create circuitry that can remember while the decoder creates a way for
individual memory location to be selected. Registers are components that are special memory
locations that can be accessed very swiftly. The three registers are instruction register, the
program counter and the accumulator. Buses are the information path or highway for the CPU.
Buses are many tiny wires that carry data between components. The most important buses
are the address buses, the data buses and the control buses. The ALU or arithmetic control
unit performs all the mathematical calculations of the CPU. It is composed of complex circuitry
which makes this component very important. The ALU can add, subtract, multiply, divide and
perform a lot of other calculations on binary numbers as well.
2.
LITERATURE REVIEW
2.1
16-Bits
Risc
Processor
Architecture:
Controller
State
Machines
And
In his book, Paul V. Bolotoff presents the general ALU function. Microprocessors tend to
have a single module that performs arithmetic operations on integer values. This is
because many of the different arithmetic and logical operations can be performed
using similar (if not identical) hardware. The component that performs the arithmetic
and logical operations is known as the Arithmetic Logic Unit, or ALU. [2]
The ALU is one of the most important components in a microprocessor, and is
typically the part of the processor that is designed first. Once the ALU is designed, the
rest of the microprocessor is implemented to feed operands and control codes to the
ALU.
Logic and addition are some of the easiest, but also the most common
operations. For this reason, typical ALUs are designed to handle these operations
specially, and other operations, such as multiplication and division, are handled in a
separate module.
Notice also that the ALU units that discussed here are only for integer
datatypes, not floating-point data. Luckily, once integer ALU and multiplier units have
been designed, those units can be used to create floating-point units (FPU). The
following is an example of a basic 2-bit ALU. The boxes on the right hand side of the
image are multiplexers and are used to select between various operations: OR, AND,
XOR, and addition.
METHODOLOGY
A 32-bit ALU is modified based on the work by Ismail Saad et al [1]. For a 32-bit
micorprocessor design, some of major modification is made, where modification for each
module and the justification is explained in details below.
3.1
Opcodes module
An opcodes is the operation codes which contains a list of defined function that are used for
the microprocessor. The coding itself does not have any specific task. It is just used to define
all the listed function which easier for anyone to understand the program such as we can just
call the opcode without remember the value inside it which might consist of a very long binary
number.
//Macro Opcodes
`define
`define
`define
`define
`define
`define
Having 32-bit ALU makes it possible to implement floating-point (which has a 32-bit format),
possible of complex calculation within a very short cycle-time: a 8-bit processor using floatingpoint format will need at least 4-cycle for data transfer compared to a 32-bit processor which
only requires at least 1 cycle.
The ALU function is expanded to 4 bits, equivalent to 16 instructions in total. With this, extra
ALU function is implemented.
By increasing the number of bits, IR instruction set can be increased, where more ALU
function is implemented and the addressable register is increased to 255 register (2 8).
However, in this project, the register is only increased to 8 register, (in which RO to R8
registers are created) to keep the project manageable and as a proof of concept.
always @(input1 or input2 or Function)
case (Function)
`ADD : result_temp = input1 + input2;
`SUB : result_temp = input1 - input2;
`AND : result_temp = input1 & input2;
`OR : result_temp = input1 | input2;
`XOR : result_temp = input1 ^ input2;
`NOT : result_temp = ~input1;
`SRA : result_temp = input1 >> 1;
`SLA : result_temp = input1 << 1;
//new function
`MOD : result_temp = input1 % input2;
//new function
`EQL : result_temp = (input1==input2)? 1:0;
//new function
`GREATER : result_temp = (input1 > input2)? 1:0; //new function
`SMALLER : result_temp = (input1 < input2)? 1:0; //new function
`MULT : result_temp = input1 * input2;
//new function
`DIV : result_temp = input1 / input2;
//new function
default : result_temp = input1;
endcase
Figure 3.2 32-Bit Alu Module
Defined in the opcodes.v, the ALU perform the arithmetic and logical operation in the
microprocessor. In this microprocessor the ALU function. The ALU will operate the arithmetic
function according to the defined function in the opcodes. The ALUs 16 functions are
addition , subtraction, and ,or , xor, xnor, not, modulus, logical left and right shift, arithmetic
left and right shift ,multiplication, division, and choose the 1 st input and 2nd input. The input
and output are 32bit. The new function is labelled in green comment.
An additional flag status is also designed in ALU module. The overflow flag is designed.
The calculation is made in a buffer register of 2 64 bit. When the result is larger than 2 32, the
overflow logic is triggered. Note that for troubleshooting purpose, the members set the result
to return the remainder value if the resultant calculation is less than 2 32. In this case, the
function of carry-over bit is simulated. If the resultant calculation is bigger than 2 33, the result
will returns 232 value. In practice, this implementation of ALU design is wrong. ALU module are
designed in bit-by-bit which has inbuilt carry-over bit, like ripple carry-in bit. Overflow bit is
actually the of the MSB carry-over bit. This is designed as a proof-of concept.
always @(result_temp)
begin
if((result_temp > 32'hFFFFFFFF) || (result_temp == 32'hFFFFFFFF))//2^32
begin
Overflow = 1;
result = result_temp-64'hF00000000;
if (result_temp > 32'hFFFFFFFF) result = 32'hFFFFFFFF;
end
else
begin
Overflow = 0;
result=result_temp;
end
end
The overflow logic from notifies the control module and set the overflow flag if the setbit is
selected.
Figure 3.2 32-Bit Alu Module
3.3 32-Bit Control Module and register R4, R5, R6, R7
The control module executes operations in proper sequence by means of controller state
machine as shown in Figure 3. The control module generates the control signals shown in
Figure 2 that causes each instruction to be executed. The instruction register is a 16-bit
register with IR[15:9] being reserved for opcodes, IR[8:6] for the destination register (Rd),
IR[5:3] for Rs1 and IR[2:0] for Rs2.
The CSM has three states: Fetch1 (00), Fetch2 (11) and Execute (01) that coded by
using gray code. The controller state machine is based on the Mealy machines as referred in
the reference [1]. Details of the state transition are shown in the state diagram in the figure
below.
In addition it also has 4 memory cycles sub states: address_setup (00), address_hold
(01), data_setup (11) and data_hold (10). To distinguish transitions of operation from one state
to another, the data_hold sub state of memory cycle and the 2-mode bit fields of instruction
are used.
Referring to figure, TRUE or FALSE represents the presence of data_hold in the sub state
cycle, the 2-bit (00,01,11,10) is represent the possible values of mode bit and XX is referred
as dont care condition.
Opcode = IR[31:24];
ModeBit = IR[31:30];
Rd = IR[23:16];
Rs1 = IR[15:8];
Rs2 = IR[7:0];
ALUfunc = IR[27:24];
setbit = IR[29];
testbit = IR[28];
Figure 3.4 set the 32-bit IR in control module
Since new register address is appended to 8-bit; up to 2 8 register can be addressed. Four new
register are designed as a proof-of-concept to demonstrate the 32-bit architecture benefit. The
following code is set to add a register in control module.
input ReadR4_1;
wire ReadR4_1;
assign WriteR4 = ( En_wrt_dec && ( Rd == `R4)) ? 1 : 0;
assign ReadR4_1 = ( En_read_dec && Rs1 == `R4) ? 1 : 0;
assign ReadR4_2 = ( En_read_dec && Rs2 == `R4) ? 1 : 0;
3.4
operation. Datapath will store the data in the register of the main memory. Datapath module
is used to suit the definition of design processor to model read and write operation using Rs1
and Rs2 from the registers available in the microprocessor. In datapath, there is the tri state
buffer. The Tris state buffer (TrisALU, TrisPC, TrisRs2, TrisRd, nTrisRd) is used to control the flow
of data in the module. In our datapath module, there are 8 register being used which are PC,
R1, R2, R3, R4, R5, R6, and R7. The extra register is made by assigning new variable in the IR
instruction set. Then the selected register is assigned its control variable in control unit. A
further assignment is made in datapath module. The coding below shows the assignment of
the new register.
Assign Rs1 = ( ReadR4_1) ? R4 : 32bz;
4.0
4.1
Verification of Waveform
For this project, a 32-bit is implemented onto the processor architecture. In changing the
number of bits from 16 to 32-bit, more ALU function and register can be used. For a 32-bit
ROM, maximum number address is increased; from 65x10 3 (216) to 4x109 (232).
Load and Store Operation
For the verification, the data 00FFFFFFhex and 0000FFFFhex is written to the register1 and
register2 respectively. The following coding is set in ROM. The testbench then transfers the
data from ROM into the CPU core using predefined timing.
//R1 <- 32'h0000FFFF
Data_stored[0] = {`LD, `R1, `R1, `R1};
Data_stored[1] = 32'h00FFFFFF;
//R2 <- 32'hFFFF0000
Data_stored[2] = {`LD, `R2, `R2, `R2};
//
Data_stored[3] = 32'h0000FFFF;
The
division
of
00FFFFFFhex(16777215dec)
by
0000FFFFhex(65535dec)
is
00000100hex(256dec). The ALU is shown performing and a 32-bit ALU can address a
maximum of 232 digits.
Figure 4.3 Loading data 00FFFFFF(hex) and 0000FFFF(hex) into register1 and
register2 respectively.
For MULTIPLICATION operation, since the result is larger than 2 32, it triggers the overflow flag.
When the overflow flag is triggered, and the result is larger than 2 33, the result will
automatically be registered 232 as an indicator for user. In real operation, overflow flag should
triggers immediate program termination via interrupt function. Register7 is used in this case
to test the implementation.
Register-Register Operation Equal, Greater, Smaller
At address 14, an EQUAL comparison is made between the data in register1 and register2.
The data are not equal; therefore the result given is zero and written into register3. Address
15 and 16 performs GREATER and SMALLER operation respectively. The results given were 1
and 0 respectively; input1 being greater than input2. All the result is written in register3. The
following code is used to perform the operation.
larger than data in register2. Consequently, the zero flag is terminated. At address 15,
SMALLER operation returns 0 (false) value and the zero flag is triggered again.
4.2
Since in the coding, 1 cycle is stated to be 10 units, 3 memory cycles is equivalent to 36 clock
cycle, or 360 units.
3) Understanding the Clock-Cycle
Therefore, in this program the pulse is set a #5 unit, where each 1 clock cycle (consist of
rising and falling edge) is #10 unit; where 1 unit is 100ps. A state has 4 sub-state, where each
sub-state has 3-clock cycle; giving a state of at least 12-clock cycle (1-memory cycle), or
12000 ps, or 120 unit. Some of the instructions take 2 states to complete; of which 240 units
is necessary. This information is important for the system to work. Although this information is
given in the beginning of the lesson, this understanding was made very late and a lot of time
is wasted on senseless trial-error troubleshooting. Unfortunately, most of the members were
overwhelmed at the information presented at the beginning of the lesson and had a hard time
discerning essential information. The lack of the members initiative and preparation hinders
the progress as well.
4) No Proper Coding Documentation
Although the paper presents extensive information and block diagram, the lack of coding
documentation makes it very hard to relate concept; especially when the code are given in
parts. The full picture and the usage of each module were understood at a very late stage.
Even then, it is very difficult to understand why the author chooses the particular coding. For
example: in the opcodes module, it is difficult to see what the assignment of the binary code
is use for. In the code below, there is no explanation on what is ADDr, ADDi, ADDrcc and so on
and a proper documentation on how to use it. Secondly, there is no explanation as on how the
author set the macro opcodes in relation to the control module. Third, a majority of the
explanation given is based on the paper which helps little as most of members are struggling
on implementing their idea into Verilog coding due to the lack of understanding on the Verilog
coding.
4.3
This implementation is however useful for students in understanding and learning how the
design of a processor can be made and simulated in Verilog environment. It also helps the
student to associate in designing and implementation in real-world engineering.
REFERENCES
[1]
[2]
CPU designers have used a variety of names for the arithmetic logic unit, including
"ALU", "integer execution unit", and "E-box". Paul V. Bolotoff. "Functional Principles of
Cache Memory" 2007.
[3]
[4]
[5]
[6]
Kui YI, Y.-H. D. (2009). 32-bit RISC based on MIPS. instruction fetch module design, 1.
[7]
[8]
[9]
Raj Kumar Singh Parihar, S. R. (2006). REPORT ON DESIGN OF 16 BIT RISC
PROCESSOR.
APPENDIX: CODING
opcodes.v
//ALU functions
`define ADD
`define SUB
`define AND
`define OR
`define XOR
`define NOT
`define SRA
`define SLA
`define MULT
`define DIV
`define EQL
`define GREATER
`define SMALLER
`define MOD
4'd0
4'd4
4'd1
4'd2
4'd3
4'd5
4'd6
4'd7
4'd8
4'd9
4'd10
4'd11
4'd12
4'd13
//Registers
`define PC 8'b00000100
`define DR 8'b00000101
`define IR 8'b00000110
`define R0 8'b00000000
`define R1 8'b00000001
`define R2 8'b00000010
`define R3 8'b00000011
`define R4 8'b00000111
//multiply
//divide
//equal
//greater
//smaller
//modulus
//program counter
//Instruction Register
`define R5 8'b00001111
`define R6 8'b00011111
`define R7 8'b00111111
//Macro
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
`define
Opcodes
ADDr
ADDi
ADDrcc
ADDicc
CADDrZ
CADDiZ
SUBr
SUBi
SUBrcc
SUBicc
CSUBrZ
CSUBiZ
ANDr
ANDi
ANDrcc
ANDicc
CANDrZ
CANDiZ
ORr
ORi
ORrcc
ORicc
CORrZ
CORiZ
XORr
XORi
XORrcc
XORicc
CXORrZ
CXORiZ
NOTr
NOTrcc
CNOTrZ
SRAr
`define LD
`define ST
{4'b0000,
{4'b0100,
{4'b0010,
{4'b0110,
{4'b0001,
{4'b0101,
{4'b0000,
{4'b0100,
{4'b0010,
{4'b0110,
{4'b0001,
{4'b0101,
{4'b0000,
{4'b0100,
{4'b0010,
{4'b0110,
{4'b0001,
{4'b0101,
{4'b0000,
{4'b0100,
{4'b0010,
{4'b0110,
{4'b0001,
{4'b0101,
{4'b0000,
{4'b0100,
{4'b0010,
{4'b0110,
{4'b0001,
{4'b0101,
{4'b0000,
{4'b0010,
{4'b0001,
{4'b0000,
`ADD}
`ADD}
`ADD}
`ADD}
`ADD}
`ADD}
`SUB}
`SUB}
`SUB}
`SUB}
`SUB}
`SUB}
`AND}
`AND}
`AND}
`AND}
`AND}
`AND}
`OR}
`OR}
`OR}
`OR}
`OR}
`OR}
`XOR}
`XOR}
`XOR}
`XOR}
`XOR}
`XOR}
`NOT}
`NOT}
`NOT}
`SRA}
//add R1 with R0
{8'b10000000}
{8'b11000000}
alu.v
`include "opcodes.v"
`timescale 100ps / 10ps
module alu( Zero, Overflow, Underflow, result , input1, input2, Function);
output [31:0] result;
output Zero, Overflow, Underflow;
input [31:0] input1, input2;
//new function
//new function
//new function
1:0;
//new function
1:0;
//new function
//new function
//new function
endmodule
control.v
`include "opcodes.v"
`define Fetch1 0
`define Execute 1
`define Fetch2 3
`define
`define
`define
`define
address_setup 0
address_hold 1
data_setup 3
data_hold 2
[1:0] state;
[1:0] sub_state;
[31:0] IR;
zero_flag_reg, overflow_flag_reg, underflow_flag_reg;
state_wire = state;
sub_state_wire = sub_state;
Rs2_sel_wire = Rs2_sel;
zero_flag_reg_wire = zero_flag_reg;
overflow_flag_reg_wire = overflow_flag_reg;
//
// Divide instruction into Opcode, Rd, Rs1, Rs2 and Immediate
assign
assign
assign
assign
assign
assign
assign
assign
Opcode = IR[31:24];
ModeBit = IR[31:30];
Rd = IR[23:16];
Rs1 = IR[15:8];
Rs2 = IR[7:0];
ALUfunc = IR[27:24];
setbit = IR[29];
testbit = IR[28];
//
// Identify memory write and generate memory control signals
//
assign memory_write = (Opcode == `ST) && (state
assign nME = ( sub_state == `address_setup ) ||
assign nALE = ( sub_state == `address_setup );
assign RnW = ( sub_state == `address_setup ) ||
~memory_write;
assign nOE = ( sub_state == `address_setup ) ||
memory_write;
assign ENB = ~nOE;
//
== `Execute);
( sub_state ==`data_hold );
( sub_state ==`address_hold )||
( sub_state ==`address_hold ) ||
//
// Conditional Instruction
always @ (posedge Clock)
begin
if ((((state == `Execute && ModeBit == 2'b01) && sub_state ==`data_hold) || (state
== `Fetch1 && ModeBit == 2'b00 && sub_state ==`data_setup)) && setbit == 1'b1)
zero_flag_reg <= Zero;
overflow_flag_reg <= Overflow;
underflow_flag_reg <= Underflow;
end
// All instructions complete in exactly 12 clock cycles
// state is Fetch1 (R+R), Fetch2(R+I) & Execute(L/S)
// sub_state controls memory cycle
//
always @(posedge Clock)
begin
if (nReset == 0)
state <= 32'bz;
else
begin
case (state)
0:if (sub_state == `data_hold && (ModeBit == 2'b00)) state <= #20 0;
else if (sub_state == `data_hold && ModeBit == 2'b01) state <= #20 1;
else if (sub_state == `data_hold && ((ModeBit == 2'b10) ||(ModeBit ==
2'b11))) state <= #20 3;
else if (ModeBit == 2'b01 || ModeBit == 2'b00) state <= #20 0;
3:if (sub_state == `data_hold && ((ModeBit == 2'b10) || (ModeBit ==2'b11)))
state <= #20 1;
else if (ModeBit == 2'b10 || ModeBit == 2'b11) state <= #20 3;
1:if (sub_state == `data_hold ) state <= #20 0;
else state <= #20 1;
endcase
sub_state[0] <= #20 ~sub_state[1];
sub_state[1] <= #20 sub_state[0];
end
end
//
// Update IR as required
//
always @(posedge Clock)
begin
if ((state == `Fetch1) && ( sub_state == `data_setup ))
IR <= #20 Sysbus;
end
//
// Asynchronous reset
//
//
always @(nReset)
if (!nReset)
begin
assign state = 0;
assign sub_state = 0;
assign IR = 0;
assign zero_flag_reg = 0;
assign overflow_flag_reg = 0;
assign underflow_flag_reg = 0;
end
else
begin
deassign
deassign
deassign
deassign
deassign
deassign
end
state;
sub_state;
IR;
zero_flag_reg;
overflow_flag_reg;
underflow_flag_reg;
endmodule
datapath.v
`include "opcodes.v"
`timescale 100ps / 10ps
module datapath( Sysbus, Zero, Overflow, Underflow, Function, TrisALU, TrisPC, TrisRs2,
TrisRd,
nTrisRd, Clock, nReset, ReadPC_1, ReadPC_2, ReadR0_1, ReadR0_2, ReadR1_1, PC_inc,
ReadR1_2,
ReadR2_1, ReadR2_2, ReadR3_1, ReadR3_2, ReadR4_1, ReadR4_2, ReadR5_1, ReadR5_2,
ReadR6_1, ReadR6_2,
ReadR7_1, ReadR7_2, PC_inc, WriteR1, WriteR2, WriteR3, WriteR4, WriteR5, WriteR6,
WriteR7, Rs2_sel,
LoadDR, LoadPC,PC_wire, R1_wire, R2_wire, R3_wire, R4_wire, R5_wire, R6_wire, R7_wire,
R0_wire,
DR_wire, Rd_wire, Rs1_wire, Rs2_wire, Result_wire);
inout [31:0] Sysbus;
output Zero, Overflow, Underflow;
input [3:0] Function;
input TrisALU, TrisPC, TrisRs2, TrisRd, nTrisRd, Rs2_sel,ReadPC_1, ReadPC_2;
input ReadR0_1, ReadR0_2, ReadR1_1, ReadR1_2,ReadR2_1, ReadR2_2, ReadR3_1,
ReadR3_2;
input PC_inc, WriteR1,WriteR2, WriteR3, LoadDR, LoadPC;
input ReadR4_1, ReadR4_2, ReadR5_1, ReadR5_2, ReadR6_1, ReadR6_2, ReadR7_1,
ReadR7_2;
input WriteR4, WriteR5, WriteR6, WriteR7;
input Clock, nReset;
output [31:0] PC_wire, R1_wire, R2_wire, R3_wire, R0_wire, DR_wire;
output [31:0] Rd_wire, Rs1_wire, Rs2_wire, Result_wire;
output [31:0] R4_wire, R5_wire, R6_wire, R7_wire;
reg [31:0] PC, R1, R2, R3, R4, R5, R6, R7, R0, DR;
wire Zero, Overflow, Underflow;
wire [31:0] Rd, Rs1, Rs2, result;
wire [31:0] Mux2_out, Mux1_out;
assign
assign
assign
assign
assign
assign
assign
assign
assign
Rs1
Rs1
Rs1
Rs1
Rs1
Rs1
Rs1
Rs1
Rs1
=
=
=
=
=
=
=
=
=
(
(
(
(
(
(
(
(
(
ReadPC_1) ? PC : 32'bz;
ReadR0_1) ? R0 : 32'bz;
ReadR1_1) ? R1 : 32'bz;
ReadR2_1) ? R2 : 32'bz;
ReadR3_1) ? R3 : 32'bz;
ReadR4_1) ? R4 : 32'bz;
ReadR5_1) ? R5 : 32'bz;
ReadR6_1) ? R6 : 32'bz;
ReadR7_1) ? R7 : 32'bz;
assign
assign
assign
assign
assign
assign
Rs2
Rs2
Rs2
Rs2
Rs2
Rs2
=
=
=
=
=
=
(
(
(
(
(
(
ReadR2_2)
ReadR3_2)
ReadR4_2)
ReadR5_2)
ReadR6_2)
ReadR7_2)
?
?
?
?
?
?
R2
R3
R4
R5
R6
R7
:
:
:
:
:
:
32'bz;
32'bz;
32'bz;
32'bz;
32'bz;
32'bz;
PC;
R1;
R2;
R3;
R4;
R5;
R6;
R7;
DR;
assign
assign
assign
assign
assign
assign
assign
assign
assign
assign
assign
assign
assign
assign
PC_wire=PC;
R1_wire=R1;
R2_wire=R2;
R3_wire=R3;
R4_wire=R4;
R5_wire=R5;
R6_wire=R6;
R7_wire=R7;
DR_wire=DR;
R0_wire=R0;
Rs1_wire=Rs1;
Rs2_wire=Rs2;
Rd_wire=Rd;
Result_wire=result;
endmodule
cpu_core.v
//cpu_core.v
`timescale 100ps / 10ps
module cpu_core( Sysbus, Data_in, ENB, nME, nALE, RnW, nOE, SDO, Clock, nReset,
Test, SDI, PC_wire, R1_wire, R2_wire, R3_wire, R4_wire, R5_wire, R6_wire, R7_wire,
R0_wire, DR_wire, Rd_wire, Rs1_wire, Rs2_wire, Result_wire,
sub_state_wire, state_wire, Rs2_sel_wire, TrisMem_wire, Function,
zero_flag_reg_wire, overflow_flag_reg_wire, underflow_flag_reg_wire);
//
// I/O declarations
//
inout [31:0] Sysbus;
input [31:0] Data_in;
output ENB;
output nME, nALE, RnW, nOE, SDO;
//control memory
output[31:0] PC_wire, R1_wire, R2_wire, R3_wire, R4_wire, R5_wire, R6_wire, R7_wire;
output[31:0] Rd_wire, Rs1_wire, Rs2_wire, R0_wire, DR_wire, Result_wire;
output[3:0] Function;
output[1:0] sub_state_wire, state_wire;
output Rs2_sel_wire, TrisMem_wire;
output zero_flag_reg_wire, overflow_flag_reg_wire, underflow_flag_reg_wire;
input Clock, nReset, Test, SDI;
wire [31:0] Sysbus;
wire TrisMem;
assign TrisMem_wire = TrisMem;
/////////////////////////////////////////////////////////////////////
// Simulation of CPU core
/////////////////////////////////////////////////////////////////////
// This system doesn't simulate the scan path
//
assign SDO = SDI;
// This system has a single internal system address/data bus
// Data_in must pass though a tri-state buffer before connection to SysBus
//
assign Sysbus = (TrisMem) ? Data_in: 16'bz;
//
// This system is built from two smaller modules
//
control Control( Sysbus, nOE, RnW, nME, nALE, Zero, Overflow, Underflow, Function, Clock,
nReset,TrisPC, TrisALU, ENB, TrisMem, TrisRs2, TrisRd, nTrisRd, ReadPC_1,
ReadPC_2,ReadR0_1,
ReadR0_2, ReadR1_1, ReadR1_2, ReadR2_1, ReadR2_2,ReadR3_1,ReadR3_2, ReadR4_1,
ReadR4_2,
ReadR5_1, ReadR5_2, ReadR6_1, ReadR6_2, ReadR7_1, ReadR7_2, WriteR1, WriteR2,
WriteR3,
WriteR4, WriteR5, WriteR6, WriteR7, PC_inc, Rs2_sel, LoadDR, LoadPC, sub_state_wire,
state_wire, Rs2_sel_wire, zero_flag_reg_wire, overflow_flag_reg_wire,
underflow_flag_reg_wire);
datapath Datapath( Sysbus, Zero, Overflow, Underflow, Function, TrisALU, TrisPC, TrisRs2,
TrisRd,
nTrisRd, Clock, nReset, ReadPC_1, ReadPC_2, ReadR0_1, ReadR0_2, ReadR1_1, PC_inc,
ReadR1_2,
ReadR2_1, ReadR2_2, ReadR3_1, ReadR3_2, ReadR4_1, ReadR4_2, ReadR5_1, ReadR5_2,
ReadR6_1, ReadR6_2,
ReadR7_1, ReadR7_2, PC_inc, WriteR1, WriteR2, WriteR3, WriteR4, WriteR5, WriteR6,
WriteR7, Rs2_sel,
LoadDR, LoadPC,PC_wire, R1_wire, R2_wire, R3_wire, R4_wire, R5_wire, R6_wire, R7_wire,
R0_wire,
DR_wire, Rd_wire, Rs1_wire, Rs2_wire, Result_wire);
endmodule
rom.v
//room module
//words 0-255
`include "opcodes.v"
`timescale 100ps/10ps
module rom (Data_bus, Address_bus, notOE, notCE);
output [31:0] Data_bus;
input [31:0] Address_bus;
input notOE, notCE;
specify
specparam tViolate=250;
$width (negedge notCE, tViolate);
$width (posedge notCE, tViolate);
$setuphold (edge[10, 01] notCE, Address_bus, tViolate, tViolate);
$setuphold (edge[10, 01] notCE, notOE, tViolate, tViolate);
endspecify
reg [31:0] Data_stored [1024:0];
assign Data_bus = ((notOE == 0) && (notCE == 0))? Data_stored [Address_bus] : 32'bz;
initial
begin
//CODE:
// opcode(modebit; setbit; testbit;) rd, rs1, rs2
//alu rs1 with mux2(rs2 or dr)
//R1 <- 32'h0000FFFF
Data_stored[0] = {`LD, `R1, `R1, `R1};
Data_stored[1] = 32'h0000FFFFF;
//R2 <- 32'hFFFF0000
Data_stored[2] = {`LD, `R2, `R2, `R2};
Data_stored[3] = 32'h000000FF;
//
test_bench7_cpu.v
//Module Name: cpu_tst.v
//Test Bench for cpu model.
`timescale 100ps/10ps
module test_bench7_cpu;
//data type declarations
reg nReset, Clock;
reg [31:0] Address_bus;
wire
wire
wire
wire
wire
wire
wire
= 1;
Address_bus=0;
Address_bus=1;
Address_bus=2;
Address_bus=3;
Address_bus=4;
Address_bus=5;
Address_bus=6;
Address_bus=7;
#480
Address_bus=8;
#480
#480
#480
#480
#480
#480
#480
#480
#480
#480
#480
#480
#480
Address_bus=9;
Address_bus=10;
Address_bus=11;
Address_bus=12;
Address_bus=13;
Address_bus=14;
Address_bus=15;
Address_bus=16;
Address_bus=17;
Address_bus=18;
Address_bus=19;
Address_bus=20;
Address_bus=32;
#480 $stop;
//#480 $finish;
end
initial begin
forever #5 Clock = ~Clock;
end
endmodule