Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
8.0 Conclusion..........................................................................................................................29
8.1 Suggestions for Further Work.........................................................................................30
Basic functions such as driving the fire truck, squirting water from the hose
and capturing images/video from the mounted camera are to be coordinated with an
on-board electronic system consisting of a custom built micro-controller together with
some peripherals. Separate motors are used for the real wheel drive, steering and
the pump.
Pump
Pump
Camera
Camera
Front
Front Wheels
Wheels
Rear
Rear Wheels
Wheels
The objective of the project is to design both the electronic control system and
a Graphical User Interface (GUI) for the PC. The electronic control system was to be
designed in two phases.
Phase one consisted the design of an I2C interface which was to be the
camera control interface and the pulse width modulators (PWM) for the driving the
motors.
The above units were to be designed, implemented and fully tested for phase
two.
The USB interface links the whole electronic system with the PC GUI. This
interface should be meet the video streaming data rate and also be able to transfer
control/status information as well as transfer program instructions to Yoda whenever
necessary.
2.0 Yoda’s Memory Organisation
Yoda is a logistics based controller so it was designed to perform data transfer
operations efficiently. Usually, this involves the data transfer between two peripheral
units within the system. Therefore, the data memory organisation of Yoda was
chosen to have a linear address space so that the peripherals can be mapped into
the data memory of Yoda. This was expected to facilitate the transfer of data. Figure
2 below shows the data memory map intended to be used for Yoda.
As can be seen from Figure 2, Yoda consists of only one page of data
memory which is 8-bits wide and 256 levels deep. The I2C unit was allocated four
registers and the 5 PWM units were allocated a register (for setting the desired pulse
width) each as shown. Since the Video and the USB units were not designed,
sufficient memory locations (i.e. 128) were allocated to the Video and (64 to the) USB
units. The rest of the memory space consisted of 33 general purpose registers which
can be used to hold constants or variables as desired by the end-user.
The FSR (File Select Register) and the INDF (Indirect File Register) serve to
function as a pointer. FSR holds the address the pointer is pointing to while the
contents of the pointer can be accessed via INDF. This simple pointer can be used
for simple indexing operations and will be desirable when dealing with video streams
for example.
The program memory was designed to be a RAM block of 8x256. The memory
depth was more that sufficient and the full program memory will rarely be used.
However, extra memory was included to allow for a worst case scenario. Figure 3
below illustrates the program memory map for Yoda:
Address Description
0x00 Reset Vector
0x01 Instruction
0x02 Instruction
……. ………
0xFE Instruction
0xFF Interrupt Vector
Figure 3: Program Memory Map
As can be seen from above, the reset vector is set to the first address of
program RAM. This allows the program to flow naturally from top to bottom since a
device reset or the burning of a new program will force the program counter to reset
to 0.
It was decided that there should be a single interrupt vector for the controller
and this interrupt vector address was assigned to the last memory location. Normal
program instructions are highly unlikely to be stored in this location. Usually, a “goto”
statement will be placed in this location to divert the program to an interrupt service
routine located elsewhere in the program.
3.0 Instruction Set of Yoda
An instruction set was designed to provide Yoda with the basic
functions that it was expected to perform. Table 1 details the instruction set:
Encoding Mnemonic Operand 1 Operand 2 D Description
0x00 NOP X X X No operation
0x01 MOVLW Literal X X W=Literal
0x02 MOVWF Addr X X F(addr)=W
0x03 MOVFW Addr X X W=F(addr)
0x04 MOVLF Literal Addr X F(addr)=Literal
0x05 MOVFF Addr1 Addr2 X F(addr2)=F(addr1)
0x06 ADDWF Addr X 1 W=F(addr)+W (D=0)
0 F(addr)=F(addr)+W (D=1)
0x07 ADDLW Literal X X W=Literal + W
0x08 SUBWF Addr X 1 W=F(addr) - W (D=0)
0 F(addr)=F(addr) - W (D=1)
0x09 SUBLW Literal X X W=Literal - W
0x0A INCF Addr X 1 W=F(addr) + 1 (D=0)
0 F(addr)=F(addr) + 1(D=1)
0x0B DECF Addr X 1 W=F(address) - 1 (D=0)
0 F(addr)=F(addr) - 1(D=1)
0x0C ANDWF Addr X 1 W=W & F(addr) (D=0)
0 F(addr)=F(addr) & W (D=1)
0x0D ANDLW Literal X X W=W & Literal
0x0E ORWF Addr X 1 W=W OR F(addr) (D=0)
0 F(addr)=F(addr) OR W (D=1)
0x0F ORLW Literal X X W=W OR Literal
0x10 XORWF Addr X 1 W=W XOR F(addr) (D=0)
0 F(addr)=F(addr) XOR W (D=1)
0x11 XORLW Literal X X W=W XOR Literal
0x12 RESET X X X Reset program counter
0x13 BTFSS Addr Bit Location X Skip next instruction if bit in
F(Addr) is set
0x14 BTFSC Addr Bit Location X Skip next instruction if bit in
F(Addr) is clear
0x15 BSF Addr Bit Location X Set bit in F(addr)
0x16 BCF Addr Bit Location X Clear bit it F(addr)
0x17 RRF Addr X X Rotate F(addr) to right once
0x18 RLF Addr X X Rotate F(addr) to left once
0x19 GOTO Addr X X Goto Addr in program memory
0x1A CALL Addr X X Call subroutine at Addr
0x1B RET X X X Return from subroutine
Table 1: Instruction Set Summary
Note that ‘W’ in the above corresponds to the working register (which is the
only accumulator in Yoda) and F denotes a file register and Addr denotes an
address referring to either a data or program memory location. The “Bit Location”
stated in Table 1 denotes the position of the bit to be set/cleared or tested. Also, ‘D’
above denotes the destination and is only a 1-bit field. Clearing D to 0 usually returns
the result to W whereas setting it to 1 returns the result to a file register.
22-Bit Field
The entire datapath illustrated above is 8-bits wide except for the ALU control
signals, the Data RAM Write_En bit and the Jump signal which is also 1-bit wide. The
following subsections will detail the functions of each of the functional blocks in
Figure 5.
4.1 ALU
The function of the ALU is to perform all the arithmetic and logic operations.
The ALU has two data input ports, namely input A and input B. Input A is always the
contents of the W register (accumulator) while input B can either be contents of a
memory location or a literal as shown in Figure 5. The Mode signal to the ALU
defines the operation to be performed on the data from inputs A and B. Note that the
arithmetic operations are all unsigned and unprotected against overflow. Therefore,
these events must be checked in software. Table 2 below summarises the different
modes of the ALU:
The data RAM block shown in Figure 5 was an 8x256 memory block. Memory
mapping of the peripherals was not carried out at this stage of the design. The
primary aim was to implement a fully working processor before performing the
memory mappings to peripherals.
The data RAM was selected to be two-port RAM since this allows
simultaneous read and write capability which is necessary for fast data transfers (1
cycle as opposed to 2) and this is desired for a logistics processor. This meant that
the RAM itself would have a separate read and write address.
For present purposes, peripheral mapping was excluded and the following
VHDL entry was used to implement this RAM block:
The Status register is also an 8-bit register which holds primarily two important
bits: the C_bit and the Z_bit. The C_bit is set whenever there is an overflow from the
ALU as a result of an arithmetic operation and the Z_bit is set if the result is zero.
Checking for zero is important especially when implementing loops (such as for loop)
and the status register will enable this. The C_bit can be used to check for overflow
in software since the ALU arithmetic is unprotected from overflows.
4.4 Pointer
A pointer was introduced into the system via the FSR (File Select Register)
and INDF register. These registers were memory mapped into data RAM as shown
below in Figure 8:
Address Description
0x04 FSR
0x05 INDF
Figure 8: Position of FSR and INDF in Data RAM
The FSR held the address of the pointer while the contents of the pointer can
be accessed via the INDF register. A pointer is a desirable asset to most
microprocessors and it enables software to index through a set of repetitive
operations on a consecutive sequence of memory locations.
Referring to Figure 5, the FSR can be loaded a new address via the datapath.
The Address Assignment block is simply an address comparator enabling the FSR to
be loaded if address location 0x04 is written to as shown below in Figure 9. This
effectively maps the FSR to the data memory and consequently, the memory location
0x04 in data memory is also saves the same data.
Whenever the pointer is used to access a memory location, the INDF register
will be read. When address 0x05 is read, the Address Assignment (Address
Comparator) block selects the read address to the Data Memory to be the address
value in FSR. This enables the contents of the pointer to be read. Similarly, if the
write address is 0x05, the Address Comparator block selects the write address to the
data memory as the contents of the FSR. This allows the contents of the pointer to
be written to. Figure 9 below provides an insight to the implementation of the pointer.
The simplified block diagram in Figure 10 below details the design of the
instruction cycle processor.
Program memory was chosen to be 22-bits wide and 256 levels deep. The
depth of the memory is sufficient to accommodate 256 lines of code which is suitable
for this application.
The IR (instruction) register is a 22 bit wide register with a clock and load
input. This register retains instructions from program memory so that the instruction
decoder can identify and process the instructions. The output from the IR register is
separated into the opcode (5-bits wide), operand 1 (8 bits wide), operand 2 (8 bits
wide) and D (1 bit wide). The controller FSM (Finite State Machine) would use the
opcode to determine the operation to perform while both operands and D will be used
by the datapath unit (which includes both the Instruction Cycle Processor and the
Instruction Set Processor).
The PC (program counter) register holds the address of the next instruction to
be executed. During normal program flow, the PC is incremented while during jump
or call operations, the PC value may change randomly depending on the destination
of the jump.
5.3 Stack
Whenever a function call is made, the stack records the value of the PC
register. This is done via a push operation. When the function returns, the stack
returns the recorded value into the PC register so that the main process can be
resumed. The stack is ‘popped’ this way via a pull operation. There is also possibility
to have a function call within another function. This requires a total of two memory
locations within the stack to record the two return locations. Therefore, the maximum
number of consecutive function calls allowed without pulling from the stack is equal
to the depth of the stack.
In a simple application such as this project, the use of functions will be limited.
Therefore, it is anticipated that a 4 level stack should be sufficient. This allows a
maximum of 4 consecutive function calls to be performed without need for popping
the stack. However, if this is exceeded, the stack will overflow and the processor will
fail to return to the correct address.
The stack was implemented in two stages. First, a simple 8-bit wide 4 level
deep memory with data in and data out ports was generated. The address input to
the memory (called the stack pointer) determines where data is read or written. The
following code shows how this stack memory was implemented in VHDL:
As can be seen in Figure 14, the clock to the stack_core (stack memory) is
inverted. This is to assure that the stack pointer has enough time to increment before
a new address is registered into the stack.
5.4 INC Blocks
The INC blocks in Figure 10 represent simple incrementers. This allows the
PC value to be incremented (for normal program flow) or allow the program to jump
one instruction forward (during btfss or btfsc commands).
6.0 Integrating the Processor & Development of Controller
The instruction cycle and instruction set processor were combined to a single
controller unit as shown below. This controller was responsible for generating all the
control signals (shown below) when necessary.
Dest
The two multiplexers in Figure 15 are used to divert operand 1 and operand 2
values into the rd_address (read address of RAM),wr_address (write address of
RAM), or literal_val (literal input) as necessary when carrying out the various
instructions. These two multiplexers are controlled by the control unit as shown.
Note that the ‘Dest’ signal shown above is the ‘D’ bit extracted from the
instruction word, i.e. bit 0 of the instruction word.
As can be seen, there are only three signals entering Yoda. This includes the
system clock, program data and a program enable signal. The program data line can
be used to “burn” the program memory of Yoda with a new set of commands. The
Prog_En signal must be toggled to indicate that a new program requires to be
written.
The green state above represents the start state where all control signals are
cleared. This is followed by the fetch state which loads the PC and the IR registers
with the appropriate values. The next state is the decode state. The states that are
positioned in a circle represent instructions from the instruction set and depending on
the value of IR (the op-code) the FSM will select one of these states to execute.
Within each of these states will be the control signals necessary to execute that
instruction. Once executed, the FSM will return to start and will process the next
instruction.
The three states at the top left cornet in Figure 16 are entered when the
device is in programming mode. Here, the PC is reset to 0 and is counted up to 255.
During each clock cycle a new instruction will be recorded into program memory.
Once the 255 memory locations are written, the FSM resets again and returns to the
start state to resume normal operation.
The call operation as shown in Figure 16 was implemented by two states. The
call state pushes the PC value into stack. Then the goto state is executed and the PC
is forced to jump to the specified value.
The entire details of the control signalling in the FSM are not provided here
due to the shear size of this FSM. For more detailed information, the reader is
referred to section 10 of the report where the project files are attached.
7.0 Testing Yoda
Each unit was tested independently before integrating all of these to form
Yoda. Note that at this stage, peripheral mapping had not included. This section
presents the final phase of testing of Yoda.
The test bench was used as a interface to “burn” different programs into the
program memory of Yoda. After programming has finished, the test bench only
generate a clock to Yoda. After programming, Yoda is configured to automatically
reset and execute the instructions. These results can be observed from the signal
display window to confirm Yoda’s functionality.
To simplify the entry of instruction into the test bench the corresponding op-
codes were assigned to the instruction names using the following statements:
The encodings in the test bench shown above enabled the desired instruction
to be written using the instruction name as opposed to its op-code. This simplified the
task.
The same was done to encode memory locations and file registers as shown
below:
The declarations given above enabled the test bench to be prepared without
ambiguity since the instruction and the operands can now be selected from the list of
constants above and can be concatenated to give almost any combination as
desired.
The following subsections will describe the different tests performed on Yoda
to fully test all of its functionality.
7.1 Testing the Data Transfer Instructions
The following data transfer instructions were tested with this test procedure:
1. Movlw – W=Literal
2. Movwf – F(address)=W
3. Movfw – W=F(address)
4. Movlf – F(address)=Literal
5. Movff – F(address1)=F(address2)
All of these instructions were tested together through the test bench listed in
Appendix one. The program written in the test bench is summarised in the Table 3
below. The test bench ‘burnt’ this program into the program memory of the Yoda and
became idle. This allowed us to observe how Yoda executed the instructions written
to its program memory.
The results obtained for the program are shown in Figure 18 below. As can be
seen, the all the data transfer instructions work as expected and in one clock cycle.
A B C D E
From Figure 18 and its interpretation from the points labelled above confirms
that the program was executed correctly. Therefore, the data transfer tests were
successfully performed. Yoda behaved as expected and carried out the program
written to its instruction memory.
7.2 Testing Arithmetic & Logic Instructions
The objective of this test process is to confirm that the arithmetic and logic
based instructions are carried out successfully by Yoda. This test will be performed
on the following instructions:
Note that since all the arithmetic and logic operations are carried out by the
ALU block, testing only the instructions shown above is fairly adequate to confirm
that all arithmetic operations will be performed properly.
Similar to the previous tests, a new test bench was prepared to test the
instructions above. As before, the test bench will burn these instructions into the
program memory and become idle (only generating a clock signal). After
programming, Yoda will auto-reset and begin executing the program sequentially.
The results can then be analysed through the simulator. The table below lists the
program used for this test. The full test bench is listed in Appendix two.
The events at the different points in Figure 18 above are explained below:
As can be observed from the above interpretation, the program was executed
by Yoda correctly so this test was completely successful.
7.3 Testing Instruction Cycle Operations
This section aims to confirm the instruction cycle operations. These operations
are all involved with the control of program flow. The following instructions are tested
in this test process:
1. Btfss
2. Goto
3. Call
4. return
As before, a new test bench was prepared to test the instructions above. This
is provided in Appendix 3. Table 5 below summarises the sequence of the program
to be executed by Yoda.
A
A B
B C
C
D
D EE FF
The important changes occurring at the points labelled above are explained
below:
The above results confirm that the instruction cycle operations are functional.
The test was successful and the testing of all the features of the instruction set is
nearly complete. The next section will test the last feature which is the pointer.
7.4 Testing the Pointer
The pointer is a useful feature in Yoda and therefore must be fully tested. The
program outlined in Table 6 below details the program used to test the pointer. The
full listing of the test bench for this is given is Appendix 4. Note again that the FSR
register (whose address is 0x04) sets the address of the pointer while the INDF
register (whose address is 0x05) is the contents of the pointer.
Figure 21 below shows the results obtained. The points labelled in Figure 20
are explained below:
A B C D
A- FSR register (with address 0x04) is loaded with the value 0. therefore, the
pointer now points to memory location 0
B- The contents of the pointer is written a value of 127 so F(0)=127
C- The FSR is incremented so that the pointer points to F(1)
D- F(1) is set to 127 through the pointer
The pointer test was successful as can be seen. With this final test, it can be
said that most of Yoda’s functionality has be tested thoroughly and it can be
confirmed that the instruction set has be successfully implemented.
8.0 Conclusion
The microcontroller for Cyclops (called Yoda) was designed and thoroughly
tested for phase two of this project. The design was started with the preparation of a
suitable instruction set. The instruction set consisted of a total of 27 instructions
including data transfer, arithmetic and logic and branching instructions.
Data transfer instructions enabled transfer of data between any two memory
locations or between the accumulator (W register) and any memory location within
one clock cycle. This was important since Yoda is a data transfer oriented processor
so it is necessary to have fast data transfers.
The instruction set also provided some simple arithmetic operations such as
addition and subtraction. However, overflow saturation or rounding was not added to
this since this complexity was not necessary for the application. Multiplication and
division by 2 was also incorporated into the design as these were relatively simple to
implement.
The width of each instruction was 22 bits. The op-code was 5 bits, operands 1
and 2 are 8 bits each and the destination bit was 1-bit wide. This meant that the
program memory had to be 22 bits wide. The depth of the program memory was
chosen to be 256 so that any program for this simple application can be
accommodated.
The datapath was 8-bits wide. Therefore, the data memory also had to be 8
bits wide. The data memory was chosen to be 256 levels deep so that there were
sufficient registers available to perform even the most demanding task in Cyclops.
The data RAM was chosen to be a two port RAM so to speed up data transfers
between the RAM and the ALU/accumulator/RAM. Two port RAMs can be read and
written simultaneously so all arithmetic operations on memory locations can be
performed in only one instruction cycle as oppose to two. Also memory – memory
transfers can occur in one clock cycle. These benefits are a necessity for Yoda since
it should be able to process video, which requires fast processing speed.
A simple pointer was implemented by using two registers, namely the FSR
and INDF. The FSR register holds the address of the pointer while the INDF register
represents the contents of the pointer. Writing to INDF modifies the contents of the
pointer. Having a pointer in Yoda is also desired. A pointer can be used for example
to index through a pixel stream.
A 4-level stack was added to Yoda so that function calls can be used in
software. This allowed a maximum of 4 nested function calls to be performed without
the danger of stack overflow. For this application, a 4 level stack was sufficient.
The full details of the controller FSM was not included in the report due to its
shear size. The FSM was responsible for asserting the various control signals for the
datapath unit.
Phase three was involved with the design of the microprocessor and this has
been mostly completed. However there still remains the mapping of the various
peripherals to data memory. This task was not performed at this stage since all the
peripherals were not implemented. The USB interface and the video processor unit
were yet to be designed.
The Video processing block and the USB interface must then be designed and
implemented. Then the peripherals must be mapped to Yoda’s memory space. Then
a GUI must be designed to communicate with the electronics system in Cyclops. The
GUI must be able to reconfigure the program memory in Yoda so that Cyclops can
be controlled in real time.
9.0 Group Details
This task was tackled by a team of three: Thant Sin, Uthayateban Logarajah
and Lakshman Athukorala. The designs were brainstormed together as a group and
other members of the class were also consulted. It was decided that each member
implement the proposed designs individually so as to gain experience with the design
software. Any disagreements or problems faced during simulations were resolved
thereafter.
Designs were drafted with FPGA Advantage 5.0 software package which was
available to each group member on their personal computers. Although the later 7.2
version was available, due to the various problems encountered and the
sluggishness of the virtual machines, version 5.0 was preferred.
CD Drive:\SystemDesignProject\I2C
CD Drive:\SystemDesignProject\Yoda
Note that the designs were implemented in FPGA Advantage version 5.0 and
some components may need to be re-compiled before simulations can be started.