Sei sulla pagina 1di 18


Berner Fachhochschule

Hochschule fr Technik und Architektur Biel

University of Applied Sciences Berne

(Biel School of Engineering and Architecture)

VLSI System Design

. . . .
FSM-D Architecture Model . . . .

Dr. Marcel Jacomet

MicroLab-I3S January 2003


Page 1



FSM-D Architecture Model

Implementing Algorithms
Once an algorithm is designed, it has to be implemented in order to have a running machine. There are numerous technologies to implement an algorithm. The algorithms of the first calculators were implemented using mechanical technology. Nowadays mechanical technology is not suitable anymore for such purposes. Nevertheless the designer must still decide if his algorithm should be implemented using digital or analog hardware or using software. The decision on which implementation technology to use depends on design constraints such as cost or performance. In this chapter, digital hardware is used as the implementation technology. A very suitable architectural model for general purpose algorithms is the FSM-D model.


Logic gates and flip-flops are the basic elements used to built digital hardware. Small circuits can simply be designed by putting these elements together in a clever way. As soon as the digital hardware gets somehow more complex, such an ad-hoc approach as described is not feasible any more. The need for a structured approach becomes clear. The finite state machine data-path model (FSM-D) is a general purpose architectural model, which supports the structured approach in designing digital hardware on a medium scale complexity.


1.3.1. FSM-D Structure
The FSM-D model is composed of two basic blocks, the finite state machine (FSM) and the data-path. These two basic blocks work in close collaboration to execute the task or algorithm assigned to the FSM-D circuit. Finite state machines as well as datapaths are very distinct hardware structures, each with its own characteristic and thus each with its preferred type of tasks. Typically a finite state machine can handle control tasks very efficiently. On the other hand a data-path is a preferred candidate to execute calculations and to move or store data. The FSM-D model benefits from these very different characteristics by assigning each of its constituent blocks some preferred type of tasks.


Page 2


Figure 1: Structure of the FSM-D architecture model.

1.3.2. Architecture Philosophy

In a more philosophical view we may compare the task assignment to the two basic blocks as a manager and his workers. The FSM has good control skills, thus it may be compared to a manager. Like a manager does, a FSM is able to control a team by taking decisions and by defining the sub-task sequence and by assigning the sub-tasks to the available team members or resources. The data-path may be compared to the team members or workers. Our team members are perfect workers, that is, the workers are specialists in their field. Our FSM-D model established a somehow old fashioned relationship and collaboration model between the finite state machine (manager) and data-path (workers). The FSM behaves like a manager who never does any labor work except giving instructions to the workers data-path. The FSM (manager) takes decisions based upon the information he receives from the data-path (workers). On the other hand, our data-path (workers) dont execute any sub-tasks unless the FSM gives them orders to do so. Once a data-path element is asked to execute a sub-task, he will do it immediately and with the speed and precision of a specialist. To summarize, the data-path is born to work and not make any decisions (thinking) as the FSM is educated to do that and to manage the execution of the whole task. As in real life, it is not always easy to draw such a strict line between tasks of a manager and a worker as described above. But when designing a FSM-D model based circuit, the primary goal is to assign to each of the blocks the sub-tasks they can do best. As an example, counting is a typical task for a data-path element. As often seen in some digital design text books, counting is often the first example to be designed with a finite state machine. This shows that most of the sub-tasks can be done by both blocks. But according to our FSM-D architecture philosophy, counting (or incrementing by 1) is a typical sub-task for the data-path as it has nothing to do with decision taking or controlling.

1.3.3. Finite State Machine

As described in numerous basic digital design text books, we distinguish between three FSM types: ? The Mealy machine is the most common case. The outputs of the Mealy machine are directly dependent on its inputs as well as its state.


Page 3


Figure 2: Structure of the Mealy type finite state machine.

The Moore machine is slightly restricted in functionality as compared to the Mealy machine. The Moore machine outputs are only dependent on the state and not anymore directly on its inputs. The Moore machine is the most often used implementation type of an FSM in digital circuit as a direct dependency of the outputs to the inputs is often not necessary.

Figure 3: of the Moore type finite state machine.

The Medwedjew machine is able to do exactly the same function as a Moore machine. The only difference between them is that a Medwedjew machine does have hazard-free outputs which cannot be guaranteed by any of the other two FSMs. Hazard-free signals have to be guaranteed for signals used as asynchronous set/reset inputs or clocks at flip-flops.

Figure 4: Figure 1: Structure of the Medwedjew type finite state machine.

1.3.4. Data-Path Structure

A typical data-path consists of three types of basic elements: ? ? ?

buses, multiplexers, de-multiplexers functional units, like comparators, adders, barrel shifters, etc. memory elements, like flip-flops, registers, register-files, etc.
Page 4 v1.1

The art of data-path design is to develop and to find an optimal structure of the datapath to perform the desired data-path functionality. Besides the functionality the design goal is either to use as few as possible data-path elements or to have a fast as possible data-path structure. Depending on the designers needs, area and/or speed optimization should be achieved.

Figure 5: Typical data-path elements: bus elements (top), functional unit (bottom left), memory element (bottom right).


Page 5


Memory elements used in the data-path are slightly different from the ones used in the control-path. Besides the clock and possibly asynchronous set/reset control inputs, the data-path memory elements have in addition a so called enable input. This enable input gives the finite state machine the possibility to control the data-paths memory behavior. Normally at every active clock a register stores the new input value. Now with an active enable signal, the register stores a new value which offers full control of the data-path by the control-path. Figure 6 illustrates the behavior of a memory element with an additional enable input.

Figure 6: Memory elements in the data-path have an additional enable input signal in order to be fully controllable by the FSM.


FSM-D Design Procedure

The FSM-D architecture model supports a structured design approach. Strictly following such a structured approach largely simplifies a design and also speeds up the design procedure significantly.1 Iterative design steps can be eliminated if each design step is seriously executed and documented. The following nine design steps for a FSMD architecture model have been proven to be very successful. A key element in the design procedure is the interface definition at the various design steps. Step 1: Step 2: Step 3: Step 4: Step 5: Step 6: Step 7: Step 8: Step 9: Definitions of the algorithm FSM-D interface definition Data-path design Data-path interface definition FSM interface definition FSM state definition FSM design VHDL coding Test-bench design and simulation

1.4.1. Design Step 1: Defining the Algorithm


The structured design approaches definitely do have their advantages. On the other hand, the designer has to be aware that the structured approach may limit the solution space. This is due to the fact that a structured approach is like a guide to the designer following an existing road and thus normally prevents the designer from leaving the road and detecting or inventing new solutions.
MicroLab-I3S Page 6 v1.1

To discuss the design steps in detail, a tutorial design example shall be introduced. The only reason to introduce the following design example is to have a vehicle to demonstrate and validate the design steps on a real but small design. Thereupon the discussion on which sub-task should be implemented by each one of the two blocks (FSM or data-path) can be discussed on a concrete example. The algorithm to be implemented with the FSM-D architectural model is that of a black jack player. The digital circuit should be able to play the black jack game. The goal of the game is to get as close as possible to 21 points by asking for cards and adding up their points. If the player exceeds 21 points he has lost the game. The player can ask for as many cards as he wants to have. The cards may have the following points: boy (10), lady (10), king (10), 2, 3, 4, 5, 6, 7, 8, 9, 10 and Ace (11). If the player does have an Ace (11 points) in his card pile he can choose if he wants to count the Ace as 11 points or just 1 point. We have to tell our digital black jack player how risky he should play the game. He should ask for new cards as long as his summed-up points are below 16. He should initially count 11 points for an Ace. If he should suddenly exceed 21 points, this rule gives him a second chance by replacing the 11 points for the Ace by 1 point.

Figure 7: The black jack game.

1.4.2. Design Step 2: FSM-D interface

The goal of design step 2 is to define the interface of the overall circuit. This is not always an easy task as figuring out all input/output signals of the digital circuit is not easily evident at such a very early design step. Nevertheless it is important to define the interface in every detail in order to prevent tedious iterative design loops. Defining the interface is not limited to fixing signals and their names but also includes the communication protocol of the interface itself. Tutorial Example: In our black jack tutorial example the interface may be defined using the following signals: clock signals and control signals which need to be defined using their sensitive edge (for example: rising edge) and active states (for example: active low), respectively. Figure 7 illustrates the interface step graphically. ? ? ? ? ? ? Clock signal, active on rising edge: clk Asynchronous reset signal can also be used to start the circuit, active low: start 4 bit data bus input for the card points: cardValue(3:0) 5 bit data bus output for the resulting card points: score(4:0) Handshake request and acknowledge signals for the data bus, active low: newCard and cardReady Status information signals to inform if the game is lost or finished, active high: lost and finished.


Page 7


Figure 8: FSM-D architecture interface definition step 2.

1.4.3. Design Step 3: Data-Path Design

The data-path has to be able to execute all necessary functional operations needed for the algorithm to be designed. A clear separation between the tasks to be implemented in the control-path and the data-path have to be performed as described earlier using the manager/worker analogy model. Specifically no control tasks should be placed in the data-path. In addition, be sure to send only the relevant information to the controlpath. This kind of behavior can be compared to our manager/worker model, where the manager is only interested in condensed and relevant information. When designing the data-path, we have to imagine that some virtual manager (FSM) is telling us what kind of operations to do. All functional operations expected from the data-path have to be implemented. Tutorial Example: The data-path will be composed of memory elements, buses and multiplexers for storing and moving data from one place to another. For memory elements to store small amounts of data we will use registers. Registers in a data-path will always be edge sensitive flip-flops, with asynchronous reset and enable inputs as introduced in chapter 1.3.4. Besides the data storing and moving capability, a data-path needs to execute functional operations like adding, comparing, etc. Designing the datapath for the BlackJack player step by step might look as follows: ? ? If the data-path receives a card value, he has to store it into a register. Lets call the registers stored value regLoad and its enable signal enaLoad. The control-path needs to know when an Ace is stored. This means that only an Ace is present information has to be delivered to the control-path and not the card value itself. A comparator following the register will do this job perfectly.

Figure 9: Designing the data-path: loading and comparing a card value.

? ?

The BlackJack player will ask for several cards and sum them up. An adder followed by a register to store the result is necessary. Lets call the summed and stored result regAdd and its enable signal enaAdd. To accumulate the card values, a feedback path from the result register to one of the adders inputs is needed.


Page 8


Figure 10: Designing the data-path: accumulating the card values.

To decide when to stop asking for cards, the control-path needs to know the summed card values but only if value boundaries have been crossed. The data-path thus needs to provide the information if more than 16 points have been accumulated cmp16 or if already more than 21 points have been accumalated cmp21. As soon as enough points are accumulated, the BlackJack player will show his score, thus an output register is necessary to block or pass the score to the outside world.

Figure 11: Designing the data-path: comparing the accumulated card values and showing end result.

The data-path sometimes has to change an Ace (11 points) to 1 point. This might be done by subtracting 10 points from the accumulated card values. Instead of subtracting 10 points we also can add -10 points using the already available adder. As there is no access to the adder for a third input to add -10 points, a multiplexer can help. By selecting the multiplexers path, either the cardValue or -10 can be loaded into the first register.

Figure 12: Designing the data-path: a multiplexer at the input is used to subtract the constant 10. This figure shows the final data-path for the BlackJack player.


Page 9


1.4.4. Design Step 4: Data-Path Interface Definition

The goal of design step 4 is to define the interface of the data-path block. This is quite an easy task as in design step 3 all inputs and outputs of the data-path have been introduced already. Nevertheless it is important to define the interface in every detail in order to prevent tedious iterative design loops. Defining the interface is not limited to fixing signals and their names but also includes the communication protocol of the interface itself as well as the active state of the signals. Tutorial Example: In our BlackJack tutorial example the interface is defined using the following signals: clock signals and control signals which need to be defined using their sensitive edge (for example: rising edge) and active states (for example: active low), respectively. Keep clock edge sensitivity and active reset level identical as in design step 1 when defining the global FSM-D interface. Figure 13 illustrates the interface step graphically.

Figure 13: Data-path interface definition.

? ? ? ? ? ? ?

Clock signal, active on rising edge: clk Asynchronous reset signal, active low: rst 4 bit data bus input for the card points: cardValue(3:0) 5 bit data bus output for the sore points: score(4:0) information signals for card value comparison, active high: cmp11 (1 if Ace present), cmp16 (1 if more than 16 points), cmp21 (1 if more than 21 points) multiplexer input selection signal: sel (0 for selection of -10, 1 for selection of cardValue) data-path register enable signals, active high: enaLoad, enaAdd, enaScore

1.4.5. Design Step 5: FSM Interface Definition

The goal of design step 5 is to define the interface of the control-path block. The necessary information can be retrieved from design step 2 and design step 4, where the interfaces of the overall design as well as the interface of the data-path are defined. All signals not yet connected define the interface of the finite-state-machine. In the FSM, we do not have to introduce and describe new signals, but have to group the already existing signals in FSM input and FSM output signals. Tutorial Example: In our BlackJack tutorial example the FSM interface is defined using the following input and output signals. Clock signals and asynchronous reset signals are treated separately and thus do not feature in our input-output signal lists. Keep clock edge sensitivity and active reset level identical as in design step 1, defining the global FSM-D interface. Figure 14 illustrates the interface step graphically.


Page 10


Figure 14: FSM input and output signals.

All interfaces are now defined. Starting with the overall interface in design step 2 and continuing with the interfaces of the two sub-blocks namely data-path in design step 4 and FSM in design step 5, we now have the completed hierarchical FSM-D architecture design (see Figure 15).

Figure 15: Hierarchy of FSM-D design illustrating the three interfaces.

1.4.6. Design Step 6: FSM State Definition

Before starting to design the state diagram, the state itself has to be defined. In a Moore type finite state machine, a state is defined by its unique name and the values of the FSM output signals. It is quite helpful if a skeleton state is drawn and then copied and pasted dozens of times on a piece of paper, ready for designing the state diagram in design step 7. Tutorial Example: In our BlackJack tutorial example the FSM skeleton state is defined by a placeholder for its name and the 7 signal output name. Figure 16 illustrates a skeleton state for the BlackJack player.


Page 11


state name finished lost newCard sel enaLoad enaAdd enaScore

state name finished lost newCard sel enaLoad enaAdd enaScore

state name finished lost newCard sel enaLoad enaAdd enaScore finished lost newCard sel enaLoad enaAdd enaScore finished lost newCard sel enaLoad enaAdd enaScore

output signals

output signals

output signals

state name finished lost newCard sel enaLoad enaAdd enaScore

state name finished lost newCard sel enaLoad enaAdd enaScore

state name

output signals

output signals

output signals

state name finished lost newCard sel enaLoad enaAdd enaScore

state name finished lost newCard sel enaLoad enaAdd enaScore

state name

output signals

output signals

output signals

Figure 16: FSM skeleton state and A4 paper with numerous copied skeleton states.

1.4.7. Design Step 7: FSM Design

In design step 7 the finite state machine is designed. Each state of the FSM orders the data-path to execute a sub-task which may consist of one or more micro-instructions running in parallel. The unique state name refers to the sub-task, the state output signals control the necessary resources in the data-path block to execute the desired sub-task. Arrows connect the states with each other and define state transitions. The state transitions define the sequence of the sub-tasks. The state transitions may have conditions, which are composed of FSM inputs only.2 Conditional state transitions introduce branches in the control-path.

Figure 17: Timing of FSM-D architectural models using Moore type FSM. The letters in brackets define the signal source, data-path (D) or control-path (FSM).

For a single clock cycle schema, a FSM-D architectural model follows the timing illustrated in Figure 17. When designing state diagrams, much care has to be taken in the timing of the signals between the FSM and the data-path. Basically we have to distinguish between the following two situations: An artificial state LoadReg in Figure 17 is supposed to ask the data-path to load a register (register is in data-path) and thus issues a register enable signal (enable is a FSM output). The new value in the register will be available in the next state CheckVal. A signal inform generated in the data-path
In contrast to Moore type finite state machines, the output values in Mealy type machines are not only defined by the states but also by the state transitions. To show this situation, in Mealy machine diagrams, input conditions as well as output values need to be attached to the transition arrows.
MicroLab-I3S Page 12 v1.1

used to inform the FSM about some operations using the registers contents will not be available in the LoadReg state but only in the next state CheckVal. Thus the finite-state machine can use the inform signal as a condition only when leaving the CheckVal state. A different situation is illustrated in state OpenData. In OpenData the FSM signals the data-path with the signal select (select is a FSM output) the data-path to send some data on a data bus which will be done immediately as long as pure combinatorial logic is responsible for the necessary operation as is the case when using multiplexers or other combinatorial logic elements. In this case the data on the data bus may already be used when leaving the state OpenData. Tutorial Example: In our BlackJack tutorial example the FSM skeleton state is defined by a placeholder for its name and the seven signal output names. Figure 18 illustrates the interface step graphically. It is up to the reader of this text to design the complete state diagram of the BlackJack player.

Figure 18: Part of an FSM design.

1.4.8. Design Step 8: FSM-D Coding

Coding digital logic with VHDL or another hardware description language cannot be compared to software programming as the modeled hardware elements basically run in parallel. In addition, any VHDL model does not represent the underlying circuits behavior exactly, an abstraction level is introduced (timing, signal races and thus spikes are not modeled). Therefore coding in VHDL is much more successful when synthesizing hardware if the designer uses some proven coding styles. Very common and easy to use is the register-transfer-language (RTL) coding style. For both blocks, data-path and control-path, a skeleton VHDL code in the RTL coding style is presented. Tutorial Example: Data-Path: As the data-path is quite small, the whole data-path is coded in one process using some conditional assignments for the combinatorial logic (comparators). All registered signals using the same clock signals and the same asynchronous reset can be packed in one single process. Be aware that all registers in the data-path need the enable control inputs as described in chapter 1.3.4. The enabled registers can easily be identified in the VHDL code below. Figure 18 illustrates the BlackJack player datapath schematic, the VHDL process as well as the conditional signal assignments.
MicroLab-I3S Page 13 v1.1

cmp11 <= 1 when (regLoad = 001011) else 0; cmp16 <= 1 when (regAdd > 010000) else 0; cmp21 <= 1 when (regAdd > 010101) else 0; process(clk,rst) begin if (rst = 0) then regLoad <= 00000; regAdd <= 00000; score <= 00000; elsif (clkevent and clk=1) then if (enaAdd = 1) then regAdd <= regAdd + 1; end if; end if; end process;

Figure 19: Data-Path and its VHDL coding.

FSM: The Moore type finite state machine is coded in two VHDL processes. A first process describes the transitions between the states. In a second process the output values are assigned for each state. In both processes a large case statement is used for the numerous states (see Figure 19).

1.4.9. Design Step 9: Test-Bench Design and Simulation

The last step in the design procedure is the design of a test-bench to verify the correct functionality of a circuit. VHDL simulators normally offer some interactive stimuli capture feature. But writing stimuli interactively is a tedious task, especially if several iterations between design correction and simulation are necessary. Writing re-usable test-benches in VHDL is the state-of-the-art. In the VLSI design flow such a VHDL test-bench can be re-used several times: at architectural simulation, at logic simulation, after placing and routing and finally at the test machines for verifying the real chips. A test-bench can be seen as a fully equipped and autonomous laboratory where the chips are tested. A laboratory uses three elements to fulfill the test task, the deviceunder-test (DUT), the signal and pattern generators as well as the logic analyzers or oscilloscopes. Thus our VHDL test-bench must also be based on these three elements. Figure 22 illustrates the structure of a test-bench with a stimuli generator, the deviceunder-test and the response analyzer. In a real test-bench and later on with the running chip, it has to be guaranteed that setup and hold times of the registers are respected. It is therefore usual to divide a test sequence in multiple test cycles. Each test-cycle has the same base structure: inputs to the DUT are only applied at the beginning of a testcycle, the response of the DUT will always be captured at a given offset in respect of a test-cycle start. The clock is activated and negated at identical relative times within a test-cycle. In our test-bench, a separate signal sync is introduced to signal the start of a test-cycle (see Figure 20).
MicroLab-I3S Page 14 v1.1

s[k+1] i[k] transition logic state register output logic o[k]


process(clk,rst) begin if (rst = 0) then state <= StartState; elsif (clkevent and clk=0) then case state is when StartState => state <= CallCard; when CallCard => if (cardReady = 1) then state <= LoadCard; end if; when others => state <= IllegalState; -- used for VHDL analysis -- null used for synthesis end case; end if; end process;

process(state) begin case state is when StartState => outvec <= 000--00; when CallCard => outvec <= 001--00; when others => outvec <= UUUUUUU; -- used for VHDL analysis -- null used for synthesis end case; end process; finished <= outvec(6); lost <= outvec(5); newCard <=outvec(4); ...

Figure 20: Moore type finite-state machine and its VHDL coding.

Figure 21: Test-cycle. Stimuli are always applied at the beginning of a test-cycle, response is always captured after the active clock edge at the same offset time relative to the beginning of a test-cycle.

Tutorial Example: The VHDL code of a test-bench has an entity without any input or output signals, like a laboratory where no signals will be wired through the labs door. The structure of the VHDL test-bench for our BlackJack player may look as the skeleton sample shown in Figure 21. Be aware, that the automatic response verification is now implemented in our sample test-bench. As the time is quite short, it is the designers duty to verify the correct DUT response visually.


Page 15


use IEEE.std_logic_1164.all entity tb_BlackJack is end tb_BlackJack; architecture skeleton of tb_BlackJack is begin signal sync,clock,start,newCard : std_logic; signal cardReady,finished,lost, : std_logic; signal cardValue: std_logic_vector(3 downto 0); signal score: std_logic_vector(4 downto 0); component BlackJack port(clock,start,newCard,cardReady: in std_logic; newCard,finished,lost: out std_logic; cardValue: in std_logic_vector(3 downto 0); score: in std_logic_vector(4 downto 0); end component; begin DUT: BlackJack port map (clock,start,newCard,cardReady, newCard,finished,lost,cardValue,score); ClockGenerator: process begin clock <= 0, 1 after 20 ns, 0 after 70 ns; sync <= 1, 0 after 1 ns; wait for 100 ns; end process; TestPatternGenerator: process begin start <= 0 -- test cycle 1 wait for 100 ns; start <= 1; -- test cycle 2 wait until (syncevent and sync=1 and newCard=1); cardValue <= 10101; -- test cycle cardReady <= 1; wait until (syncevent and sync=1 and newCard=0); cardValue <= UUUUU; -- test cycle cardReady <= 0; wait until (syncevent and sync=1 and newCard=1); ... wait; end process; end skeleton; configuration tb_cfg of tb_BlackJack is for skeleton end for; end configuration tb_cfg; Figure 22: VHDL test-bench. A test-bench is composed of three elements: deviceunder-test, pattern and signal generator, logic analyzer


Page 16


A simulation of the BlackJack player, based on the test-bench methodology is visualized in Figure 23. The visualization of internal VHDL signals like data-path registers and especially the state of the finite state machine are very helpful.

Figure 23: Simulation of the BlackJack player using a test-bench.


Errors and Pitfalls

When designing an FSM-D architectural model for an algorithm, a possible disastrous pitfall should always be considered. The FSM-D model is by principle a synchronous design. Nevertheless the external inputs are not synchronous in respect to our clock and thus may change their values whenever they like. If an FSM has an asynchronous input, the state machine risks to jump to a wrong or non-existing state. This situation arrives when an asynchronous input changes its value shortly before an active clock edge at the state register. This might provoke hazards at the state input. Such situations certainly do not arrive at every change of the asynchronous input. Lets imagine that a 0.1 ns hazard pulse has been captured in the state register using a 100 MHz clock frequency. Therefore every 100th change of the asynchronous input signal in a FSM with 100 states would lead 100 errors per second. The solution to this problem is to synchronize such inputs using a simple flip-flop running on the system clock (see Figure 24). This technique is called input synchronization.

Figure 24: Synchronizing external, asynchronous inputs to FSM-D clock.


Page 17



Summary and Conclusion

The finite state machine / data-path (FSM-D) architectural model supports the structured design approach with a general purpose model. Using the described nine design steps, an algorithm can efficiently be implemented in digital hardware using the FSM-D architectural model. It is crucial to have a well designed task separation between the two elements, finite state machine and data-path, which is controlling for the FSM and execution for the data-path. Nevertheless there is not always a crisp task border between the two hardware units. Imagine again the BlackJack player tutorial example: The FSM needs to remember if an Ace is present. If later on during the game a second Ace arrives, who should remember or count the Aces? If we use our worker/manager analogy we might decide that counting the Ace is rather an execution like sub-task for the specialized workers and not a decision like sub-task for the managers. Nevertheless, counting just up to 2 might simply be implemented by the FSM instead of introducing a counter which would need new control signals like an enable for counting up and an enable for counting down.


Page 18