Sei sulla pagina 1di 88

CHAPTER 1

INTRODUCTION

1.1 BACKGROUND
The family of IEEE wireless LAN standards were designed to
extend 802.3 (wired Ethernet). The first 802.11 specifications were introduction
in 1997. It supports 1 and 2Mbps transmission rate at 2.4GHz band with either
frequency hopping spread spectrum (FHSS) or Direct Sequence Spread Spectrum
(DSSS). 802.11a and 802.11b were released in 1999 September. 802.11a
supports 6 to 54 Mbps transmission rate with orthogonal frequency-division
multiplexing (OFDM) at 5 GHz and 802.11b supports 1 to 11Mbps with DSSS at
2.4GHz. The 802.11g was ratified in 2003 June. It operates at 2.4 GHz same as
802.11b but maximum transmission rate is 54 Mbps with OFDM as 802.11a.

1.2 MOTIVATION

The IEEE 802.11 is a standard protocol proposed for wireless


networks, it includes a physical layer (PHY) and a medium access layer (MAC)
which provides a variety of functions that support the operation of wireless LAN.

Although 802.11a is faster than 802.11b, but more disadvantages,


such as cover range shorter and more expensiveetc. and it also cant
compatible each other. 802.11g is extended from 802.11b and backwardcompatible with it. Thus the structure of wireless network still uses 802.11b
specification in public area currently. This thesis will focus on 802.11b
specification to design the wireless MAC functions. The wireless MAC has been
designed in many papers with 802.11a specification. There are several
implementation methods, such as CPU-Based, Cell-Based and FPGA. With
CPU-Based, it has advantage of design flexible, but complex, low process speed
1

and cost high. With Cell-based, it has advantage of design simple, high process
speed and easy to simulation, but design non-flexible and need long developing
time [1]. FPGA has both of advantages of CPU-based and Cell-based, such as
high process speed, design flexible and short developing timeetc. It is popular
implementation way currently. It is also suitable for MAC design and
verification. In this thesis, we proposed a new architecture implemented in FPGA
to perform MAC functions [2] [3] [4].

1.3 THESIS ORGANIZATION


There are seven chapters in this thesis. Chapter 2 describes 802.11
basic structures. Chapter 3 describes MAC function detail, such as MAC frame
format, MAC architecture (DCF and PCF) and CSMA/CAetc. Chapter 4
describes system architecture and the function of modules in our design
proposed. Chapter 6 describes simulation results by ModelSim and verification
results on logic analyzer. The conclusions and the references are in Chapter 7
and the end of this thesis, respectively.

In the last of the thesis conclusion and future prospects and the list
of references are also included.

CHAPTER 2
IEEE 802.11 OVERVIEW

2.1 SUMMARY OF 802.11 STANDARDS


The IEEE 802 families are composed by a series of local area
network (LAN) technology specification. 802.11 belong to one of members. 802
family members relations and OSI model are shown in Figure 2.1 [5].
The services and protocols specified in IEEE 802 map to the lower
two layers (Data link and Physical) of OSI model. In fact, IEEE 802 splits OSI
data link layer into two sub-layers: LLC sub-layer and MAC (Media access
control) sub-layer. The protocol of LLC sub-layer adopted 802.2 used in wired
network, while the MAC sub-layer is depended on Physical layer tightly. So the
802.11 standard just defined MAC sub-layer and PHY, and MAC is defined to
access media and transmit data from/to LLC sub-layer and PHY [5].

Figure 2.1: IEEE 802 family and its relation to the OSI model
3

The 802.11 standard focuses on the physical layer (PHY) and the
medium access control layer (MAC). Original design the system transmitted data
at 1 Mbps and 2Mbps over the ISM frequency band. The coming version is
expected to push data rate up to 5.5 Mbps and 11Mbps (802.11b). Until now, the
data rate is up to 54Mbps (802.11g). Compare list is shown in Table 2.1

Table 2.1: Comparison of 802.11 Standards

2.2 802.11 NETWORK TYPE

2.2.1 Independent Basic Service

The Basic Service Station (BSS) is the basic building block of an


802.11 LAN; each of BSS has recognized each other and is connected via the
wireless media in a peer-to-peer fashion. This kind of network topology is
referred to as an independent basic service (IBSS) or an Ad-hoc network [6].
Using this typology, computers can be connected in any place conveniently and
networking needs are likely to be temporary. IBSS topology is shown as Figure
2.2.

Figure 2.2 Independent basic service set (IBSS)

2.2.2 Infrastructure Basic Service Set

An infrastructure basic service set should include one BSS with a


component called an Access Point (AP) which is connected to a wired network
and also provides a local relay function for the BSS [6]. All computers
5

communicate directly to an AP and all frames between stations are relayed by the
access point. Mobile devices can use network resources on wired network
transparently through AP. The AP may also provide connection to a distribution
system shown as Figure 2.3.

Figure 2.3: Infrastructure basic service set

2.2.3 Extended Service Set

A set of one or more interconnected basic service sets (BSSs) and


integrated local area networks (LANs) that appear as a single BSS to the logical
link control layer at any station associated with one of BSSs [7]. AP performs
this communication via distribution system. All the access points in an ESS are
given the same service set identifier (SSID) which serves as a network name
for the users [5]. ESS is shown as Figure 2.4.

Figure 2.4: Extended service set

2.2.4 Distribution System (DS)

The distribution system is used to forward frames to their


destination. In most commercial products, the distribution system is implemented
as a combination of bridging engine and a distribution system medium, which is
the backbone network used to relay frames between access points [5].

Figure 2.5: Distribution System

CHAPTER 3
802.11 MAC LAYER

3.1 MAC ARCHITECTURE


The architecture of MAC sub-layer specifies two different
mechanisms: distributed coordination function (DCF) with contention scheme
and point coordination function (PCF) with contention free or polling scheme to
perform real-time applications. At present DCF is the dominant MAC
mechanism implemented by IEEE 802.11 compliant products. The MAC
architecture can be shown as Figure 3.1 as PCF is built on top of the DCF [7].

Figure 3.1: MAC ARCHITECTURE

3.1.1 DCF (Distributed Coordination Function)

The DCF is the basis of the standard Carrier Sense Multiple Access
with Collision Avoidance (CSMA/CA) access mechanism [8]. It works by a
listen before talk scheme. It means all stations first sense the medium to be idle
before transmitting. To support of contention based DCF make IEEE 802.11
equipment popular choices for different wireless ad hoc networtecks. In this
thesis, we only consider to implement DCF function on Ad Hoc network
topology.

3.1.1.1 CSMA/CA Mechanism

Carrier sensing is used to make sure medium is available or busy


when the BSS tries to transmit data. In 802.11, it similar to 802.3 that MAC
determines the medium status depends on both physical (PHY) and virtual carrier
sense mechanism. Virtual carrier senses is provided by NAV (Network
Allocation Vector). The NAV is a timer that indicates the amount of time the
medium will be reserved in microsecond. In 802.11 frames, it has duration field
for NAV. Station set NAV time for expect to use medium. When NAV counts
down to zero, it means medium is available; when nonzero, it means medium is
busy. Figure 3.2 shows NAV for virtual carrier sensing.

Figure 3.2: NAV for virtual carrier sensing

10

3.1.1.2 Inter-frame Space (IFS)

Inter-frame space is time interval which successive frames must be


separated. In 802.11, four different inter-frame spaces that use to provide
multiple priorities for medium access are defined. They are SIFS, PIFS, DIFS
and EIFS. Figure 3.3 shows inter-frame space relationships. [7]

Figure 3.3: Inter-frame space relationships

3.1.1.2.1 Short IFS (SIFS)

SIFS is the shortest of inter-frame spaces. It is used to separate


transmissions belonging to a signal dialogue. SIFS is highest-priority
transmissions like RTS/CTS frames [5]. Once highest priority start, medium
becomes busy, so frame can start to transmit after SIFS elapse and dont need to
wait longer intervals.

11

3.1.1.2.2 PCF inter-frame space (PIFS)

PIFS is used by the Access Point or Point Coordinator to gain a


medium access before other station during contention free operation.

PIFS = SIFS + one slot time

3.1.1.2.3 DCF inter-frame space (DIFS)

DIFS is the minimum medium idle time for contention-base


services [5]. It is used by stations operating under the DCF to transmit data
frames and management frames.

DIFS = SIFS + two slot time

3.1.1.2.4 Extended inter-frame space (EIFS)

EIFS is the longer IFS used when there is an error frame


transmission only [5].

3.1.1.3 Random Back-off Mechanism

As Ethernet, the random back-off is used to avoid having multiple


users to begin transmission at the same time. It can reduce the probability of
collision. The random back-off time formula as below:

Back-off Time = INT (CW * Random ()) * Slot-time

CW means Contention window, the CW values will be sequentially


ascending integer power of 2 minus 1 (Ex: 31, 63,127,255). Each time the retry
the counter increase, then the CW moves to next greatest power of two.

12

The contention window is limited by physical layer, CW


minimum values are 31 and maximum transmission slots of contention
window are 1,023.

3.1.1.4 RTS and CTS Mechanism

To reduce the overhead imposed on MAC data during medium


collision and to improve system throughput, IEEE 802.11 provides a solution by
using Request To Send and Clear To Send (RTS/CTS). DCF also use
RTS/CTS. If sender transmits the RTS frame, the receiver will reply a CTS
frame after a SIFS period. Once RTS/CTS are exchanged successfully, sender
begins to transmit data frame. After a SIFS receiver reply ACK frame, the
procedure of transmission is complete. The procedure is also shown in Figure 3.2

3.1.1.4.1 RTS (Request to Send)

RTS frame is defined as in Figure 3.4. The frame subtype is set to


1011 to indicate a RTS frame in frame control field. The RA of RTS frames is
address of STA; TA is address of STA transmitting the RTS frame.
The duration value is time in microseconds. It required transmitting the pending
data or management frame, plus one CTS frame, one ACK frame, and three SIFS
intervals (3*SIFS + CTS + ACK + frame time).

3.1.1.4.2 CTS (Clear to Send)

CTS frame is defined as in Figure 3.5. The frame subtype is set to


1100 to indicate a CTS frame in frame control field. MAC copies the transmitter
address of the RTS frame into the receiver address of the CTS frame.

Figure 3.4: RTS frame format


13

Figure 3.5: CTS frame format

The duration value is the value obtained from the Duration field of
the immediately previous RTS frame, minus the time (RTS - CTS time -one
SIFS) in microseconds, required transmitting the CTS frame and its SIFS interval
[7].
3.1.1.4.3 ACK (Acknowledgment)

ACK frame is defined as in Figure 3.6. The frame subtype is set to


1101 to indicate an ACK frame in frame control field. The receiver address is
copied from the transmitter of the frame being acknowledged.

Technically, it is copied from address 2 field of immediately


previous directed data, management, or PS-Poll control frame. If the More
Fragment bit was set to 0 in the Frame Control field, the duration values is set
to 0. If the More Fragment bit was set to 1, the duration value is obtained from
Duration field of the immediately previous data or management frame, minus the
time (Fragment X duration - ACK time one SIFS), in microseconds, required to
transmit the ACK frame and its SIFS interval [7].

Figure 3.6 ACK frame format

14

3.1.1.5 PHY Frame Format

The 802.11 uses radio wave to send medium, it needs more


complicated PHY. 802.11 splits PHY into two components, one is PLCP
(Physical Layer Convergence Procedure) that use to map MAC frame to
transmission medium, another is PMD (Physical Medium Dependent) that use to
transmit MAC frames shown in Figure 3.7 [5].

In this section, we just illustrate 802.11b PHY frame format only.


The IEEE 802.11b PHY is an extension of the original Direct Sequence Spread
Spectrum (DSSS) PHY. It operates in 2.4GHz ISM band; provide 5.5 and
11Mbps PHY rates in addition to the 1 and 2 Mbps rates supported by the
original DSSS PHY. 802.11b PHY also be called HR/DS or HR/DSSS. Figure
3.8 shows the PLCP framing specified in 802.11b.

Figure 3.7: PHY component

15

Figure 3.8: HR/DSSS PLCP framing

To ensure backwards compatibility with the installed base of


802.11, direct sequence hardware, the HR/DSSS PHY can transmit and receive
at 1.0 Mbps or 2.0M bps. Any transmissions at the slower rates must use long
headers. Table 3.1 shows IEEE 802.11b PHY Parameters.

Table 3.1: IEEE 802.11b PHY parameters

16

3.1.2 PCF (Point Coordination Function)

The 802.11 standard include a coordination function to support


real-time service, it is called PCF. PCF is only usable in infrastructure network
configuration. The PCF is a contention-free protocol and enables stations to
transmit data frames synchronously, with regular time delays between data frame
transmissions [8]. It may be activated to support time-sensitive information such
as audio or video.

PCF in IEEE 802.11 is based on a polling scheme. All


transmissions during contention-free period are separated by only the SIFS [9]. If
station doesnt respond after an elapsed PIFS, it will poll next station. By using
PCF inter-frame space, access point ensures it can continue to access to the
medium. Figure 3.9 shows PCF operation that consists of a contention free
period (CFP) and contention period (CP).

Figure 3.9: PCF operation


17

3.2 MAC Frame Formats


To meet the challenges posed by a wireless data link, the MAC was
forced to adopt several unique features. All stations shall be able to properly
construct frames for transmission and decode frames upon reception. Each frame
consists of the following basic components [7]:

a. MAC header: Contains frame control, duration ID, address


(1~4) and sequence control information;

b. Frame body: Contains information specific to the frame type in


variable length (0~2312 bytes).

c. Frame check sequence (FCS): IEEE 32 bit Cyclic Redundancy


Check (CRC).Contains result of applying CRC-32 polynomial to MAC header
and frame body.

3.2.1 General frame format

The MAC frame format comprises of a set of fields that occur in a


fixed order in all frames [7]. Figure 3.10 shows the generic 802.11 MAC frame,
field are transmitted from left to right.

Figure 3.10: Generic 802.11 MAC frame

18

3.2.2 Frame field

3.2.2.1 Frame control field

The frame control field consists of two-byte subfields: Protocol


version, Type, subtype, To DS, From DS, More Fragments, Retry, Power
Management, More Data, WEP (Wired Equivalent Privacy), and Order. The
components of the Frame Control subfield are illustrated in Figure 3.11.

Figure 3.11: Frame control field

3.2.2.1.1 Protocol version

The Protocol version field is 2 bits in length and is invariant in size


and placement across all revisions of this standard. At present, only one version
of the 802.11 MAC has been developed. The value of the protocol version is 0.
All other values are reserved and will appear when the IEEE standardizes
changes to the MAC that render it incompatible with the initial specification. A
device that receives a frame with a higher revision level than it supports will
discard the frame without indication to the sending station or to LLC.

19

3.2.2.1.2 Type and subtype fields


The Type field is 2 bits in length, and the Subtype field is 4 bits in
length. The Type and Subtype fields identify the type of frame used. To cope
with noise and unreliability, a number of management function are incorporated
into the 802.11 MAC [5]. There are three frame types: control, data, and
management. Type and subtype identifiers of control, data, and management
frames are shown in Table 3.2, 3.3 and 3.4 respectively.
Table: 3.2 Type and subtype identifiers (Management Frame)

Table: 3.3 Type and subtype identifiers (Control Frame)

20

Table: 3.4 Type and subtype identifiers (Data Frame)

Table: 3.5 To/From DS combinations in data type frames

From DS

To

Meaning

DS
0

A data frame direct from one STA to another STA


within the same IBSS, as well as all management and
control type frames.

Data frame destined for the DS.

Data frame existing the DS.

Wireless Distribution system(WDS) frame being


distributed from one AP to another AP.

3.2.2.1.3 To DS and From DS fields


To DS and From DS fields are 1 bit each in length. These bits
indicate whether a frame is destined for the distribution system. To DS field set
to 1 in data type frames destined for the DS, set to 0 in all other frames. From DS
21

field set to 1 in data type frame exiting the DS, set to 0 in all other frames. The
detail link is shown in Table 3.5.

3.2.2.1.4 More fragments field

The More fragments field is 1 bit in length. When a higher-level


packet has been fragmented by the MAC, the initial fragment and any following
non-final fragments set this bit to 1. Large data frames and some management
frames may be large enough to require fragmentation; all other frames set this bit
to 0.

3.2.2.1.5 Retry field

The More fragments field is 1 bit in length. From time to time,


frames may be retransmitted. It is set to 1 in any data or management type frame
that is a retransmission of an earlier frame; set to 0 in all other frames. It aids the
receiving station in the process of eliminating duplicate frame.

3.2.2.1.6 Power Management field

The Power Management field is 1 bit in length. This field indicates


whether the sender/station will be in a power saving mode after the successful
completion of the current atomic frame exchange sequence. A value of 1
indicates that the station will be in power save mode, and 0 indicates that the
station will be in active mode. AP performs a number of important management
functions and is not allowed to save power, so this field is always set to 0 in
frame transmitted by an AP.

3.2.2.1.7 More data field

The More data field is 1 bit in length. To accommodate stations in


a power saving mode, APs buffer frames from distribution system towards power
saving mode stations. The More Data field is valid in directed data or
management type frames transmitted by an AP to an STA in power saving mode.
A value of 1 indicates that least one frame buffered at the AP for the mobile

22

station. The AP can set this field to one to indicate that there are more multicast
frames buffered in multicast frames by the AP.

3.2.2.1.8 WEP field

The WEP field is 1 bit in length. Wireless transmissions are


inherently easier to intercept than transmissions on a fixed network. It is set to 1
if the frame body field contains information that has been processed by the WEP
algorithm. It is set to 0 in all other frames.

3.2.2.1.9 Order field

The Order field is 1 bit in length. Frames and fragments can be


transmitted in order at the cost of additional processing by both the sending and
receiving MAC. It is set to one when the content of the data frame was provided
to the MAC with a request for strictly ordered service. It is set to 0 in all other
frames.
3.2.2.2 Duration/ID field

The Duration/ID field is 16 bits in length. This field has three kind
of usage as follows:

a. Duration: setting the NAV: When bit 15 is 0, the duration/ID


field is used to set the NAV and bits. 14-0 represent the remaining duration of a
frame exchange. The value represents the number of microseconds that the
medium is expected to remain busy for the transmission currently process.
Stations receive this frame will update the NAV and will not begin a
transmission and blocks access to the medium for additional time [5].

b. Frames transmitted during CFP (Contention-free periods):


During the contention-free periods, bit 15 is set to 1 and all other bits are 0, so
the duration/ID field takes a value of 32,768. This value is interpreted as NAV.
Any station that did not hear Beacon announcing the contention-free period
updates the NAV avoiding collision [5].
c. PS-Poll frames: Both bits 14 and 15 are set to 1 in PS-Poll
frames. For battery saving stations turn antennas off. When stations wake up
23

periodically, to ensure that no frames are lost, waking stations incorporate the
association ID (AID). It is used by a station to retrieve frames that are buffered
for it at the AP. The AID is included in PS-Poll frame and may range from 12,007. Only PS-Poll frame contains the AID [5].The encoding of the Duration/ID
field is show in Table 3.6.

Table: 3.6 Duration/ID field encoding

3.2.2.3 Address field

There are four address fields in the MAC frame format. The
address fields are numbered because different field are used for different
purposes depend on frame type. The usage of the four address field in frame type
is BSSID (Basic Service Set ID), DA (Destination Address), SA (Source
Address), RA (Receiver Address), and TA (Transmitter Address). In general
usage that Address 1 is used for the receiver, Address 2 for the transmitter, and
Address 3 for filter by receiver.

Address fields are 48 bits in length. If first bit which sends to


physical medium is 0, it represents a signal station (unicast). If first bit is 1, it
represents a group of physical stations that call multicast. If all of bits are 1s, it
represents the frame is broadcast and is transmitted to all of stations connected to
wireless medium.

24

3.2.2.3.1 BSSID

The BSSID field is 48 bits field that identify different wireless


LAN in same area. In an infrastructure BSS, BSSID is the MAC address used by
the station in the AP of the BSS. The value of this field will generate a random
number in Ad hoc network with the universal/local bit of the address is set to 1
and individual/group bit of the address is set to 0. All of value is set to 1, it
represents broadcast BSSID. It may use in BSSID field of management frames of
subtype probe request only [7].

3.2.2.3.2 Destination address

The destination address contains an IEEE MAC individual or


group address. The IEEE MAC identifier intends as the final recipient that will
hand frame to higher protocol layers for process.

3.2.2.3.3 Source address

The source address contains an IEEE MAC individual address.


The IEEE MAC identifier identifies the source of transmission. Because only
one station can be source, the individual/group bit is always 0 in the source
address.
3.2.2.3.4 Receiver address

The receiver address contains an IEEE MAC individual or group


address. The IEEE MAC identifier intends immediate recipient station which
processes the frame.

3.2.2.3.5 Transmitter address

The Transmitter address contains an IEEE MAC individual


address. It identifies station that has transmitted the frame onto the wireless
medium. The transmitter address is used only in wireless bridge. The
individual/group bit is always 0 in transmitter address. Address field functions
that relation with To DS/From DS show as in Table 3.7.

25

Table: 3.7 Address field functions

Figure 3.12: Sequence control field

3.2.2.4 Sequence Control field

The sequence control field is 16 bits in length that is used for both
de-fragmentation and discarding duplicate frames. It consist two subfields:
Sequence Number and Fragment Number, as shown in Figure 3.12.

26

3.2.2.4.1 Sequence Number field

The sequence number field is 12 bits in length that assigned


sequentially by the sending station to each MSDU. It operates as a modulo 4096
counter of the frames transmitted. Start from 0 and increments by 1 for each
MSDU. All fragments have same sequence number if MSDU is fragmented.
When frames are retransmitted, sequence number will not be changed.

3.2.2.4.2 Fragment Number field

The fragment number field is 4 bits in length that indicate the


number of each fragment of MSDU. The first or only fragment of an MSDU is
assigned a fragment number of 0. Each successive fragment increments number
by 1. When fragments are retransmitted, it will keep original sequence numbers
to assist in reassembly.

3.2.2.5 Frame Body field

The frame body field is a variable length field that is used in data
or management frames. 802.11 can transmit frames with a maximum payload of
2,304 bytes of higher level data. Actuality it has to support payload of 2,312
bytes with WEP. Because 802.2 LLC headers use 8 bytes, so it can transmit
frames with a maximum network protocol payload of 2,296 bytes.

3.2.2.6 FCS field

The FCS field is 32 bits in length that is calculated over all the
fields of MAC header and frame body. The FCS is often referred to as CRC
(Cyclic Redundancy Check). It is an IEEE 802 LAN standards and generated in
the same way as IEEE 802.3

FCS will be calculated when frames are sent to wireless medium.


Receivers can calculated FCS from received frame and compare it. If these two
match, there is a high probability that frame was not damaged in transmit
process.

27

CHAPTER 4
DESIGN AND IMPLEMENTATION OF MAC
TRANSMITTER

4.1 SYSTEM ARCHITECTURE OVERVIEW


In this chapter, we will describe implementation of MAC
transmission controller more detail. In this controller, it includes 4 modules:
Transmission control module, Build frame module, Transmit module and Timer
module. System Architecture is show as in Figure 4.1.

S Y S T E M (L L C)

MAC Tx
Transmission Control Module

Build Frame Module

Transmit Frame Module

Timer Module

PHY
Figure 4.1 System architecture

28

4.2 MODULE FUNCTION

4.2.1 Timer Module

Timer module is consisted of two components. One is for Back-off


time, another is for Inter-frame space that DIFS and SIFS. When medium is not
available, this module will count one DIFS period (50 s) first, then generate
random contention window until Transmit Control Module inform Timer
Module of the medium is available and system can start to transmit frame by
transmission ready bit. In 802.11b specification, CW minimum value is 31,
maximum value is 1023. In other word, 802.11b defines the minimum back-off
time of 620 s and the maximum of 20460 s.

After transmitting a frame (RTS, CTS, ACK or Data frame),


Timer Module is requested to count time (DIFS, SIFS or others) based on
Transmit Control Module. When count is finished, the Timer module will
respond Timer_count_End signal to inform Transmit Control Module.

In 802.11b specification, SIFS time is 10 s and DIFS is 50 s. In


this design, set SIFS count value is 1000 and DIFS is 5000. (Base on clock is 100
MHz, 10 ns period). Table 3.7 shows more detail parameters of 802.11b. Figure
4.2 shows Timer module block diagram and Figure 4.3 shows its flowchart.

Figure 4.2: Timer module block diagram

29

Idle

Need to
count?

Time Mode select

Back-off
Time
(Time mode
= 001)

SIFS
(Time mode
= 010)

DIFS
(Time mode =
011)

Set Random Count


Value and Count

EIFS
(Time mode =
101)

PIFS
(Time mode =
100)

Set Count Value and


Count

Count
Finish?

N
Count
Finish?

Y
Generate BackTimer_Count_End

Generate Backoff_Count_End

END

Figure 4.3: Timer module flow chart

30

4.2.2 Transmission Control Module

The main function of Transmit Control Module is handling


information or data from upper layer and coordinating all of other modules that
include Timer Module, Build Frame Module and Transmission Frame Module in
this system.

In this module, whether the frame, which was received from upper
layer, needed to be fragmented or not was determined by RTS threshold. In
802.11b specification, RTS threshold is set to 2,347 bytes. If network throughput
is slow or there are high numbers of frames retransmissions, enable RTS clearing
by decreasing the RTS threshold.

This module also controls frame retry time. Every station has two
retry limits: long retry limit and short retry limit. If frame is longer than RTS
threshold called long retry limit and retry time is set default value of four, and
shorter than RTS threshold called short retry limit and retry time is set default
value of 7. Once retry time over long/short retry limit setting, the frame will be
discarded and reported to higher-level protocols.

All of actions are controlled or arranged by Transmission Control


Module in this design. Figure 4.4, Figure 4.5 and Figure 4.6 are shown
Transmission Control Module flow chart.

31

Idle

Reset=1
Y

System
Reset?

Clear Timer, Transmit


and Build Frame
Module register

N
Transmit
ready?

Y
N
Medium Y
Busy?

CCA =1?
N
Fragment
number
>=1?

DCF
Process

DCF
Process

Transmit Frame

Complete
transmit?

N
Request retry
by Rx?

N
Reset Timer, Transmit and Build Frame
Module register etc

Idle

Figure 4.4: Transmit control flow chart A

32

Retry limit
is reached

Retry time is
Reached to 7?
Y
Reset Timer,
Transmit and Build
Y
Frame Module
register.etc

N
Idle

Retry
Request?

Retry limit
is reached?

CCA=1?

N
Count Random off
time be interrupted
before?

Medium
Busy?

Y
Restore
Random Bachoff count value

Set Random Back-off


time count value

DIFS
Counts
end?

N
Y
CCA=1?
Lock and save
Random back
off time value

Medium
Busy?

Busy
Frame?

Y
N

Idle

Random backoff time count


end?

Sequence
number =0?

Y
Y

RTS/CTS
Process

Figure 4.5: Transmit control flow chart B

33

Y
Idle

Transmit
Process

Multicast

N
N

Frame Size> RTS


threshold?

Y
Tx has been
sent RTS and
successful?
Request
retry by Rx?
N
Sent RTS

Y
Y Receive
ACK and
correct?

Reqest retry
by Rx?

N
DCF
Process

N
N
Timer counts a SIFS

Reset Timer,
Transmit and Build
Frame Module
register .etc

N
Receive CTS
and correct?

Idle

Timer counts a SIFS

Send Data Frame

Figure 4.6: Transmit control flow chart C

34

4.2.3 Build Frame Module

All of transmit frames (RTS, CTS, Data and ACK frame) are
generated by Build Frame Module. It is depended on information that provided
from upper layer and be controlled by Transmission Control Module to build
frame and set some values like Retry, To DS/From DSetc. in frame control
field. In this module, it also calculates duration time and FCS value (CRC-32)
then build completely frame. Figure 4.7 shows Build Frame Module flowchart.

Build Frame Start

Reset Build Frame


Setting

Reset = 1
transmit?

Frame Type= 001

Frame Type =100


Frame Type

Frame Type =010

Frame Type=011
Y

Calculate RTS
Duration Time

Calculate CTS
Duration Time

Calculate Date
Duration Time

Calculate ACK
Duration Time

Build RTS Frame

Build CTS Frame

Build Date Frame

Build ACK
Frame

Calculate FCS

=
Out Frame to
Transmit Module

Build Frame End

Figure 4.7 Build frame module flow chart

35

4.2.3.1 Calculation Duration Time


In this section we will depend on 802.11b specification to
calculate duration time and use RTS and CTS frame to do an example. Assuming
frame body is 1 byte in data frame.

Calculation duration time of RTS frames way:

Short PLCP Preamble & Header = 72 bits at 1Mbps + 48 bits at 2 Mbps


= 72 s + 24 s = 96 s

CTS transmission time = 96 s + CTS frame length (14 bytes)/11 Mbps


= 107 s

ACK transmission time = 96 s + ACK frame length (14 bytes)/11 Mbps


= 107 s

Data frame transmission time = 96 s + Data frame length (32bytes)/11


= 120 s

RTS duration time = 3 x SIFS + CTS + ACK + Data frame time


= 3 x 10 s + 107 s + 107 s + 120 s
= 364 s

In this thesis, internal transmission is 32 bits. So setting 0x016C


(364 s) in RTS duration field.
4.2.3.2 Calculation FCS Value

An FCS value is calculated based on CRC-32 algorithm. CRC-32


is cyclic redundancy code and 32 represent the length of checksum in bits. It uses
following standard generator polynomial:
G(x) = x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7
+ x5 + x4 + x2 + x +1
Calculating FCS, the frame is represented (original frame, before
adding the FCS) as a polynomial, M(x), then multiplies M(x) by x32, and divides
36

the result by a standard generator polynomial G(x) (all of procedure is completed


by using Modulo-2 operation). The resulting remainder of the above operation is
FCS. The FCS is appended to the original frame to form coded frame which will
be transmitted. The whole of the above operation can be mathematically
represented as

CRC = Remainder of (M(x) * X 32 /G(x)).


Figure 4.8 shows example of CRC generation.

Figure 4.8: Example of CRC generation

37

Figure 4.9 and Table 4.1 show CRC-32 circuit and parallel CRC-32 algorithm.

Figure 4.9: CRC-32 circuit


38

Table 4.1: Parallel CRC-32 algorithm

39

4.2.4 Transmit Frame Module

Transmit Frame Module is used to interface to PHY to transmit


frame. In this Transmit Frame Module, we transfer parallel frame to serial to
store onto FIFO first, then send them out according to interfacing to Base-band
module used in physical layer. When transmit is completed, it will issue a
Transmit End signal to inform Transmit Control Module of transmission
finished. There is one FIFO included in this Module. We assume the interface
between Baseband and MAC follows the 4 bit wide MII specification. MII
transmission clock is 250 KHz in 1Mbps [29]. In this thesis, internal data
transmission is 32 bits and clock is 100 MHz so we use 1600-bit width and 32bit depth FIFO in this module.

40

CHAPTER 5
INTRODUCTION TO FPGA DESIGN

5.1. FIELD PROGRAMMABLE GATE ARRAY (FPGA)


Field Programmable Gate Arrays are called this because rather
than having a structure similar to a PAL or other programmable device, they
are structured very much like a gate array ASIC. This makes FPGAs very nice
for use in prototyping ASICs, or in places where and ASIC will eventually be
used. For example, an FPGA may be used in a design that needs to get to
market quickly regardless of cost. Later an ASIC can be used in place of the
FPGA when the production volume increases, in order to reduce cost.

5.2. FPGA ARCHITECTURES

Figure 5.1: FPGA


41

Each FPGA vendor has its own FPGA architecture, but in


general terms they are all a variation of that shown in Figure 5.1. The
architecture consists of configurable logic blocks, configurable I/O blocks, and
programmable interconnect. Also, there will be clock circuitry for driving the
clock signals to each logic block, and additional logic resources such as
ALUs, memory, and decoders may be available. The two basic types of
programmable elements for an FPGA are Static RAM and anti-fuses.

5.3. CONFIGURABLE LOGIC BLOCKS


Configurable Logic Blocks contain the logic for the
FPGA. In large grain architecture, these CLBs will contain enough logic
to create a small state machine. In fine grain architecture, more like a true
gate array ASIC, the CLB will contain only very basic logic. The diagram in
Figure 5.2 would be considered a large grain block. It contains RAM for
creating arbitrary combinatorial logic functions [25]. It also contains flipflops for clocked storage elements, and multiplexers in order to route the logic
within the block and to and from external resources. The multiplexers also allow
polarity selection and reset and clear input selection.

Figure 5.2: FPGA Configurable Logic Block [25]

42

5.4. CONFIGURABLE I/O BLOCKS


A Configurable I/O Block, shown in Figure 5.3, is used to bring
signals onto the chip and send them back off again. It consists of an input
buffer and an output buffer with three state and open collector output
controls. Typically there are pull up resistors on the outputs and sometimes
pull down resistors. The polarity of the output can usually be programmed for
active high or active low output and often the slew rate of the output can be
programmed for fast or slow rise and fall times. In addition, there is often a
flip-flop on outputs so that clocked signals can be output directly to the pins
without encountering significant delay. It is done for inputs so that there is not
much delay on a signal before reaching a flip-flop which would increase the
device hold time requirement.

Figure 5.3: FPGA Configurable I/O Block

5.5. PROGRAMMABLE INTERCONNECT


Interconnect of an FPGA is very different than that of a CPLD,
but is rather similar to that of a gate array ASIC. In Figure 5.4, a hierarchy of
43

interconnect resources can be seen. There are long lines which can be used to
connect critical CLBs that are physically far from each other on the chip
without inducing much delay. They can also be used as buses within the chip.
There are also short lines which are used to connect individual CLBs which
are located physically close to each other. There are often one or several
switch matrices, like that in a CPLD, to connect these long and short
lines together in specific ways. Programmable switches inside the chip
allow the connection of CLBs to interconnect lines and interconnect lines to
each other and to the switch matrix [25].

Three-state buffers are used to connect many CLBs to a long


line, creating a bus. Special long lines, called global clock lines, are specially
designed for low impedance and thus fast propagation times. These are
connected to the clock buffers and to each clocked element in each CLB. This
is how the clocks are distributed throughout the FPGA.

Figure 5.4: FPGA Programmable Interconnect

5.6. CLOCK CIRCUITRY


Special I/O blocks with special high drive clock buffers, known
as clock drivers, are distributed around the chip. These buffers are connected
to clock input pads and drive the clock signals onto the global clock lines
described above. These clock lines are designed for low skew times and fast
44

propagation times. As we will discuss later, synchronous design is a must


with FPGAs, since absolute skew and delay cannot be guaranteed. Only
when using clock signals from clock buffers can the relative delays and skew
times are guaranteed.

5.7. SMALL V/S LARGE GRANULARITY


Small grain FPGAs resemble ASIC gate arrays in that the CLBs
contain only small, very basic elements such as NAND gates, NOR gates,
etc. The philosophies that small elements can be connected to make larger
functions without wasting too much logic. In a large grain FPGA, where the
CLB can contain two or more flip-flops, a design which does not need
many flip-flops will leave many of them unused. Unfortunately, small
grain architectures require much more routing resources, which take up
space and insert a large amount of delay which can more than compensate for
the better utilization.
Small Granularity

Large Granularity

Better utilization
Direct conversion to ASIC

Fewer levels of logic


Less interconnect delay

A comparison of advantages of each type of architecture is


shown in Table. The choice of which architecture to use is dependent on your
specific application.

5.8. SRAM V/S ANTI-FUSE PROGRAMMING


There a r e t w o c o m p e t i n g m et ho ds o f p r o gr a m m i n g
F P G A s . The f i r s t , SRAM programming, involves small Static RAM bits for
each programming element. Writing the bit with a zero turns off a switch,
while writing with a one turns on a switch. The other method involves antifuses which consist of microscopic structures which, unlike a regular fuse,
normally make no connection. A certain amount of current during
programming of the device causes the two sides of the anti-fuse to connect.

The advantages of SRAM based FPGAs is that they use a


standard fabrication process that chip fabrication plants are familiar with and
are always optimizing for better performance. Since the SRAMs are
45

reprogrammable, the FPGAs can be reprogrammed any number of times,


even while they are in the system, just like writing to a normal SRAM
[25].
The disadvantages are that they are volatile, which means a
power glitch could potentially change it. Also, SRAM based devices have large
routing delays.

The advantages of Anti-fuse based FPGAs are that they are


non-volatile and the delays due to routing are very small, so they tend to be
faster.

The disadvantages are that they require a complex fabrication


process, they require an external programmer to program them, and once they
are programmed, they cannot be changed.

5.9. EXAMPLE OF FPGA FAMILIES

Examples of SRAM based FPGA families include the following:


Altera FLEX family
Atmel AT6000 and AT40K families
Lucent Technologies ORCA family
Xilinx XC4000 and Virtex families
Example of Anti-fuse based FPGA families include the following:
Actel SX and MX families
Quick logic pASIC family

5.10. THE DESIGN FLOW


This section examines the design flow for any device,
whether it is an ASIC, an FPGA, or a CPLD. This is the entire process for
designing a device that guarantees that you will not overlook any steps and
46

that you will have the best chance of getting backs a working prototype that
functions correctly in your system. The design flow consists of the steps in
Figure 5.5.

Write a Specification

Specification Review
Design

Simulate
Design Review
Synthesize
Place and Route
Resimulate
Final Review
Chip Test
System Integration on Test
Chip Product

Figure 5.5: FPGA Design Flow


47

5.11. WRITING A SPECIFICATION


The importance of a specification cannot be overstated. This is an
absolute must, especially as a guide for choosing the right technology and for
making your needs known to the vendor. As specification allows each engineer
to understand the entire design and his or her piece of it. It allows the engineer
to design the correct interface to the rest of the pieces of the chip. It also saves
time and misunderstanding. There is no excuse for not having a specification.

A specification should include the following information:


An external block diagram showing how the chip fits into the
system.
An internal block diagram showing each major functional
section.
A description of the I/O pins including
output drive capability
input threshold level

Timing estimates including


Setup and hold times for input pins
Propagation times for output pins
Clock cycle time
Estimated gate count
Package type
Target power consumption
Target price
Test procedures

48

It is also very important to understand that this is a living


document. Many sections will have best guesses in them, but these will
change as the chip is being designed.

5.11.1 Choosing a Technology

Once a specification has been written, it can be used to


find the best vendor with a technology and price structure that best meets
your requirements.

5.11.2 Choosing a Design Entry Method

You must decide at this point which design entry method you
prefer. For smaller chips, schematic entry is often the method of choice,
especially if the design engineer is already familiar with the tools. For larger
designs, however, a hardware description language (HDL) such as Verilog or
VHDL is used because of its portability, flexibility, and readability. When
using a high level language, synthesis software will be required to
synthesize the design. This means that the software creates low level gates
from the high level description.

5.11.3 Choosing a Synthesis Tool

You must decide at this point which synthesis software you will
be using if you plan to d e s i gn t h e FPGA w i t h a n H D L . This is
i m p o r t a n t s i n c e e a c h synthesis t o o l has recommended or mandatory
methods of designing hardware so that it can correctly perform synthesis. It
will be necessary to know these methods up front so that sections of the chip
will not need to be redesigned later on. At the end of this phase it is very
important to have a design review. All appropriate personnel should review the
decisions to be certain that the specification is correct, and that the correct
technology and design entry method have been chosen.

5.11.4 Designing the Chip

It is very important to follow good design practices. This means


taking into account the following design issues that we discuss in detail later in
this chapter.
49

Top-down design
Use logic that fits well with the architecture of the device you
have chosen
Macros
Synchronous design
Protect against metastability
Avoid floating nodes
Avoid bus contention

5.11.5 Simulating - design review

Simulation is an ongoing process while the design is being done.


Small sections of the design should be simulated separately before hooking
them up to larger sections. There will be much iteration of design and
simulation in order to get the correct functionality. Once design and
simulation are finished, another design review must take place so that the
design can be checked.

It is important to get others to look over the simulations and


make sure that nothing was missed and that no improper assumption was
made. This is one of the most important reviews because it is only with
correct and complete simulation that you will know that your chip will work
correctly in your system.

5.11.6 Synthesis
If the design was entered using an HDL, the next step is to
synthesize the chip. This involves using synthesis software to optimally
translate your register transfer level (RTL) design into a gate level design
that can be mapped to logic blocks in the FPGA. This may involve
specifying switches and optimization criteria in the HDL code, or playing
with parameters of the synthesis software in order to insure good timing and
utilization.

5.11.7 Place and Route

The next step is to lay out the chip, resulting in a real physical design
for a real chip. This involves using the vendors software tools to optimize
the programming of the chip to implement the design. Then the design is
programmed into the chip.

50

5.11.8 Re-simulating - final review

After layout, the chip must be re-simulated with the new timing
numbers produced by the actual layout. If everything has gone well up to this
point, the new simulation results will agree with the predicted results.
Otherwise, there are three possible paths to go in the design flow. If the
problems encountered here are significant, sections of the FPGA may need to
be redesigned. If there are simply some marginal timing paths or the design
is slightly larger than the FPGA, it may be necessary to perform another
synthesis with better constraints or simply another place and route with
better constraints. At this point, a final review is necessary to confirm that
nothing has been overlooked.

5.11.9 Testing

For a programmable device, you simply program the device


and immediately have your prototypes. You then have the responsibility to
place these prototypes in your system and determine that the entire system
actually works correctly. If you have followed the procedure up to this point,
chances are very good that your system will perform correctly with only
minor problems. These problems can often be worked around by modifying
the system or changing the system software. These problems need to be
tested and documented so that they can be fixed on the next revision of the
chip. System integration and system testing is necessary at this point to insure
that all parts of the system work correctly together. When the chips are put
into production, it is necessary to have some sort of burn-in test of your
system that continually tests your system over some long amount of time. If a
chip has been designed correctly, it will only fail because of electrical or
mechanical problems that will usually show up with this kind of stress testing.

5.12 DESIGN ISSUES


In the next sections of this chapter, we will discuss those
areas that are unique to FPGA design or that are particularly critical to these
devices.

5.12.1 Top-Down Design

Top-down design is the design method whereby high level


51

functions are defined first, and the lower level implementation details are
filled in later. A schematic can be viewed as a hierarchical tree as shown in
Figure 5.6. The top-level block represents the entire chip. Each lower level
block represents major functions of the chip. Intermediate level blocks may
contain smaller functionality blocks combined with gate-level logic. The
bottom level contains only gates and macro functions which are vendorsupplied high level functions. Fortunately, schematic capture software and
hardware description languages used for chip design easily allows use of the
top-down design methodology.

Figure 5.6: Top-Down Design

Top-down design is the preferred methodology for chip design


for several reasons. First, chips often incorporate a large number of gates
and a very high level of functionality. This methodology simplifies the
design task and allows more than one engineer, when necessary, to design the
chip. Second, it allows flexibility in the design. Sections can be removed and
replaced with a higher-performance or optimized designs without affecting
other sections of the chip. Also important is the fact that simulation is
much simplified using this design methodology. Simulation is an extremely
important consideration in chip design since a chip cannot be blue-wired after
production. For this reason, simulation must be done extensively before the
chip is sent for fabrication. A top-down design approach allows each module
to be simulated independently from the rest of the design [25]. This is important
for complex designs where an entire design can take weeks to simulate and
days to debug.

52

5.12.2 Keep the Architecture in Mind


Look at the particular architecture to determine which logic
devices fit best into it. The vendor may be able to offer advice about this.
Many synthesis packages can target their results to a specific FPGA or CPLD
family from a specific vendor, taking advantage of the architecture to provide
you with faster, more optimal designs.

5.12.3 Synchronous Design

One of the most important concepts in chip design, and one of


the hardest to enforce on novice chip designers, is that of synchronous design.
Once a chip designer uncovers a problem due to asynchronous design and
attempts to fix it, he or she usually becomes an evangelical convert to
synchronous design. This is because asynchronous design problems are due
to marginal timing problems that may appear intermittently, or may
appear only when the vendor changes its semiconductor process.
Asynchronous designs that work for years in one process may suddenly fail
when the chip is manufactured using a newer process. Synchronous design
simply means that all data is passed through combinatorial logic and flipflops that are synchronized to a single clock. Delay is always controlled by
flip-flops, not combinatorial logic. No signal that is generated by
combinatorial logic can be fed back to the same group of combinatorial
logic without first going through a synchronizing flip-flop. Clocks cannot
be gated - in other words, clocks must go directly to the clock inputs of the
flip-flops without going through any combinatorial logic. The following
sections cover common asynchronous design problems and how to fix them
using synchronous logic.

5.13 RACE CONDITIONS


Figure 5.7 shows an asynchronous race condition where a clock
signal is used to reset a flip-flop. When SIG2 is low, the flip-flop is reset to a
low state. On the rising edge of SIG2, the designer wants the output to change
to the high state of SIG1. Unfortunately, since we dont know the exact
internal timing of the flip-flop or the routing delay of the signal to the clock
versus the reset input, we cannot know which signal will arrive first - the
clock or the reset. This is a race condition. If the clock rising edge appears
first, the output will remain low. If the reset signal appears first, the output
will go high. A slight change in temperature, voltage, or process may cause a
chip that works correctly to suddenly work incorrectly. A more reliable
53

synchronous solution is shown in Figure 5.8. Here a faster clock is used, and
the flip-flop is reset on the rising edge of the clock. This circuit performs the
same function, but as long as SIG1 and SIG2 are produced synchronously they change only after the rising edge of CLK - there is no race condition.

Figure 5.7: Asynchronous: Race Condition

Figure 5.8: Synchronous: No Race Condition

54

5.14 DELAY DEPENDENT LOGIC


Figure 5.9 shows logic used to create a pulse. The pulse width
depends very explicitly on the delay of the individual logic gates. If the process
should change, making the delay shorter, the pulse width will shorten also, to
the point where the logic that it feeds may not recognize it at all. A
synchronous pulse generator is shown in Figure 5.10. This pulse depends only
on the clock period. Changes to the process will not cause any significant change in
the pulse width.

Figure 5.9: Asynchronous: Delay Dependent Logic

Figure 5.10: Synchronous: Delay Independent Logic

55

5.15 HOLD TIME VIOLATIONS


Figure 5.11 shows an asynchronous circuit with a hold time
violation. Hold time violation s occur when data changes around the same
time as the clock edge. It is uncertain which value will be registered by the
clock. The circuit in Figure 5 .12 fixes this problem by putting both flip-flops
on the same clock and using a flip-flop with an enable input. A pulse
generator creates a pulse that enables the flip-flop.

Figure 5.11: Asynchronous: Hold Time Violation

5.16 GLITCHES
A glitch can occur due to small delays in a circuit such as that
shown in Figure 5.12. An inverting multiplexer contains a glitch when
56

switching between two signals, both of which are high. Yet due to the delay
in the inverter, the output goes high for a very short time.

Figure 5.12: Asynchronous: Glitch


Synchronizing this output by sending it through a flip-flop as
shown in Figure 5.13, ensures that this glitch will not appear on the
output and will not affect logic further downstream.

Figure 5.13: Synchronous: No Glitch


57

5.17 BAD CLOCKING

Figure 5.14 shows an example of asynchronous clocking. This


kind of clocking will produce problems of the type discussed previously. The
correct way to enable and disable outputs is not by putting logic on the clock
input, but by putting logic on the data input as shown in Figure 5.15.

Figure 5.14: Asynchronous: Bad Clocking

Figure 5.15: Synchronous: Good Clocking

5.18 METASTABILITY
One of the great buzzwords, and often misunderstood
concepts, of synchronous design is metastability. Metastability refers to
a condition which arises when an asynchronous signal is clocked into a
synchronous flip-flop. While chip designers would prefer a completely
synchronous world, the unfortunate fact is that signals coming into a chip will
depend on a user pushing a button or an interrupt from a processor, or will be
generated by a clock which is different from the one used by the chip. In these
cases, the asynchronous signal must be synchronized to the chip clock so that
it can be used by the internal circuitry.

The designer must be careful how to do this in order to


avoid metastability problems as shown in Figure 5.16. If the ASYNC_IN
58

signal goes high around the same time as the clock, we have an unavoidable
race condition [25]. The output of the flip-flop can actually go to an undefined
voltage level that is somewhere between a logic 0 and logic 1. This is
because an internal transistor did not have enough time to fully charge to the
correct level. This meta level may remain until the transistor voltage leaks
off or decays, or until the next clock cycle. During the clock cycle, the
gates that are connected to the output of the flip-flop may interpret this level
differently. In the figure, the upper gate sees the level as logic 1 whereas the
lower gate sees it as logic 0. In normal operation, OUT1 and OUT2 should
always be the same value. In this case, they are not and this could send the
logic into an unexpected state from which it may never return. This
metastability can permanently lock up your chip.

Figure 5.16: Metastability - The Problem


The solution to this metastability problem by placing a
synchronizer flip-flop in front of the logic, the synchronized input will be
sampled by only one device, the second flip-flop, and be interpreted only as
logic 0 or 1. The upper and lower gates will both sample the same logic level,
and the metastability problem is avoided. Or is it? The word solution is in
quotation marks for a very good reason. There is a very small but non-zero
probability that the output of the synchronizer flip-flop will not decay to a
valid logic level within one clock period. In this case, the next flip-flop will
sample an indeterminate value, and there is again a possibility that the output
of that flip-flop will be indeterminate. At higher frequencies, this possibility is
greater. Unfortunately, there is no certain solution to this problem. Some
vendors provide special synchronizer flip-flops whose output transistors decay
very quickly. Also, inserting more synchronizer flip-flops reduces the
59

probability of metastability but it will never reduce it to zero. The correct


action involves discussing metastability problems with the vendor, and
including enough synchronizing flip-flops to reduce the probability so that it is
unlikely to occur within the lifetime of the product.

5.19 TIMING SIMULATION


This method of timing analysis is growing less and less popular.
It involves including timing information in a functional simulation so that
the real behavior of the chip is simulated. The advantage of this kind of
simulation is that timing and functional problems can be examined and
corrected. Also, asynchronous designs must use this type of analysis because
static timing analysis only works for synchronous designs. This is another
reason for designing synchronous chips only.

As chips become larger, though, this type of compute


intensive simulation takes longer and longer to run. Also, simulations can
miss particular transitions that result in worst case results. This means that
certain long delay paths never get evaluated and a chip with timing problems
can pass timing simulation. If you do need to perform timing simulation, it
is important to do both worst case simulation and best case simulation. The
term best case can be misleading. It refers to a chip that, due to voltage,
temperature, and process variations, is operating faster than the typical chip.
However, hold time problems become apparent only during the best case
conditions.

60

CHAPTER 6
VERIFICATION AND SIMULATION RESULTS

6.1 DESIGN FLOW


In this thesis, we use VHDL to design four modules as the
description in chapter 3 which is Simulated with ModelSim and complied in ISE.
After simulation pass then download to V2MB1000 develop board to verify
function in logic analyzer.

START

Define Specification

Edit behavioral Code by


VHDL

Compile ok
by ISE

Simulate ok
by Model Sim

Hardware
Environment
ok

Implementation

Download into FPGA

Verify result ok?

End

Figure 6.1 Design flow Chart

61

6.2 IMPLEMENT ENVIRONMENT


The Virtex-II family is a platform FPGA developed for high
performance, low to high-density designs utilizing IP cores and customized
modules. We use Virtex-II V2MB1000 development board which supports
FPGA peripheral circuits and the P160 expansion slot for application specific
add-on cards. The Virtex-II V2MB1000 is shown in Figure 6.2.

Figure 6.2: Virtex-II V2MB1000

6.2.1 Implementation status

The Figure 5.3 shows Xilinx FPGA array of design. The Figure
5.4 and Figure 5.5 are shown the floor plane and information about
implementation. From information above, we can find total 6,976 equivalent gate

62

counts used, and the maximum combinational critical path delay of 9.43ns and
the maximum operating frequency of 112.41 MHzetc.

Figure 6.3: Xilinx FPGA array

63

Figure 6.4: Implement information A

64

Figure 6.5: Implement information B

65

6.3 SIMULATION RESULT BY SOFTWARE


6.3.1 Timer Module Simulation

6.3.1.1 Back-off Time function

The formula of random back-off time is:

Back-off Time = Random () x a Slot Time

Random () indicates a random number that pseudorandom integer


drawn from a uniform distribution over the interval [0, CW]. CW is an integer
between CWmin and CWmax. In 802.11b specification, the CW is 31~1023. Slot
Time indicates the value of the correspondingly named PHY characteristic. In
802.11b specification, the Slot Time is 20us.

In this simulation result, we can see Back-off function starting


after one DIFS (50us). The process as below:
a.
Timer will count 50us (Time mode set on 010): if it
counts finish and medium is not available (CCA = 1) then enters
Back-off process. (Time mode set on 000)

b.
In this simulation: Getting 5 sets pseudorandom and CW
values are 31, 182, 58, 203, and 89.

For first value, 31 x 20us = 620 us; clock is set in 100MHz (10ns
period). Timer should count 62000 then issue Back_off_Count_End if count
finish. Show simulation result in Figure 6.6.

66

Figure 6.6: Back-Off time function simulation result

67

6.3.1.2 Inter-frame space function

This function process is:

a. Select Time Mode for SIFS and DIFS: In this design, if Time
mode is set to 000 for Back-off time, set to 001 for SIFS and set to 010 for
DIFS. In 802.11b specification, SIFS is 10 us and DIFS is 50 us.

b. Once time mode is set: it will assign a count value, if count to


zero, it will issue Time_Count_End signal.

Figure 6.7 and 6.8 show simulation result.

Figure 6.7: Inter-frame space function simulation result A

68

Figure 6.8: Inter-frame space function simulation result B

6.3.2 Transmit Frame Module Simulation

6.3.2.1 FIFO function

In original design, FIFO depth is 32 bits and width is 1600 bits.


For verify FIFO function easier, modify FIFO width to 4 bytes but depth keep in
32 bits. Input data from 1 to 7 that read data from FIFO and write data to FIFO
then see FIFO function is correct or not. FIFO function simulation shows in
Figure 6.9.

69

Figure 6.9: FIFO simulation result A

6.3.2.2 Frame Transmission

Use a RTS frame to demo this simulation. Frame Transmission


function shows in Figure 6.10. (Frame parallel in and serial out) Figure 6.10:

70

Figure 6.10: Frame Transmission simulation result.

71

CHAPTER 7
CONCLUSIONS

In this thesis, we implement the transmitter of IEEE 802.11 b


MAC layer functions in FPGA. For this system architecture, we have realized
MAC transmitter including all features and protocol defined in DCF functions,
like frame generation, back-off time, Duration time and CRC 32
calculationetc. according to 802.11 specification. Some functions are not be
implemented in this thesis, such as PCF.

Access point vendors seem to be reluctant to active it even PCF


is defined in 802.11 specification. In other word, PCF has not been widely
implemented until now; and Wi-Fi alliance does not include PCF functionality in
their interoperability standard [37]. But In real-time applications, PCF operation
should be implemented in the future.

Wireless LAN generally is used in mobile devices, such as


laptop. So it also need to consider power saving problem. 802.11 specifications
have defined it; it can be included to make the MAC function more completely.

Finally, we can consider Co-design in firmware, software and


hardware such as SOC implement. It can improve design more flexible and get
better performance in the future.

72

REFERENCES

[1] Bononi L., Conti M. and Gregori E., Design and performance evaluation of
an asymptotically optimal backoff algorithm for IEEE 802.11 Wireless LANs.
Department of Computer Science University of Bologna, Jan. 2000, pp. 1-10.
[2] Cimini J., Leung K., McNair B. and Winters J. ,Outdoor IEEE 802.11
cellular networks: MAC protocol design and performance. AT&T LabsResearch, May 2002, pp. 595-599.
[3] Cali F., Conti M. and Gregori E. ,IEEE 802.11 wireless LAN: capacity
analysis and protocol enhancement. Italian National Council of Research, Apr.
1998, pp. 142-149.
[4] IEEE standard for information technology- telecommunications and
information exchange between systems- local and metropolitan area networksspecific requirements Part II: wireless LAN medium access control (MAC) and
physical layer (PHY) specifications, 1999.
[5] Dirtterle D., Panic G., Stamenkovic Z. and Tittelbach-Helmrich K. A
system-on-chip implementation of IEEE 802.11a MAC layer. Im
Technologiepark 25, Sept. 2003, pp. 319-324.
[6] Hou J. and Hwangnam K. Improving protocol capacity for UDP/TCP traffic
with model-based frame scheduling in IEEE 802.11-operated WLANs. IEEE
journal on selected areas in communications, Dec. 2004. 72, pp. 1987-2003.
[7] Fob C., Lee B. and Tantra J. An efficient scheduling scheme for high speed
IEEE 802.11 WLANs. Centre of Multimedia and Network Technology School of
Computer Engineering Nanyang Technological University, Oct. 2003, pp. 25892593.
[8] Han R. and Sheth A. Adaptive power control and selective radio activation
for low-power infrastructure-mode 802.11 LANs. Department of Computer
Science, University of Colorado, May 2003, pp. 812-818.
73

[9] Chhaya H. and Gupta S. Performance of asynchronous data transfer methods


of IEEE 802.11 MAC protocol. IEEE Personal Communications, Oct. 1996, pp.
8-15.
[10] Gupta H., Khanna V. and Maheshwari S. Contention free data transfers in
IEEE 802.11 ad-hoc wireless LAN protocol: an analysis. Electrical Engineering
Department, Indian Institute of Technology, Oct. 2003, pp. 1062-1066.
[11] Tang C. and Wang C. A probability-based algorithm to adjust contention
window in IEEE 802.11 DCF. Department of Computer Science and
Engineering, University of Arkansas, Fayetteville, USA, June 2004, pp. 418-422.
[12] Cai Y., Luo F., Zhang H. and Zhou Z. Novel design and implementation of
IEEE 802.11 medium access control. Department of Electronic Engineering,
Tsinghua University, 29 Aug. 2004, pp.278-282.
[13] Liu Y., Liu B. MAC implementation with embedded system.
Microelectronic Institute, Tsinghua University, Oct. 2003, pp. 757-760.
[14] Lee D., Lee G. and Park S. Effective co-verification of IEEE 802.11a
MAC/PHY combining Emulation and Simulation technology. System Integration
Technology Institute Information and Communications University, Apr. 2005,
pp. 138-146.
[15] Chang M. Wireless LAN Systems. Department of Electrical and Computer
Engineering Iowa State University, Dec. 13-14, 2003.
[16] Lin J., Qu. X., Rao X., Shajian M., Wang Q. and Yeong J. 802.11a
MAC layer: firmware/hardware is Co-design. Institute for Infocomm Research,
Dec.2003.
[17] Choi S. and Pavon P. 802.11g CP: a solution for IEEE 802.11g and
802.11b inter-working. School of Electrical Engineering Seoul National
University, Apr., 2003, pp. 690-694.

74

[18] Acharya A., Bansal S. and Misra A. High-performance architectures for IPbased multi-hop 802.11 networks Stanford University, Oct. 2003, pp. 22-28.
[19] McFarland B., Meng T., Su D. and Thomson S. Design and
implementation of an all-CMOS 802.11a wireless LAN chipset. Stanford
University, 2003, pp. 160-168.
[20] Nishida Y. Enhancing 802.11 DCF MAC for TCP/IP Communication. Sony
Computer Science Laboratories, Inc., Mar. 2005, pp. 13-17.
[21] Bruno R., Conti M. and Gregori E. IEEE 802.11 optimal performances:
RTS/CTS mechanism vs. basic access. Italian National Council of Research,
Sept. 2002, pp. 1747-1751.
[22] Liu H. and Wu J. Packet Telephony support for the IEEE 802.11 Wireless
LAN. IEEE communications letters, Sept. 2000, pp. 286-268.
[23] Cho K., Haewon J., Lee, H. and Youjin K. MAC implementation for
IEEE 802.11 Wireless LAN. Router Technology Department, Electronics &
Telecommunications Research Institute, Apr. 2001, pp. 191-195.

75

PROGRAMMING CODE

1. Txtop.vhd
library ieee;
use ieee.std_logic_1164.all;
entity txtop is
port ( mpdu:out std_logic_vector(7 downto 0);
clk,rst,phy_idle,nav_str,probereply,
authreply,assoreply,rtsreply,datareply:in std_logic;
datain,add1:in std_logic_vector(7 downto 0);
data_msdu_in:in std_logic_vector(7 downto 0));
end ;
architecture txtop of txtop is

component assoreq_block
port(clk,rst,asso_req,header_end:in std_logic;
datain:in std_logic_vector(7 downto 0);
crc_stp_ass:out std_logic;
crcout:in std_logic_vector(31 downto 0);
assofrm:out std_logic_vector(7 downto 0);
asso_end:out std_logic);
end component;
component authreq_block
port(clk,rst,auth_req,header_end:in std_logic;
datain:in std_logic_vector(7 downto 0);
crc_stp_auth:out std_logic;
crcout:in std_logic_vector(31 downto 0);
authfrm:out std_logic_vector(7 downto 0);
auth_end:out std_logic);
end component;
component probreq_block
76

port(clk,rst,probe_req:in std_logic;
datain:in std_logic_vector(7 downto 0);
header_end: in std_logic;
crc_stp_prob:out std_logic;
crcout:in std_logic_vector(31 downto 0);
probfrm:out std_logic_vector(7 downto 0);
prob_end:out std_logic);
end component;
component rtsfrmo
port(clk,rst,rts_en:in std_logic;
datain:in std_logic_vector(7 downto 0);
crc_stp_rts:out std_logic;
crcout:in std_logic_vector(31 downto 0);
rtsfrm:out std_logic_vector(7 downto 0);
rts_end:out std_logic);
end component;
component backoff
port(clk_1M,rst,backoff_str,tx_fail,phy_idle:in std_logic;
backoff_end,backoff_hault:out std_logic);
end component;
component framecntrl
port
(rst,tx_fail:in std_logic;
contrl_frame:out std_logic_vector(15 downto 0);
typ:in std_logic_vector(1 downto 0);
subtyp:in std_logic_vector(3 downto 0));
end component;
component frame_enblock
port(rst,probreq,assoreq,authreq,rtsreq,datareq:in std_logic;
typ :out std_logic_vector(1 downto 0);
subtyp :out std_logic_vector(3 downto 0));
end component;

77

component control_state_tx
port(clk,rst,phy_idle,backoff_end,backoff_hault:in std_logic;
nav_str,difs_end,probereply,tx_fail,sifs_end:in std_logic;
authreply,assoreply:in std_logic;
rtsreply,datareply:in std_logic;
prob_end,auth_end,asso_end,rts_end,data_end,header_end:in
std_logic;
difs_str,backoff_str,sifs_str,header_en,mngmnt,data_req,probe_req,aut
h_req,data_en,auth_en,asso_req,asso_en,rts_req,rts_en,crc_en,mach,
probreq,assoreq,authreq,rtsreq,datareq:out std_logic);
end component;
component header2
port(add1:in std_logic_vector(7 downto 0);
clk,rst:in std_logic;
datain:in std_logic_vector(7 downto 0);
header_en:in std_logic;
mngmnt,mach: in std_logic;
header_end : out std_logic;
headerout:out std_logic_vector(7 downto 0);
contrl_frame:in std_logic_vector(15 downto 0));
end component;
component crc
port(clk,rst: in std_logic;
crc_en,crc_stp : in std_logic;
datain : in std_logic_vector(7 downto 0);
crcout: out std_logic_vector(31 downto 0));
--crcout1error: out std_logic;
end component;
component mpdufrm
port(header_end,header_en,probe_req,asso_req,auth_req,rts_en,data_
en:in std_logic;
mpdu:out std_logic_vector(7 downto 0);

78

probfrm,assofrm,authfrm,data_msdu,rtsfrm,headerout:in
std_logic_vector(7 downto 0));
end component;
component data_tx_block
port(data_en,clk,rst:in std_logic;
crc_stp_data,data_end:out std_logic;
data_msdu_in:in std_logic_vector(7 downto 0);
data_msdu:out std_logic_vector(7 downto 0));
end component;
component difs
port(clk_1M,rst,difs_str,sifs_str:in std_logic;
difs_end,sifs_end:out std_logic);
end component;
component clk1M_gen
port(clk,rst:in std_logic;
clk_1M:out std_logic);
end component;
component crc_en_stp_block
port(crc_stp_ass,crc_stp_auth,crc_stp_data,crc_stp_prob,
crc_stp_rts:in std_logic;
crc_stp:out std_logic);
end component;
signal typ:std_logic_vector( 1 downto 0);
signal subtyp:std_logic_vector(3 downto 0);
signal crcout:std_logic_vector(31 downto 0);
signal
asso_req,header_end,crc_stp,asso_end,auth_req,auth_end,crc_stp_ass,
crc_stp_auth,crc_stp_prob,crc_stp_rts,crc_stp_data,
prob_req,prob_end,rts_en,rts_end,clk_1M,backoff_str,tx_fail,backoff_e
nd,backoff_hault,
mngmnt,mach,crc_en,difs_end,sifs_end,data_req,probe_req,data_en,au
th_en,asso_en,rts_req,data_end,
difs_str,sifs_str,header_en,probreq,assoreq,authreq,rtsreq,datareq:std_l
ogic;
79

signal
assofrm,authfrm,probfrm,rtsfrm,headerout,data_msdu,din:std_logic_ve
ctor(7 downto 0);
signal contrl_frame :std_logic_vector(15 downto 0);
begin
association:assoreq_block port
map(clk,rst,asso_req,header_end,datain,crc_stp_ass,
crcout,assofrm,asso_end);
authentication:authreq_block port
map(clk,rst,auth_req,header_end,datain,crc_stp_auth,
crcout,authfrm,auth_end);
probeframe:probreq_block port
map(clk,rst,probe_req,datain,header_end,crc_stp_prob,crcout,probfrm,
prob_end);

rtsframe_block:rtsfrmo port
map(clk,rst,rts_en,datain,crc_stp_rts,crcout,rtsfrm,rts_end);
backoffunit :backoff port map
(clk_1M,rst,backoff_str,tx_fail,phy_idle,backoff_end,backoff_hault);
framecontrol :framecntrl port map(rst,tx_fail,contrl_frame,typ,subtyp);
frameenable :frame_enblock port
map(rst,probreq,assoreq,authreq,rtsreq,datareq,
typ,subtyp);
statemach :control_state_tx port map
(clk,rst,phy_idle,backoff_end,backoff_hault,
nav_str,difs_end,probereply,tx_fail,sifs_end,
authreply,assoreply,rtsreply,datareply,
prob_end,auth_end,asso_end,rts_end,data_end,header_end,
difs_str,backoff_str,sifs_str,header_en,mngmnt,data_req,probe_req,aut
h_req,data_en,auth_en,asso_req,asso_en,rts_req,rts_en,crc_en,mach,
80

probreq,assoreq,authreq,rtsreq,datareq);

header :header2 port map


(add1,clk,rst,datain,header_en,mngmnt,mach,header_end,headerout,co
ntrl_frame);
crcblock :crc port map (clk,rst,crc_en,crc_stp,datain,crcout);
mpduframe :mpdufrm port
map(header_end,header_en,probe_req,asso_req,auth_req,rts_en,data_
en,
mpdu,probfrm,assofrm,authfrm,data_msdu,rtsfrm,headerout);
datablock :data_tx_block port map
(data_en,clk,rst,crc_stp_data,data_end,data_msdu_in,data_msdu);
difsblock : difs port map(clk_1M,rst,difs_str,sifs_str,difs_end,sifs_end);
clock1M: clk1M_gen port map(clk,rst, clk_1M);
crcstop: crc_en_stp_block port
map(crc_stp_ass,crc_stp_auth,crc_stp_data,crc_stp_prob,crc_stp_rts,
crc_stp );
end txtop;

2. Assoreg.vhd
library ieee;
use ieee.std_logic_1164.all;
entity assoreq_block is
port(clk,rst,asso_req,header_end:in std_logic;
datain:in std_logic_vector(7 downto 0);
crc_stp_ass:out std_logic;
crcout:in std_logic_vector(31 downto 0);
assofrm:out std_logic_vector(7 downto 0);
asso_end:out std_logic);
end;
architecture assoreq_block of assoreq_block is
type stateasso is (idleasso,sprt_rt,cap_inf,list_int,ssid,fcs);
81

signal st_asso:stateasso;
signal countr: integer;
signal crcsig :std_logic_vector(31 downto 0);
begin
process(clk,rst,datain,asso_req,header_end)

begin
if(rst='1') then
assofrm="00000000";
countr=0;st_asso=idleasso;
crc_stp_ass='Z';
elsif(clk'event and clk='1') then
if(asso_req='1' ) then
case st_asso is
when idleasso=>
-- if(asso_req='1' ) then
asso_end='0';
crc_stp_ass='Z';
st_asso=cap_inf;
countr=24;
--assofrm=datain;
--else
st_asso=cap_inf;
-- end if;
when cap_inf=>
if(countr=25) then
st_asso=list_int;
assofrm=datain;
countr=26;
else
assofrm=datain;
countr=countr+1;
end if;
when list_int=>
if(countr=27) then
st_asso= ssid;
assofrm=datain;
countr=28;
else
assofrm=datain;
countr=countr+1;
82

end if;
when ssid=>
if(countr=59) then
st_asso=sprt_rt ;
assofrm=datain;
countr=60;
else
assofrm=datain;
countr=countr+1;
end if;
when sprt_rt=>
if(countr=67) then
st_asso= fcs;
assofrm=datain;
crc_stp_ass='1';
countr=68;
crcsig=crcout;
else
assofrm=datain;
countr=countr+1;
end if;
when fcs=>
if(countr=71) then
asso_end='1';
st_asso=idleasso;
countr=0;
assofrm=crcsig(31 downto 24);
else
assofrm=crcsig(31 downto 24);
crcsig=crcsig(23 downto 0)&x"00";
countr=countr+1;
end if;
end case;
end if;
end if;
end process;
end assoreq_block;

3. Authreg_block.vhd
library ieee;
use ieee.std_logic_1164.all;
entity authreq_block is
83

port(clk,rst,auth_req,header_end:in std_logic;
datain:in std_logic_vector(7 downto 0);
crc_stp_auth:out std_logic;
crcout:in std_logic_vector(31 downto 0);
authfrm:out std_logic_vector(7 downto 0);
auth_end:out std_logic);
end;
architecture authreq_block of authreq_block is
type stateauth is (idleauth,algno,transeq_no,st_cde,chllg_txt,fcs);
signal st_auth:stateauth;
signal countr: integer;
signal crcsig:std_logic_vector(31 downto 0);
begin
process(clk,rst,datain,auth_req,header_end,countr,st_auth)
begin
if (rst='1') then
authfrm="00000000";
countr=0;
crc_stp_auth='Z';
elsif (clk'event and clk='1') then
if(auth_req='1' ) then
case st_auth is
when idleauth=>
auth_end='0';
st_auth=algno;
countr=24;
authfrm=datain;
crc_stp_auth='Z';
when algno=>
if (countr=25) then
st_auth=transeq_no;
authfrm=datain;
countr= 26;
else
authfrm=datain;
countr=countr+1;
end if;
when transeq_no=>
if(countr= 27) then
st_auth= st_cde;
authfrm=datain;
countr=28;
else
authfrm=datain;
84

countr=countr+1;
end if;
when st_cde=>
if(countr=29) then
authfrm=datain;
st_auth=chllg_txt;
countr=30;
else
authfrm=datain;
countr=countr+1;
end if;
when chllg_txt=>
if(countr=93) then
authfrm=datain;
st_auth= fcs;
crc_stp_auth='1';
countr=94;
crcsig=crcout;
else
authfrm=datain;
countr=countr+1;
end if;
when fcs=>
if(countr=97) then
auth_end='1';
st_auth=idleauth;
countr=0;
authfrm=crcsig(31 downto 24);
else
authfrm=crcsig(31 downto 24);
crcsig=crcsig(23 downto 0)&x"00";
countr=countr+1;
end if;
end case;
end if;
end if;
end process;
end authreq_block;
4. Backoff.vhd
library ieee;
use ieee.std_logic_1164.all;
entity backoff is
port(clk_1M,rst,backoff_str,tx_fail,phy_idle:in std_logic;
85

backoff_end,backoff_hault:out std_logic);
end;
architecture backoff of backoff is
signal random,cnt_back:integer;
signal slot_end:std_logic;
begin
process(clk_1M,rst,backoff_str,tx_fail,random)
begin
if(rst='1') then
backoff_end='0';
backoff_hault='0';
random=31;
elsif(clk_1M='1' and clk_1M'event) then
if(random=0) then
backoff_end='1';
elsif(tx_fail='1') then
random=(random *2);
elsif(phy_idle='0' and backoff_str='1') then
random=random ;
backoff_hault='1';
elsif(phy_idle='1' and backoff_str='1' and slot_end='1') then
random=(random-1);
end if;
end if;
end process;
process(clk_1M,rst,cnt_back,backoff_str)
begin
if(rst='1') then
cnt_back=0;
slot_end='0';
elsif(clk_1M='1' and clk_1M'event) then
if(backoff_str='1') then
if(cnt_back=19) then
cnt_back=0;
slot_end='1';
else
cnt_back=cnt_back+1;
slot_end='0';
end if;end if;end if;
end process;
end backoff;
86

List of publications

[1] Mittal N., Akashe S. and Sharma S. Implementation of High Performance


Wi-Fi MAC layer for Transmitter on FPGA. Institute for Mathematics, BioInformatics, Information-Technology and Computer-Science IMBIC, Vol. 6, Apr.
2011, pp. 1-12.

[2] Mittal N. and Agarwal N. VHDL Modeling of Wi-Fi MAC Layer for
Transmitter. AICTE Sponsored National Seminar on Recent Trends & Advances
in VLSI Design, Gwalior, December 5-7, 2011, pp-15.

87

Potrebbero piacerti anche