Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Introduction
1.0
There are many companies that develop core IP for SoC products. The
interfaces to these cores can differ from company to company and can
sometimes be proprietary in nature. The SoC developer then must
expend time, effort, and money to create bridge or glue logic that
allows all of the cores inside the SoC to communicate properly with each
other. Incompatible interfaces are thus barriers to both IP developers and
SoC developers. SoC integrated circuits envisioned by this subcommittee
span a wide breadth of applications, target system costs, and levels of
performance and integration.
Integrated circuits have entered the era of System-on-a-Chip (SoC),
which refers to integrating all components of a computer or other
electronic system into a single chip. It may contain digital, analog, mixedsignal, and often radio-frequency functions all on a single chip
substrate. With the increasing design size, IP is an inevitable choice for
SoC design. And the widespread use of all kinds of IPs has changed the
nature of the design flow, making On-Chip Buses (OCB) essential to the
design.
Of all OCBs existing in the market, the AMBA bus system is widely
used as the de facto standard SoC bus. On March 8, 2010, ARM
announced availability of the AMBA 4.0 specifications. As the de facto
standard SoC bus, AMBA bus is widely used in the high-performance
SoC designs. The AMBA specification defines an on-chip communication
standard for designing high-performance embedded microcontrollers.
designs
with
large
peripherals.
numbers
of
controllers
and
technical
direction
more
than
reality:
increasing
chip
1.
2.
3.
4.
5.
6.
7.
as
route
directly
from ARM
between
designs
with large
numbers
of
controllers
and
of
SoCs
with
different
power,
performance
and
area
requirements. With its ACE, AXI, AHB and APB interface protocols;
AMBA 4 has the flexibility to match every requirement.
2. Multi-Layer
The Multi-layer architecture acts as a crossbar switch between
masters and slaves in an AMBA 3 AXI or AHB system. The parallel
8
wide
adoption
of
AMBA
specifications throughout
the
modular
system
design
to
improve
processor
10
In this project
between modules, then again ARM introduced AHB and AXI in followed
version of AMBA.
ARM AMBA (Advanced Microcontroller Bus Architecture) set of
buses: ASB, AHB and AXI. The AHB (Advanced High-performance Bus) is
considered (despite the bus name) a mid-performance bus and the AXI
(Advanced eXtensible Interface) bus is considered a high-performance
bus. The AMBA family of buses was chosen for consideration because of
their widespread acceptance in the industry and large amount of existing
IP cores.
History of High-Performance Buses
1. ASB (Advanced System Bus)
2. AHB (Advanced High-performance Bus)
3. AXI (Advanced eXtensible Interface)
1. AXI4
2. AXI4-Lite
3. AXI4-Stream
1. ASB (Advanced System Bus)
The Advanced System Bus (ASB) specification defines a highperformance bus that can be used in the design of high performance 16
12
13
14
and
ARADDR,
but
this
could
be
extended
in
some
implementations. The write and read data buses (WDATA and RDATA)
may be defined under the specification as any 2n number, from 8-bit to
1024-bit. With the assumption that both the address and data buses are
32-bit, and that the data buses are 128-bit, the write address, write
data, and write response channels would require 56, 139, and 8 I/O,
respectively. The read address and read data channels would require 56
and 137 I/O, respectively. Thus, each 128-bit AXI master has 396 I/O
total.
interconnect, which routes master requests and write data to the proper
slave, and returning read data to the requesting master. The interconnect
also maintains ordering based on tags if, for example, a single master
pipelines read requests to different slaves.
AXI uses a handshake between VALID and READY signals. VALID is
driven by the source, and READY is driven by the destination. Transfer of
information, either address and control or data, occurs when both VALID
15
and READY are sampled high. AXI, the third generation of AMBA
interface defined in the AMBA 3 specification, is targeted at high
performance, high clock frequency system designs and includes features
which make it very suitable for high speed sub-micrometer interconnect.
1.
2.
3.
4.
5.
16
2.
3.
Wait states.
Error reporting.
Transaction protection.
21
latest bridge.
1. ASB to APB Bridge
The APB Bridge provides an interface between the ASB and the
Advanced Peripheral Bus (APB). It continues the pipelining of the ASB by
inserting wait cycles on the ASB only when they are needed. It inserts
them for burst transfers or read transfers when the ASB must wait for
the APB [Ref 9].
In the AMBA specification, APB accesses are word-wide (32 bits). The
AHB2APB provides the signal pbyte_enable[3:0] to allow byte and halfword accesses. These can be used by the APB peripheral as necessary.
The AHB2APB does not perform any alignment of the data, but transfers
data from the AHB to the APB for write cycles and from the APB to the
AHB for read cycles.
The AHB2APB does not support burst transfers and converts any
burst transfers into a series of APB accesses. The AHB slave interface
supplied by the AHB2APB does not make use of the split response
protocol [Ref 10]
Features of bridge
25
The Xilinx AXI to APB Bridge is a soft IP core with these features:
4. AXI interface is based on the AXI4-Lite specification
5. APB interface is based on the APB3 specification, supports
optional APB4 selection
6. Supports 1:1 (AXI:APB) synchronous clock ratio
7. Connects as a 32-bit slave on 32-bit AXI4-Lite
8. Connects as a 32-bit master on 32-bit APB3/APB4
9. Supports optional data phase time out
26
27
features
or
sufficient
bandwidth
to
support
common
SoC
applications.
3. The ease of interfacing to Power Architecture processors: Given that
the main goal of Power.org is to promote Power Architecture processors, it
would make little sense for the Power.org Bus
29
31
Advanced eXtensible
33
Interface. The APB address and data bus widths are fixed to 32-bits.
read/write response.
RAM,
and
Processor
are
directly
interfaced
to
low
of
clock
timings
between
the
modules.
To
36
high-frequency
operation
without
using
complex
bridges
3. meets the interface requirements of a wide range of components
4. is suitable for memory controllers with high initial access
latency
5. Provides flexibility in the implementation of interconnect
architectures is backward-compatible with existing AHB and
APB interfaces.
The key features of the AXI protocol are:
1. Separate address/control and data phases
2. Support for unaligned data transfers, using byte strobes
3. uses burst-based transactions with only the start address
issued
4. separate read and write data channels that can provide low-cost
DMA
5. Support for issuing multiple outstanding addresses
6. Support for out-of-order transaction completion
7. Permits easy addition of register stages to provide timing
closure.
The AXI protocol includes the optional extensions that cover signaling
for low-power operation
AXI Revisions
37
1. AXI4
2. AXI4-Lite
3. AXI4-stream
Note: Out of these AXI4-Lite is used
1. AXI4:
The AXI4 protocol is an update to AXI3 to enhance the performance
and utilization of the interconnect when used by multiple masters. It
includes the following enhancements:
1. Support for burst lengths up to 256 beats
2. Quality of Service signaling
3. Support for multiple region interfaces
2. AXI4-Lite:
3. AXI4-Stream:
39
This allows for full bus utilization, even with interfaces with extremely
long latency.
40
41
that reads with the same ARID from different slaves return read data in
the same order in which the requests were made.
5. Write Data Interleaving:
Multiple AXI masters may attempt to write data to a single slave at
the same time. Some masters buffer all of the write data before making
the request, but others assemble data to send after the request is made.
AXI slaves that support write data interleaving can accept write data
from both types of masters in what looks like a single data tenure, with
the AWID value switching between the masters sending write data. This
allows interconnect to avoid stalling the write data bus to the slave,
waiting for writes data to be assembled. Instead, the interconnect sends
write data from a buffered write master in between beats of write data
from the assemble write master.
6. Separate Data Buses:
AXI doubles the peak bandwidth at a given frequency by using
separate buses for read and write data. These buses are used in
conjunction with data pipelining to increase performance.
7. Handshaking:
42
8. Exclusive Access:
AXI supports an exclusive access mechanism that enables
semaphore types of operations without requiring the bus to be locked.
The process is
1) A master requests an exclusive read from an address location,
43
2) The same master, some time later, attempts to complete the exclusive
operation by attempting an exclusive write to the same address. The
exclusive write is only able to complete successfully if no other master
has written to that location between the exclusive read and write and
this mechanism could be used to improve the performance of Power
processors attached to AXI.
9. No Slave Burst Termination:
Once an AXI slave acknowledges a burst transfer, it is responsible for
accepting all of the write data or generating all the read data associated
with that burst. This simplifies master designs, since the master does not
have to prepare to make a subsequent request if the slave terminated the
original request before all of its data was transferred. Giving slaves the
ability to terminate bursts is not required when the burst length is
declared with the request, and is reasonably short. AXI bursts are 16
beats or fewer.
44
46
1. A write data channel to transfer data from the master to the slave.
In a write transaction, the slave uses the write response channel to
signal the completion of the transfer to the master.
2. A read data channel to transfer data from the slave to the master.
The AXI protocol:
1. Permits address information to be issued ahead of the actual data
transfer
2. supports multiple outstanding transactions
3. Supports out-of-order completion of transactions.
Figure (5.2) shows how a read transaction uses the read address and
read data channels.
48
include a LAST signal to indicate the transfer of the final data item in a
transaction.
1. Read and write address channels
Read and write transactions each have their own address channel
which carries all of the required address and control information for a
transaction.
2. Read data channel
The read data channel conveys both the read data and read
response information from the slave back to the master. It includes:
1. The data bus, which can be 8, 16, 32, 64, 128, 256, 512, or 1024
bits wide (RDATA)
2. A read response indicating the completion status of the read
transaction.
3. Data and response group signals are maintained until the RReady
signal is asserted
3. Write data channel
The write data channel conveys the write data from the master to
the slave and includes:
1. the data bus, which can be 8, 16, 32, 64, 128, 256, 512, or 1024
49
APB revisions
The APB Specification Rev E, released in 1998, is now obsolete and
is superseded by the following three revisions:
1. AMBA 2 APB Specification
2. AMBA 3 APB Protocol Specification v1.0
3. AMBA APB Protocol Specification v2.0.
Out of these using AMBA APB Protocol Specification v2.0.
The APB Bus is the lowest performance bus in the AMBA family.
There are separate address (PADDR), write data (PWDATA), and read data
(PRDATA) buses, up to 32-bits each. With 7 additional control signals,
there can be up to 103 I/O for each APB slave. There is one APB master,
usually the bridge from a higher performance bus that begins a transfer
by asserting the appropriate PSELn signal with PADDR. PWRITE is active
for a write and inactive on a read. PENABLE is asserted in the second
clock, and is held active until PREADY is returned by the slave. The
minimum transfer, read or write, is two clocks. APB slaves also have the
option of inserting wait states for reads or writes by withholding PREADY.
There is an optional PSLVERR signal used by the slave to report an error
on a read or write with PREADY.
51
52
53
Figure (5.1) shows the component signal connections. The bridge uses:
AMBA AXI-Lite and APB signals as described in the AMBA AXI-Lite 4.0
protocol specification.
54
55
56
57
58
59
60
independent clock frequency and phase. For every AXI channel, invalid
commands are not forwarded and an error response generated. That is
once a peripheral accessed does not exist, the APB Bridge will generate
DE CERR as response through the response channel (read or write). And
if the target peripheral exists, but asserts PSLVERR, it will give a
SLVERR response.
6.1Asynchronous FIFO
An asynchronous FIFO refers to a FIFO design where data values
are written to a FIFO buffer from one clock domain and the data values
61
are read from the same FIFO buffer from another clock domain, where
the two clock domains are asynchronous to each other.
Asynchronous FIFOs are used to safely pass data from one clock domain
to another clock domain.
There are many ways to do asynchronous FIFO design, including
many wrong ways. Most incorrectly implemented FIFO designs still
function properly 90% of the time. Most almost-correct FIFO designs
function properly 99%+ of the time. Unfortunately, FIFOs that work
properly 99%+ of the time have design flaws that are usually the most
Difficult to detect and debug (if you are lucky enough to notice the bug
before shipping the product), or the most costly to diagnose and recall.
6.1.1 Passing multiple asynchronous signals:
Attempting to synchronize multiple changing signals from one
clock domain into a new clock domain and insuring that all changing
signals are synchronized to the same clock cycle in the new clock domain
has been shown to be problematic. FIFOs are used in designs to safely
pass multi-bit data words from one clock domain to another.
Data words are placed into a FIFO buffer memory array by control
signals in one clock domain, and the data words are removed from
another port of the same FIFO buffer memory array by control signals
62
data word. If the receiver first had to increment the read pointer before
reading a FIFO data word, the receiver would clock once to output the
data word from the FIFO, and clock a second time to capture the data
word into the receiver. That would be needlessly inefficient.
The FIFO is empty when the read and write pointers are both equal.
This condition happens when both pointers are reset to zero during a
reset operation, or when the read pointer catches up to the write pointer,
having read the last word from the FIFO.
A FIFO is full when the pointers are again equal, that is, when the
write pointer has wrapped around and caught up to the read pointer.
This is a problem. The FIFO is either empty or full when the pointers are
equal, but which?
One design technique used to distinguish between full and empty is to
add an extra bit to each pointer. When the write pointer increments past
the final FIFO address, the write pointer will increment the unused MSB
while setting the rest of the bits back to zero as shown in Figure 1 (the
FIFO has wrapped and toggled the pointer MSB). The same is done with
the read pointer. If the MSBs of the two pointers are different, it means
that the write pointer has wrapped one more time that the read pointer.
If the MSBs of the two pointers are the same, it means that both
64
65
simply
a state
machine,
is
a mathematical
model used
to
66
which
are electronic
artificial
of
state
design
other
automation, communication
engineering
intelligence research,
machines
are
state
applications.
machines
sometimes
used
or
to
67
68
69
70
71
72
7.3 AXI4-Lite
7.3.0 Definition of AXI4-Lite
This section defines the functionality and signal requirements of
AXI4-Lite components.
The key functionality of AXI4-Lite operation is:
1. All transactions are of burst length 1
2. All data accesses use the full width of the data bus.
73
74
75
76
such a device, a master must use a burst length that exactly matches the
size of the required data transfer.
In AXI4, transactions with INCR burst type and length greater than
16 can be converted to multiple smaller bursts,
2. Burst size
The maximum number of bytes to transfer in each data transfer, or beat,
in a burst, is specified by:
ARSIZE [2:0], for read transfers
AWSIZE [2:0], for write transfers.
7.4 Applications
Product Type
Application
Computing
Mobile Handset
Automotive
Infotainment, Navigation
Digital Home
Enterprise
Wireless
Infrastructure
References
1. Design and Implementation of APB Bridge based on AMBA 4.0 (IEEE
2011), ARM Limited.
2. http://en.wikipedia.org/wiki/System_on_a_chip#Structure
3. Power.org Embedded Bus Architecture Report Presented by the Bus
Architecture TSC Version 1.0 11 April 2008
82
4. http://www.arm.com/products/system-ip/amba/amba-openspecifications.php
5. ARM, "AMBA Protocol Specification 4.0", www.arm.com, 2010
6. ARM,AMBA Specification (Rev 2.0).
7. AMBA 4 AXI4, AXI4-Lite, and AXI4-Stream Protocol Assertions
Revision: r0p0 User Guide.
8. AMBA APB Protocol Version: 2.0 Specifications.
9. ASB Example AMBA System Technical Reference Manual Copyright
1998-1999 ARM Limited.
10. AHB to APB Bridge (AHB2APB) Technical Data Sheet Part Number: TCS-PR-0005-100 Document Number: I-IPA01-0106-USR Rev 05 March
2007.
11. LogiCORE IP AXI to APB Bridge (v1.00a) DS788 June 22, 2011 Product
Specification.
12. Simulation and Synthesis Techniques for Asynchronous FIFO Design
Clifford E.Cummings, Sunburst Design, Inc. cliffc@sunburst-design.com.
SNUG San Jose 2002 Rev 1.2., FIFO Architecture, Functions, and
Applications SCAA042A November 1999.
13. ARM, "AMBA Protocol Specification 4.0", www.arm.com, 2010.
83
84