Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
S O L U T I O N S F O R A P R O G R A M M A B L E W O R L D
Motor Drives
Migrate to Zynq SoC
with Help from
MATLAB 32
www.xilinx.com/xcell
Xilinx SpeedWay
Design Workshops
Avnet, Inc. 2014. All rights reserved. AVNET is a registered trademark of Avnet, Inc.
Xilin, Zynq and Vivado are trademarks or registered trademarks of Xilinx, Inc.
Announcing
HAPS Developer eXpress Solution
Pre-integrated hardware and software for fast prototyping of complex IP systems
500K-144M ASIC gates
Ideal for IP and Subsystem
Validation
Design Implementation and
Debug Software Included
Flexible Interfaces for FMC
and HapsTrak
Plug-and-play with
HAPS-70
Designs come in all sizes. Choose a prototyping system that does too.
HAPS-DX, an extension of Synopsys HAPS-70 FPGA-based prototyping
product line, speeds prototype bring-up and streamlines the integration of
IP blocks into an SoC prototype.
ART DIRECTOR
Jacqueline Damian
Scott Blair
X comer, having joined the company in 2008, I find it remarkable how far and fast
Xilinx and the devices we make have advanced in just the last six years. I cant imag-
ine what it must be like for the employees who have been with the company essentially from
DESIGN/PRODUCTION Teie, Gelwicks & Associates the beginning, when Xilinx was a tiny startup on Hamilton Avenue in San Jose offering what
1-800-493-5551
to many seemed like a crazy technology with an even crazier business model.
FPGAs and the fabless semiconductor business model are mainstays of the industry
ADVERTISING SALES Dan Teie
1-800-493-5551 today, but back in 1984 that wasnt the case. Years ago, I had the pleasure of interviewing
xcelladsales@aol.com
Bill Carter, who in the Xilinx world is a bit of a living legend, having laid out the industrys
INTERNATIONAL Melissa Zhang, Asia Pacific
very first FPGA, the XC2064, and later becoming Xilinxs first CTO. Of the many great
melissa.zhang@xilinx.com recollections Bill shared regarding his years at Xilinx, one of the most memorable was the
story of his job interview with Xilinxs co-founders: Ross Freeman (who invented the
Christelle Moraga, Europe/
Middle East/Africa FPGA), Bernie Vonderschmitt and Jim Barnett. Basically, the founders laid out their plans
christelle.moraga@xilinx.com to come up with a reprogrammable device and have it manufactured by Seiko-Epson.
I took one look at the architecture and thought they were nuts, said Carter. Back then,
Tomoko Suto, Japan
tomoko@xilinx.com silicon real estate was precious and their new circuit structure took so many transistors to
implement. But the idea of being able to reprogram the hardware was revolutionary.
REPRINT ORDERS 1-800-493-5551
The business model was sort of nuts too for that time. Back in 1984, everyone manufac-
tured their own chips and the biggest barrier to entry into the business was securing enough
funding to build a factory. Carter recalled that Vonderschmitt knew manufacturing from his
days at RCA and could call on his friends at Seiko (to whom he had shown the silicon-man-
ufacturing ropes while working at RCA) to produce the chips.
So here was Bill Carter, with a young family and a mortgage to pay, working a solid job
for established Zilog, creator of the Z80. He walked away to take a chance with Xilinx. The
rest is history. Xilinx introduced FPGAs to the world in 1985, and years later the fabless
model under the leadership of TSMC became the mainstream way to produce chips.
What I realize 30 years after the fact is that while many personalities have come and gone
and shaped the character of Xilinx, the risk-taking, innovative spirit that launched the com-
pany is still thriving here. In the last six years, Ive been an eyewitness to Xilinx gaining a
Generation Ahead lead over the competition at the 28-nanometer node by means of innova-
www.xilinx.com/xcell/ tive silicon (7 series All Programmable FPGAs, SoCs and 3D ICs) and tools (the Vivado
Design Suite). Whats more, Xilinx is stretching that lead with first-to-market 20-nm
Xilinx, Inc. UltraScale devices. And you can count on even bigger innovations coming down the pike
2100 Logic Drive
San Jose, CA 95124-3400 as the era of the FinFET arrives. This pioneering spirit is charged by the remarkable inno-
Phone: 408-559-7778
FAX: 408-879-4780
vations Xilinx customers have created with our chips over the last 30 years. Im sure well
www.xilinx.com/xcell/ see even more customer innovations in the decades ahead.
Xcellence in Industrial
Xcellence in Industrial
Cover Story
32
8 Xilinxs New SDNet Environment
Enables Softly Defined Networks
SECOND QUARTER 2014, ISSUE 87
Calculating Mathematically
Complex Functions 44
Tools of Xcellence
54
XTRA READING
Xtra, Xtra The latest Xilinx tool updates and patches,
as of April 2014 60
Excellence in Magazine & Journal Writing Excellence in Magazine & Journal Design
2010, 2011 2010, 2011, 2012
COVER STORY
SDNets softly defined network ap- streamlines the creation of a high-lev- arrange into packet data flows. These
proach goes to the root of this prob- el behavioral specification of the line subengines can include user-provid-
lem and allows network system design card and automatically generates RTL ed engines. The SDNet specification
teams to quickly design line cards that blocks for implementation in Xilinx All environment contains no implementa-
are correct by construction. In partic- Programmable devices, firmware and a tion details. That gives customers the
ular, SDNet focuses on automating the validation testbench. freedom to scale the performance and
most complex aspect of line card de- With SDNet, system architects speci- resources of their design without the
signnamely, the design and program- fy the what, not the how, said Possley. need to understand the details of the
ming of the packet-processor and traf- System architects specify the exact underlying architecture. The SDNet
fic-manager functions in modern line services they are looking to deploy specifications are also not limited to
cards (Figure 3). without regard as to how they are being any specific network protocols.
Instead of having two separate, deployed in the underlying silicon. Possley said that SDNet is simple,
discrete ASSPs handling these func- In the SDNet flow, system architects and the select few customers with
tions, network systems teams can define line-card functionality using whom Xilinx beta tested it have found
integrate packet processing and traf- a high-level functional specification it very intuitive and easy to use. It dra-
fic management as well as other line (Figure 4). SDNet allows architects matically cuts the amount of code they
card functionality on a single Xilinx to describe the required behavior of have to produce into a simple and in-
All Programmable FPGA or SoC. They various types of packet-processing en- tuitive specification and is, therefore,
can ensure they are creating optimal gines, including parsing, editing, search orders-of-magnitude less effort com-
implementations for their targeted ap- and quality-of-service (QoS) policy en- pared with microcoding a network
plications. In addition to integrating gines. Architects can describe engines processor, he said.
the functionality of many chips into hierarchically in terms of simpler sub- Once architects have finished defining
one All Programmable device, SDNet engines that they can interconnect and the system engines and flows in the SD-
Figure 2 SDNet brings flexibility and automation to the data plane, enabling a softly defined
network approach for the design and upgrade of next-generation networks.
Figure 3 With SDNet, companies can create a highly integrated All Programmable line cards.
creating an All Programmable line card compiler, means that users can per- They can run the updating software
on a chip. form hitless upgrades, whereby the on an embedded soft processor or
Whats more, SDNet generates data firmware is changed and placed into on an external processor. Of course,
for functional verification and valida- service without disrupting the flow of if they implemented the design on a
tion to guide correct-by-construction packets, said Brebner. In this way, Xilinx Zynq -7000 All Programma-
design. Specifically, SDNets compiler companies can perform significant ble SoC, he added, they can run the
accepts a collection of test packets for service upgrades without any interrup- software on the devices embedded
testing input and output of the design. tion to the service. This revolutionary ARM processor.
Architects can use the packets in the development is achieved through the SDNet offers full hardware pro-
specification-definition phase of the unique nature of the SDNet technology grammability under software control,
process to ensure they are creating an and its coupling of high-level specifi- which is why we call it softly defined
accurate interpretation of the SDNet cations with Xilinx All Programmable networking, Possley said.
description. Network engineers can devices (Figure 5). For more information on the SDNet
use test packets during the simulation SDNets ability to generate datapath specification environment, including
of the RTL description generated by processing functions that support a video demonstration of SDNet in
the SDNet compiler. Last but not least, hitless, in-service updates is unique, action, visit www.xilinx.com/sdnet.
the packets can help with hardware said Possley. Carriers can update At the same site, you will find an in-
validation of the final implementation line card components with new fea- depth white paper entitled Software
of the design using network test equip- tures or capabilities using a software Defined Specification Environment for
ment. In addition, SDNet will generate controller via standard SDNet APIs. Networking (SDNet).
corresponding contents for search
engine lookup tables. This verifica-
tion-and-validation ability vastly re-
duces design time and eliminates iter-
ations between system architects and
network hardware engineers, allowing
the teams to get highly differentiated
products to market faster.
Gordon Brebner, distinguished engi-
neer at Xilinx, said the compiler auto-
matically generates custom firmware
operations and their binary encodings
for each individual component in the
architecture. This gives architects an
intimate level of control over the pro-
cessing, he said. SDNet has a utility
that keeps a record of runs and stores
details of the generated architecture
and its firmware. When users rerun
the compiler with an updated SDNet
description as input, it determines
whether the change can be accom-
modated with a firmware update only
(without generation of new hardware),
or whether a regeneration of the hard-
ware (and firmware) is needed. In
most cases, medium-scale updates,
such as adding or subtracting a pro-
tocol the line card will handle, can be
done through firmware updates only.
The intimate connection between
Figure 5 After deployment, SDNet allows vendors to update
the firmware and the architecture, protocols on line cards without interrupting service.
which are both generated by SDNets
Second Quarter 2014 Xcell Journal 13
XCELLENCE IN WIRELESS
Xilinxs 20-nm
UltraScale Architecture
Advances Wireless
Radio Applications
U
pcoming 5G wireless communications sys-
tems will likely be required to support much
Next-generation 5G wider bandwidths (200 MHz and larger) than
the 4G systems used today, along with large
systems will be complex antenna arrays, enabled by higher carrier fre-
quencies, that will make it possible to build much smaller
to design. UltraScale antenna elements. These so-called massive MIMO applica-
tions, together with more stringent latency requirements,
devices have built-in will increase design complexity by an order of magnitude.
At the end of last year, Xilinx announced the 20-nano-
functionality that will meter UltraScale family and the first devices are now
shipping [1,2,3]. This new technology brings many ad-
make the job easier. vantages over the previous 28-nm 7 Series generation,
especially for wireless communications. Indeed, the com-
bination of this new silicon and the tools of the Xilinx
Vivado Design Suite [4, 5] is a perfect fit for high-perfor-
by Michel Pecot mance signal-processing designs such as next-generation
Wireless Systems Architect wireless radio applications.
Xilinx, Inc. Lets look at the benefits of the UltraScale devices for
michel.pecot@xilinx.com
such designs, with a focus on architectural aspects
specifically, the advantages of the enhancements brought
to the DSP48 slices and Block RAMs for the implementa-
tion of some of the most common functionalities used in
radio digital front-end (DFE) applications. The UltraScale
family offers much denser routing and clocking resources
compared with the 7 Series devices, enabling better device
utilization, especially for high-speed designs. However,
these features do not usually have a direct impact on de-
sign architectures, so we will not address them here.
OVERVIEW OF ENHANCEMENTS
TO ULTRASCALE FABRIC
Moving to 20 nm not only enables the higher integration ca-
pabilities, improved fabric performance and lower power
consumption that come with any geometry node migration,
but the UltraScale 20-nm architecture also includes several
new, greatly enhanced features that directly support DFE
applications. This is especially true for the UltraScale Kin-
tex devices, which Xilinx has highly tuned to the needs of
this type of design.
First, these devices contain up to 5,520 DSP48 slices.
Thats almost three times more than the maximum count
of 1,920 available on 7 Series FPGAs (2,020 for the Zynq-
7000 All Programmable SoC). Higher levels of integration
are therefore possible. For example, you can implement a
complete 8Tx/8Rx DFE system with instantaneous band-
width of 80 to 100 MHz in a single midrange UltraScale
FPGA, while a two-chip solution is necessary on the 7 Se-
ries architecture, with each chip effectively supporting a
4x4 system. For a detailed functional description of such
designs, read the Xilinx white paper WP445, Enabling
High-Speed Radio Designs with Xilinx All Programmable
FPGAs and SoCs [6].
Second Quarter 2014 Xcell Journal 15
XCELLENCE IN WIRELESS
* These signals are dedicated routing paths internal to the DSP48E2 column. They are not accessible via general-purpose routing resources.
Increasing the multiplier size from
(a) Detailed DSP48E2 architecture 25x18 to 27x18 has minimal impact
on the silicon area of the DSP48 slice,
A/B Mux on Squaring Mux Wide XOR but significantly improves the support
Pre-adder input
for floating-point arithmetic. First, it is
48-Bit Accumulator/Logic Unit
worth pointing out that the DSP48E2
can in effect support up to 28x18-bit or
B
27x19-bit signed multiplication. This is
achieved by using the C input to pro-
A cess the additional bit, as described in
Figure 2, which shows the multiplica-
+/ 27 x 18
Multiplier tion of a 28-bit operand, X, with an 18-
D Pre-adder bit operand, Y.
Round Cst
Pattern Detector
The 45 most significant bits (MSBs)
C of the 46-bit output are computed as:
Z[45:1] = X[27:1]*Y[17:0] + X[0]*Y[17:1]
Rounding
Constant
Sign
Extension
DSP48E2
Figure 2 A 28x18-bit signed multiplication with convergent rounding capability of the output
rounding of the result can still be sup- has been recently added to the IEEE compared with a true single-precision
ported through the W-mux. floating-point standard [8]. Basically, it floating-point implementation, but with
A double-precision floating-point mul- consists of building the floating-point reduced latency and footprint. The
tiplication involves the integer product operation A*B+C, without explicitly UltraScale architecture is again well
of the 53-bit unsigned mantissas of both rounding, normalizing and de-normal- adapted for this purpose, since it takes
operators. Although a 52-bit value (m) izing the data between the multiplier only two DSP48 slices to build a sin-
is stored in the double-precision float- and the adder. These functions are in- gle-precision fused multiplier, whereas
ing-point representation, it describes the deed very costly when using traditional A/B
three are Muxon
required on7 Series Squaring
devices
Mu
fractional part of the unsigned mantissa, floating-point arithmetic and account with Pre-adder input
additional fabric logic.
and it is actually the normalized 1+m val- for the greatest part of the latency. This The pre-adder, integrated within the
ues, which need to be multiplied togeth- concept may be generalized to build DSP48 slice in front of the multiplier,
er; hence the additional bit required by sum-of-products operators, which are provides an efficient way to implement
the multiplication. Taking into account common in linear algebra (matrix B prod- symmetric filters that are commonly
the fact that the MSBs of both 53-bit op- uct, Cholesky decomposition). Con- used in DFE designs to realize the dig-
erands are equal to 1, and appropriately sequently, such an approach is quite ital upconverter (DUC) and downcon-
splitting the multiplication to optimally efficient for applications where cost or verter (DDC) functionality. For an N-tap
A
exploit the DSP48E2 26x17-bit unsigned latency are critical, while still requiring symmetric filter, the output samples are
multiplier and its improved capabilities the accuracy and dynamic range of the +/
computed as follows: 27 x 18
(e.g., the true three-input 48-bit adder floating-point representation. This is Multiplie
enabled by the W-mux), it can be shown the case in radio DFE applications D for Pre-adder
that the 53x53-bit unsigned multiplica- which the digital predistortion func- Round Cst
tion can be built with only six DSP48E2 tionality usually requires some hard-
slices and a minimal amount of external ware-acceleration support to improveC
logic. It is out of the scope of this arti- the update rate of the nonlinear filter where x(n) represents the input signal
cle to provide all the details of such an coefficients. You can then build one or and h(n) the filter impulse response,
implementation, but a similar approach more floating-point MAC engines in the with h(n)= h(N1 n).
would require 10 DSP48E1 slices on the FPGA fabric to assist the coefficient-es- Increased
previous-generation 7 Series devices; timation algorithm running in software Multiplier
Pairs of input samples Width
are therefore
hence there is a 40 percent gain brought (e.g. on one of the ARM Cortex-A9 fed into the pre-adder and the output is
by the UltraScale architecture. cores of the Zynq SoC). further multiplied with the appropriate
The 27x18 multiplier of the DSP48E2 For such arithmetic structures, it filter coefficient. On the 7 Series archi-
is also very useful for applications has been shown that a slight increase tecture, the pre-adder must use the 30-
based on fused data paths. The con- of the mantissa width from 23 to 26 bit input (A) of the DSP48E1, together
cept of a fused multiply-add operator bits can provide even better accuracy with the 25-bit input (D), and its output
18 Xcell Journal Second Quarter 2014
XCELLENCE IN WIRELESS
filter bit growth, N being the total num- decimation. Figure 3 shows a simpli- clock cycle. It is well known that you
ber of coefficients. Normal practice is fied block diagram for both 7 Series can rewrite the equation of a complex
therefore to implement the accumulator and UltraScale implementations. It product, PI + j.PQ = (AI + j.AQ).(BI + j.BQ),
within a DSP48 slice to ensure support clearly highlights the advantage of the so as to use only three real multiplica-
for the highest clock rate while minimiz- UltraScale solution, since the phase ac- tions, according to:
ing footprint and power. cumulator is absorbed by the last DSP48
PI = P1 + AI.(BI - BQ)
It should be noted that semi-par- slice thanks to the W-mux capability.
PQ = P1 + AQ.(BI + BQ)
allel architectures can be derived for Lets now consider the implemen-
any type of filter: single-rate, integer tation of a fully parallel complex MAC where P1 = BQ.(AI - AQ).
or fractional-rate interpolation and operator generating one output every
Data Data
MEM/SRL MEM/SRL
+ + +
h0 h3 h6
h1
h2
x h4
h5
x h7
h8
x
+ + + + DOUT
Data Data
MEM/SRL MEM/SRL
h0 A B D PCOUT PCIN
h1 M
+ + +h2
A B D PCOUT PCIN
DSP48E2
DSP48E2
DSP48E2
h0 h3 h6
h1
h2
x h4
h5
x h7
h8
x
+ + + DOUT
functionality, commonly integrated with 2. Xilinx white paper WP435, Xilinx Ul-
the baseband CPRI interfacing of DFE traScale: The Next Generation Architec-
ture for Your Next Generation Architec- All Programmable
systems. Figure 7 shows the high-level
ture, July 8, 2013 FPGA and SoC modules
switching architecture, which essential-
ly consists of an NxM memory array. 3. Xilinx datasheet DS890, UltraScale 4
The successive data on the N ingress Architecture and Product Overview, form
x factor
CTRL Bus
1R1W 1R1W
BRAM BRAM
ADDR
ADDR
DATA
DATA
DATA
DATA
Design Services
MUX MUX Module customization
Carrier board customization
Ingress 2
Custom project development
1R1W 1R1W
ADDR
ADDR
BRAM BRAM
DATA
DATA
DATA
DATA
MUX MUX
Egress 1 Egress 2
difference by design
Angle Measurement
Made Easy with
Xilinx FPGAs and a
Resolver-to-Digital
Converter
When properly paired with an FPGA,
angle transducers can help engineers
create ever-more-remarkable machinery.
E
ver since humans invented the wheel, we have
wanted to know, with varying degrees of accu-
by N. N. Murty racy, how to make wheels turn more efficient-
Scientist F ly. Over the course of the last few centuries,
scientists and engineers have studied and de-
S. B. Gayen vised numerous ways to accomplish this goal, as the basic
Scientist F principles of the wheel-and-axle system have been applied
to virtually every mechanical system, from cars to stereo
Manish Nalamwar knobs to cogs in all forms of machinery, to the humble
Scientist D wheelbarrow [1].
nalamwar.manishkumar@rcilab.in Over these many eras, it turns out the most essential el-
ement in making a wheel turn efficiently is not the wheel
K. Jhansi Lakshmi, itself (why reinvent it?) but the shaft angle of the wheel.
Technical Officer C And the most effective way to measure and optimize a shaft
Radar Seeker Laboratory angle today is through the use of angle transducers. There
Research Centre IMARAT are many types of angle transducers that help optimize
Defense Research Development Organization wheel efficiency through axle monitoring and refinements,
Hyderabad, India but by applying FPGAs to the task you can achieve remark-
able results and improve axle/wheel efficiencies in a broad
number of applications.
Before we get into the details of how engineers are doing
this optimally with Xilinx FPGAs, lets briefly review some
basic principles of angle transducers. Today there are two
widely used varieties: encoders and resolvers.
RESOLVER WINDINGS
R2 S4
ROTOR STATOR
R4 S2
S1
STATOR
S3
nated calibration point before beginning operation. Incre- log device whose output voltage is uniquely related to the input
mental counters are also susceptible to electrical interfer- shaft angle it is monitoring. It is an absolute-position transduc-
ence, which can result in inaccuracies in pulses they send er with 0o to 360o of rotation that connects directly to the axle
to the system and thus in rotation counts. Moreover, many and reports speed and positioning. Resolvers have a number
incremental encoders are photoelectric devices, which of advantages over encoders. They are robust devices that can
precludes their use in radiation-hazardous areas, if that is withstand harsh environments marked by dust, oil, temperature
a concern for your targeted application. extremes, shock and radiation. Being a transformer, a resolver
Absolute encoders are sensor systems that monitor provides signal isolation and a natural common-mode rejection
the rotation count and direction of an axle. In an abso- of electrical interference. In addition to these features, resolvers
lute-encoder-based system, users typically attach a wheel require only four wires for the angular data transmission, which
to an axle that has an electrical contact or photoelectric suits them for everything from heavy manufacturing to minia-
reference. When the axle is in operation, the absolute-en- ture systems to those used in the aerospace industry.
coder-based system records the rotation and direction of A further refinement is the brushless resolver, which
the operation and generates a parallel digital output that is does not require slip-ring connections to the rotor. This
easily translated into code, most commonly binary or Gray type of resolver is therefore even more reliable and has a
code. Absolute encoders are useful in that they need to be longer life cycle.
calibrated only oncetypically in the factoryand not be- Resolvers use two methods to obtain output voltages
fore every use. Moreover, they are typically more reliable related to the shaft angle. In the first method, the rotor
than other encoders. That said, absolute encoders are typ- winding, as shown in Figure 1, is excited by an alternating
ically expensive, and they are not great with parallel data signal and the output is taken from the two stator wind-
transmission, especially if the encoder is located far away ings. As the stator windings are mechanically positioned at
from the electronic system measuring its readings. right angles, the output signal amplitude is related by the
A resolver, for its part, is a rotary transformeran ana- trigonometric sine and cosine of the shaft angle. Both the
INTEGRATOR
UP /DOWN
VCO
COUNTER
VELOCITY
= DIGITAL ANGLE
LATCHES
WHEN ERROR = 0,
= 1 LSB
sine and cosine signals have the same phase as the original angle, . The converter seeks to adjust the digital angle, ,
excitation signal; only their amplitudes are modulated by continuously to become equal to and track , the analog an-
sine and cosine as the shaft rotates. gle being measured.
In the second method, a stator winding is excited with The stator output voltage of the resolver is:
the alternating signals, which are in phase quadrature to
V1= V sint sin Eq. 1
each other. Then a voltage induces in the rotor winding.
V2= V sint cos Eq. 2
The windings amplitude and frequency are fixed, but its
phase shift varies with the shaft angle. where is the angle of the resolvers rotor. The digital angle
The resolver can be positioned where the angle needs is applied to the cosine multiplier and its cosine is multiplied
to be measured [2]. The electronics, generally a resolv- by V1 to produce the term:
er-to-digital converter (RDC), can be positioned where the
V sint sin cos. Eq. 3
digital output needs to be measured. Analog output from
the resolver, which contains the angular position informa- The digital angle is also applied to the sine multiplier
tion of the shaft, is then transformed in digital form using and multiplied by V2 to produce the term:
the RDC. V sint cos sin. Eq. 4
FUNCTIONALITY OF THE TYPICAL RDC These two signals are subtracted from each other by the
In general, the two outputs of a resolver are applied to the error amplifier to yield an ac error signal of the form:
sine and cosine multiplier of the RDC [3]. These multipli- (V sint sincos V sint cos sin) Eq. 5
ers incorporate sine and cosine lookup tables and func-
tion as multiplying digital-to-analog converters. Figure 2 V sint (sin cos- cos sin) Eq. 6
shows their functionality. From trigonometric identity, this reduces to:
Let us assume, in the beginning, that the current state of
the up/down counter is a digital number representing a trial V sint [sin ( -)] Eq. 7
RH RL BIT
R
REFERENCE CONDITIONER
BIT
DETECTOR
C1
LOS S OPTION
SYNTHESIZED REFERENCE
ERROR
S1
R1
S2 CONTROL
INPUT OPTION GAIN DEMODULATOR VEL
S3 TRANSFORMER
S4 B
A HYSTERESIS INTEGRATOR
E
14/16 BIT
+5 V +5 V UP/DOWN VCO & TIMING
COUNTER
DC/DC
CONVERTER
-5 V
FILTER (-4.2 V typical) DATA LATCHES
47 f
external
capacitor
8 8
EM DATA EL INH A B
G = +2.8
Cext
C1
+15 VDC
PA IN -
Rext PA OUT
GND QUAD OSCILLATOR +
REF
-15 VDC OUT
5350
The detector synchronously demodulates this ac error A single +5 V is required for device operation. The con-
signal, using the resolvers rotor voltage as a reference. verter has velocity outputs (VEL A, VEL B) of a voltage
This results in a dc error signal proportional to sin ( -). range of 4 V with respect to analog ground, which can be
The dc error signal feeds an integrator, the output of which used to replace a tachometer. Two built-in test outputs are
drives a voltage-controlled oscillator. The VCO, in turn, causes provided for two channels (/BIT A and /BIT B) to indicate
the up/down counter to count in the proper direction to cause: loss of signal (LOS).
This converter has three main sections: an input front
sin ( -)0. Eq. 8 end, an error processor and a digital interface. The front end
When this result is achieved, differs for synchro, resolver and direct inputs. An electron-
ic Scott-T is used for synchro inputs, a resolver conditioner
-0, Eq. 9 for resolver inputs and a sine-and-cosine voltage follower
and therefore for direct inputs. These amplifiers feed the high-accuracy
control transformer (CT). The other input of the CT is a 16-
= Eq. 10 bit digital angle and the output is an analog error angle,
in one count. Hence, the counters digital output, , represents or a difference angle, between the two inputs. The CT per-
the angle . The latches make it possible to transfer this data forms the ratiometric trigonometric computation of SIN
externally without interrupting the loops tracking. COS - COS SIN = Sin(-) using amplifiers, switches,
This circuit is equivalent to a Type 2 servo loop, as it has logic and capacitors in precision ratios.
in effect two integrators. One is the counter, which accu- Compared with a conventional precision resistor, these
mulates pulses; the other is the integrator at the output of capacitors are used in precision ratios to get enhanced ac-
the detector. In a Type 2 servo loop with a constant-rota- curacy. Further, these capacitors (which are used with an
tional-velocity input, the output digital word continuously op amp as a computing element) sample at high rates to
follows, or tracks, the input, without needing externally de- eliminate drift and op-amp offsets.
rived conversion. The DC error processing is integrated, yielding a velocity
voltage that drives a voltage-controlled oscillator. This VCO
TYPICAL EXAMPLE OF AN RDC: THE SD-14621 is an incremental integrator when it is combined with the
The SD-14621 is a small, low-cost, RDC from Data Device velocity integrator: a Type 2 servo feedback loop.
Corp. (DDC). It has two channels with programmable res-
olution control. Resolution programming allows selection REFERENCE OSCILLATOR
of 10-, 12-, 14- or 16-bit modes [4]. This feature allows low The power oscillator in our design, also from DDC, is the
resolution for fast tracking or higher resolution for higher OSC-15802. This device is suitable for RDC, synchro, LVDT,
accuracy. Thanks to its size, cost, accuracy and versatili- RVDT and inductosyn applications [5]. The frequency and
ty, this converter is suitable for high-performance military, amplitude outputs are programmable with capacitors and re-
commercial and position-control systems. sistors, respectively. The output frequency range is 400 Hz to
28 Xcell Journal Second Quarter 2014
XCELLENCE IN INDUSTRIAL
D0-15 D0-15 S1
S1 A RESOLVER
FD16-31 TRANSCEIVER BIT1-16 S3
S3 A
S2
S2 A
INH A S4
S3 A
FPGA RES CON A
(VIRTEX-5 REF SIGNAL
/ENABLE A
FX30T)
RDC
POWER
REF
TRANSCEIVER
/BIT A (SD-14621-DS) SIGNAL OSCILLATOR
(OSC-15802)
VEL CH A
SIGNAL
CHAIN SCALED VEL CH A
DATA DATA
VALID msb_val=msb_val & 0xff00;
XGpio_WriteReg(XPAR_INHIBIT_CH_A_BA-
SEADDR,XGPIO_DATA_OFFSET,0X01);
EM OR EL for(i=0;i<=5;i++);
Hit your target by advertising your product or service in the Xilinx Xcell Journal,
youll reach thousands of qualified engineers, designers, and engineering managers worldwide.
Motor Drives
Migrate to Zynq
SoC with Help
from MATLAB
by Tom Hill
Senior Manager, DSP Solutions
Xilinx, Inc.
tom.hill@xilinx.com
S
ince the 1990s, developers of motor
drives have been using a multichip ar-
chitecture to implement their motor
control and processing requirements.
In this architecture, a discrete digital signal-pro-
cessing (DSP) chip executes motor control algo-
rithms, an FPGA implements high-speed I/O and
networking protocols, and a discrete processor
handles executive control. With the advent of
the Xilinx Zynq-7000 All Programmable SoC,
however, designers have the means to consoli-
date these functions into a single device while
integrating additional processing tasks. The re-
duction in parts count and complexity makes it
possible to lower system cost while improving
performance and reliability.
But how can drive developers evolve their
established design practices to leverage the
Zynq SoC?
Industrial designers have long embraced
model-based design for the development of
custom motor algorithms on DSP chips through
the use of simulation and C-code generation.
Now, a new workflow from MathWorksde-
veloped in conjunction with Xilinxextends
model-based design to the processing system
and programmable logic available with the
Zynq-7000 All Programmable SoC.
production system, meaning that de- Ethernet industrial networking connec- zoidal controller. The controllers main
ploying new controllers can consume tivity (http://www.xilinx.com/products/ components are as follows:
their limited time and can make the boards-and-kits/1-490M1P.htm).
Hall-effect sensor detects the
deployment a protracted process. It comes with an Avnet ZedBoard
motor position
7020 baseboard; Analog Devices
Reduces dependency on equip-
AD-FMCMOTCON1-EBZ module, which V
elocity estimator computes rotor
ment availability The production
is capable of driving brushless DC and velocity based on the sensor signal
environment itself may not be avail-
stepper motors with a 24-volt external
able, such as in cases where custom Six-step commutator computes
power supply (included with the kit);
drive electronics or electric motors the phase voltages and inverter en-
and a 24-V BLDC motor rated for 4,000
are under development or are not lo- able signals based on rotor position
RPM and equipped with Hall-effect sen-
cated where control system designers and velocity
sors and a 1,250-CPR indexed encoder.
can access them.
Also included are a Zynq SoC reference P
ulse-width modulation (PWM)
Given these factors, simulation pro- design of field-oriented control and An- drives the controller outputs out
vides an excellent alternative to testing alog Devices Ubuntu Linux framework through the drive circuitry
on production hardware. Simulation including drivers, application software
environments such as Simulink provide and source code. We start by using a behavioral, con-
a framework for creating plant models trol-loop model of the system suited
from preexisting libraries of building EXAMPLE: TRAPEZOIDAL to control-loop analysis. First we will
blocks of electromechanical components MOTOR CONTROL evaluate the model in simulation by
for the evaluation of new control system Lets apply this workflow to the trape- subjecting it to a pulse test, command-
architectures against plant models. zoidal motor control system in Figure ing a rotational rate of 150 radians per
Risk to the schedule is further reduced 1 using simulation in Simulink to evalu- second for 2 seconds and then returning
by linking the system model to a rap- ate a controller with a simulated plant, to a stop. Through tuning of the control
id-prototyping environment as well as the then prototype the controller using the loops proportional-integral (PI) control-
final production system. The rapid-proto- Intelligent Drives Kit. As a final step, we ler gains, we can achieve a settling time
typing flow enables algorithm developers will validate the Simulink model using of 1.2 seconds with negligible overshoot
to prototype without having to depend results from hardware testing. (control-loop simulation results appear
on hardware designers. Instead they use In this example, we will use the kit as the purple-shaded signal in Figure 3;
a platform-specific support package in a to drive an inertial load in the form of details on this example are available at
highly automated process that deploys an aluminum disc, with a basic trape- mathworks.com/zidk).
the hardware and software components
of the system to a design template that
can be compiled to a specific hardware
development platform. The hardware
and software design teams can reuse
these same hardware and software com-
ponents in the final production systems
without modification to accelerate devel-
opment and reduce errors.
Figure 3 Hardware and software simulation models used to validate against hardware results
With the control-loop gains set, we With the bitstream loaded into the
now can test on a more accurate system
model for the controller. In contrast to
programmable logic and the executable
running on an ARM core, we can run a
FPGA
the control-loop model, the system mod- hardware-in-the-loop test. For this test, Boards & Modules
el incorporates more detailed models we use a modified Simulink testbench
of the drive electronics and, more sig- model from which we have removed the
nificantly, it includes detailed models models for the drive electronics, motor
that specify the implementation of the and sensors, since we are using hard- EFM-02
controller and peripherals, including ware-in-the-loop in place of simulated FPGA module with USB 3.0
interface. Ideal for Custom
timing-accurate models for PWM and plant models. To help us check the out- Cameras & ImageProcessing.
Hall-effect sensor processing. come of the testand compare it with
We have partitioned the controller for our simulation resultswe can set up
the Zynq SoC, with the velocity control- the Zynq SoC to store motor shaft veloc-
ler and velocity estimator running on an ity measurements and other data in the
ARM core at 1 kHz and the commutator, memory of an ARM core (the black-shad-
Hall sensor and PWM running on the ed signal in Figure 3 shows results from
Zynq SoCs programmable logic. hardware testing). Doing so enables us
We can compare simulation results to upload the results to a MATLAB ses- Xilinx Spartan-6 FPGA
for the control-loop and system models sion for processing and visualization at XC6SLX45(150)-3FGG484I
(system model results appear as the red the conclusion of the test by applying a USB 3.0 Superspeed interface
signal in Figure 3). In general we get very pulse input in the testbench. In this way, Cypress FX-3 controller
good agreement between the waveforms, we can exactly repeat in hardware the On-board memory
except as the motors rate approaches test weve done in simulation. The re- 2 Gb DDR2 SDRAM
zero. At such points the coarseness of sults from prototyping align very closely
the Hall sensor, with only six index puls- with our simulation results, including Samtec Q-strip connectors
es per revolution of the motors shaft, the discontinuity in the measured motor 191 (95 differential) user IO
becomes evident. This high-fidelity sys- velocity due to the Hall sensor.
tem model runs the 4-second simulation This brief overview illustrates how
in 7 minutes, compared with a run-time the MathWorks workflow for the Zynq EFM-01
of only 7 seconds for the lower-fidelity SoC enables model-based design for use Low-cost FPGA module for
control-loop model. For control system in simulation and prototyping. To con- general applications.
designers, the takeaway here is that tinue on into production, you can im-
these simulation results give us more port the generated C and HDL code into
confidence that the control-loop model the Vivado Design Suite, where you
is sufficiently accurate for further eval- can integrate them with executive rou-
uation of controller alternatives, which tines, networking IP and other design
can be validated before hardware testing components required for the complete
using the system model. system implementation. Xilinx Spartan-3E FPGA
Armed with these findings, we are To download the models shown in XC3S500E-4CPG132C
prepared to prototype the controller this article and learn more about how USB 2.0 Highspeed interface
on the Intelligent Drives Kit. Through to use model-based design with the Cypress FX-2 controller
the Zynq SoC guided workflow, we can Zynq-7000 All Programmable SoC / An- On-board memory
generate C and HDL code from Simulink alog Devices Intelligent Drives Kit from
models that have been partitioned into Avnet, visit mathworks.com/zidk. From Standard 0.1 pin header
subsystems targeting an ARM core and this page, you can also browse a Sim- 50 user IO
programmable logic (Figure 4). ulink model that implements a com-
With this workflow, we use the HDL plete field-oriented control model on a
Coder from MathWorks to generate an Zynq SoC device and view videos show-
IP core that will run in the Zynq SoC ing this example in greater detail.
devices programmable logic to build an For information on how MathWorks
executable running on an ARM core and products support the Xilinx Zynq-7000 Hardware Software HDL-Design
to establish the interfaces between core All Programmable SoC family, visit www.cesys.com
and executable over the AXI bus. mathworks.com/zynq.
Second Quarter 2014 Xcell Journal 37
X P L A N AT I O N : F P G A 1 0 1
How to Use
Interrupts on
the Zynq SoC
by Adam P. Taylor
Head of Engineering Systems
e2v Technologies
aptaylor@theiet.org
I
n embedded processing, an interrupt is
a signal that temporarily halts the pro-
Real-time computing often cessors current activities. The proces-
sor saves its current state and executes
requires interrupts to respond an interrupt service routine to address
the reason for the interrupt. An interrupt can
quickly to events. Its not hard come from one of the three following places:
Resets
Clock
Generation
4 5 6 7 Processing System (PS)
0 1 2 3
troller (GIC), as shown in Figure 1, to process interrupts. 2. The processor stops executing the current thread.
The GIC handles interrupts from the following sources:
3. The processor saves the state of the thread in the stack
Software-generated interrupts There are 16 such inter- to allow processing to continue once it has handled
rupts for each processor. They can interrupt one or both the interrupt.
of the Zynq SoCs ARM Cortex-A9 processor cores.
4. The processor executes the interrupt service routine,
Shared peripheral interrupts Numbering 60 in total, which defines how the interrupt is to be handled.
these interrupts can come from the I/O peripherals, or to
5. The processor resumes operation of the interrupted
and from the programmable logic (PL) side of the device.
thread after restoring it from the stack.
They are shared between the Zynq SoCs two CPUs.
Because interrupts are asynchronous events, it is pos-
Private peripheral interrupts The five interrupts in
sible for multiple interrupts to occur at the same time. To
this category are private to each CPUfor example
address this issue, the processor prioritizes interrupts such
CPU timer, CPU watchdog timer and dedicated
that it can service the highest-priority interrupt pending first.
PL-to-CPU interrupt.
To implement this interrupt structure correctly, we will
need to write two functions: an interrupt service routine
The shared peripheral interrupts are very interesting, as
to define the actions that will take place when the inter-
they are very flexible. They can be routed to either CPU from
rupt occurs, and an interrupt setup to configure the inter-
the I/O peripherals (44 interrupts in total) or from the FPGA
rupt. The interrupt setup is a reusable routine that allows
logic (16 interrupts in total). However, it is also possible to
for constructing different interrupts. Generic for all inter-
route interrupts from the I/O peripherals to the programmable
rupts within a system, the routine will set up and enable
logic side of the device, as shown in Figure 2.
the interrupts for the general-purpose I/O (GPIO).
PROCESSING THE INTERRUPTS ON THE ZYNQ SOC
USING INTERRUPTS IN SDK
When an interrupt occurs within the Zynq SoC, the pro-
Interrupts are supported and can be implemented on a
cessor will take the following actions:
bare-metal system using the standalone board support
1. The interrupt is shown as pending. package (BSP) within the Xilinx Software Development Kit
40 Xcell Journal Second Quarter 2014
XPLANANTION: FPGA 101
(SDK). The BSP contains a number of functions that greatly push. To set up the interrupt, we will need two static global
ease this task of creating an interrupt-driven system. They variables and the interrupt ID defined above to make the
are provided within the following header files: following:
Xparameters.h This file contains the processors static XScuGic Intc; // Interrupt Controller Driver
static XGpioPs Gpio; //GPIO Device
address space and the device IDs.
Within the interrupt setup function, we will need to ini-
Xscugic.h This file holds the drivers for the configuration
tialize the Zynq SoCs exceptions; configure and initialize
and use of the GIC.
the GIC; and connect the GIC to the interrupt-handling hard-
Xil_exception.h This file contains exception functions ware. The Xil_exception.h and Xscugic.h files provide the
for the Cortex-A9. functions we need to accomplish this task. The result is the
following code:
To address a hardware peripheral, we need to know the
address range and the device ID for the devices we wish //GIC config
XScuGic_Config *IntcConfig;
to usein other words, the GIC, which is provided mostly Xil_ExceptionInit();
within the BSP header file xparameters. However, the inter-
rupt ID is provided from xparameters_ps.h (there is no need //initialize the GIC
to declare this header file within your source code as it is IntcConfig = XScuGic_LookupConfig(INTC_DEVICE_ID);
included in the xparameters.h file). We can use this interrupt XScuGic_CfgInitialize(GicInstancePtr, IntcConfig,
labeled ID (its the GPIO_Interrupt_ID) within our source IntcConfig->CpuBaseAddress);
file as shown below:
//connect to the hardware
#define GPIO_DEVICE_ID XPAR_XGPIOPS_0_DEVICE_ID Xil_ExceptionRegisterHandler(XIL_EXCEPTION_ID_IN-
#define INTC_DEVICE_ID XPAR_SCUGIC_SINGLE_DEVICE_ID T,(Xil_ExceptionHandler)XScuGic_InterruptHandler,
#define GPIO_INTERRUPT_ID XPS_GPIO_INT_ID GicInstancePtr);
For this simple example, we will be configuring the Zynq When it comes to configuring the GPIO to function as an
SoCs GPIO to generate an interrupt following a button interrupt within the same interrupt configuration routine,
Figure 2 These are the interrupts available between the processing system and the programmable logic.
we can configure either a bank or an individual pin. This task PRIVATE TIMER EXAMPLE
can be achieved using functions provided within xgpiops.h, The Zynq SoC has a number of timers and watchdogs avail-
for example: able. These are either private to a CPU or a shared resource
voi
d XGpioPs_IntrEnable(XGpioPs *InstancePtr, u8 available to both CPUs. Interrupts are required if you are to
Bank, u32 Mask); use these components efficiently in your design. The timers
voi
d XGpioPs_IntrEnablePin(XGpioPs *InstancePtr, and watchdogs include the following:
int Pin);
CPU 32-bit timer (SCUTIMER), clocked at half the CPU
Naturally, you will also need to configure the interrupt
frequency
correctly. For instance, do you wish it to be edge triggered or
level triggered? If so, which edge and level can be achieved CPU 32-bit watchdog (SCUWDT), clocked at half the CPU
using the function? frequency
void XGpioPs_SetIntrTypePin(XGpioPs *InstancePtr, Shared 64-bit global timer (GT), clocked at half the CPU
int Pin, u8 IrqType);
frequency (each CPU has its own 64-bit comparator; it is
where the IrqType is defined by one of the five definitions used with the GT, which drives a private interrupt for
within xgpiops.h. They are: each CPU)
#defi
ne XGPIOPS_IRQ_TYPE_EDGE_RISING 0 /**< System watchdog timer (WDT), which can be clocked
Interrupt on Rising edge */
#defi
ne XGPIOPS_IRQ_TYPE_EDGE_FALLING 1 /**< from the CPU clock or an external source
Interrupt Falling edge */
A pair of triple timer counters (TTCs), each containing
#defi
ne XGPIOPS_IRQ_TYPE_EDGE_BOTH 2 /**<
Interrupt on both edges */ three independent timers. The TTCs can be clocked by the
#defi
ne XGPIOPS_IRQ_TYPE_LEVEL_HIGH 3 /**< CPU clock or by means of an external source from the
Interrupt on high level */ MIO or EMIO in the programmable logic.
#defi
ne XGPIOPS_IRQ_TYPE_LEVEL_LOW 4 /**<
Interrupt on low level */ To gain the maximum benefit from the available timers
If you decide to use the bank enable, you need to know which and watchdogs, we need to be able to make use of the Zynq
bank the pin or pins you wish to enable interrupts are on. The SoCs interrupts. The simplest of these to configure is the pri-
Zynq SoC supports a maximum of 118 GPIOs. In this configura- vate timer. Like most of the Zynq SoCs peripherals, this tim-
tion, all of the MIOs (54 pins) are being used as GPIO along with er comes with a number of predefined functions and macros
the EMIOs (64 pins). We can break this configuration into four to help you use the resource efficiently. They are contained
banks, with each bank containing up to 32 pins. within the following:
This setup function will also define the interrupt service #include xscutimer.h
routine, which is to be called when the interrupt occurs that
This file contains functions (macros) that will provide a
uses the function:
number of capabilities, including initialization and self-test.
XGpioPs_SetCallbackHandler(Gpio, The functions within this file will also start and stop the tim-
(void *)Gpio, IntrHandler);
er, and manage the timer (restart it; check to see if it has
The interrupt service routine can be as simple or as com- expired; load the timer; enable/disable auto loading). Anoth-
plicated as the application defines. For this example, it will er of their jobs is to set up, enable, disable, clear and man-
toggle the status of an LED on and off each time a button is age the timer interrupts. Finally, these functions also get and
pressed. The interrupt service routine will also print out a then set the prescaler.
message to the console each time the button is pressed. The timer itself is controlled via the following four
static void IntrHandler(void *CallBackRef, int registers:
Bank, u32 Status)
{ Private Timer Load Register This register is used in auto
int delay; reload mode. It contains the value that is reloaded into the
XGpioPs *Gpioint = (XGpioPs *)
Private Timer Counter Register when auto reload is enabled.
CallBackRef;
XGpioPs_IntrClearPin(Gpioint, pbsw);
Private Timer Counter Register This is the actual
printf(****button pressed****\n\r); counter itself. When enabled, once this register reaches
toggle = !toggle; zero the interrupt event flag is set.
XGpioPs_WritePin(Gpioint, ledpin, toggle);
for( delay = 0; delay < LED_DELAY; delay++) Private Timer Control Register The control register
//wait enables or disables the timer, auto reload mode and
{} interrupt generation. It also contains the prescaler for
} the timer.
42 Xcell Journal Second Quarter 2014
XPLANANTION: FPGA 101
Private Timer Interrupt Status Register This register //enable interrupt on the timer
contains the private timer interrupt status event flag. XScuTimer_EnableInterrupt(TimerInstancePtr);
As for using the GPIO, the timer device ID and timer in- Where TimerIntrHandler is the name of the function that
terrupt ID that are needed to set up the timer are contained is called when the interrupt occurs, the timer interrupt must
within the XParameters.h file. Our example will use the be enabled on the GIC and within the timer itself.
pushbutton interrupt that we developed previously. When The timer interrupt service routine is very simple. All it
the button is pressed, the timer will load and start to run does is to clear the pending interrupt and write out a message
(not in auto reload mode). Upon expiration of the timer, over the STDOUT, as follows:
an interrupt will be generated that will write a message out static void TimerIntrHandler(void *CallBackRef)
over the STDOUT. The interrupt will then be cleared to wait {
until the next time the button is pressed. This example will XScuTimer *TimerInstancePtr =
always load the same value into the counter; hence with the (XScuTimer *) CallBackRef;
declarations at the top of the file, the timer count value is XScuTimer_ClearInterruptStatus(TimerInstancePtr);
printf(****Timer Event!!!!!!!!!!!!!****\n\r);
declared, as follows:
#define TIMER_LOAD_VALUE 0xFFFFFFFF With this action complete, the final thing to do is to mod-
The next stage is to configure and initialize the private ify the GPIO interrupt service routine to start the timer each
timer and load the timer count value into it. time the button is pushed, as such:
//timer initialisation //load timer
TMRConfigPtr = XScuTimer_LookupConfig XScuTimer_LoadTimer(&Timer, TIMER_LOAD_VALUE);
(TIMER_DEVICE_ID); //start timer
XScuTimer_CfgInitialize(&Timer, XScuTimer_Start(&Timer);
TMRConfigPtr,TMRConfigPtr->BaseAddr);
//load the timer To do this we first load the timer value into the timer
XScuTimer_LoadTimer(&Timer, TIMER_LOAD_VALUE);
and then call the timer start function. Now we can again
We also need to update the interrupt setup subroutine to clear the pushbutton interrupt and resume processing, as
connect the timer interrupts to the GIC and enable the timer seen in Figure 3.
interrupt. Many engineers initially approach an interrupt-driv-
//set up the timer interrupt en system design with trepidation. However, the Zynq
XScuGic_Connect(GicInstancePtr, TimerIntrId, SoCs architecture, with the Generic Interrupt Controller
(Xil_ExceptionHandler)TimerIntrHandler, coupled with the drivers provided with the SDK, enables
(void *)TimerInstancePtr);
you to get an interrupt-driven system up and running very
//enable the interrupt for the Timer at GIC
XScuGic_Enable(GicInstancePtr, TimerIntrId); quickly and efficiently.
Figure 3 This screen shows an example of the GPIO and timer interrupt event outputs.
Calculating
Mathematically
Complex Functions
by Adam P. Taylor
Head of Engineering Systems
e2v Technologies
aptaylor@theiet.org
T
hanks to their flexibility and per-
formance, FPGAs have found
One of the great benefits their way into a number of indus-
trial, science, military and other
800,000
700,000
600,000
Temperature
500,000
400,000
300,000
200,000
100,000
0.0000
100 150 200 250 300 350 400
Resistance
culate the temperature using the rear- curve with linear interpolation between Engineers often call the DSP slices
ranged equation below. the points within the LUT. DSP48s, due to the 48-bit accumulator
This approach may fit the bill, depend- they provide. However, these slices
ing upon the accuracy requirement and also supply 25 x 18-bit-wide multipliers
number of elements stored within the and addition/subtraction capabilities,
lookup table. However, you will still need among many other faculties. It is these
Implementing this equation within to include a linear interpolator function internal RAM structures and DSP slices
an FPGA may be daunting for even a within the design. This function can be that you can use to implement transfer
seasoned FPGA engineer. Plotting the mathematically sophisticated and will functions with greater ease.
obtained resistance against temperature often include a non-power-of-two divide,
results in a graph as shown in Figure 1. which adds to the complexity. POLYNOMIAL APPROXIMATION
From the graph, you can clearly see the One method that utilizes the DSP- and
nonlinearity of the response. CAPITALIZE ON FPGA RESOURCES RAM-rich architecture of the FPGA
Implementing the rearranged trans- Instead, there is another method you can is polynomial approximation. To use
fer function in an FPGA directly could use to implement these types of transfer this technique, you must first plot
be a significant challenge, both in terms functionsone that capitalizes upon the the mathematical function, covering
of actual design effort required and very nature of the FPGA. Modern FPGAs the input value range in a mathemati-
then in validation (ensuring accuracy like the Xilinx Spartan-6 and the 7 se- cal program such as MATLAB or Ex-
and function across boundary and cor- ries Artix, Kintex and Virtex lines con- cel. You can then add a polynomial
ner-case conditions). Many engineers tain much more than just the traditional trend line to the data set in question,
will look for different methods to imple- lookup tables and flip-flops. They also such that the equation for the trend
ment the function in a way that will re- come with built-in DSP slices, Block RAM line can then be implemented within
duce design and validation effort so as and distributed RAM, along with many the FPGA in place of the mathemati-
to protect project time scales. One pos- advanced hard IP cores such as PCIe cally complex function, provided the
sible approach would be using a lookup and Ethernet endpoints, high-speed serial trend-line equation meets the accu-
table to store a number of points on the links and so on. racy requirements.
Most mathematical programs capable Having obtained the polynomial using just one polynomial equation,
of adding a polynomial trend line allow fit for the transfer function we want depending upon the range of mea-
you to select the order, or number of to implement, we can then dou- surements and the transfer function
polynomial terms. The larger the order, ble-check for accuracy against the you are implementing. How can we
2.403x -251.26
ing this process for the transfer function
example we are using in Microsoft Ex-
ture is being monitored, it may be
that the end measurement has to be
SELECTED BY INPUT VALUE
Should one polynomial equation not
cel, we obtained the trend line and equa- accurate to +/-1oC, not a particular- provide sufficient accuracy over the
tion seen in Figure 2. This example used ly demanding accuracy requirement. entire transfer function input range,
the polynomial order of four. Still, it may prove difficult to achieve just add more. It is still possible to
800,000
700,000
600,000
Temperature
500,000
400,000
300,000
200,000
100,000
0.0000
100 150 200 250 300 350 400
Resistance
Figure 2 Trend line and polynomial equation for the temperature transfer function
290,000
Temperature
280,000
270,000
260,000
250,000
199 201 203 205 207 209 211
Resistance
Figure 3 Plotting between 269oC and 300oC to provide a more accurate result
use this approach so long as you gen- lated range that corresponds to 268oC. HOW DOES IT COMPARE?
erate a number of polynomial con- Above that range, the second set of Of course, as I mentioned earlier
stants for use across the input range. constants is used to maintain the accu- there are other methods you can use
Thus, once the input value goes out- racy requirements. to implement transfer functions. The
side specific bounds, a new set of In this way, you can break a transfer four most commonly used methods
constants is loaded. function into a number of segments aside from polynomial approxima-
Continuing with the tempera- to achieve the desired accuracy. You tion are software routines, lookup
ture example, the first polynomial may opt to space these segments uni- tables, a lookup table with interpola-
equation provides +/- 1oC of accura- formly across the transfer function tion and CORDIC.
cy between 0o and 268oC. For many that is, split them into 10 segments of The use of software to calculate the
applications this will be more than equal value of X. Alternatively, you transfer function complicates the sys-
sufficient. Suppose that we require can make them nonuniform and seg- tem architecture due to the need to
an extended operating range and tol- ment them as required to achieve the add a processor (with an associated in-
erance to 300oC, which would mean desired accuracy, focused upon areas crease in design complexity, BOM cost
our initial approach did not meet the of the transfer function where accura- and so on). Even if the design team mit-
design requirements. Using a seg- cy is harder to obtain. igates this drawback by using a system-
mented approach, we can address Among the trade-offs to consider on-chip such as the Xilinx Zynq-7000
this problem by plotting the range be- when deciding upon your implementa- All Programmable SoC, challenges still
tween 269oC and 300oC and obtaining tion, keep in mind that a uniform ap- remain. For starters, the time it takes to
a different polynomial equation that proach may require a larger memory calculate the transfer function in soft-
will provide more accuracy for this footprint than a nonuniform approach. ware would be much longer than can
output range (see Figure 3). Depending upon the transfer function be achieved in logic, reducing the sys-
In short, the implementation uses you are implementing, going with a tem response time. In fact, calculation
the first polynomial constants until nonuniform approach could result in a of transfer functions such the one used
the input value goes above a precalcu- considerable saving. in our sample design is a classic exam-
methods of implementing
USB 2.0
Gigabit Ethernet
85,120 LUT4-eq
108 user I/Os
2. MERCURY KX1
a good trade-off in
Kintex-7 FPGA Module
Xilinx Kintex-7 FPGA
Up to 2 GB DDR3L SDRAM
16 MB QSPI flash
performance, accuracy
PCIe x4 endpoint
4 6.6/10.3125 Gbps MGT
USB 3.0 Device
2 Gigabit Ethernet
Up to 406,720 LUT4-eq
and footprint.
178 user I/Os
5-15 V single supply
72 54 mm
3. FPGA MANAGER
ple of where the processing should be tions such as sine, cosine, multipli- IP Solution
offloaded to the programmable logic cation, division, square root and so
side of the Zynq SoC. on. Therefore, it is possible to imple- C/C++
The efficacy of the second approach ment transfer functions exactly using C#/.NET
MATLAB
using a lookup table containing precalcu- a combination of CORDIC algorithms LabVIEW
lated values for the inputcan vary de- and basic mathematical blocks. The
pending upon the range and width of the technique can result in higher pre- USB 3.0
PCIe Gen2
input values. At times, the result will very cision. However, for a complicated Gigabit Ethernet
quickly be a very large LUT that requires transfer function, this gain in preci-
a lot of RAM within the FPGA. Depend- sion will come at the cost of increased
ing upon the FPGA, this approach could design and verification time. There Streaming, FPGA
require more resources than are avail- will of course be an impact on the op- made simple.
able, or it might cause conflicts with re- erating frequency of a device imple-
One tool for all FPGA communications.
quirements for other modules within the mented in this way. Stream data between FPGA and host over
USB 3.0, PCIe, or Gigabit Ethernet all with
design. On the plus side, of course, this Polynomial approximation there- one simple API.
method will result in a very fast calcula- fore presents a middle ground among
tion of the result. the four alternative options, offering a 4. PROFINET IP Core
The third potential approacha good trade-off in performance, accura-
lookup table with interpolationis cy and implementation footprint. Optimized for Xilinx FPGA and Zynq SoC
IRT cycle times as low as 31.25 s
one we explored earlier and is an at- Hardware IEEE 1588 PTP implementation
tempt to reduce the number of mem- EASE OF IMPLEMENTATION Isochronous traffic can bypass the software stack
ory locations needed with a full LUT Every engineer wants to produce an
approach. This technique does re- FPGA that has an optimal utilization of We speak FPGA.
quire the engineer to write a linear in- device resources. Polynomial approxi-
terpolation function within the FPGA, mation allows you to benefit from the
which can be a little involved. It is, multiplier- and RAM-rich environment Design Center FPGA Modules
however, still much simpler than the provided by the FPGAs, and to use these Base Boards IP Cores
final option: CORDIC. resources to easily implement what at We speak FPGA.
A CORDIC algorithm is capable of first might appear to be a very complex
implementing transcendental func- mathematical transfer function.
ARCHITECTURAL CONCERNS
by David C. Black As I started to think about this trans-
Senior Member of Technical Staff formation from a software perspective,
Doulos I grew concerned about the software
david.black@doulos.com interface. After all, HLS creates hard-
ware dedicated to processing hardware
interfaces. I needed something easy to
access, like a coprocessor or hardware
accelerator, to make the software go
faster. Also, I didnt want to write a
new compiler. To make it easy to ex-
change data with the rest of the soft-
ware, the interface needed to look like
simple memory locations where we
could place the inputs and later read
back the results.
Then I made a discovery. Vivado HLS
Have you ever written some software supports the idea of creating an AXI
that, despite your best coding efforts, slave with relatively little effort. This
didnt run as fast as desired? I have. capability started me thinking an accel-
Have you thought, If only there were erator might not be so difficult to create
an easy way to put some of the code into after all. Thus, I found myself coding up
multiple custom processors or custom a simple example to explore the possi-
hardware that wasnt so expensive? Af- bilities. I was pleasantly surprised with
ter all, your application is one of many, how it turned out.
and custom hardware takes time and Lets take a walk through the ap-
money to create. Or does it? proach I took and consider the results.
I began rethinking this proposition For my example, I chose to model a
recently when I heard about the Xilinx set of simple matrix operations such as
high-level synthesis tool, Vivado HLS. add and multiply. I didnt want it to be
In combination with the Zynq-7000 All constrained to a fixed size, so I would
Programmable SoC, which combines a have to provide both the input arrays
dual-core ARM Cortex-A9 processor and their respective sizes. An ideal in-
with an FPGA fabric, high-level synthe- terface would put all the values as sim-
sis opens up new possibilities in design. ple arguments to a function, such as the
This class of tools creates highly tuned code in Figure 1.
RTL from C, C++ or SystemC source The interface to the hardware would
code. Many purveyors of this technol- need to have a simple way to map the
ogy exist, and the rate of adoption has function arguments to memory loca-
been increasing in recent years. tions. Figure 2 shows a memory layout
So, how hard would it be to migrate to support this mapping. The registers
some of that slow code into hardware, would hold information about how
if indeed I could simply use Vivado HLS matrices were laid out and what the
to do the more demanding computa- desired operations would be. The com-
tions? After all, I usually wrote my code mand register would indicate which
in C++, and Vivado HLS used C/C++ operation to do. This would allow me
as an input. The ARM processor cores to combine several simple operations
meant I could run the bulk of my soft- into one piece of hardware. The status
ware in a conventional environment. In register would simply be a way to know
fact, Xilinx has even made available a if the operation was in progress or had
software development kit (SDK) and finished successfully. Ideally, the de-
PetaLinux for this purpose. vice would also support an interrupt.
Going back to the hardware design, I Assuming the ability to synthesize and conveniently PetaLinux provides
learned that Vivado HLS allows for array the AXI slave, how would this fit a mechanism known as the User I/O
arguments to specify small memories. with the software? My normal coding device. UIO allows a simple approach
Thus, the functionality would be described environment assumes Linux. Fortu- to mapping the new hardware into
with a function such as Figure 3 shows. nately, Xilinx provides PetaLinux, user memory space, and provides the
Xilinx Opens
a Tcl Store
by Greg Daughtry
Director of Product Marketing
Xilinx, Inc.
greg.daughtry@xilinx.com
O
ver the last five years formed and drive the tools to provide
Xilinx has had a stra- optimal implementations.
Figure 1 The Tcl Store dialog box in the Vivado IDE allows installation of apps and browsing of the commands.
in Vivado Design Suite commands, INSTALLATION AND USE Design Suite it loads automaticallythere
right down to the help infrastructure. Designers can access the Xilinx Tcl is no need to install the app each time you
Vivado Design Suite supports differ- Store by means of an icon on the Get- start a new session.
ent versions of apps using standard ting Started page when you first launch Procs have a naming convention
package facilities of Tcl, so if a newer the Vivado IDE. Alternatively, you may that uses a facility in Tcl called name-
version is released you can choose to also go to the Tools Menu and select spaces. The names of the commands
upgrade with a single mouse click. the Xilinx Tcl Store menu option. This may seem a little more complex than
The Xilinx Tcl Store is intended to will bring up the repository dialog box, normal Tcl commands, and have ::
make it easier to find and use well-craft- which will give you a list of apps avail- characters embedded in them. For
ed Tcl scripts developed and supported by able to install (Figure 1). example, xilinx::ultrafast::check_pll_
the user community, in the same manner As you browse the list of apps, with- connectivity runs some connectivity
as Linux development. Tcl scripting is a in each app there is a list of commands checks on the clock-modifying blocks
little more advanced than selecting IDE (called procs in Tcl) that are available in Xilinx devices. The naming conven-
buttons. However, it is easy to learn. Docu- for execution. You will see a description of tions serve to make sure the Tcl code is
mentation and user guides provide details each app, and of each proc within the app, unique and that a proc in one app does
on specific commands from the Tcl API to get an idea of what it does. Click on the not conflict with another proc by the
and can be found on xilinx.com/support. install button to install the app and register same name in another app. Namespaces
Lets take a closer look at the infra- it so that it now shows up like native Viva- are a standard feature of Tcl.
structure for installing and using Tcl do Design Suite commands. Once an app To execute an app command, type in
apps from the Xilinx Tcl Store. is installed, each time you start the Vivado the fully qualified name of the proc in-
Contributor Gatekeeper
3. Send pull request
(needs permission to send)
6. Approve/reject 4. Assign
App Owner
5. Review/test
Figure 2 Workflow of the Xilinx Tcl Store submission process passes through several discrete steps.
PCs. Linux machines typically already code and generate a catalog.xml file. 7.
Go to GitHub.com using a Web
have it or can install it through stan- This is one of three files that you will browser and issue a pull request.
dard packages. GitHub provides tuto- need. The others are a package index This formally initiates the process
rials to help you get started with Git. and a Tcl index. of merging your contributions into
Once you have a GitHub account, the repository. Work with the gate-
here are the steps for contributing to 4.
Open Vivado Design Suite in an- keeper and app owner as appropri-
the Tcl Store repository: other location, point to the local ate to resolve any issues through
repository and test your apps. Run GitHub and e-mail.
1.
Clone the Xilinx Tcl Store master the linter and your local tests un-
til you are comfortable that all is 8. Congratulations! It feels good to help
repository. This creates a local copy
working correctly. your fellow designers.
that is your sandbox and it allows
you to develop locally and test with- Figure 2 shows a basic diagram of
5. Commit your changes and provide a
out impacting others. the workflow showing the submis-
message briefly documenting them.
sion process.
2. Place your new code in the correct
6. Send an e-mail requesting permission
directory, following the established
to contribute to tclstore@xilinx.com. THE FINE PRINT
guidelines based on app name and
Indicate whether youd like to create The Xilinx Tcl Store is open source, and
company or GitHub username. Use
a new app and what youd like to call there is no facility to monetize or charge
standard Git add commands.
it. If youd like to modify or contrib- for contributions. Apps contributed to
3. Use Vivado Design Suite in the local ute to an existing app, please indicate the Tcl Store are made freely available for
repository, and call a few commands that; you will need permission from derivative works through a BSD license
that are necessary for registering the the app owner. commonly used in open-source projects.
VIVADO DESIGN SUITE 2014.1 RELEASE HIGHLIGHTS custom netlist changes and integrate
Vivado Design Suite 2014.1 increases your productivity with faster run-times, im- with third-party electronic design au-
proved quality of results, automation of the UltraFast Design Methodology and tomation (EDA) tools such as simu-
hardware acceleration of OpenCL kernels through Vivado high-level synthesis (HLS). lation, synthesis, timing and power
analysis, and linting tools. Natively ac-
DEVICE SUPPORT cessed from the Vivado Integrated De-
sign Environment (IDE), the Tcl Store
Production Ready:
enables users to select and install col-
Artix-7 XC7A35T and XC7A50T
lections of Tcl scripts called apps
XA Artix-7 XA7A50T, XA7A35T and XA7A75T
directly from within the tool. Once
Zynq-7000 XC7Z015
installed, these apps have commands
General Access: that appear just like built-in Vivado
Kintex UltraScale: Design Suite commands.
XC KU035, XC KU040, XC KU060 and XC KU075 To learn more about the new Xilinx
Tcl Store, watch the QuickTake video at
Early Access (contact local sales rep): http://www.xilinx.com/training/vivado/
Kintex UltraScale SSI devices: introduction-to-the-xilinx-tcl-store.htm.
XC KU100 and XC KU115
Virtex UltraScale devices: VIVADO DESIGN SUITE:
XC VU065, XC VU080, XC VU095, XC VU125, XC VU145 and XC VU160 DESIGN EDITION UPDATES
Vivado Implementation Tools
XILINX Tcl STORE these scripts can extend the consider-
Performance and run-time improvements:
able core functionality of the Vivado
Xilinx is taking another large step for- Design Suite, enhancing productivity Average 25 percent faster overall
ward in designer productivity by hosting and ease of use. The Tcl Store is open to implementation run-time compared
an open-source repository for sharing the user community to contribute to the to 2013.4
Tool Command Language (Tcl) code. greater good of all designers by publish-
Average 2.5 percent better Fmax on
This repository, called the Xilinx Tcl ing Tcl code that others might find useful.
7 Series SSIT devices
Store, will make it a lot easier to find and The Xilinx Tcl Store provides exam-
share Tcl scripts that other engineers ples of how to write custom reports, Average 5 percent improvement in
have developed. With the power of Tcl, control specific tool behavior, make Fmax across all devices.
VIVADO DESIGN SUITE: There is no need to edit any IP file. Vivado Training
SYSTEM EDITION UPDATES For instructor-led training on the Viva-
Additional new key IP available for Ul- do Design Suite, please visit www.xil-
Vivado High-Level Synthesis traScale devices in 2014.1 includes the inx.com/training.
Vivado HLS now offers early-access HSSIO Wizard, System Management
Wizard, SGMII over LVDS, Aurora 8B10B Download Vivado Design Suite 2014.1
support of OpenCL kernels. OpenCL and 64B66B, CPRI and Serial RapidIO. today at http://www.xilinx.com/down-
provides a framework and language for load.
writing kernels that execute across het-
Second Quarter 2014 Xcell Journal 61
X P E D I TE
T
he Xilinx Alliance Program is a worldwide eco- JPEG2000 compression and Transport
system of qualified companies that collaborate Stream solutions with Xilinx SMPTE
2022 cores on a single Kintex-7 device.
with Xilinx to further the development of All Pro- Fully compliant with the Video Services
grammable technologies. Xilinx has built this eco- Forum (VSF) technical recommenda-
system, which leverages open platforms and standards, tion, this design ensures interopera-
to meet customer needs and is committed to its long-term bility with major broadcast industry
players as it was demonstrated at the
success. Alliance membersincluding IP providers, EDA
VidTrans 2014 meeting in Arlington,
vendors, embedded software providers, system integra- Va., in February and again at NAB 2014
tors and hardware suppliershelp accelerate your de- in April. This new version of the refer-
sign productivity while minimizing risk. Here are some ence design combines JPEG2000 com-
highlights of recent alliance activities. pression and flexible MPEG2-TS cores
from Barco-Silex with the SMPTE2022
1-2 IP core from Xilinx on a Kintex-7
XYLON AND NORTHWEST April at the NAB 2014 show in Las Ve- FPGA to carry a 4K video stream over
LOGIC DELIVER MIPI CSI-2 gas, visitors to the Xilinx booth saw the a 1G network. The 4K video is captured
CAMERA INTERFACE DEMO demo system based on a Xilinx ZC706 via Quad-SDI, and is encoded by the
AT NAB 2014 Zynq SoC Evaluation Kit developed high-quality Barco-Silex JPEG2000 en-
by Premier Xilinx Alliance Program coder core. The integrated Barco-Silex
https://www.youtube.com/
members Xylon and Northwest Logic. Transport Stream solution combined
watch?v=aKbkB7WN8CE
Please click on the link above to view a with the Xilinx SMPTE2022 1-2 core
short video explaining the demo. provides interoperability thanks to its
The MIPI Display Serial Interface (DSI)
and Camera Serial Interface 2 (CSI-2) proven compliance with the VSF techni-
are becoming key, low-cost industry cal recommendations.
BARCO-SILEX
standards for connecting video displays
DEMONSTRATES 4K
and cameras to a wide variety of em-
VIDEO-OVER-IP WITH TOPIC SHOWCASES DYPLO
bedded systems. You can now leverage
JPEG2000 ON A SINGLE SYSTEM AT EMBEDDED
a MIPI-compatible interface implement-
KINTEX-7 AT NAB 2014 WORLD 2014
ed on the Xilinx platform. This low-cost
way to interface multiple cameras and http://www.barco-silex.com/ip-cores/ https://www.youtube.com/watch?v=8S-
displays greatly enhances the use of jpeg-2000 1GOcL-t4o
Xilinx All Programmable devices at the
heart of diverse Smarter Vision systems Barco-Silex upgraded to 4K its well- Premier Alliance member TOPIC Embed-
for all sorts of markets including auto- known video-over-Internet Protocol ded Products has developed an operating
motive, industrial and medical. In early (VoIP) reference design that integrates system that will significantly reduce the
Application Notes
If you want to do a bit more reading about how our
FPGAs lend themselves to a broad number of applications,
we recommend these application notes.
XAPP1177: DESIGNING WITH SR-IOV CAPABILITY The application note demonstrates several key features
OF XILINX VIRTEX-7 PCI EXPRESS GEN3 of the Vivado Design Suite and the IP cores used in the
INTEGRATED BLOCK design, starting with generating a block diagram subsys-
http://www.xilinx.com/support/documentation/applica- tem using Vivados Tcl commands and scripting. Other
tion_notes/xapp1177-pcie-gen3-sriov.pdf areas covered include PCI Express endpoint configura-
tion; DMA-initiated data transfers over PCI Express; and
Evaluating single-root I/O virtualization (SR-IOV) capabil-
achieving high throughput into the Zynq SoC processing
ity can be a complex process, with many variations seen
system through the high-performance AXI interface. The
among different operating systems and system platforms.
author also explores dynamic address translation be-
In demonstrating the SR-IOV capability of the Xilinx
tween a 64-bit root complex (host) address space and a
Virtex-7 FPGA PCI Express Gen3 integrated block, this
32-bit FPGA (AXI) address space, and outlines a meth-
application note establishes a baseline system configura-
odology to perform DMA scatter-gather operations using
tion and provides the necessary software to quickly bring
dynamic address translation.
up and evaluate the SR-IOV features of the Virtex-7 FPGA
PCIe Gen3 integrated block.
Author Vivek Surabhi explains the key concepts of SR- XAPP1180: REFERENCE SYSTEM:
IOV and details how to configure the SR-IOV capability. KINTEX-7 MICROBLAZE SYSTEM
The document shows how to create a PCI Express x8 Gen3 SIMULATION USING IP INTEGRATOR
endpoint design configured for two physical functions and http://www.xilinx.com/support/documentation/applica-
six virtual functions. The reference design, which targets tion_notes/xapp1180.pdf
a Virtex-7 FPGA VC709 Connectivity Kit, has been hard-
ware-validated on a system with SR-IOV capability. This application note and reference system demonstrate
the functionality of a MicroBlaze processor system
on the Kintex-7 device architecture using the Xilinx IP
XAPP1171: PCI EXPRESS ENDPOINT-DMA Integrator tool in simulation and in hardware. The sys-
INITIATOR SUBSYSTEM tem includes common peripherals such as main memory
http://www.xilinx.com/support/documentation/applica- as well as RS232 communications. Author James Luce-
tion_notes/xapp1171-pcie-central-dma-subsystem.pdf ro provides several standalone software applications to
verify the functionality of the peripherals. Applications
This application note by Brian Martin demonstrates a
include hello_uart and hello_mem.
Vivado Design Suite subsystem for endpoint-initiated
Lucero also explains how to set up the simulation en-
direct memory access (DMA) data transfers through PCI
vironment for the system, execute the simulation using
Express. The provided subsystems target the Zynq-7000
either the Vivado simulator or Mentor Graphics Model-
All Programmable SoC ZC706 and Kintex-7 KC705 de-
Sim environments, and run the design on hardware. The
vice to initiate data transfers between DDR3 memory and
document describes running the design in simulation and
an externally connected PCI Express Root Complex. You
hardware, targeting the KC705 board that contains the
could modify the subsystem for use in other devices or
Kintex-7 XC7K410TFFG900-2 FPGA.
applications that require data transactions to be initiated
from within the FPGA logic.
XAPP742: AXI VDMA REFERENCE DESIGN XAPP1199: SMPTE 2022-5/6 HIGH-BIT-RATE MEDIA
http://www.xilinx.com/support/documentation/applica- TRANSPORT OVER IP NETWORKS WITH FORWARD
tion_notes/xapp742-axi-vdma-reference-design.pdf ERROR CORRECTION
http://www.xilinx.com/support/documentation/applica-
If youve ever contemplated building a video system using
tion_notes/xapp1199-smpte2022-56-over-ip.pdf
Xilinx native video IP cores to process configurable frame
rates and resolutions in Kintex-7 FPGAs, this application Heres a video-over-IP network system that leverages the
note will show you how. The reference design focuses on performance features of the LogiCORE IP SMPTE 2022-5/6
run-time configuration of an onboard clock generator for a video-over-IP transmitter and receiver cores. The reference
video pixel clock, and on using video IP cores such as AXI design by authors Gilbert Magnaye, Josh Poh, Myo Tun Aung
Video Direct Memory Access (VDMA), Video Timing Con- and Tom Sun focuses on high-bit-rate, native media transport
troller (VTC), test pattern generator (TPG) and the DDR3 over 10-Gbps Ethernet with a built-in forward error correc-
memory controller for running selected combinations of tion (FEC) engine. The design is able to support up to three
video resolution and frame rate. Authors Pankaj Kumbhare SD/HD/3G-SDI streams.
and Vamsi Krishna discuss the configuration of each video The transmitter platform uses three SMPTE SDI cores
IP in detail, helping designers make effective use of these to receive the incoming SDI video streams. The received
cores. Each video IP block is configured dynamically to pro- SDI streams are multiplexed and encapsulated into fixed-
cess various combinations of frame rate and resolution. size datagram packets by the SMPTE 2022-5/6 video-over-
IP transmitter core and sent using the 10-Gigabit Ethernet
MAC core. The 10-Gbit link is supported by a 10-Gigabit
XAPP1200: KINTEX-7 FPGA TRANSCEIVER Ethernet PCS/PMA core using an optical cable connected
WIZARD EXAMPLE DESIGN to the receiver end. On the receiver platform, the 10-Giga-
http://www.xilinx.com/support/documentation/applica- bit Ethernet MAC collects the Ethernet datagram packets.
tion_notes/xapp1200-k7-xcvr-wiz-example-design.pdf The SMPTE 2022-5/6 video-over-IP receiver core filters the
The KC705 Evaluation Kit provides a comprehensive, datagram packets, and de-encapsulates and demultiplexes
high-performance development and demonstration plat- the datagrams into individual streams, then outputs those
form using the Kintex-7 FPGA family for high-bandwidth streams through the SMPTE SDI cores. The Ethernet da-
and high-performance applications in multiple market seg- tagram packets are buffered in DDR3 SDRAM for both the
ments. This application note by Dinesh Kumar and Thupalli transmitter and receiver.
Ramachandra uses the KC705 kit and the GTX Transceiver The DDR traffic passes through the AXI4 interconnect to
Wizard to demonstrate a transceiver example design run- the 7 series AXI memory controller. A MicroBlaze processor
ning on Kintex-7 FPGA hardware. The authors have fully initializes the cores and reads the status. The reference design
verified the reference design and tested it on hardware. targets the Xilinx Kintex-7 FPGA KC705 Evaluation Kit.
Xpress Yourself
in Our Caption Contest
PAUL McFARTHING, development
engineer at Smith & Nephew
(Andover, Mass.), won a shiny new
Digilent Zynq Zybo board with this
caption for the tattooed engineer
in Issue 86 of Xcell Journal:
DANIEL GUIDERA
S
pring is in the air, even if your garden grows indoors. Xercise your about the skin effect.
funny bone as you tend your posies by submitting an engineering-
or technology-related caption for this cartoon showing a couple of
engineers watering their plants. The image might inspire a caption like
Congratulations as well to
Once they cut static power and dynamic power in their design, Ernie and
our two runners-up:
Frank decided to experiment with Flower Power.
Send your entries to xcell@xilinx.com. Include your name, job title, compa- Wearable computing is so pass.
ny affiliation and location, and indicate that you have read the contest rules at This is the real future of
www.xilinx.com/xcellcontest. After due deliberation, we will print the submis- embedded computing.
sions we like the best in the next issue of Xcell Journal. The winner will receive
Chris Lee, technical leader,
a Digilent Zynq Zybo board, featuring the Xilinx Zynq-7000 All Programma-
Cisco Systems (San Jose, Calif.)
ble SoC (http://www.xilinx.com/products/boards-and-kits/1-4AZFTE.htm).
Two runners-up will gain notoriety, fame and a cool, Xilinx-branded gift from
our swag closet. Looks great, huh? Only three
The contest begins at 12:01 a.m. Pacific Time on April 16, 2014. All entries ECOs, too. New personal record.
must be received by the sponsor by 5 p.m. PT on June 30, 2014. David Riley, principal hardware
So, put down your trowel and get writing! engineer, Mantaro Networks
(Philadelphia)
NO PURCHASE NECESSARY. You must be 18 or older and a resident of the fifty United States, the District of Columbia, or Canada (excluding Quebec) to enter. Entries must be entirely original. Contest begins on
April 16, 2014. Entries must be received by 5:00 pm Pacific Time (PT) on June 30, 2014. Official rules are available online at www.xilinx.com/xcellcontest. Sponsored by Xilinx, Inc. 2100 Logic Drive, San Jose, CA 95124.