Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Visit www.mathworks.com/academia
MATLAB Releases
Visit http://www.mathworks.com/products/matlab/whatsnew.html
Learning Resources
Interactive Video Tutorials Students learn the basics outside of the classroom with self-guided tutorials provided by The MathWorks
MATLAB Simulink Signal Processing Control Systems Computational Math
Visit www.mathworks.com/academia/student_center/tutorials
Self-Paced Training
Immediate access (90 days) Learn at your own pace Use features and workflows that are considered best practices Hands-on MATLAB experience Online discussion boards and live trainer chats
Available on demand:
MATLAB Central
File Exchange
Newsgroup
Blogs
Visit www.mathworks.com/matlabcentral
MATLAB Central
Visit www.mathworks.com/matlabcentral
Student Version
Further Information
Contact Information:
Peter Sheridan peter.sheridan@mathworks.com 508-647-7176
Academic Community:
www.mathworks.com/academia
Visit www.mathworks.com/
10
Communications Seminar
Houman Zarrinkoub PhD. Product Manager LTE & Communications Systems houmanz@mathworks.com
2014 The MathWorks, Inc. 11
Agenda
12
Another resource
13
15
16
H(f)
802.11ad
f
17
. . ..
Channel
Receiver
Output bits
Channel decoding Demodulation MIMO Receiver (Equalizer) Channel estimation
Noise
OFDM receiver
OFDM receiver
18
OFDM
Multicarrier transmission scheme Subdivide wideband channel into multiple narrowband orthogonal sub-channels (subcarriers) Enables Flexible transmission bandwidths (BW) Robust to multipath fading High spectral efficiency Low-complexity implementation Works with MIMO transmission
19
OFDM implementation
Modulated symbols organized in a time-frequency grid Grid is composed of data, pilots and control signals Pilots are pre-determined samples used for channel estimation In frequency, symbols align with subcarriers Subcarriers are multiples of frequency spacing OFDM modulation is essentially an Inverse Fourier Transform (IFFT) plus Cyclic Prefix (CP) insertion: Append last N samples to the beginning CP insertion ensures orthogonality among frames at the receiver
=subcarriers = 2 pilots
+3
1
OFDM symbol = n
/2
=/2
< +
( + ) 1
MIMO
Class of Multi-antenna techniques
Advantages of MIMO techniques Boosting overall data rates Increasing reliability of communication link Various types of MIMO/SIMO Receive diversity Transmit diversity Beamforming Spatial multiplexing
MIMO
21
MIMO implementation
MIMO layer mapping in
Essentially composed of 2 steps
Layer mapping Precoding Splits modulated symbol stream into multiple substreams Transforms (scales) substreams In beamforming provides beamsteering In transmit diversity it creates orthogonal codes In spatial multiplexing, each substream is steered to a different direction Beamforming Transmit diversity
Layer mapping
Precoding
MIMO precoding in
Beamforming
Spatial multiplexing
22
pathloss
pathloss
23
24
Fading channels
Interfering signals
Non-linearities front-end receivers Phase noise, Frequency offset, Timing mismatch, IQ imbalance Channel estimation & Equalization Antenna arrays & directional propagation Beamforming & beamsteering
25
Easy-to-follow end-to-end simulation Graphical test bench Adjustment of channel characteristics on the fly
26
27
28
Introduce OFDM transmission of 802.11x Transceiver with modulation, coding, scrambling & OFDM transmitter & receiver Channel (Interferer + path loss) (No multipath fading yet)
29
Introduce Receive-side beamforming Transceiver with modulation, coding, scrambling, OFDM transmitter & receiver & Receive diversity SIMO (receiver beamforming) Channel with Interferer + path loss (No multipath fading yet) Receiver has multiple Antennas (1 to 8)
Interference Source (I)
30
Introduce Receive-side beamforming with Multipath fading Transceiver with modulation, coding, scrambling, OFDM transmitter & receiver & Receive diversity SIMO (receiver beamforming) Channel with Interferer + path loss + fading Receiver has multiple Antennas (1 to 8)
Interference Source (I)
31
Introduce Transmit-side beamforming with Multipath fading Transceiver with modulation, coding, scrambling, MIMO-OFDM transmitter & receiver Channel with Interferer + path loss + fading Transmitter has multiple Antennas (1 to 8)
Interference Source (I)
32
Introduce Steered transmission(s) in base station (OFDMA) Transceiver with modulation, coding, scrambling, MIMO-OFDM transmitter & receiver Channel with Interferer + path loss + fading Transmitter has multiple Antennas (1 to 8)
Interference Source (I)
33
Introduce MIMO beamforming (both Tx-side and Rx-side) Transceiver with modulation, coding, scrambling, MIMO-OFDM transmitter & receiver Channel with Interferer + path loss + fading Transmitter has multiple Antennas (1 to 8)
Interference Source (I)
34
MATLAB
Live interactive MATLAB testbenches
35
36
Need to reduce
simulation time during design simulation time for large scale testing during prototyping
38
39
Application
LTE Physical Downlink Control Channel (PDCCH)
40
Workflow
Start with a baseline algorithm Profile it to introduce a performance yardstick Introduce the following optimizations:
Better MATLAB serial programming techniques Using System objects MATLAB to C code generation (MEX) Parallel Computing GPU-optimized System objects Rapid Accelerator mode of simulation in Simulink
41
Parallel Computing
GPU processing
42
43
Pre-allocation
Initialize an array using its final size Helps avoid dynamically resizing arrays in a loop
Vectorization
Convert code from using scalar loops to using matrix/vector operations Helps MATLAB leverage processor-optimized libraries for vector processing
44
Automatically generate a MEX function Call the generated MEX file within testbench Verify same numerical results Assess the baseline function and the generated MEX function for speed
46
Worker
TOOLBOXES
BLOCKSETS
Worker Worker
Worker
Task 1
Task 2
Task 3
Task 4
Time
>> Demo
Time
47
Summary
matlabpool available workers No modification of algorithm Use parfor loop instead of for loop Parallel computation or simulation leads to further acceleration More cores = more speed
48
Users Code
System objects
MATLAB to C
Parallel Computing
GPU processing
49
Originally for graphics acceleration, now also used for scientific calculations Massively parallel array of integer and floating point processors Typically hundreds of processors per card GPU cores complement CPU cores Dedicated high-speed memory
50
51
Ease of Use
52
53
Alternative implementation for many System objects take advantage of GPU processing Use Parallel Computing Toolbox to execute many communications algorithms directly on the GPU Easy-to-use syntax Dramatically accelerate simulations
54
Impressive coding gain High computational complexity Bit-error rate performance as a function of number of iterations
= comm.TurboDecoder( NumIterations, numIter,
55
CPU
8 hours
1.0
40 minutes 11 minutes
12.0 43.0
56
GPU Version 1
% Turbo Encoder hTEnc = comm.TurboEncoder('TrellisStructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrIndices) % AWG Noise hAWGN = comm.AWGNChannel('NoiseMethod', 'Variance'); % BER measurement hBER = comm.ErrorRate; % Turbo Decoder hTDec = comm.gpu.TurboDecoder( 'TrellisStructure',poly2trellis(4, [13 15], 13),... 'InterleaverIndices', intrlvrIndices,'NumIterations', numIter);
ber = zeros(3,1); %initialize BER output %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) data = randn(blkLength, 1)>0.5; % Encode random data bits yEnc = step(hTEnc, data); %Modulate, Add noise to real bipolar data modout = 1-2*yEnc; rData = step(hAWGN, modout); % Convert to log-likelihood ratios for decoding llrData = (-2/noiseVar).*rData; % Turbo Decode decData = step(hTDec, llrData); % Calculate errors ber = step(hBER, data, decData); end
ber = zeros(3,1); %initialize BER output %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) data = randn(blkLength, 1)>0.5; % Encode random data bits yEnc = step(hTEnc, data); %Modulate, Add noise to real bipolar data modout = 1-2*yEnc; rData = step(hAWGN, modout); % Convert to log-likelihood ratios for decoding llrData = (-2/noiseVar).*rData; % Turbo Decode decData = step(hTDec, llrData); % Calculate errors ber = step(hBER, data, decData); end
57
GPU Version 1
% Turbo Encoder <0.01 hTEnc = comm.TurboEncoder('TrellisStructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrIndices) % AWG Noise <0.01 hAWGN = comm.AWGNChannel('NoiseMethod', 'Variance'); % BER measurement <0.01 hBER = comm.ErrorRate; % Turbo Decoder 0.02 hTDec = comm.gpu.TurboDecoder( 'TrellisStructure',poly2trellis(4, [13 15], 13),... 'InterleaverIndices', intrlvrIndices,'NumIterations', numIter);
<0.01 ber = zeros(3,1); %initialize BER output %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) 0.30 data = randn(blkLength, 1)>0.5; % Encode random data bits 2.33 yEnc = step(hTEnc, data); %Modulate, Add noise to real bipolar data 0.05 modout = 1-2*yEnc; 1.50 rData = step(hAWGN, modout); % Convert to log-likelihood ratios for decoding 0.03 llrData = (-2/noiseVar).*rData; % Turbo Decode 330.54 decData = step(hTDec, llrData); % Calculate errors 0.17 ber = step(hBER, data, decData); end
<0.01 ber = zeros(3,1); %initialize BER output %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) 0.28 data = randn(blkLength, 1)>0.5; % Encode random data bits 2.38 yEnc = step(hTEnc, data); %Modulate, Add noise to real bipolar data 0.05 modout = 1-2*yEnc; 1.45 rData = step(hAWGN, modout); % Convert to log-likelihood ratios for decoding 0.04 llrData = (-2/noiseVar).*rData; % Turbo Decode 98.18 decData = step(hTDec, llrData); % Calculate errors 0.17 ber = step(hBER, data, decData); end
58
GPU Version 2
% Turbo Encoder hTEnc = comm.TurboEncoder('TrellisStructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrIndices) % AWG Noise hAWGN = comm.gpu.AWGNChannel ('NoiseMethod', 'Variance'); % BER measurement hBER = comm.ErrorRate; % Turbo Decoder - setup for Multi-frame or Multi-user processing numFrames = 30; hTDec = comm.gpu.TurboDecoder('TrellisStructure',poly2trellis(4, [13 15], 13),... 'InterleaverIndices', intrlvrIndices,'NumIterations',numIter, NumFrames,numFrames); %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) data = randn(numFrames*blkLength, 1)>0.5; % Encode random data bits yEnc = gpuArray(multiframeStep(hTEnc, data, numFrames)); %Modulate, Add noise to real bipolar data modout = 1-2*yEnc; rData = step(hAWGN, modout); % Convert to log-likelihood ratios for decoding llrData = (-2/noiseVar).*rData; % Turbo Decode decData = step(hTDec, llrData); % Calculate errors ber=step(hBER, data, gather(decData)); end
%% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) data = randn(blkLength, 1)>0.5; % Encode random data bits yEnc = step(hTEnc, data); %Modulate, Add noise to real bipolar data modout = 1-2*yEnc; rData = step(hAWGN, modout); % Convert to log-likelihood ratios for decoding llrData = (-2/noiseVar).*rData; % Turbo Decode decData = step(hTDec, llrData); % Calculate errors ber = step(hBER, data, decData); end
59
GPU Version 2
% Turbo Encoder <0.01 hTEnc = comm.TurboEncoder('TrellisStructure',poly2trellis(4, [13 15], 13),.. 'InterleaverIndices', intrlvrIndices) % AWG Noise 0.03 hAWGN = comm.gpu.AWGNChannel ('NoiseMethod', 'Variance'); % BER measurement <0.01 hBER = comm.ErrorRate; % Turbo Decoder - setup for Multi-frame or Multi-user processing 0.01 numFrames = 30; 0.01 hTDec = comm.gpu.TurboDecoder('TrellisStructure', poly2trellis(4, [13 15], 13),'InterleaverIndices', intrlvrIndices, 'NumIterations',numIter, NumFrames,numFrames); %% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) 0.22 data = randn(numFrames*blkLength, 1)>0.5; % Encode random data bits 2.45 yEnc = gpuArray(multiframeStep(hTEnc, data, numFrames)); %Modulate, Add noise to real bipolar data 0.02 modout = 1-2*yEnc; 0.31 rData = step(hAWGN, modout); % Convert to log-likelihood ratios for decoding 0.01 llrData = (-2/noiseVar).*rData; % Turbo Decode 20.89 decData = step(hTDec, llrData); % Calculate errors 0.09 ber=step(hBER, data, gather(decData)); end
%% Processing loop while ( ber(1) < MaxNumErrs && ber(2) < MaxNumBits) 0.30 data = randn(blkLength, 1)>0.5; % Encode random data bits 2.33 yEnc = step(hTEnc, data); %Modulate, Add noise to real bipolar data 0.05 modout = 1-2*yEnc; 1.50 rData = step(hAWGN, modout); % Convert to log-likelihood ratios for decoding 0.03 llrData = (-2/noiseVar).*rData; % Turbo Decode 330.54 decData = step(hTDec, llrData); % Calculate errors 0.17 ber = step(hBER, data, decData); end
60
Minimize data transfer between CPU and GPU. Using GPU only makes sense if data size is large. Some functions in MATLAB are optimized and can be faster than the GPU equivalent (eg. FFT). Use arrayfun to explicitly specify elementwise operations.
61
Summary
Acceleration methodologies in MATLAB & Simulink 1. Best Practices in Programming Vectorization & pre-allocation Environment tools. (i.e. Profiler, Code Analyzer) 2. Better Algorithms Ideal environment for algorithm exploration Rich set of functionality (e.g. System objects) Technology / Product
MATLAB, Toolboxes, System Toolboxes
3. More Processors or Cores Parallel Computing High level parallel constructs (e.g. parfor, matlabpool) Toolbox, MATLAB Distributed Utilize cluster, clouds, and grids
Computing Server
62
Thank You
Q&A
63