Sei sulla pagina 1di 10

STORAGE AREA NETWORK

Buffer Credit Recovery in


Fabric OS 6.1
Buffer-to-Buffer Credit Recovery was introduced in Fabric OS 6.1.0 on the
Brocade DCX Backbone and Brocade 5300, 5100, and 300 Switches,
three new 8 Gbit/sec platforms.
STORAGE AREA NETWORK Technical Brief

CONTENTS
Buffer Credit Recovery in Fabric OS 6.1...................................................................................................................................................................................1

Contents...............................................................................................................................................................................................................................................2

Background........................................................................................................................................................................................................................................3

Buffer Credit Recovery Mechanism...........................................................................................................................................................................................4


Fibre Channel Technologies .......................................................................................................................................... 4
Link Reset Protocol........................................................................................................................................................ 7
Frames Lost in Transit ................................................................................................................................................... 8
R_RDYs Lost in Transit .................................................................................................................................................. 8
BB_SCs or BB_SCr Lost in Transit ................................................................................................................................ 8
Example.......................................................................................................................................................................... 9

Summary .........................................................................................................................................................................................................................................10

Buffer Credit Recovery 2 of 10


STORAGE AREA NETWORK Technical Brief

BACKGROUND
During normal Storage Area Network (SAN) operation, Fibre Channel (FC) frames, R_RDYs, or VC_RDYs may
become corrupted in transport, which could be in the form of one or more bits in error. Bit errors can be
caused by optics failing, bad cables, optical budgets not within tolerance, intermittent hardware
malfunctions, long distance connections, and so on.

NOTE: VC_RDY is a Brocade® proprietary version of R_RDY used with Virtual Channels (VC) technology. In
this document, the term “R_RDY” is used to refer to either R_RDY or VC_RDY.

During Inter-Switch Link (ISL) formation, the Exchange Link Parameters (ELPs) process grants each side of
an FC link a certain number of frame buffers. Frame buffers are referred to as Buffer-to-Buffer (BB) credits.
The sender port keeps track of the number of buffers available on the receiving side by subtracting 1 when
sending a frame and adding 1 when receiving an R_RDY. A difference between R_RDY and Virtual Channels
worth mentioning is that VCs maintain BB credits and counts for each VC. (A few basic concepts of flow
control are explained in this paper, but it is beyond the scope of the paper to explain FC flow control in
detail.)

If the receiving side cannot recognize the Start Of Frame (SOF) in the header of the incoming frame, then it
will not respond with the appropriate R_RDY. Because the sender had decremented the available buffer
count by 1 and in turn does not received the corresponding R_RDY, synchronization between the sender
and receiver is now skewed by the missing R_RDY. When this condition occurs, no explicit error is generated
and it will not resolve itself without some form of automated or manual recovery mechanism. Not receiving
an R_RDY is not considered an error condition. Corrupted SOF or R_RDY violates the Cyclic Redundancy
Check (CRC) and generates an error and/or increments a statistical counter; however, this will not be
specific to BB Credit or R_RDY loss errors and will not initiate BB recovery or adjust associated counters.

Another scenario occurs when the R_RDY returned by the receiver is corrupted. The result is the same as in
the previous scenario. The sender will have an outstanding buffer count of 1 less than what is actually
available on the receiving side.

If a condition exists that creates enough corrupted bits during transmission, eventually the number of SOFs
and R_RDYs affected will cause the number of BB credits to diminish. Overall, a very small percentage of a
large quantity of sent frames will be associated with BB credit loss. At a point, diminished BB credits will
start to cause noticeable performance problems. Because no frames can be sent without a BB credit, which
represents an available buffer on the receiver side, an insufficient number of BB credits will cause periods
of idle time on the link. Queued frames can not be sent and must wait for R_RDYs to be received, which
indicates that ample space has freed on the receiver side. Over time, transmissions will degrade for what
seems to be no apparent reason and without excessive errors. If the extended link is in a metro area the
aggregate idle time becomes dramatic and adversely affects performance. In a FICON environment this
situation can quickly turn critical. It is possible for a BB credit count to diminish all the way to zero causing
the link to stop transmitting altogether.

Buffer Credit Recovery 3 of 10


STORAGE AREA NETWORK Technical Brief

BUFFER CREDIT RECOVERY MECHANISM


The Link Reset (LR) primitive sequence is common to Fibre Channel products. An LR initiated on Brocade®
8 Gbit/sec products is an effective way to reload the exchange credit model with minimal disruption of data
flows. This paper explores in further depth how the LR protocol works and when it should be initiated to
recover BB credits.

First, the Brocade BB Credit Recovery mechanism is activated by default when a port on a Brocade switch
that supports 8 Gbit/sec FC is configured for LE, LS, or LD mode. (Mode L0 does not utilize BB Credit
Recovery.) When in an extended link mode without QoS enabled only a single Virtual Channel (VC) is
utilized, therefore, only one VC has to be monitored for BB credit loss. In VC_RDY mode it is VC2, and in
R_RDY mode it is VC0.

NOTE: EX_Ports are not supported and FCIP links do not use BB credits, and so this topic does not apply to
the following port types: EX_Ports, VE_Ports, and VEX_Ports. If QoS is enabled, the additional VCs that are
utilized (VC8 through VC14) are not protected by BB Credit Recovery.

The goal of BB Credit Recovery is to detect missing credits and restore them, even in the event of a failed
Control Processor (CP). This feature is backward compatible with older versions of Fabric OS® (FOS). Start by
reviewing a few basic FC technologies.

Fibre Channel Technologies


Ordered sets are words consisting of four 8b/10b encoded characters. Special characters are denoted by
a “K” and data characters are denoted by a “D.” The only special character used in Fibre Channel is K28.5.
For the 8 bits being encoded, the digits before the decimal point are the decimal representation of 5 bits (0
– 4), and after the point is the decimal representation of 3 bits (5 – 7). After all 256-bit patterns
(00 – FF) have been assigned to 10-bit characters in the 8b/10b encoding, valid 10-bit characters remain
available for special purposes, such as Frame Delimiters and Primitive Signals. Ten bits can represent 1024
strings; however, too many consecutive 1s or 0s in the pattern prevent many from being used. The number
of 1s and 0s must be either equal (5 and 5) or have no more than two occurrence of one than the other
(four 0s and six 1s or six 0s and four 1s).

Each FC frame starts with a transmission word in the category of Frame Delimiter. A Frame Delimiter
demarcates either the start or end of a frame, designated as SOF and EOF respectively. SOF and EOF
delimiters carry status information. For example:

• SOFi3 word indicates the start of a sequence for class 3 traffic (“i” for initial).
• SOFn3 is for non-initial frames and SOFf is used for class F.
• EOFt indicates that it is the last frame of the sequence.
• EOFn means it is not the last frame of the sequence.
• EOFni is used when an error occurs (“i” stands for invalid in this case). If possible, a fabric will continue
to forward FC frames containing errors to the final destination using this special EOFni indicating that
an error was detected. EOFni is used even if it is the last frame of a sequence, there is no EOFti.

Buffer Credit Recovery 4 of 10


STORAGE AREA NETWORK Technical Brief

Primitive Signals represent events on the sending port. Primitive Signals include IDLEs referred to as “Fill
Words” and R_RDYs referred to as “Non-Fill Words.” They are a single transmission word and start with
K28.5. There are two more that pertain directly to BB Credit Recovery, BB_SCs and BB_SCr.

Ordered Set Transmission Word (4 x 10 bits)

IDLE K28.5 D21.4 D21.5 D21.5


R_RDY K28.5 D21.4 D10.2 D10.2
BB_SCs K28.5 D21.4 D22.4 D22.4
BB_SCr K28.5 D21.4 D22.6 D22.6
SOFi3 K28.5 D21.5 D22.2 D22.2
SOFn3 K28.5 D21.5 D22.1 D22.1
SOFf K28.5 D21.5 D24.2 D24.2
EOFn K28.5 * -D21.4 or +D21.5 D21.6 D21.6
EOFt K28.5 * -D21.4 or +D21.5 D21.3 D21.3
EOFni K28.5 * -D10.4 or +D10.5 D21.6 D21.6
LR K28.5 D09.2 D31.5 D09.2
LRR K28.5 D21.1 D31.5 D09.2

* EOF delimiters have a negative and a positive transmission word to maintain running disparity.

In a fabric, when two E_Ports are connected they will perform an Exchange Link Parameters (ELP) and send
an Internal Link Services (ILS) frame carrying the BB_SC_N value. If the two ports have different values,
then the larger of the two is used by both ports.

NOTE: If either value is 0 (“zero”), it indicates that BB recovery is not supported by that switch and the
feature is disabled.

Dynamic Mode (LD). LD calculates BB credits based on the distance measured during port initialization.
Brocade switches use a proprietary algorithm to estimate distance across an ISL. The estimated distance is
used to determine the BB credits required in LD (Dynamic) extended link mode based on a maximum FC
payload size of 2,112. An upper limit is placed on the calculation by the user providing a “desired_distance”
value. FOS will confine the users entry to no larger than what it has estimated the distance to be. When the
measured distance is more than “desired_distance,”, the desired_distance (the smaller value) is used in
the calculation.

Static Long-Distance Mode (LS) . LS calculates a static number of BB credits based solely on a user-
defined desired_distance value. For both the LD and LS methods the following formula can be used for
an approximation of the calculated number of BB credits:

BB credits = roundup nearest integer [desired_distance in km * (data rate / 2.125)] (1)

For LD, the estimated distance in km is the smaller of the distance measured during port initialization
versus the entered desired_distance.. Note that it is best practice to use LS over LD. The assumption of FC
payloads consistently being 2,112 bytes is not realistic in practice. To gain the proper number of BB credits
using LS mode, first there must be enough BB credits available in the pool because FOS will check before
accepting a value. Each 8 Gbit/sec-capable ASIC has 1,420 BB credits available after all other ports have
been assigned their default allocations. There are 2,048 BB credits total per ASIC.

Buffer Credit Recovery 5 of 10


STORAGE AREA NETWORK Technical Brief

Second, determine how many BB credits you want across your connection using the payload size that is
common in your network, for example, 1,024 bytes is often reasonable. Use the equation below to calculate
the pseudo desired_distance needed for FOS to calculate the number of BB credits you actually want.

Pseudo “desired_distance” = roundup to nearest integer [(real distance * 2112) / payload] (2)

Example: Calculate the pseudo desired_distance for a particular situation and to be entered into the
PortCfgLongDistance command for the desired_distance parameter:

Known:
• Port speed: 8 Gbit/sec

• Average payload size: 1,024 bytes

• Real estimated distance: 100 km

Pseudo “desired_distance” = roundup [(100 * 2112) / 1024] = 207

This is interpreted as: The user when configuring LS mode in PortCfgLongDistance should enter a
desired_distance value of 207 for an actual 100 km link connected to an 8 Gbit/sec E_Port. This will cause
FOS to apportion the correct number of BB credits, which will be determined by equation 1 above and
shown below:

BB credits = roundup [(207 * 8.5) / 2.125] = 828

This will not work with LD mode because LD mode checks the distance and limits the estimated distance to
the real value of 100 km. LS mode allows for the necessary desired_distance based on payload size to be
entered, regardless of the actual distance.

The “data rate” is one of:

• • 1.0625 for 1 Gbit/sec

• • 2.125 for 2 Gbit/sec

• • 4.25 for 4 Gbit/sec

• • 8.5 for 8 Gbit/sec

Considering that the switch knows the desired_distance and needs to determine a BB_SC_N value to send
to the other side, the following equation is used:

BB_SC_N = Round up to nearest integer ( log2(Estimated Distance) ) (3)

Here is an example: if you have a 250 km link, and it is determined to be 250 km by the E_Port, BB_SC_N is
computed as follows:

BB_SC_N = round-up to nearest integer ( log2 (250) ) = 8


Fabric OS supports a BB_SC_N range of 1 to 15, therefore it is impossible for the user-entered
desired_distance to be more than the number of BB credits available in the pool as determined by the
calculations above. The BB Credit Recovery supported distance is well within the range of all possible
connections. An estimated distance of 32,768 is considerably higher than the available BB credits and only
lower values of desired_distance would be permitted by FOS.

Buffer Credit Recovery 6 of 10


STORAGE AREA NETWORK Technical Brief

From the BB_SC_N value, you can compute F as follows:

F = 2BB_SC_N (4)
The two ports communicating must periodically send state information to detect lost frames or lost R_RDYs.
F is the number of frames between each BB_SCs and the number of R_RDYs between each BB_SCr. A
BB_SCs is sent after every F FC frames, a BB_SCr is sent after every F R_RDYs, and these primitives
establish checkpoints.

Between the end of an FC frame (EOF) and the start of a new FC frame (SOF), there must be at least six
primitive signals, which are a combination of IDLES, BB_SCs, and/or R_RDYs. More can be sent, especially
if there are no FC frames to send. In FC something is always being sent.

From the completion of the Link Reset protocol and between each successive occurrence of a BB_SCs,
each side must maintain a count of the number of frames and R_RDYs sent. The counter never goes higher
than F, instead it resets to zero upon reaching F. The idea is that upon receipt of a BB_SCs, the counter
should be at zero because a BB_SCs is sent every F frames and the counter resets upon every F frames.
If the counter is not at zero when the BB_SCs arrives, it means that one or more frames were lost and a
Link Reset is issued again. BB credits and counters are synchronized at Link Reset, which uses the Link
Reset protocol.

Link Reset Protocol


To perform a Link Reset (LR), the initiating port starts to transmit the LR primitive sequence instead of
IDLEs. (See the table on page 5 for the primitive transmission words.) The switch does not send FC frames
during an LR. Frames are not discarded; they are buffered until the LR concludes. If the buffers fill, flow
control will apply back-pressure to prevent frame loss. The receiving port enters the Link Reset Response
(LRR) substate and starts sending LRR primitive sequences instead of IDLEs. The port that initiated the LR
completes the LR process by sending IDLEs again. Receipt of an IDLE on the remote side transitions that
E_Port out of LRR and back to IDLEs again.

Figure 1. Link Reset protocol

Buffer Credit Recovery 7 of 10


STORAGE AREA NETWORK Technical Brief

Frames Lost in Transit


When a frame is lost in transit, a BB credit was spent by the sender upon transmission. No R_RDY is
returned to free up that credit. A BB_SCs is sent after F frames. The receiving side will show a discrepancy
in its counters, because a BB_SCs arrived when the count was not zero and will in turn take a corrective
action and initiate an LR. Now in preparation for the next interval, the counters are reset to zero (0).

The number of lost frames is calculated on the receiving side, the opposite side that actually lost the BB
credits in question. The number of lost frames is calculated using the following equation:

BB Credits Lost by Lost Frames = ( 2BB_SC_N – Frames Received ) modulo 2BB_SC_N (5)

R_RDYs Lost in Transit


The next condition to examine is R_RDYs lost in transit. If all the frames make it successfully to the
receiving side, then no discrepancies in BB credits should be detected. When R_RDYs are lost in transit,
the same situation occurs, that is, diminished BB credits.

In the same way as with frames, a count is maintained for R_RDYs by each port. This count is established at
ELP and between each successive interval of F R_RDYs. Again, the idea is that upon receipt of a BB_SCr,
the counter should be at zero because a BB_SCr is sent every F R_RDY and the counter resets upon every
F R_RDY. If the counter is not at zero when the BB_SCr arrives, then one or more R_RDYs were lost.

When a BB_SCr arrives and the count is not zero, the receiving side detects a discrepancy. It will in turn
take a corrective action and initiate an LR. In preparation for the next interval, the counters are reset to
zero (0).

The number of lost R_RDYs is calculated on the receiving side, the side that actually lost the BB credits in
question. The number of lost R_RDYs is calculated using the following equation:

BB Credits Lost by Lost R_RDYs = ( 2BB_SC_N – R_RDYs Received ) modulo 2BB_SC_N (6)

BB_SCs or BB_SCr Lost in Transit


The final condition that must be considered is when the BB_SCs or BB_SCr ordered set is lost in transit due
to transmission error. In equations 5 and 6 in the previous section, notice that modulo F is used. Modulo is
a mathematical method for wrapping numbers around after they reach a certain value, in this case, the
value of F. This way when a BB_SC is lost, the calculation for the second interval produces the correct value
to recover all the BB credits. The moduloF of a number that is the exact multiple of F is zero and indicates
no loss, even when a BB_SC has been lost. (Remember that the loss of a BB_SC is a primitive sequence
and not a FC frame.)

When an LR is required to reset BB credits and associated BB_SC counters, the event must be logged. The
event is sent to the Raslog with the following message:

Raslog Error Message:


2008/03/12-15:30:33:396033, [C2-5856], 430239/0,, ERROR, Neptune, S3,P33(73): Credit
Recovery: cause:0x2 link reset to recover credit(0x1)/frame(0x0), OID:0x43328809,
c2_buf.c, line: 2799, comp:swapper, ltime:2001/08/12-15:30:33:396030

Buffer Credit Recovery 8 of 10


STORAGE AREA NETWORK Technical Brief

Example
In the first example, illustrated in Figure 2, one frame is corrupted in transit such that the SOF could not be
identified. The sending side starts with 120 BB credits. Because the frame could not be identified as a
frame by the receiver, no R_RDY west sent in reply. Now the sender has only 119 usable BB credits until the
lost credit is recovered.

Figure 2. Lost frame results in usable BB credit reduction

The computed F value is 32 (25), therefore, after every 32 frames, a BB_SCs is sent and after every
32 R_RDYs, a BB_SCr is sent. In this situation, a BB_SCs is sent and the frame counter on the receiving
side is not at zero. It is at 1 (one), calculated by the receiver using the following equation:

Lost BB Credits due to Lost Frames = ( 2BB_SC_N – Frames Received ) modulo 2BB_SC_N = ( 32 – 31 )mod(32) = 1

The receiver initiates an LR to reset the BB credits and lost BB credit counter.

In the second example, illustrated in Figure 3, an R_RDY primitive signal is corrupted and cannot be
interpreted by the side that sent the original frame. The sending side starts with 120 BB credits. Because
the R_RDY does not reach its intended destination, 1 BB credit has been lost until it is recovered. After the
receiver sends 32 R_RDYs, it sends a BB_SCr. When the Sender side gets the BB_SCr, it determines that a
R_RDY was lost because the R_RDY counter is not at zero. The following calculation determines how many
R_RDYs were lost:

Lost BB Credits due to Lost R_RDYs = ( 2BB_SC_N – R_RDYs Received ) modulo 2BB_SC_N = ( 32 – 31 )mod(32) = 1

In this case, the Sender initiates an LR to reset the BB credits and the lost BB credit counter.

Figure 3. Lost R_RDY results in usable BB credit reduction

Buffer Credit Recovery 9 of 10


STORAGE AREA NETWORK Technical Brief

SUMMARY
The Fabric OS 6.1 release on the Brocade DCX Backbone, and Brocade 300, 5100, and 5300 8 Gibt/sec
switches supports an elegant Buffer-to-Buffer Credit Recovery mechanism. (Available on extended E_Ports
only.) This hardware-enabled feature is backward compatible with older versions of FOS.

A scaling value (BB_SC_N) is exchanged during Exchange Link Parameters (ELP). BB_SCr, and BB_SCs
primitives are used as periodic milestones. By maintaining counters and exchanging state information each
side can keep track of the number of sent frames and received R_RDYs. Simple modulo calculations are
performed to determine if frames or R_RDYs have been lost. Lost frames or R_RDYs in turn reduce the
available BB credits. The LR protocol is used to reset BB credits and BB_SC counters when indicated.

Brocade platforms provide extreme reliability for both FICON and Open Systems architectures—BB Credit
Recovery is another way Brocade meets these stringent requirements.

© 2008 Brocade Communications Systems, Inc. All Rights Reserved. 05/08 GA-TB-077-00

Brocade, Fabric OS, File Lifecycle Manager, MyView, and StorageX are registered trademarks and the Brocade B-wing
symbol, DCX, and SAN Health are trademarks of Brocade Communications Systems, Inc., in the United States and/or
in other countries. All other brands, products, or service names are or may be trademarks or service marks of, and
are used to identify, products or services of their respective owners.

Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied,
concerning any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the
right to make changes to this document at any time, without notice, and assumes no responsibility for its use. This
informational document describes features that may not be currently available. Contact a Brocade sales office for
information on feature and product availability. Export of technical data contained in this document may require an
export license from the United States government.

Buffer Credit Recovery 10 of 10

Potrebbero piacerti anche