Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
SANs
with
Brocade Switches
Summary
version 1.5
Data Sharing
Resource Sharing
simplest form storage farm is shared among machines access to the storage is
defined statically. ownership does not change often
resource partitioning can be done at the switch level (zoning), LUN masking level
(storage), LUN masking (HBA), or virtualization
Volume Sharing
sharing a LUN between hosts can cause corruption at the SCSI block level
need software clustered hosts have this software built in
FC-0
specifies how light is transmitted
FC-1
encoding layer 8b/10b for every 8 bits you get 10 for error checking
FC-0 with FC-1 are considered signaling interface
bits are encoded into two kinds of characters K and D
all primitives (LIP, SOF, OPN, CLS, IDLE) are delimited by K characters
D characters (data characters) are used to provide all other 8 bit values
FC-2
framing and flow control
relies on primitives encoded from the FC-1 layer followed by 3 data characters (D
characters)
primitives drive loop initialization and arbitration
FC-2 controls flow control by sending the correct primitieves to initiate transfers.
Topologies
Point-to-point
Arbitrated Loop
Switched
Arbitrated Loop
all devices connected in loop and arbitrate for communiction
each device received Arbitrated Loop Pysical Address AL_PA of 8 bits as an address
up to 127 devices to attach
Switched Topology
F_Port fabric port (on the switch)
FL_Port fabric loop ports (on the switch) loop devices connect to these ports
Nodes are assigned a 24 bit address xxyyzz
o xx is the domain
o yy is the area (port on the switch)
o zz is the al_pa (00 for point to point)
Name Server
FFFFFC
gets informtion from a port login (PLOGI) at registration and subsequent registrtion
frames
common requests is Request for Transfer (RFT_ID) which registers what layer 4
protocols the device can handle
Management Server
provides a single access point for managing the fabric as well as three services
o Fabric configuration server information to discover topology
o Unzoned name server access to name server for nodes within all zones
o Fabrc zone server - allows mannagement entities to contol zone participation
Multimode Fiber
Managed Hub
perform moreadvancced services
provide frame switching between initiators and targets for throughput enhancement
can isolate initiators in case the initiator is having problems can bypass fixing the loop
typical capabilities
o LIP isolation (prevent LIP from affecting entire loop)
o automatic port bypass (if initiator is having problems)
o signal retiming
o loop zoning
o web interface
o telnet
o port-event logging
o snmp support
Class of Service
most support Class 3 and F
few support Class 2 or even Class 1
Buffer Credits
buffer credits per port are crucial as this indicates how many frames can be sent. This is
especially critical for long distance applications
Persistent Binding
(LUN mapping) is the mapping of a Fibre Channel device into an operating system at a
specific device location.
important for some applications that use the SCSI address to address a device (raw
volume accessed by Oracle)
Remote Boot
allows a host to boot off a volume on the SAN
binding between a specific WWN and LUN must be done to work
2400 (8 port)
1U full fabric switch
ports can be E, F, or FL ports (start at G_Ports)
Extended Fabrics
allows switch to support the rigors of long distance I/O operations in such instances as
DWDM.
Fabric Watch
switch watches for faults and alerts based on thresholds
Performance Gathering
Windows NT use the diskmon feature (from the resource kit) or permon
Solaris iostat utility
Group 1
Group 2 Server 1
Group 3
Group 4
Server 2
SAN B
Group 1
Group 2 Server 3
Group 3
Group 4
Server 4
if you are able to make relatiely small performance groups, your SAN will benefit greatly
from applying the principal of locality
The amount of locality will determine the number of ports needed for ISLs high locality
= low ISL count, low locality = high ISL count
Complex considerations
if you have distance considerations add two ISLs per switch
if you have high performance and little locallity add two ISLs per switch
Disk array
Fabric A Fabric B
Storage Consolidation
Major advantage of SAN is to allow storage to come online without having to down the
host and reconfigure the SCSI bus.
When storage is directly attached to servers difficult to reallocate space
when put on SAN can bring online, allocate all without reboooting (W2K, Solaris)
Problem
Windows NT will assume that it owns storage it encounters and write a signature to the
disk. When this happens if this is storage owned by UNIX, the data on the storage
could become corrupt.
use Brocade zoning to ensure that hosts dont step on other hosts storage
to truly add storage on the fly without taking file system off line another layer of file
system functionality like Veritas Volume Manager is needed
16e x 4c x 1i
16 edge 4 core 1 isl (per edge)
Scalability
there are two metrics for measuring scalability: the size that the topolgy can scale to (in
terms of port count and switch count) and the ease of performing this process.
Cascade Topology
line of switches in which the end switches are not connected
inexpensive easy to deploy but limited scalability
Best for situations where most if not all traffic can be localized onto individual switches,
and the ISLs are used primarily for management traffic.
Limit of Scalability: 114 ports / 8 switches
Ring topology
like the cascaded fabric but with the ends connected
superior reliability since traffic can get around any one ISL failure
Best when localization is high
Good when implementing SAN over MAN or WAN where ring topology is already
dictated.
good for starting small and staying small
ISLs used more for management than data
Limit of Scalability: 112 ports / 8 switches
NOTE: This topology does NOT replace a redundant fabric SAN. For true resilience two of
these fabrics are needed running in parallel. Therefore, maintenance on this fabric will not cause
downtime since the other fabric will remain operational.
Now physically connect the cables and issue the portEnable command on the first port
(port 0). The fabric will reconfigure itself. Issue the fabricshow command to ensure
that the switch is successfully integrated into the fabric.
Step one: Physically install the switch and configure configure elements including
switch name, Domain ID, and IP address. Make sure there is no zoning information in
the switch. Disable the entire switch with the switchDisable command. Now telnet
into one of the existing core switches and issue the switchDisable command. Ensure
that all traffic is successfully crossing the remaining core switch (assuming two core
switches). Remove the old core switch and cable the new switch to all existing edge
switches
Step two: Issue the switchEnable command on the new core switch and allow the
fabric to reconfigure itself. Issue the fabricShow command to ensure that the new
switch is successfully integrated into the fabric.
Step three: repeat the above steps on all core switches that must be upgraded.
H ost
SAN A SAN B
S to ra g e
Server
Disk array
Server
Disk array
Server
Host Tier
Storage Tier
Disk array
The benefits of tiered SANS is that they do not need ISLs or bandwidth
optimization between switcheson the same tier.
You can also have three tiers one for the core tier as well. Again ISL
optimization between switches at the core tier is not important.
Tiered SANs aid in administration when you need to add more hosts, you just
add a host switch when you need more storage- you just add another storage
switch.
Exploiting Locality
You can attain the best performance in any network by localizing. Localizing means
putting ports that need to communicate closer together.
The more locality within a SAN fabric, the fewer ISL are needed for data communication.
While smaller SANs will not benefit as greatly frm the use oflocality as large SANs will,
all SANs will benefit somewhat. However, for low bandwidth applications, the
management benefits of organizing your edge devices in a tiered fashion are significant,
and zero percent locality can be acceptable.
Ports Description
no light no light or signal no GBIC module or cable
steady yellow receiving light but not yet online
slow yellow disabled due to switchDisable or portDisable
(flashes 2
secs)
fast yellow error, fault with port
(flashes
sec)
steady green online (connected with external device over cable)
slow green online, but segmented
(flashes 2
secs)
fast green internal loopback
(flashes
sec)
flickering online and frames being forwarded
green
Switch Diagnostics
diagHelp list of diagnostic commands
ramTest system DRAM diagnostic
portRegTest port register diagnostic
centralMemoryTest central memory diagnostic
cmiTest CMI bus connection diagnostic
canTest QuickLoop CAM diagnostic
portLoopbackTest port internal loopback diagnostic
sramRetentionTest SRAM data retention diagnostic
cmemRetentionTest Central mem data retention diagnostic
crossPortTest cross-onnected port diagnostic
spinSilk cross-connected line-speed exerciser
diagClearError clear diag error on specified port
diagDisablePost disable POST on reboot
diagEnablePost enable POST on reboot
setGbicMode enable tests only on ports with GBICs
setSplbMode enable 0=dual, 1=single port LB mode
supportShow print version, error, portLog, etc.
parityCheck Dram parity, 0=diabled 1=enable
spinFab ISL link diagnostic
loopPortTest L_Port cableloopback diagnostic
Helpful commands
generally the help <command> will give a man page for the command
errShow command
64 logged errors list is cleared upon reboot
also logs environmental errors
consider using the syslog facilities of the switch for persistent storage of errors
syslogIpAdd, syslogIpRemove, and syslogIpShow
switchShow command
when troubleshooting issues involve fabric services or switchs ability to participate in
the fabric, the important parts of switchShow are:
o switchState (online, offline, testing, faulty)
o switchRole (principal, subordinate, disabled)
o switchDomain
if running in a fabric switchState should be online
there should only be one principal switch in a fabric
there should be no duplicated switchDomains
o 1000 series 0-31
o 2000 series 1-239
switchID is the 24 bit address of the switch in the fabric
switchType which model the switch is
o 1: 1000 series
o 2: 2800
o 3: 2400
o 4: 20x0
o 5: 22x0
the terms upstream and downstream indicate switchs position relative to the principal
switch
switchName: brocade2b
switchType: 2.4
switchState: Online
switchRole: Principal
switchDomain: 1
switchId: fffc01
switchWwn: 10:00:00:60:69:10:5e:a7
port 0: id No_Light
port 1: id No_Light
port 2: id No_Light
port 3: id No_Light
port 4: id No_Light
port 5: id No_Light
port 6: id No_Light
port 7: id No_Light
port 8: id No_Light
port 9: id No_Light
port 10: id No_Light
port 11: -- No_Module
port 12: -- No_Module
port 13: -- No_Module
port 14: id No_Light
port 15: id No_Light
brocade2b:admin>
topologyShow command
displays the fabric topology as seen by the local switch
lists all the switches in the fabric and all the possible paths to reach those switches
Port addressing:
XX 1Y ZZ
XX is value between 0x1 and 0xef indicates the domain id of switch
1 will always exist (native mode) with 2000 series switches
Y is the port number
ZZ is the AL_PA for a loop device or 00 for F_Port
Timeouts
if the SAN experienced a reconfiguration a host that is online might timeout. Most
host will retry but verify with the specific host.
if there are genuine timeouts- you can increase the Resource Alocation Time Out Value
(R_A_TOV) , or Error-Detect Time Out Value (E_D_TOV)
Zone Conflict
A fabric will segment when zoning configuration is not consistent. In most cases it is
easier to clear the configuration of one switch (most likely the new one) and absorb the
existing configuration.
o Multiple zoning configurations enabled will create Zoning Conflict only one
configuration may be active at one time.
o Zone definition type conflict happens when configuration is defined but the
definition type (alias, zone) are in conflict. Example: cfg1 has red as an alias
where cfg2 has red as a zone
o Zone definition content conflict happens when configuration is defined, no
type conflicts but content of configuration is in conflict (different port definitions
in an alias for example)
These can be remedied by:
o cfgClear <configuration you want to delete> followed by a
o cfgDisable <active configuration you want to delete>
keep in mind that all elements in the configure command must be identical (except
domain ID of course) otherwise fabric will segment.
if there is a domain conflict:
o switchDisable followed by a switchEnable to gain a new domain ID
LIP count
a port (on a loop) with a larger Lip_in than a Lip_out count indicates that the
associated device is guilty of the LIP activity.