Sei sulla pagina 1di 32

Broadening EVA Repair Techniques. Maintenance and repair techniques, excluding nonreleased systems.

This paper is written by Bert Martens, with significant input from EVA engineering (both HW and SW), FC Disk engineering and input from JGO support engineering. Some information contained in these notes is related to new functionality included in a V2002 patch kit. The intent is to provide troubleshooting information as quickly as possible. Okay, it is time to pull together the ideas and rules for servicing EVA systems. I have a few ideas that may help in bringing the field and support engineers up to speed in the shortest amount of time. I think the traditional model of service is causing problems with the EVA system. The storage systems we have used over the last 20+ years are very simple in data flow and structure. So the error paths and failure modes are relatively simple. The EVA has changed that world and now service engineers and support groups must change to meet the new paradigm. With the EVA, it is necessary to collect and understand more event data before an analysis is feasible. The EVA provides a lot of data and proper analysis is not realistic with only one event or event type. In most cases it is necessary to review all entries for the time frame associated with the events, and then view the events in the context of all errors and events. The first new idea is that the EVA system, and it is a system that must be viewed as both HW and SW. Since the EVA is tightly coupled between HW and SW operation, failures must be viewed in both contexts. The HW failures can have an immediate and significant impact on the SW and that can result in a significant impact to the user access and user data. This point can not be stressed enough, many times people have tried to repair or service the EVA and did not fully understand the impact a simple activity can effect.

Bert Martens V0.3

Page 1 HP Confidential

10/6/2003

Hardware............................................................................................................................. 3 Hardware box...................................................................................................................... 3 Host port enable .............................................................................................................. 4 Controller device ports.................................................................................................... 4 Disk Enclosure .................................................................................................................... 6 Disk Enclosure Temperature warning and shutdown operation ..................................... 6 I/O module ...................................................................................................................... 7 EMU................................................................................................................................ 8 Software Section ................................................................................................................. 9 Boot................................................................................................................................. 9 HSV110 Queue depth ................................................................................................... 10 VRAID0, VRAID5 and VRAID1................................................................................. 10 Meltdown ...................................................................................................................... 10 RSS introduction........................................................................................................... 11 Another RSS description............................................................................................... 12 And here is another explanation on the RSS topic. ...................................................... 12 Disk Error Handling.......................................................................................................... 14 Disk Events ................................................................................................................... 15 Loops................................................................................................................................. 17 Quiesce Time for Failed Drive Loop ............................................................................ 19 Override of Failed Drive Loop Quiesce........................................................................ 19 Hard Addressing of Drives ........................................................................................... 19 Troubleshooting tips ......................................................................................................... 20 Events................................................................................................................................ 22 Correlating data from event logs................................................................................... 22 SSSU ................................................................................................................................. 24 Link Error counters ........................................................................................................... 25 SWMA Event log data ...................................................................................................... 26 SWMA files .................................................................................................................. 27 Serial line use.................................................................................................................... 28 Field Service Page Commands ......................................................................................... 30 What new terms mean....................................................................................................... 31

Bert Martens V0.3

Page 2 HP Confidential

10/6/2003

Hardware
The Hardware is divided in to several sections. The first is the hardware box, the second is the host ports, the next is disk enclosure (with I/Os and EMUs), then the disk drives and last we cover the device loops on the backend of the controller.

Hardware box
The hardware box has several areas that will impact operation. These include the simple event reports, like power supply, fans, LCD display. The batteries provide backup power to the cache to maintain user data for up to 96 hours. While the actual hold up time (HUT) can be in excess of 96, 96 hours is the claimed time. The batteries are 2 separate units, each containing 3 cells. The batteries are charged and status reported one cell at a time and then a status report for the unit is logged. This will produce up to 8 entries in the log at boot time. During operation, the batteries are tested one cell at a time. The cell is discharged and then charged to verify that each battery unit can maintain a charge. The cell test takes about 10.5 hours to discharge and charge and there is about 30 days between each cell test. The total sequence can take 24 weeks to complete. The timer for battery load testing is reset each time the controllers are booted. If one battery fails, the entire controller will stop presenting units and all LUNs will failover to the other controller. This is because it is difficult to guarantee the hold up time and it is safe to use the remaining controller. Since the controllers operate in writeback cache mode, if the batteries are not fully functional, customer data can not be assured. The cache operates in writeback mode only (non-failure mode) and the user can select mirrored or non-mirrored. The non-mirrored may be quicker, but it reduces availability of user data on failures. If the batteries are installed, but not fully charged, the cache will operate in write-through mode. If the batteries are not installed, LUNs access will be disabled on this controller. There is another important point on LUN access and failover. If a user selects nonmirrored cache, and a controller fails, the LUNs can not failover to the other controller. Since the cached data is not copied to both controllers cache, the LUN would lose data. If a user reports that some LUNs are not accessible and one controller is shutdown, the cause may be the cache setting. Memory errors can be misleading. There is a case where the controller will log a memory parity error that is also an indicator of a low memory access error. It is important to fully read the description of the controller event. The event for a low memory access is identical to a memory parity error, it is just the cause is different. A low memory access error can be induced by software, where a real memory parity error would be caused by hardware failure. If both controllers fail with the same error, a software cause should be investigated. There are additional details on memory errors included in the termination event logs. While the log may indicate a true memory error, the failure could still be software induced.

Bert Martens V0.3

Page 3 HP Confidential

10/6/2003

Host port and device port transceivers (2Gb) are hardware identical but operationally different. The mirror port is a point to point connection. The host port is switched fabric and the device ports operate as FC-AL. The host ports are connected to fabric switches that can provide significant error and operational data on the link stability. Any time a host port is a suspect, the switch should be used to collect data. Remember this is not the full picture, only one side. So it is possible that the switch is operating without error, but the controller port is still faulty. The controller event log provides data on the host port operation, these entries with switch data should be adequate to isolate faulty components. During boot it is normal to see FP1 And FP2 log excessive link errors on each port. The cause of the FP event entries is related to the 2Gb rate negotiations. Host port enable The host ports will be enabled after the storage system finds a valid storage system configuration and uses the WWN from the metadata. If no valid storage system is found in the metadata, the system will prompt for a WWN at the LCD display. If the storage system finds a partial configuration, it may request confirmation before it proceeds. The controller will prompt for user verification on multiple storage systems, unprotected metadata and for WWN input. The LED for each host port indicates that the controller has logged in to the switch and at least one host has logged in to the controller. If the host that logged in to the controller stops operating, the LED continues to be illuminated (green). The host port login can be validated from the switch and the Emulex utility can be used to validate the controller is being seen from the SWMA. If you use the SWMA to see the EVA system, its logged in to the switch.

Controller device ports The device ports provide connection to the physical devices that provide the capacity used in the EVA storage system. The device ports are FC-AL and the loop has 2 controllers and up to 9 device enclosures shelves per loop pair. Since it is a loop, there is no start and no end of the loop. All devices reside on the loop and the data will pass through each device. Device ports are more complex to isolate. The port can operate in several states providing some indication of stability. The port LED will be green when it is receiving light from another port. It will be amber when there is no light detected and the port LED will be off when it is disabled. The device port LED is off when the port is in a state where the port is in bypass mode, this is where the port will be just passing frames and characters. This is done when the controller detects an unstable loop. It will stop responding to frames for about 10 minutes on V1.On V2 the controller device ports will be disabled until the port is stable. I have noticed that if you resolve the loop issue, the controller will enable the port (LED) and it will start to operate. During the time that the port LED is off, the controller will not log any errors or events on the port. This can cause some confusion since it appears that a port stopped logging errors. But the cause is that the port stopped monitoring the loop, so it can not log errors. Since it is possible this controller is causing the stability problem, this will allow the loop to operate from the other controller. With VCS V2001 code will place the port in bypass mode for 5 minutes and then check the status of the port. If the port is still unstable, it will wait for another 5 minutes. There is Bert Martens V0.3 Page 4 HP Confidential 10/6/2003

some special code to isolate failures in configurations that contain loop switches on the backend. This code will try to recover the loop in less than 5 minutes. Starting in V2002, there are new rules for controller device port operation. See the section below on Quiesce Time for Failed Drive Loop. It is important to review the controller event logs and ensure device port operation and VCS versions are understood. A new pattern is being added to the device port LEDs. There are now 4 LED patterns, Off, Green, Yellow and On/Off. The Green indicates the port is operating normally, the Yellow indicates the port does not detect any laser light. If the port LED is off, it indicates the port did not complete loop initialization and is disabled. If the port is alternating between on and off, it indicates the port completed initialization and is set to failed. There is a command in V2002 that will allow a port to be enabled when it is set to failed. Just press the right button on the controller OCP to enter the port menu and select enable then select the port that you would like to enable. When that port is selected, it will require that you confirm the selection, when you press the left button, it will enable that port on the loop.

Bert Martens V0.3

Page 5 HP Confidential

10/6/2003

Disk Enclosure
The disk enclosure provides power and signal for up to 14 2Gb dual port 1 disk drives. The drives are installed from the left side with drive 1 and drive 14 on the right side. The signal always is routed from drive 1 to drive 14. The enclosure will operate with one power supply operational and/or one blower operational (both must be installed for air flow reasons). The enclosure will power off if a power supply is removed for more than 7 minutes. The FC-AL loops are separate with the I/O module on the right providing loop A and the I/O module on the left providing loop B signals (rear view). The disk enclosure also includes the I2C bus that provides communication between the EMU and the power supplies and the I/O modules. The disk enclosure also provides support for the ESI bus. The ESI bus provides communication between the EMU and the disk drives. The EMU collects the FC-AL link rate from the I/O on the I2C bus and link rate from the disk drives on the ESI bus. If there is a difference, the EMU will keep the disk off the loop using the bypass circuit in the I/O module. Loop IDs are assigned on a shelf and loop basis. It is expected that AL-PAs in an enclosure are close to other AL-PAs in the same enclosure. The controller doesn't assign AL-PAs, so there is no correlation. Assignment is done by soft addressing in V1 and V2001. In V2002 and later builds, the AL-PA addressing is done with hard addressing using the EMU to assign a base address for the loop. A device becomes the shelf master. The list of taken AL-PAs is passed around the loop in signal order. A device will pick an available ALPA (usually the highest). There are several passes around the loop. The controllers use "Fabric addressing," they will select AL-PA 01 and 02. Then, if drives already had AL-PAs, they reclaim them on the next pass. Then the rest of the drives pick AL-PAs. Starting with V2002 it is not supported to install disks in shelf 9, bays 13 and 14 on each loop. These bays are assigned the AL-PA addresses the controllers will select. Disk Enclosure Temperature warning and shutdown operation The EMU gives warnings that should show up in EVA logs and event manager logs whenever a high temperature limit is exceeded. These levels are at least 4 degrees C lower than any failure temperature, so there should be adequate warning before the shelf gets close to shutting down. The EMU has multiple copies of three different classes of temperature sensors, drives, power supplies and the EMU's local sensor. If a majority of the sensors in two of the three classes report temperatures are exceeding our critical limit, higher than the warnings mentioned, then the EMU starts a seven minute shut down timer and informs the EVA. If the EVA is down (crashed), then the shelf will power itself off in seven minutes. If the controller is working, then when any one shelf signals that it will power off, the controller should begin a shutdown of the entire system to prevent the "single shelf went down" type of problem. I do not know about flushing cache or other things that might delay this, but once the controller starts the power down sequence, the controllers power down almost instantly and all of the storage shelves will power off between 5 and 90 seconds later. The system should never let a single shelf power off with V2 code in place. Bert Martens V0.3 Page 6 HP Confidential 10/6/2003

The trigger point for the drives are based on vendor specifications and are designed to trip before the point where the vendor can guarantee that there is no damage to the drive for a short duration. MTFB figures may be shortened at these levels. The trigger point for the power supplies and the EMU are based on measured temperature rises within the shelves running under load in thermal chambers. These are then de-rated some to allow for part variations and measurement errors.

I/O module The I/O modules appear to be identical, but are unique and can not be interchanged. The A module is installed on the right side of the rear and the B module is installed in the left side. The logic card of the A and B module are identical, the difference is the carrier. The I/O module has 2 LC ports and 3 LEDs. The LC connectors are both in and out ports. There is no indication of loop direction and it can change if there is a change to a controller or I/O module. Since the I/O module can support data flow from either port to the disk drives, it is not possible to determine the current direction of loop data flow. The data may be entering the top port and routing the frames to the disk drives and then exiting the bottom port. Or the data may be entering the bottom port and being routed to the disk drives, then exiting the top port. If the data are being routed from the bottom port to the disk drives then exiting the top port, the return path is from the top port directly to the bottom port. The module LEDs are to provide power indication (Center LED) and port status (top and bottom). The port indicators provide indication of the port signal being received. If there is no valid light, the LED is off, if there is light but no valid frames, the I/O module will provide fill characters to ensure the loop remains operational. The I/O module also contains the loop bypass chips. There are 4 bypass chips per I/O module. Two of the bypass chips are for the devices and the other 2 are used for the optical ports. If you noticed the I/O module power LEDs (center LED) on both I/Os are flashing for about 1 minute after power on, it indicates that the I/O can not communicate with the EMU (EMU is busy or missing or bus problem). The I/O will wait for about 4 seconds after power on for the EMU to provide a heartbeat. If the heartbeat is not seen, the I/O will use a default set of rules. It will spin-up the disk drives and enable all disks on the loops.

Bert Martens V0.3

Page 7 HP Confidential

10/6/2003

EMU The EMU provide monitoring of the elements in the disk enclosure and also provides active operation for disk spin up and loop bypass. The ESI bus is used for all communication between EMU and other components installed in the enclosure. The base sequence is that the EMU will try and talk with the I/O modules and Power supplies. The EMU will obtain the link rate supplied by the I/Os. The EMU will send spin-up commands to all disks. The EMU will request link rate from each disk drive. If the Disk drive and I/O both report the same link rate (should be 2Gb), the disks are enabled on the loop. If the I/O modules are not communicating with the EMU, the EMU will use a default link rate to determine if the disks are at the correct link rate. The default used in V1 EMU code is 1Gb, and the disks are all 2Gb. This will prevent the disks from being enabled on the loops by the EMU. However, the I/O module will enable the disks to the loops if it detects no EMU heartbeat. Since the EMU will only use the default if it can not communicate with the I/O, there is a very short timeframe that the disks are in bypass mode. The default link rate used in the EMU for V2 is 2Gb. The firmware will automatically upgrade any EMU that is detected with a different FW revision or in an uncompleted code load. The EMU image is on the card and always available, so the firmware load can occur at any time -- not just when the controller firmware code load is done. The check is done every time a map occurs. A typical scenario is that an EMU that requires replacement is missing or dead. There is no indication that a new EMU is inserted into the shelf. Whenever there are unlocated devices (like the above missing/dead EMU), the controller will do a remap every 8 minutes or so, hoping a live EMU can be found. If a technician inserts a new working EMU (version 42 is currently on replacement EMUs), the controller will remap the loop with the missing drives every 8 minutes (or sooner if a LIP occurs). When the remap happens, the new EMU will be recognized and marked as downrev. Once mapping completes, the EMU will be code loaded automatically. The EMU code load will also flash the LED on the front of the disk enclosure while the code is being loaded. This allows a person to stay in front of the system and have a visual indicator of the EMU code load

Bert Martens V0.3

Page 8 HP Confidential

10/6/2003

Software Section
The software section will cover multiple areas of VCS operation. It will not be a full factual description of the VCS operation. The intent is to provide a service level understanding of the system operation, not a design level discussion.

Boot The boot process starts at power up and then controller HW self tests, then the VCS starts running. The VCS startup steps are; Diagnostics complete, FC Service brings loops up and discovery of devices. Then init of all known devices and read ID blocks. The system primary controller is selected (if both are booted). If the other controller does not respond to primary selection in 30 seconds, the primary will force crash of other controller. The primary controller will then find the quorum disks. If more than one quorum disk that matches the system ID in cache, the primary controller will use those disks for the system state. This is only if the controllers are rebooted without a power cycle. A power cycle will clear any old system state data in cache. If only one quorum disk is found that matches the system state from cache, on the OCP it will display unprotected metadata, continue? If the user answers yes, the primary controller will resync and return the one quorum disk as the quorum set. On power up and boot of controllers, the quorum search will look for one set of quorum disks and if it finds one set, it will use those disks as the quorum set. If the system finds more than one quorum disk and they are different, it gets more complex. The system will display on the OCP multiple systems detected, use prevalent? If the user selected yes, it will use the quorum set with the most members. Note, if the system has intrinsic replacement mode set on the system, all devices from the other quorum disks will be added to the selected system at boot. If the replacement mode is extrinsic, then it would be possible to remove the unused quorum disks and they would still be valid for use. The system will stop on boot until all quorum disks match, so it is necessary to either remove the other disks, or add them to the booted system. You can add the disk to existing disk groups or create a new disk group. If no quorum disks are found, no system has been detected and it is handled as the system has not been initialized. The system will then check all devices on the loops for disk groups and virtual disks. It will verify that all disks for each disk group exist. After all disks are found the scanning for disk message is replaced by activating stsys while the system makes sure all quorum members are consistent (enough members) and as it completes the creation of data structures. When the storage system has completed all data structures, verified quorum disks and checked all disk group and virtual disk metadata and structures, it will change the OCP display and the WWN and Storage system name will be displayed.

Bert Martens V0.3

Page 9 HP Confidential

10/6/2003

HSV110 Queue depth Each EVA subsystem has 2 controller modules, each controller module has two host ports and each host port can accept up to 2000 simultaneous I/O requests. So each EVA subsystem can have 8000 commands queued to it. Each controller module has its own set of LUNs, so no one LUN can have more than 4000 commands queued to it. No one host is allowed to have more than 1500 commands outstanding. If multiple hosts try to exceed 4000 commands to one controller module, a throttling mechanism kicks in to balance allowed commands queued per host. We want to be fair to all hosts connects, and prevent one host from using an inequitable number of queue entries by using a queue-full mechanism. VRAID0, VRAID5 and VRAID1 VRAID0 is striping, no redundancy, lose one disk in the disk group and all VRAID0 are lost. VRAID5 is striped parity, it uses a 4+1 data/parity model. The data and parity are maintained within an RSS, this helps reduce the risk of data loss. VRAID1 is mirrored data. The data are maintained on 2 separate disks that are maintained in a mirrored pair relationship. It would take both members to be failed to impact customer data. This is the highest level of redundancy. Meltdown A meltdown is a condition where the disk group cannot present LUNs to the hosts. This condition can be induced when excessive number of disks have either failed or been removed from the disk group. Each disk group will operate independently and will meltdown independently. While redundancy is maintained on a virtual disk level, meltdowns occur on a disk group level. That is, each virtual disk maintains it own redundancy. But if a disk group loses a single disk all VRAID0 are lost, if it losses 2 disks in the same RSS, all VRAID5 are lost, and if it losses both members of a mirror pair, all VRAID1 are lost.

Bert Martens V0.3

Page 10 HP Confidential

10/6/2003

RSS introduction RSS (redundant storage set ) is a method used in the EVA to reduce the risk that a multi-disk failure will also result in a loss of data for the customer. While this is a topic that I have avoided in other papers and discussions. I now will start to provide an introduction to allow a better understanding of how and why failures can impact EVA operation. The RSS discussion in this paper may not provide complete operational details, but will provide basic understanding to allow failures in disk groups to be understood. It is expected that additional information on RSS and its operation will be available from the engineering groups at a later time. The RSS is used to reduce the risk that a double disk failure in a single disk group will cause customer data loss. If a disk group has 50 disks and provides VRAID0, VRAID5 and VRAID1. It can be expected that a single disk failure will cause VRAID0 units to go inoperative (if the data can not be safely removed before the disk is unavailable). This would be expected operation for any data distribution model, since VRAID0 is basic striped data without redundancy. The main points to make are that a disk group will have 3 levels of failure. The first level is when a single disk fails quickly and solid, then all VRAID0 units are inoperative. The second failure is when more than 1 disk fails and induces a VRAID5 failure. This is the area that is the next explanation. The RSS is a redundant storage set, each disk group over 11 disk dives is divided in to multiple RSS. An RSS is from 6 physical disks to 11 physical disks, with the desired number of 8 disks per RSS. The allocation of disk to RSS is always even for all RSS, except the last. The last RSS in a disk group will contain the remaining disks after all other RSS have been allocated with even numbers. The reason for even number of disks in an RSS is to support VRAID1 pairing. All redundancy for a VRAID level is maintained in an RSS. If one disk fails in an RSS, the remaining disks can provide enough data for reconstruction of the failed data. This is true for VRAID5 and VRAID1 (with VRAID0 the data is lost for the entire disk group). If the disk group contains 50 physical disks, then it would be subdivided in to 6 RSS. There would be 5 RSS with 8 members and 1 RSS with 10 members. A disk group with 50 physical disks, that has 6 RSS. It could sustain a maximum of 6 separate disks failing before loss of customer data. If there was only VRAID1 virtual disks in the 50 member disk group, it could sustain up to 25 separate disks, however, it is possible that other sections of the storage system metadata would have failed. So it is not advisable to expect that a disk group would maintain operation after a large number of disk failures. The above description will help in understanding how multiple disks can fail and continue to serve data to the user. If there is no user activity and the system is stable, when the disk is removed, the other members of the RSS will start to reconstruct the data. When that phase is complete, the data is then leveled across the entire disk group. This can also be demonstrated by monitoring the occupancy level of the disk drives.

Bert Martens V0.3

Page 11 HP Confidential

10/6/2003

Another RSS description Or to word the topic in another format, here is what one engineer has used to describe RSS in a few words. Since this is an important topic, I think it is necessary to provide different wording of the same topic. An RSS is a sub-grouping of drives within a disk group for the purpose of failure separation at the redundancy level. The target size of an RSS is 8 drives, but can vary from 6 to 11. Although a Virtual Disk will span all drives in a disk group, and therefore all RSSs in a disk group, the smaller pieces of redundant storage that make up the Virtual Disk can not span RSSs. A single drive failure in a disk group will cause all VRAID0 Virtual Disks in that disk group to become inoperative. However, 2 drive failures in a disk group will only cause VRAID5 Virtual Disks to become inoperative if the 2 drives are in the same RSS. A disk group with 48 drives will have 6 RSSs with 8 drives each. In this disk group it is possible to have 6 drives fail (1 in each RSS) without VRAID5 Virtual Disks becoming inoperative. Further, drives in an RSS are paired for VRaid1 storage, such that more than 1 drive can fail in an RSS without necessarily causing VRaid1 Virtual Disks to become inoperative. For VRaid1 Virtual Disks to become inoperative, a drive must fail in an RSS, and then a specific second drive in that RSS must fail (this would be the VRaid1 partner drive of the first failed drive) - and this specific second failure would have to occur prior to the reconstruction of the data from the first drive failure. And here is another explanation on the RSS topic. Just a few points to emphasize: User data is segmented into 8MB redundant stores. Physical data is allocated for the redundant store based on raid type, 8MB for VRaid0, 10MB for VRaid5, 16MB for VRaid1. The physical storage for a given redundant store is fully contained within a single RSS. Relocation of data may occur but it will always be within a single RSS (data may be completely re-allocated/copied to a different RSS in the disk group ). RSS sizes can vary from 1 to 16 members, but sizes outside the range of 6 to 11 are transient. The controller will migrate physical drives between RSS's (potentially creating and removing RSS's in the process ) to effect a size between 6 and 11. When an RSS falls below 6 members it will be merged into another RSS if possible. When an RSS exceeds 11 members it will be split into 2 RSS's of at least 6 members. RSS migration operations occur sequentially, one at a time ( i.e., a merge may be followed by a split ). Starting with V2, RSS member migration will be performed in order to form Raid1 pairs from unpaired drives. These operations can occur within a single RSS or between 2 RSS's. It might be useful to mention something about how the controller attempts to enhance redundancy by distributing RSS members across shelves whenever possible. While we are on the subject of best practices, certain disk group/rss configurations are better than others from a standpoint of utilization of space. This is due to the methods Bert Martens V0.3 Page 12 HP Confidential 10/6/2003

used for leveling. Optimum disk group sizes would be a factor of 400 times a multiple of 8 (ie; 8, 16, 32, 40, 64, 80, 128, 160 and 200), assuming drives of equal size. The rules are 1) #drives in each RSS is a factor of 400 between 6 and 11, rule 2) #of RSS's is a factor of 400. There is no real harm in breaking the rules, its just that you maximize available space utilization by following the rules.

There is a Quorum disk in each disk group, with a minimum of 5 total quorum disks. The default disk group will have the most quorum disks to ensure at least 5 when there is less than 5 separate disk groups. These different metadata segments are what is required to allow a storage system to boot and operate with minimal resources. It should be noted that if an entire disk group is moved from one storage system to another storage system, it can induce undesired results. If the entire disk group is placed in a new storage system without other disks or configuration data. The moved disk group will boot and the new system will now appear as a single disk group using the WWN and storage system name of the other storage system. This can create problems with the HSV EM, and the fabric, since the fabric will now have two storage systems with the same WWN. If the disk group is installed in a operating configuration, it will create a failure and the OCP will display multiple systems detected Proceed ?. If this is answered yes, it will destroy the data and configuration on the moved disk group (it should select the existing disk group as the primary group and delete the new disk group).

Bert Martens V0.3

Page 13 HP Confidential

10/6/2003

Disk Error Handling


ARRE Criterion For Auto-reallocation Not all drive errors qualify for auto-reallocation. Errors related to media defects are selected as candidates for reallocation. Any unrecoverable read error is not considered for auto-reallocation . During the reallocation and re-write stages, the firmware will protect the media against corruption due to loss of power or SCSI reset. If one of these events occur, the re-write and/or reallocation stage will be completed following restoration of power or following reset recovery. BBR The REASSIGN BLOCKS command requests the target to reassign the physical locations of defective logical blocks to another area on the medium known as spares that are set aside for this purpose In addition, as an added feature the target will also attempt to recover the data from each logical block before reassigning the physical location and then move that logical block data to the new physical location An error is posted only if the physical location of the logical block could not be reassigned SMART reported events should be viewed as predictive failures of components that wearout, not predictive of all failure types. Self-Monitoring Analysis and Reporting Technology Monitors wear-out components of drive Moving components (spindle, arms, heads, platters) On/Off failures not predictable Protection against running out of spare blocks Predict failure before data is lost Goal is 24 hours before failure Used for Failure Analysis when possible -Failure may not be SMART trip -Data Collection -Histograms maintained for Attributes/non-Attributes alike and history of recovered errors.

Bert Martens V0.3

Page 14 HP Confidential

10/6/2003

Disk Events Disks have several fault conditions that are seen in the EVA system. The first is the basic SMART trip. They also log errors that are related to media, either recoverable or unrecoverable. The drive will also have port interface errors. And then it is necessary to understand how the controller handles a disk that is reporting events. I will try and cover these topics in this section. Disk errors are either media, mechanical or interface events. While a disk could have a power or physical problem, I will focus on the more common events for this discussion. With media errors, the disk may report either recoverable or unrecoverable. Errors. If the data transfer was recoverable, its is simple. The disk will just continue to operate as before. If the error is unrecoverable, the controller will regenerate the data from that LBA and write it to the reassigned LBA. If its a VRAID0, the controller will write the LBA with an error indicator (forced error flag). In many cases if the disk is degrading, the drive will report a SMART trip event. The SMART trip will cause the controller code to start moving the data from the disk to the spare capacity on the disk group. It can use both reserved capacity and free capacity in the disk group while moving the data. Disk errors that are mechanical are normally servo or positioning errors, and recovery is the same as media errors. If the data can be recovered, it will continue. If the data can not be read from the disk, the system will recreate the data from the remaining disks in the group. And many times this will also be reflected in a SMART trip event. Disk errors that are related to loop failures are more complex and can impact other disks and controller operation. Failures that impact one port will also impact other disks and be detected and reported by any device down stream from the problem. So if a disk in enclosure 2 bay 4 is causing loop errors, you can see errors on disks 5-14 in shelf 2 and see errors on either shelf 3 or shelf 1. That depends on the direction of frame flow in the loop. It is not possible to guarantee the direction of data flow outside the disk enclosure. The controllers may also report events since the controller will have trouble communicating with devices that are after the fault and devices will have trouble communicating with the controllers if their frames must pass through the fault device. Loop errors can also be reported by one disk in greater quantities that other disks. If the loop is not busy with data transfers, the ratio of data frames to idle characters will be low, and for any percentage of errors, the majority will be detected on non-data frames. These errors will be detected by a disk, and in many cases will be fixed (inserting idles). So if you see one disk with a large number of errors (bad words), and the loop is not busy, you should not expect to see equal number of errors on drives down stream.

Bert Martens V0.3

Page 15 HP Confidential

10/6/2003

If there is a disk drive that was marked failed due to loop or other (non-disk) event. It is possible to change the failed status of the disk. If you have confirmed that the disk drive is not faulty and is marked failed by the storage system. A special command issued from the Field Service web page on the HSV EM will allow the change from failed to good. This command should not be issued to a disk that is failed for valid disk failures. It will only help in cases where the disk was marked failed, but is in fact good. This command is valid in the V2 code. From the field service command line page enter an opcode of 3, and 0 (zero) for the first parameter. Do this once for every drive that is flashing red. This command will attempt to fix the first "declared broken" drive encountered in our internal array. After the drive is fixed it will be included in the disk group that it was a member. To access the field service page on the SWMA http://hostname:2301/ResEltCpqFusion/fieldservice is the field service page address. You must use the full URL, since there are no links to this page. If the storage system is totally down, this command can not be used. An option would be to remove all physical disks from the suspected disk group that is melted down and then boot the controllers. Then install the failed disks and issue the command to each drive. Then install all remaining disks in the system.

Bert Martens V0.3

Page 16 HP Confidential

10/6/2003

Loops
Collecting disk error counters will be covered in a separate paper that will explain the fields and how to read the output. There are 3 basic commands, FCS_L, FCS_C and FCS_D. FCS_L provides 4 tables, each table covers all drives on that port. FCS_C will clear the current output so the FCS_D command can be used to determine the delta between when the error counters were cleared and the execution of the FCS_D command. AL-PA IDs are assigned in a sequential manner based on the devices on the loop. Currently the EVA uses soft AL-PA addressing. This may change in future releases, but this is correct for V1 and V2001. There are some base requirements for this model. The storage system must be powered up with both power supplies active. This ensures the drives are powered on at the same time. If only one power supply is active, the drives will power on in sequence (8, then 4, then 2). This can result in a different AL-PA scheme than if all drives were powered on at one time. The drives will start taking AL-PA IDs from the lowest priority (EF) to the highest (01). The controllers will take AL-PA 01 and 02. It should be noted that sequential for AL-PA IDs is a little different than just normal counting from EF to 01. There is a special requirement for FC-AL that restricts the ALPA IDs to limited numbers. An example of the numbers is; EF, E8, E4, E2, E1, E0, DC, DA, D9, D6, D5, D4, D3, D2, D1, CE. A complete list can be found in the FC-AL standard. I just listed the few AL-PAs to ensure everyone understands that AL-PA addresses are not sequential. Disks can be identified as failed by the controller and the action taken will depend on the type of failure and the ability of the disk to complete basic read and write commands. For SMART trips and other events that allow the disk to be read/written, the controller will read all the data from the drive while it placed the data in the spare capacity in the disk group. When the data is all safely moved, the controller will write to the disk a special identifier that the disk is marked as failed. This identifier is checked when the controller boots or if the disk is re-inserted in the storage system. If the drive is marked as failed, the controllers will not allow the disk to be used in the system. The disk may also be marked as failed for communication errors. So disk drives that are having problems communicating with the controllers may be marked as failed. If the disk is not the source of the errors, but just reporting the errors, it may still be identified as failed. At this time there is no simple method to change the failed status of a disk after the controller has marked the disk. There is a new procedure to change the failed state of the disk, see the disk error section of this paper. The direction of the frame flow on a loop is dependant on the I/O modules and which port first detects FC light and signal. It is possible that if both controllers are powered on at the same time and both boot at the same time. The frames will not flow from the top shelf to the bottom shelf (or bottom to top). It is possible (and expected) that in some loops there will be times when the frames will start at the top controller and then enter shelf-1. At shelf 1 the frames will be routed to the disk in bay 1, then continue to bay 14 and out the top port of the I/O module. Then the frames may be routed through shelf 2 and shelf 3 to the bottom controller. Then from the bottom controller the frames will then Bert Martens V0.3 Page 17 HP Confidential 10/6/2003

go to shelf 3 and be routed to bay 1. After the frames leave shelf 3 they will go to shelf 2 and be routed to bay 1. The frames will continue through bay14 and then go to shelf 1 where they will just be passed through to the top controller. If you think this is hard to read, it was worse to write. If the failure is a short burst of events on loop A and then a short burst on loop B, it can indicate that the device is being accessed on one loop, then being moved by the controller to the other loop. The controller will try a device on both loops if there are errors accessing the device. The pattern of loop errors can provide a lot of insight to the location of the fault source. It can be expected that if the errors are equally distributed between both loop A and loop B, the cause is a disk device. If the errors are all on one loop, it indicates an I/O module. If the errors are mostly on one loop, then it could be a disk or I/O module. It is expected that error recovery on one loop will induce events on the other loop. That is if there are significant errors on loop 1A, it can be expected to find events on loop 1B as the controller resets the loops or devices. A problem we encountered was trying to find all the SCSI parity errors and other disk detected loop errors, and then identifying these drives on the loop. How the controller identifies a loop switch on the device loops The controller starts pumping out ARB(FD,FD) primitives on the loop. FD is an invalid AL_PA but has neutral disparity. When the loop switch sees an ARB(FD,FD) it replaces it with an OLS primitive. OLS primitives are only used on pt-to-pt connections so the controller knows that if it receives an OLS there is a loop switch on that loop.

Bert Martens V0.3

Page 18 HP Confidential

10/6/2003

Quiesce Time for Failed Drive Loop When a drive loop is failed by a controller it will be placed in a 24 hour quiesce state during which it will not be accessed by the controller, and any traffic seen on the loop by the controller will be ignored. The purpose of the 24 hour quiesce is to avoid using, for an extended period of time, a problematic loop, the use of which may result in severe degradation of controller performance. The only exception to the use of the 24 hour quiesce period is if a loop is failed due to no light being received by the controller port attached to that loop, which will instead result in a quiesce period of 5 minutes. For either quiesce period, once the period has ended a LIP will be issued on the loop to force the loop to be retried for the purpose of determining if the loop has been healed. Override of Failed Drive Loop Quiesce The quiesce period for a failed loop may be manually overridden by using the Port Config OCP function. The override will only execute on the controller on which the Port Config function was issued. When a manual override is desired it is highly recommended that the override be executed from both controllers as close together in time as possible in order to avoid dual-port accessing of drives. Hard Addressing of Drives An attempt will be made through software to force each drive enclosure to use hard addressing when obtaining AL_PAs on each loop. In the absence of any hardware faults or system misconfigurations the following addressing will result, with AL_PAs listed from bay 1 to bay 14: ENCLOSURE # 1, 8 2, 9 3, 10 4, 11 5, 12 6, 13 15, 22 16, 23 17, 24 AL_PA 36-26 51-39 6A-52 80-6B A5-81 B6-A6 D1-B9 EF-D2 25-01

Drive bays 13 and 14 of drive enclosures 17 and 24 will always be soft addressed due to overlap with the controller AL_PAs, 02 and 01. These drive bays should not be populated with drives under any condition.

Bert Martens V0.3

Page 19 HP Confidential

10/6/2003

Troubleshooting tips
If the exact disk is not known, migrate the data from the suspected disk before removal. On drives that we have confidence are bad, we should have the data migrated off the disk. But on drives that are only suspected, the migration can take a while and affect the customer. So maybe the best answer is to just pull the disk and let the system recover the data. If the customer has VRAID0, then disk removal will impact system operation. Of course once the disk is reinstalled, the data will be available to the user. For troubleshooting, it is possible to locate and remove an entire disk group. Then the system will boot with remaining disk groups and you can then reinsert the drives to allow single drive fault isolation. The system will boot with any disk group, if it contains all functional disks. Each disk group contains at least one storage system level metadata disk (quorum disk), and all the metadata for the disk group and it contains the metadata for the virtual disks. If the system finds valid quorum disks, but can not build at least one valid disk group, the storage system will be marked multi-disk failure and be inoperative. It can provide some insight in disk selection, but there are many factors in the selection. Have you read the disk selection paper posted in this forum? That will provide the base rules for disk group creation. Also when increasing the number of disks in a disk group, it is suggested that you add all the disks at one time. There are internal rules on how drives are added to internal structures. The controller disk group rules are optimized for additions of large number of drives. Adding disks that have been used in one EVA to another EVA and that is a valid storage system and is running can be problematic. The disks will just sit there if you have the addition policy set to extrinsic. If the policy is set to intrinsic, the disk will be added the existing disk groups automatically. If the storage system is shutdown (but not powered off), the drives will appear when the system is booted. And they will follow the addition policy after the boot. If the system is powered off when the disks are inserted, at power-on boot the storage system may detect multiple storage systems (if the new disks contain metadata disks). The user will have to remove the new disks and the system will continue to boot. Then the new disks can then be installed. The uninitialize command on the controller OCP and the HSV EM page are a little different in function. While the user visible results are similar, they handle the uninitialize in different manners. The OCP uninitialize command will delete more of the on disk metadata, but will not clean up any of the system state information on the SWMA. The HSV EM uninitialize command will clean up all the old storage system files and then just delete the base system state from the EVA on disk metadata. If the disks from a system that was uninitialized using the GUI are moved to another EVA. The disks will still contain some system metadata from the old EVA. No single best method to uninitialize the system. Select the model that works best for the customer.

Bert Martens V0.3

Page 20 HP Confidential

10/6/2003

On a resynch (which is done after codeload), the controller logs back in to the fabric causing all adapters to also re-login. The (adapter) logins setup power fail unit attentions to all presented units. This can cause units that are mounted on an OpenVMS system to enter mount verify if they are accessed through the same HBAs that are connected to the EVA. TECHBB can be found on the NAPSE web page containing technical information TECHPORT. Just click on the TECHBB tab and then look for the storage forums. URL http://techport.tay.cpqcorp.net/

If you have the serial line connected and the verbose debug flag set, you may see printout like this example. This indicates that the EMU was busy and the controller had to retry a command. The controller will poll the EMU every 45 seconds, if the EMU is busy, the serial line will output messages like these. They are not failures and do not require corrective action. The status of 0802 indicates the busy state.
FCS: prt 2 Short status 08020000 al_pa 66 - shelf 10, slot 4 FCS: prt 0 Short status 08020000 al_pa 4c - shelf 2, slot 4**** FCS: prt 0 Short status 08020000 al_pa 6a - shelf 3, slot 1

Bert Martens V0.3

Page 21 HP Confidential

10/6/2003

Events
The controller event logs are formatted in different structures depending on the type of error. All error entries contain a couple of basic structures. In each entry you will find the title bar with the date, time, controller ID, and event code. Below the title bar is the description of the event, then the event log packet event specific data, including the state of the time, and event sequence numbers and then the controller state including event number, controller software version, controller model, reporting controller and event time. The next parts depend on the type of error and what data is being included. For events that are controller detected, the event log will contain the reporting NSC (network storage controller). For events that are disk drive specific, the data will contain disk drive specific UUID. For each piece of data, there can be a couple of different methods to format the data. These are identified as unions, where the data is formatted in at least 2 different structures. The data presented is identical, and in many cases the different unions provide the same definition but with different format. The end of the controller event packet normally contains the error codes or error type count. These packets contain several important parts of data including the controller that logs the event, the controller that reports the event (some entries) and the device that is associated with the event. And of course the event that occurred that required the entry. Correlating data from event logs. The most complex work on EVA systems is how to handle the large amount of data and how to take the data and make a logical flow of what events are expected (due to user activity) and what events are abnormal (caused by failures or an unexpected result of normal activity). This complex task can be reduced by using the tools available that reformat the data. The tools that I use today are Chrono and YAPP. Chrono uses the HSVET output file and provides a condensed chronological output showing the events logged by each controller. YAPP uses the HSVET output file and provides multiple files, each with a different structure. I use the spreadsheet (main), the Device map (devmap) and the event file (eventcode) files. Other folk use timestamp, the full HTML file and the brief file. Each person will have to decide how they prefer the data formatted. If an EVA is logging link and loop errors and they are not detected by any disk drives, the failure would be either an I/O module or the controller. We have seen systems in which a replaced I/O module will appear to correct the issue, but errors are logged after a few hours or days. With loop errors it is important to remember that a frame sent from a controller to disk device will pass through all cables, I/Os and disk drives between the source of the command and the disk destination. And the response will travel through all devices from the disk drive to the controller (and it is not possible to verify the exact path through the shelves) This is a different operational model than parallel SCSI and older StorageWorks products. With command and data frames flowing through all these devices, it is more difficult to isolate the point at which bad characters are introduced. By reviewing the controller event log for controller detected bad characters, controller detected command timeouts to disks (or the other controller) and disk reported bad loop characters, it is possible to isolate the source that is inducing the bad characters on the

Bert Martens V0.3

Page 22 HP Confidential

10/6/2003

loop. Remember that a device reporting bad characters may not be the source of the problem, just the messenger. It is important to not to get lost on any one specific error or event. When reviewing the logs, look at all entries associated with the timeframe or customer issue. Try to understand how the different events occurred to create the big picture. When the big picture is understood, then looking at the detailed logs can fill any holes.

Bert Martens V0.3

Page 23 HP Confidential

10/6/2003

SSSU
From SSSU a sho disk command will provide the following information. This can help in verifying the operating state of the disk drives. This includes the operating state on each loop, the migrating state, the predictive failure state. This is the same information available from the HSV EM disk page. But with the SSSU command you can collect all disks in one step. Then use an editor or search tool to verify the different operating states. \Disk Groups\Default Disk Group\Disk 040 information: Identification: Name : \Disk Groups\Default Disk Group\Disk 040 Loop_Pair : LoopPair1 Node_World_Wide_ID : 2000-0004-CF2F-2AE3 Loop A: Port_World_Wide_ID : 2000-0004-CF2F-2AE3 Assigned_Lun : 0 Loop_ID : 13 Loop B: Port_World_Wide_ID : 2000-0004-CF2F-2AE3 Assigned_Lun : 0 Loop_ID : 4 Condition State: Operational_State : Normal Migration_State : Not migrating Failure_Prediction : No Media_Accessible : Yes Loop_A_State : Normal Loop_B_State : Normal Physical: Type : Fibre Channel Disk Manufacturer : COMPAQ Model_Number : BD03654499 Firmware_Revision : 3BE3 Formatted_Capacity : 33.92 GB System: Requested_Usage : Member of StorageCell Actual_Usage : Member of StorageCell Disk_Group : \Disk Groups\Default Disk Group Occupancy : 4.00 GB Comments : ID : 2108071004000020e32a2fcf0000000000000000

Bert Martens V0.3

Page 24 HP Confidential

10/6/2003

Link Error counters


Excessive link errors are reported in the controller event log. These error limits apply to all FC ports (Host Fabric, Device and Mirror Port). The Tachyon chips in the controller are checked for error counts each 60 seconds. If any error counters exceed the threshold, the event is logged. The threshold for the counters is; FCS_DP_MAX_LOSS_OF_SIGNAL 25 FCS_DP_MAX_BAD_RX_CHAR (0xff * 10) FCS_DP_MAX_LOSS_OF_SYNC 25 FCS_DP_MAX_LINK_FAIL 25 FCS_DP_MAX_RX_EOFA 25 FCS_DP_MAX_DIS_FRM 25 FCS_DP_MAX_BAD_CRC 25 FCS_DP_MAX_PROTO_ERR 25

Please use the field service page on the HSV EM to collect and manage the storage system. The details on the use of that page will be available in another paper.

Bert Martens V0.3

Page 25 HP Confidential

10/6/2003

SWMA Event log data


EVA related event files from the HSV EM point of view. The HSV EM provides a GUI interface to these logs, but it can be important to be able to collect this data when the HSV EM can not access the desired controller There are 3 types of events available on the HSV EM related to the storage system: Management Agent Events (Management Events) Controller Events Controller Termination Events

All files are found in the c:\hsvmafiles directory on the appliance. Do not use the files in this directory for normal troubleshooting procedures. These files may not contain all the related errors related to recent conditions. These files are only to be used in cases where other procedures are not available or do not provide the information necessary to resolve a problem. Do not delete or rename these files. They can be copied to other directories or different systems for analysis. File Names: Controller and Termination events are placed files whose names start with the node WWN of the controller pair. WWN.sceventfile.ascii - Example: 5000-1FE1-0015-4690.sceventfile.ascii WWN.termeventfile.txt - Example: 5000-1FE1-0015-4690.termeventfile.txt Management events are placed in a file whose name starts with a numeric representation of the UUID of the Storage System. uuid.bridgeLog.txt - Example: 8.7.16.1610942644.83329.40960.152502272.bridgeLog.txt The UUID of the Storage System is associated with the WWN of the controller pair on which it resides - this pairing can be found in a file named wwn2handlemap.txt Example contents of the wwn2handlemap.txt file: 5000-1FE1-0015-4690 8.7.16.1610942644.83329.40960.152502272 5000-1FE1-0015-4680 8.7.16.1610942644.83311.69632.14286848

All of the log files are downloadable through the GUI. Caveat for V2 Bridge: The controller events are placed into the file during the Management Agent's monitoring cycles. When the file reaches 15K events the entire file is moved to a "last" file: WWN.last.sceventfile.ascii and the original file begins filling up again. When viewing events the contents of the GUI are populated with the contents Bert Martens V0.3 Page 26 HP Confidential 10/6/2003

of the .ascii file. The GUI will represent fewer events when the "rollover" of the file happens and will only show events generated since the 15K events were saved in the "last" file. When downloading the event file through the GUI only the WWN.sceventfile.ascii will be sent. In the event of the file rollover the WWN.last.sceventfile.ascii should be collected off the appliance if necessary. If a Storage System is uninitialized using the HSV EM GUI the contents of these files are removed. If event data needs to be saved copy the log files before a Storage System is uninitialized. If the Storage System configuration is deleted using the controller OCP, the files in hsvmafiles are not effected.

SWMA files Using the HSVMAfiles area for understanding problems with EVA systems. Check to see if the directory contains .ascii files for the storage systems managed by this appliance. It can also help to check the date and find the last time they were modified. This can be an indication of the last time the appliance had a valid connection to the storage system. If the SWMA will not connect to the EVA or display the HSV EM, there are several procedures that can help isolate the failure. If you have an SWMA that will not display the HSV EM and you know its installed, then you can disconnect the appliance from the fabric and see if a reboot will present the EM. I have found that this will help isolate if the issue is in the appliance or on the fabric. If the HSV EM will load, but not present the EVA, you can check the Emulex utility to verify that the controllers WWN is logged in to the page. I have noted that some time the SWMA will have a connection to the EVA, but the HSV EM will not connect and display the storage system. Another idea, I think if the SWMA reports that the EVA is not initialized (when EVA is initialized), the issue may be related to the SWMA only talking to the secondary controller and not having access to the primary controller. Change the fabric and see if it works. MLD is a part of quorum disk (5 disks), and quorum disks are always in system disk group (or Default disk group). System disk group is created when the storage system is created and is always present as long as the storage system is present. If someone deletes this system disk group, it picks another disk group to be system disk group before it's deleted. The SCMI log file has a date that is a month behind the current date in the entries.

Bert Martens V0.3

Page 27 HP Confidential

10/6/2003

Serial line use


The serial line on the rear of each controller provides a last resort path for information on the controller operation. It is cryptic and non-user friendly and will impact normal system operating performance. It should not be used unless other methods have been exhausted. To enable the serial line, connect a terminal (or terminal emulator) to the top 9pin connector on each controller. The port is label UART, do not connect to the UPS port. The settings are 19.2K, 8N1,no flow control and VT100 mode works fine. There will not be any output on the terminal and it will not accept commands at this point. To enable the serial line, type a ctl-h then clt-r on each serial line. This will cause each controller to reboot and stop at the boot prompt. After the controllers reach the boot prompt, you should enter a p then f then g. It is important to use a lower case g and not upper case, the upper case G will disable console output. After each letter is typed the controller will output a new screen. When the g is entered it will start the boot sequence. It will output a screen requesting input for the print flags, you should type 1fff or 0 then enter, if the boot flags are not displayed, then it will be necessary to use a clt-j as the string terminator. Once the boot flags are displayed, type 9 or 0 or 2000009 and enter (or ctl-j) and the controllers will finish booting. It should be noted that the default for print and boot flags is 0, and if you just hit enter, it will set the flags to 0. Starting with VCS v2002, there is a new debug flag 2xxxxxx. This flag sets verbose mode on the serial line and will output all important data regardless of print flag settings. If the verbose debug flag is used and the print flags are all enabled, it will output more data than is useful. The print flags should be 0 if the verbose debug flag is used to prevent the system from being consumed by serial line output. Once the flags are set each line will start to output data. When the flags set, each time the controller is rebooted, it will require user input to complete the boot process, it is necessary to enter a g at the boot prompt. It is important that the flags be set identical on both controllers, or they will not operate correctly. It is also necessary to have serial lines connected and logging processes on each line. And the flags must be set to 0 (zero) for customer operation. The amount of data output will depend on the number of events being detected. If the systems are running without any errors, there will be little output. However, if there are significant events and errors being detected, each line will output a large amount of data. It is suggested that these lines be logged to a file for review and easier analysis. Trying to analyze complex data from data scrolling on a video terminal is almost impossible. (I will not include any information on how to read the output in this paper.) For analysis of the output, contact your support group or engineering. When done collecting data and resolving the issue, make sure you reset the print and debug switches to 0, or the system will not reboot for the customer.

Bert Martens V0.3

Page 28 HP Confidential

10/6/2003

PROMPT_FOR_GO 0x00000001 MAKE_ILF_DISK 0x00000002 ENABLE_PARITY_CHECK 0x00000004 PRINTF_TO_CONSOLE 0x00000008 PRINTF_TO_ILFLOG 0x00000010 ENABLE_MDU_BGTASKS 0x00000020 ENABLE_CACHE_DUMP 0x00000040 DISABLE_FM_TERMINATE 0x00000080 LOOP_ON_DIAGS 0x00000100 ENABLE_MONITOR 0x00000200 PERF_LOGGING 0x00000400 TRIGGER_ANALYZER 0x00000800 TACH_HOOD_LOOP 0x00001000 USE_FAKE_LCD 0x00002000 RESYNC_DISCARD_CMDS 0x00004000 ENABLE_1_GIG_DEVICE 0x00008000 CPLD_CRASH_ALWAYS 0x00400000 VERBOSE_MODE 0x02000000 Enter hex debug flags (just digits, no leading '0x') [2000009]:

Bert Martens V0.3

Page 29 HP Confidential

10/6/2003

Field Service Page Commands


Here are the current commands that can be entered using the field service page. These commands are expert commands and may change or be removed from the product. There is little support for these commands, but they can help in some critical situations. Hex Command Name Command Description 03 Set disk drive OK P1=0 for first bad drive,0xFFF for all drives, or a NOID 05 scs_show_config Displays volume, pstore, rss, and scell information. 16 Crash/reboot Master 17 Crash/reboot Salve 22 Locate drive P1= (0=Off, 1=On) P2= NOID 23 Locate Quorum P1= (0=Off, 1=On) 24 Locate ILF P1= (0=Off, 1=On) 25 Locate ALPA P1= (0=Off, 1=On) P2= ALPA 26 Locate Disk Group P1= (0=Off, 1=On) P2= NOID of disk group 29 fcs_show_config Displays loop state and drive information. 2A Resync Causes the controllers to resynchronize 34 fcs_link_errors Displays Fibre Channel link errors reported by drives. 35 fcs_clear_links Sets new baseline for fcs_delta_links. There is no display for this function. 36 fcs_delta_links Displays difference of FC link errors between now and baseline. Default baseline is the controller boot or reboot/resync. To use some commands, it is necessary to have a terminal connected to the serial line for data output. And this will require that the controller debug flags be set to output data to the serial line. Use the debug flag of 8 for serial line output.

Bert Martens V0.3

Page 30 HP Confidential

10/6/2003

What new terms mean


Completing - This previously inaccessible volume has become accessible; data migration is being completed Creation in progress = The unit is being created Degraded = The virtual disk is in operating without the parity redundancy Derived Unit, or DU, supplies the I/O protocol, non-data behavior for Virtual Disks. Failed - Volume is not being used in the Disk Group; disk errors are preventing normal usage Grouped --> Reserved = The volume is in currently in a disk group, but it is being removed Leveling = The moving of PSEG between volumes in a disk group to provide equal distribution of the capacity for each virtual disk across all members of a disk group. During the leveling process, the VRAID protection must be maintained and when PSEGs are moved from one disk to another, the RSS integrity must not be violated. Missing - Volume is inaccessible Normal - Volume is present and operating normally Migrating - Data from this volume is being moved to other storage in this Disk Group Presented Unit, or PU, is an association between a host or group of hosts and a Derived Unit PSEG = The smallest allocated physical capacity (2MB) of a physical disk drive that is used to build a virtual disk. The state of the Quorum Disk flag for the Volume identified in the handle field has changed Reconstructing - Volume is inaccessible and redundant data is being regenerated and moved to other storage in this Disk Group

Redundancy None.Inoperative = The disk group identified had a disk failure and all VRAID0 units are inoperative.

Bert Martens V0.3

Page 31 HP Confidential

10/6/2003

Redundancy Parity.Inoperative = The disk group identified had multiple disk failure and all VRAID5 units are inoperative. Redundancy Mirrored.Inoperative = The disk group identified had multiple disk failure and all VRAID01 units are inoperative and this implies the entire disk group is inoperative. Reserved --> Grouped = The volume is identified and is being added to the disk group. RStore Redundant Store = A contiguous 8MB of user addressable space within a logical disk created by the user. The RStore has redundancy characteristics that are required to support the Virtual disk redundancy the user requested. RSS Redundant Storage Set = a collection of RStores that span 6 to 11 volumes. Reverting - This previously inaccessible volume has become accessible; data is being regenerated on this volume. This is seen when a disk is not accessible and the data is being generated on remaining members of the disk group, then the disk becomes accessible and the data is then moved back to the original volume.

Virtual Disks are the data bearing objects

Regards, Bert Martens (with help from many on this paper)

Bert Martens V0.3

Page 32 HP Confidential

10/6/2003