Sei sulla pagina 1di 28

Nestlé European Information Technology Operations Center (ITOC) S.A.

Windows Cluster Operational Guidelines


Author : Steve Rosa
Authoriser : Fabian Geurts

Creation Date : 11-May-2004


Save Date : 03-Nov-2005
Print Date : 16-Feb-2005

Synopsis : This document is about the Windows clusters in Z-EUR.

Revision History
Revision Revision Summary of Changes Changed by
Number Date (Acronym)
1.0 11/05/04 First draft SRO
1.1 09/08/04 Few updates and add monitoring SRO
1.2 14/09/04 PDBnet SRO
Restoring
Various additions
1.3 31/01/05 PDBnet information updated SRO
1.4 16/02/05 Veritas Support Information SRO

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 1 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

Table Of Contents

1. Introduction....................................................................... .........................4

2. Hardware / Software requirements...........................................................5

3. MS Cluster architecture.................................................................. ...........6

4. GeoCluster Architecture.................................................................... ........7

5. Tools installed on Windows Clusters.......................................................8

6. Supported Applications..................................................................... ........9

7. Active/Passive configuration vs. Active/Active configuration.............10


7.1 Active/Passive configuration......................................................................................10

7.2 Active/Active configuration........................................................................................10

8. Inventory of cluster nodes......................................................................11

9. Daily management of clusters................................................................13


9.1 The tool: Cluster Administrator..................................................................................13

9.2 Manually fail resources over to the other node........................................................14

9.3 Manually take resources offline.................................................................................14

9.4 Manually bring resources online................................................................................15

9.5 Manually reboot the two nodes of the cluster..........................................................15

9.6 Recovering the service after a HW failure has occurred (BETA section)..............15

9.6.1 Shared data and / or quorum drive lost / corrupt......................................................... .....15

9.6.2 One node crashed........................................................................................... ....................16

10. Monitoring............................................................................................. ..17

11. Backup.................................................................... ................................18

12. Troubleshooting................................................................... ..................19


12.1 Event log.....................................................................................................................19

12.2 Log file........................................................................................................................19

13. Important points, things to pay attention to........................................20


File: 8909694.doc Version: 1.4
Author: Steve Rosa Page 2 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

14. References and further reading............................................................21

15. Appendix A – List of Veritas Volume Manager events monitored by


Tivoli.................................................................................................... ..........22

16. Appendix B – Veritas Volume Manager Support Information.............28

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 3 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

1.Introduction
As the Globe solution is being deployed, there is a rising need for highly available applications. For
instance, intranet websites need to be available to the users 24 hours a day, 7 days a week.

Server cluster provides failover support for applications and services that require high availability,
scalability and reliability. With clustering, organizations can make applications and data available
on multiple servers linked together in a cluster configuration. Back-end applications and services,
such as those provided by database servers, are ideal candidates for Server cluster.

Microsoft cluster technologies guard against three specific types of failure:

• Application/service failure affects application software and essential services.


• System/hardware failure affects hardware components (for example, CPUs, drives,
memory, network adapters, power supplies, etc.).
• Site failure could be caused by natural disaster, power outages or connectivity outages
(covered by the GeoCluster technology).

The concept of a cluster involves taking two or more computers and organizing them to work
together to provide higher availability, reliability and scalability than can be obtained by using a
single system. When failure occurs in a cluster, resources can be redirected and the workload can
be redistributed. Typically, the end user experiences a limited failure, and may only have to refresh
the browser or reconnect to an application to begin working again.

The Windows 2K Development Team located in Vevey has developed two possible
implementations of Windows clustering: simple cluster and geographically dispersed cluster (also
referred to as geocluster).
File: 8909694.doc Version: 1.4
Author: Steve Rosa Page 4 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

2.Hardware / Software requirements

The solution provided by the Dev Team has been certified on the following hardware:

• IBM eSeries x360 connected to an IBM ESS E800 SAN (simple clusters and geoclusters,
only in regional datacenters)
• Dell PowerEdge connected to a Dell PowerVault PV-220s configured in cluster mode
(simple clusters only)

Concerning software requirements, the only version of Oasis 2 that is supported for clustering is
2226.

This document will focus on the IBM implementation, as it is the only one used in Z-EUR for the
moment.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 5 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

3.MS Cluster architecture

We will now see the architecture of “simple” cluster.

ESS
Q: drive (quorun)
F: drive (data)

SERVER1 SERVER2

VIRTUAL SERVER
SERVERNAME ipaddress

This cluster consists of two different servers located in the same site and connected to the same
SAN.

Each of the servers (but not at the same time) has access to two SAN volumes1
one volume is called the Quorum and holds the information on the cluster, like the name of the
node that owns the resources, etc…
the other volume holds the actual data

Access to the volumes is managed by the cluster service in a way that at a particular time, only
one node has access to the volume.

The two nodes of the cluster have at least two network connections:
- one connection is for the production network allowing the clients/applications to connect to the
cluster (it is called Public Network Connection)
- one connection is for the internal connection (like the heartbeat) between the two nodes (it is
called Private Network Connection): in the simple cluster implementation, a crossed network cable
is used.

The cluster mechanism presents a virtual server with a virtual IP address to the clients or
applications. If the node currently owning the cluster resource group fails, the other node will take
over all resources.

1
These volumes are configured with special settings, in order to allow the two servers to recognize the same
data.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 6 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

4.GeoCluster Architecture

Before starting, note that geocluster is only supported in the Regional DataCenters (Frankfurt and
Mainz in our case).

Here is the architecture for the geocluster:

Datacenter 2
ESS Datacenter 2
Q: drive (quorun)
F: drive (data)

Datacenter 1
ESS Datacenter 1
SERVER1 Q: drive (quorun) SERVER2
F: drive (data)

VIRTUAL SERVER
SERVERNAME ipaddress

We will explain here the differences between simple cluster and geocluster.

Each node is located in a specific datacenter and connects to both ESS SAN: the one in its own
site and the one in the remote site.
As with the simple cluster, two SAN volumes are accessible by each node (not a the same time).
One of the major differences between simple cluster and geocluster is about the storage. In the
geocluster, each disk volume used by the cluster consists of 3 LUNs forming a mirror
concatenated2: 1 on the local SAN and 2 on the remote SAN. This will allow the cluster to be
independent from whole SAN failure in the local site.

2
The FlashSnap feature of Veritas Volume Manager achieves synchronisation between the LUNs.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 7 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

Another major difference with the simple cluster is the existence of an extra heartbeat connection.
The two nodes of the cluster have at least three network connections:

one connection is for the production network allowing the clients/applications to connect to the
cluster (it is called Public Network Connection)3
two connections are used for the internal connection (like the heartbeat):
one goes via the MAN connection between the two datacenters
one goes via a dedicated leased line; this enables the cluster to be independent from a MAN
failure

5.Tools installed on Windows Clusters

The following add-ons are installed on the nodes of the cluster:

Tivoli Storage Manager client: used for the backup of the physical nodes (C$, D$ and system
objects) (especially in the RDC)
Tivoli Data Protection agent for SQL server used for the backup of the SQL server (if the cluster
runs SQL Server 2000) (in the future)
Tivoli EndPoint (physical endpoint is installed on each of the nodes and a logical endpoint is
installed on the cluster)
Veritas Volume Manager with FlashSnap option used for the synchronisation between the LUNs on
the SAN (if it is a geocluster).
Veritas Backup Exec: used to backup data in the markets (this solution is still in development).

3
Note that with the geocluster, the public network connection relies on the teaming of two different NICs.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 8 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

6.Supported Applications

Application Supported mode by BTC


File Server Active / Passive
Print Server Active / Active
DHCP Server Active / Passive
E2K (Exchange 2000) Not yet supported
SQL 2000 Server Active / Passive
Active / Active (only simple cluster)
DC Not supported
DNS Server Not supported

SQL Server 2000 is the only supported application with geocluster.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 9 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

7.Active/Passive configuration vs. Active/Active configuration

The cluster can be configured in two different ways, depending on the needs.

We will take SQL server as an example, but the explanation is valid for any other cluster-aware
application.

7.1 Active/Passive configuration

An active/passive configuration allows you to have a single instance of SQL Server running on one
of the physical servers in the cluster. The other nodes in the cluster are in standby mode until a
failure on the active node or a manual failover during maintenance occurs. Only one SQL Server
2000 virtual server is installed on an active/passive SQL Server cluster environment.

7.2 Active/Active configuration

An active/active configuration allows you to have multiple instances of SQL Server running on both
nodes of a cluster. If one of the SQL Servers in the cluster fails, the failed instances of SQL Server
will automatically fail over to the other server. This means that both instances of SQL Server will be
running on one physical server, instead of two separate servers. This could lead to some
performance drops, but it is much better than having nothing.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 10 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

8.Inventory of cluster nodes


All Windows 2000 servers are scanned by Tivoli in order to detect the roles. This detection occurs
during the mifgen process of the inventory (every 3 days).

This role information is gathered and inserted in the PDB Net application.

A cluster node is detected as having the mscs role, as shown in the following screenshot:

So, in case of a failure, when you have to recover a server, you can directly see in the inventory
that this server is a cluster node.

In addition to this, PDBnet application has been amended in order to show partnership between
servers.

For instance, if you look for DESDB017, you can see the following comment:

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 11 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

This clearly explains which server is the counterpart, which framework is using the cluster and the
names of the virtual servers. The first one is always the name of the MS cluster, while the following
names are the names of the virtual SQL cluster instances.

From there on, you can open one of the virtual servers, e.g. DESDB523:

So, from a physical node, you can obtain the names of the counterpart and the virtual servers.
From a logical node or virtual server, you can obtain the names of the two physical nodes.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 12 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

9.Daily management of clusters


9.1 The tool: Cluster Administrator
The tool used for the administration of the clusters is the Cluster Administrator. It is located in the
Administrative Tools program group:

Once you have started it, it may ask for the cluster you want to manage. If you are logged on to a
node of the cluster, simply enter . as cluster name 4:

You will then see a screen similar to this one:

We will now see the things that can be done with this console.

4
You can enter the name of the cluster or the name of one of the nodes.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 13 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

The different resources of the cluster are grouped together. If you expand the Groups folder on the
left pane, you can see the groups present on this cluster:

In the right pane, you can see the resources member of the selected group.

9.2 Manually fail resources over to the other node

Due to the dependencies, you may fail individual resources over. You must always fail group over.
To do so, right-click on a group and select Move Group:

Pay attention: the software will not ask for confirmation!

9.3 Manually take resources offline

To take a resource offline, locate the resource, right-click on it and select Take Offline 5:

Note that the resource holding the quorum disk (usually the Q:) may not be taken offline.

Pay attention: the software will not ask for confirmation!

5
Note that doing this will take offline the resources that depend on the resource you take offline.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 14 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

9.4 Manually bring resources online

To bring a resource online, locate the resource, right-click on it and select Bring Online:

Pay attention: the software will not ask for confirmation!

9.5 Manually reboot the two nodes of the cluster

When planned maintenance has to be performed on the nodes (upgrade, service pack, hotfix, …),
follow the following steps:

Make sure that all resources are owned by one single node.
Perform the maintenance on the standby node (i.e. the node not owning any resources).
Reboot if needed the standby node.
When it is back, move all resource groups to the standby node.
Wait for the application to come online.
Have confirmation from the users or application owners that the application is still working as
expected.
Perform the maintenance on the other node.
Reboot if needed the other node.
When it is back, move again all resource groups to it.
Have confirmation from the users or application owners that the application is still working as
expected.

9.6 Recovering the service after a HW failure has occurred (BETA section)
You can have different cases:
9.6.1Shared data and / or quorum drive lost / corrupt
This is the situation where you have to restore the shared data (either on a SAN or on a shared
PV) before being able to restart the application.
1. Make sure that the disk subsystem has been repaired and that the partitions are created
and visible by both nodes (see the GLOBE ISIT OASIS2 Cluster Guide).
2. Shut down one node of the cluster in order to make sure that only one node is running.
3. Make sure that the application is stopped.
4. Restore the quorum partition and the shared data partition(s) (F:, G:, …) 6.
5. Restart the application via the Cluster Administrator tool.
6. Have the functionality of the application checked by the application responsible.
7. Restart the second node of the cluster.

6
Note that in the RDC, this has to be done with the help of the AIX Support team.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 15 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

8. Fail the application over to the second node. Attention, this causes an application outage,
please check with the application responsible on beforehand.
9. Have the functionality of the application checked by the application responsible.

9.6.2One node crashed


In this situation, there should not be (or less) impact to the users, as the application should have
failed over, thanks to the clustering technology. So, there is no need to work overnight to recover
the server, as long as the application keeps running.
The only thing to do is to restore the server from a known good backup. This does not differ from
recovering a usual W2K server.

1. Have the HW of the server repaired.


2. Have Oasis level 1 installed onto the server.
3. Restore the C: partition and the System State by following the usual backup/restore
documentation.
4. Reboot the server.
5. When possible (i.e. when the application responsible gives the green light), fail the
application over to the restored node, in order to make sure that everything is OK.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 16 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

10.Monitoring

Both nodes of a cluster are monitored, as standard member servers.

Furthermore, on GeoClusters, the Veritas Volume Manager software is monitored as well. Please
refer to Appendix A – List of Veritas Volume Manager events monitored by Tivoli for the list of the
monitored events and Appendix B – Veritas Volume Manager Support Information for support
information.

In addition, there is a logical endpoint created for the monitoring of the application itself. The name
of the logical endpoint is VirtualServerName_ClusterGroupName-log
(e.g.: DESDB537_DESDB537\PSWWW50-log).

The configuration and the logfiles are stored on the shared drive used by the application (F: or G:)
at the following location: SharedDrive:\Program Files\Tivoli\lcfX., where X is 2 or 3,
depending on which cluster group is monitored by this endpoint.

Note that the logical endpoint is a cluster resource and is member of the cluster group of the
application:

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 17 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

11.Backup

As any server located in the Regional Data Center, each local resource of the server is backed up
by TSM. So, all the local partitions (C: D:) and the system objects (registry, WMI repository, system
files,…) are protected.

The cluster resources (usually, SAN partitions) are backed up as well via TSM. In fact we install
another Scheduler service. This scheduler is an extra resource in the cluster group:

Servers located in the markets are backed up by a special component of Backup Exec
(development of this solution is in progress).

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 18 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

12.Troubleshooting

The cluster service is relatively verbose; there are several ways to troubleshoot potential issues.

12.1 Event log

You can find a lot of information in the System event log.


To have better visibility, you can filter the events on the source ClusSvc.

12.2 Log file

You can find precious information (especially during startup of the cluster and nodes joining) in the
following file:

C:\WINNT\cluster\Cluster.log

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 19 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

13.Important points, things to pay attention to

Always remember that users and/or applications rely on the applications hosted on the
cluster. Each action you perform can have dramatic results. Some applications need to be
restarted after the DB has been restarted or failed over. In summary always discuss with the
application responsible before doing anything on the database side.

Do not ever reboot the two cluster nodes at the same time. Always reboot the first node, wait
and then reboot the second node.

One of the most sensitive parts of a cluster is the communication between the two nodes. If the
nodes do not communicate with each other, the resources will start on both nodes, causing
damage to the application data (this is referred to as the split-brain situation). Note that the only
way to recover from this situation is to take the application offline and restore the data from the
last backup tape.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 20 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

14.References and further reading

Globe documents:
GLOBE ISIT OASIS2 Cluster Guide.doc
GLOBE ISIT OASIS2 GeoCluster Architecture.doc
GLOBE ISIT OASIS2 Geo Cluster Installation Document.doc
GLOBE ISIT OASIS2 TSM Backup and restore guidelines.doc
GLOBE ISIT SQL 2000 Operational guideline.doc
M250 Windows Naming Conventions.doc

Website:
www.microsoft.com

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 21 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

15.Appendix A – List of Veritas Volume Manager events monitored by


Tivoli

This list is sorted per Perigrine priorities:

ID Priority Description

23 P2 Volume Manager disabled due to excessive disk I/O error

23 P2 Volume Manager disabled due to excessive disk I/O error


583 P2 Could not find the cluster dynamic disk group.
583 P2 Could not find the cluster dynamic disk group.

584 P2 No disk was found for the cluster dynamic disk group.

584 P2 No disk was found for the cluster dynamic disk group.

585 P2 Failed to reserve a majority of disks in cluster dynamic disk group.

585 P2 Failed to reserve a majority of disks in cluster dynamic disk group.


1 P3 %1 provider reported physical disk %2 failure.
2 P3 An unexpected error has occurred.

3 P3 %1 provider reported physical disk %2 was offlined.

4 P3 %1 provider reported volume failure. Volume Name: %2


6 P3 %1 provider reports Volume %2 is failing.
12 P3 %1 provider reports virtual disk %2 failed.

25 P3 INTERNAL Error - Unexpected kernel error in configuration update

47 P3 INTERNAL Error - Communication failure with Volume Manager kernel


55 P3 INTERNAL Error - Volboot file not loaded

115 P3 INTERNAL Error - Root dynamic disk group is not enabled


142 P3 Volume Manager configuration disk read error
143 P3 Volume Manager configuration disk write error

150 P3 INTERNAL Error - No valid disk found containing dynamic disk group

166 P3 INTERNAL Error - Disk has inconsistent disk headers


167 P3 INTERNAL Error - Disk header not found

168 P3 INTERNAL Error - Disk private region contents are invalid


183 P3 The specified disk cannot be located
211 P3 Operation aborted due to disk I/O error
307 P3 Volume not found.

541 P3 Dynamic disk group not found. Failed to start SCSI reservation thread.

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 22 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

Unable to reserve a majority of dynamic disk group members. Failed to start SCSI
542 P3 reservation thread.

543 P3 Failed to start SCSI reservation thread for dynamic disk group.
544 P3 Failed to import dynamic disk group.

545 P3 Failed to stop SCSI reservation thread for dynamic disk group.
546 P3 Failed to release dynamic disk group reservations.

547 P3 Dynamic disk group not found. Failed to update SCSI reservation thread.

548 P3 Failed to obtain SCSI reservations for all members of dynamic disk group.

549 P3 Failed to update SCSI reservation thread for dynamic disk group.
580 P3 Failed to Recover the DiskGroup.
581 P3 Failed to lock the volume.

586 P3 Starting reservation thread on the cluster dynamic disk group failed.
589 P3 Import the cluster dynamic disk group failed.
740 P3 Dynamic disk group could not be found.
791 P3 Failed to resynchronize volume %1.
809 P3 Volume capacity reached error condition
813 P3 Capacity critical error on %1.

814 P3 The volume free space has reached the user defined critical condition.

815 P3 Volume capacity reached user defined critical condition

818 P3 Volume capacity reached user defined error condition


1020 P3 Volume Disabled
1021 P3 Volume Disabled
1030 P3 Volume Degraded
1031 P3 Volume Degraded
1110 P3 Disk Failing
1111 P3 Disk Failing
7710 P3 Failed to import dynamic disk group.
7712 P3 Import dynamic disk group failed
7713 P3 dynamic disk and volume class
7910 P3 Failed to resynchronize volume.

7970 P3 Outcome of attempt to automatically relocate a failed subdisk.


7972 P3 Hot Relocation/Spare failed
8010 P3 Failed to recover dynamic disk group.
8012 P3 Recover dynamic disk group failed

8016 P3 Immediately back up your data and replace your hard disk. A failure may be imminent.

8020 P3 Failed to start SCSI reservation thread for dynamic disk group.
8022 P3 SCSI Reservation Thread Start Failure

8030 P3 Failed to stop reservation thread for dynamic disk group.


8032 P3 SCSI Reservation Thread Stop Failure

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 23 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

8050 P3 The SCSI reservation thread update failed for dynamic disk group.
8052 P3 SCSI Reservation Thread Update Failure
8094 P3 Failed to reactivate Harddisk%1.
10231 P3 Import cluster dynamic disk group failed

3 P5 INTERNAL Error - Protocol error with configuration daemon


4 P5 INTERNAL Error - Commit status lost
5 P5 %1 provider reported volume %2 lost redundancy.
5 P5 INTERNAL Error - Operation would block
10 P5 INTERNAL Error - Bad message format in request

11 P5 INTERNAL Error - Configuration daemon can not speak protocol version


12 P5 Operation failed due to lack of memory
13 P5 INTERNAL Error - Configuration request too large
20 P5 INTERNAL Error - Operation requires transaction
21 P5 INTERNAL Error - Transaction locks timed out
24 P5 INTERNAL Error - Configuration daemon error
26 P5 Operation failed due to lack of memory
27 P5 INTRNAL Error - Failed to create logging daemon

42 P5 INTERNAL Error - Error in root dynamic disk group logs


43 P5 INTERNAL Error - Cannot create portal
44 P5 INTERNAL Error - dmconfig is currently enabled
45 P5 INTERNAL Error - dmconfig is currently disabled

46 P5 INTERNAL Error - dmconfig is currently in boot mode

48 P5 INTERNAL Error - No convergence between dynamic disk group and disk list

50 P5 INTERNAL Error - Expected registry entry not found

52 P5 INTERNAL Error - String format errors in registry entry


53 P5 INTERNAL Error - Out of space in registry update
54 P5 INTERNAL Error - Short read n volboot file

61 P5 INTERNAL Error - Disabled/unimplemented feature


62 P5 Requested operation is not supported
63 P5 INTERNAL Error - Could not obtain requested lock

64 P5 INTERNAL Error - Required lock not held in transaction

65 P5 INTERNAL Error - Required data lock not held in transaction


67 P5 The specified object no longer exists

69 P5 INTERNAL Error - Unexpected failure in search operation


72 P5 INTERNAL Error - Negative count field in request
73 P5 INTERNAL Error - Negative length, width, or offset

74 P5 INTERNAL Error - Operation not supported on portal

78 P5 INTERNAL Error - Operation requires an associated record


82 P5 INTERNAL Error - Plex is not compact
86 P5 INTERNAL Error - Invalid record type for operation
File: 8909694.doc Version: 1.4
Author: Steve Rosa Page 24 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

95 P5 INTERNAL Error - Too many plexes

101 P5 INTERNAL Error - Operation overflows maximum offsets

102 P5 INTERNAL Error - Log subdisk is too small for volume

103 P5 INTERNAL Error - Log length too small for logging type

104 P5 INTERNAL Error - Log length too large for logging type
106 P5 INTERNAL Error - Field value is out of range
111 P5 The specified disk is not ready or usable
112 P5 INTERNAL Error - Volume is unusable
113 P5 The specified disk is not ready or usable
114 P5 INTERNAL Error - Device node not block special

116 P5 INTERNAL Error - Disk access type not recognized


118 P5 INTERNAL Error - Record store size is too large

124 P5 INTERNAL Error - Configuration too large for dynamic disk group log

125 P5 INTERNAL Error - Disk log too small for dynamic disk group configuration
127 P5 INTERNAL Error - Error in configuration record

128 P5 INTERNAL Error - Configuration too large for configuration copies

129 P5 INTERNAL Error - Dynamic disk group creation not complete

130 P5 INTERNAL Error - No valid log copies in dynamic disk group


136 P5 INTERNAL Error - Database file not found
144 P5 INTERNAL Error - Association not resolved
145 P5 INTERNAL Error - Association count is incorrect

146 P5 INTERNAL Error - Too many minor numbers for volume or plex
147 P5 INTERNAL Error - Invalid block number
148 P5 INTERNAL Error - Invalid magic number
149 P5 INTERNAL Error - Invalid block number

151 P5 INTERNAL Error - Duplicate record in configuration

152 P5 INTERNAL Error - Configuration records are inconsistent

153 P5 INTERNAL Error - No dynamic disk group record in configuration

154 P5 INTERNAL Error - Temp and perm configurations does not match

155 P5 INTERNAL Error - Configuration changed during recovery

156 P5 INTERNAL Error - Rootdg dynamic disk group has no configuration copies

157 P5 INTERNAL Error - Rootdg dynamic disk group has no log copies

158 P5 INTERNAL Error - Expected record not found in kernel

159 P5 INTERNAL Error - Record in kernel not in configuration


File: 8909694.doc Version: 1.4
Author: Steve Rosa Page 25 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

160 P5 INTERNAL Error - Configuration record does not match kernel

161 P5 INTERNAL Error - Kernel and on-disk configurations does not match
162 P5 INTERNAL Error - Disk public region is too small
163 P5 INTERNAL Error - Disk private region is too small
164 P5 INTERNAL Error - Disk private region is full

165 P5 INTERNAL Error - Format error in disk private region

169 P5 INTERNAL Error - Disk privat region version not supported

170 P5 INTERNAL Error - Disks for dynamic disk group are inconsistent

171 P5 INTERNAL Error - Attribute cannot be changed with a reinit

173 P5 INTERNAL Error - Disk header indicates aliased partitions


176 P5 INTERNAL Error - Duplicate device

177 P5 INTERNAL Error - Disk VTOC does not list public partition

178 P5 INTERNAL Error - Disk VTOC does not list private partition

179 P5 INTERNAL Error - Disk VTOC has duplicate public partition

180 P5 INTERNAL Error - Disk VTOC has duplicate private partition

181 P5 Disk sector size is not supported by Volume Manager

182 P5 INTERNAL Error - Dynamic disk group has no valid configuration copies

184 P5 INTERNAL Error - Disk for dynamic disk group in other dynamic disk group

185 P5 INTERNAL Error - Stripe column number too large for plex

186 P5 INTERNAL Error - Volume does not have a RAID read policy
187 P5 INTERNAL Error - Volume has a RAID read policy

189 P5 INTERNAL Error - Volume has the storage attribute

190 P5 INTERNAL Error - License has expired or is not available for operation

191 P5 INTERNAL Error - Could not create license file or directory

192 P5 INTERNAL Error - Volume does not have the storage attribute
203 P5 INTERNAL Error - Overlapping partitions detected

204 P5 INTERNAL Error - Record is subsumed by a multipath disk


209 P5 Operation failed due to NT system error
213 P5 This operation requires a non-failed volume

224 P5 INTERNAL Error - subdisk is not detached but needs to be


227 P5 Partition was not found on the disk
228 P5 Could not lock partition
File: 8909694.doc Version: 1.4
Author: Steve Rosa Page 26 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

229 P5 Could not dismount partition


230 P5 Could not determine partition type
231 P5 Could not associate a drive letter
232 P5 Query of an ft object failed
233 P5 There is no ft object on the disk
234 P5 No space for private region on disk
239 P5 Can not boot from an FT volume

240 P5 Disk(s) can not be upgraded as there are too many partitions.

The disk configuration appears to be changed by another system. Use Merge Foreign Disk
243 P5 to correct the problem.
244 P5 Disk upgrade aborted

BOOT.INI should be modified because the partition number of a bootable volume might
264 P5 have been changed.
306 P5 Snap record not found.

Operation failed. Diskgroup requires recovery. Please recover the diskgroup and retry this
323 P5 operation.
500 P5 Could not open a handle to vxload.sys
501 P5 Call to start vxio failed.
502 P5 Could not load vxconfig.dll.
503 P5 Could not get a proc address in vmconfig.
504 P5 DmConfig is not loaded.
505 P5 Vxio driver not started.
507 P5 Unable to read disk layout information.
509 P5 Disk partition not found.

510 P5 Unexpected Volume Manager database inconsistency.

526 P5 The cluster dynamic disk group online operation failed.

527 P5 The cluster dynamic disk group offline operation failed.

528 P5 Unable to start or modify cluster dynamic disk group SCSI reservation thread.

8015 P5 S.M.A.R.T. monitoring predicts an impending device failure.


8017 P5 S.M.A.R.T. predicts failure on a device
8018 P5 disk class

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 27 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM
Nestlé European Information Technology Operations Center (ITOC) S.A.

16.Appendix B – Veritas Volume Manager Support Information

Phone number to call is : +32 2 713 15 14

Please specify the severity you want to assign to the case:


• Severity 1-a "system down" or product inoperative condition that impacts your
production/business-critical operation
• Severity 2-severely affects or restricts major functionality
• Severity 3-issue with no major affect on business systems
• Severity 4-minor condition or documentation error

Here are the support contract information:

File: 8909694.doc Version: 1.4


Author: Steve Rosa Page 28 of 28
Saved on 03/11/2005 04:51:00 AM
Last printed 16/02/2005 11:34:00 AM