Sei sulla pagina 1di 41

INTERNAL USE ONLY

SUBJECT: AMENDMENT NO:

001
Windows Server 2003 Cluster Installation & DR Instructions ISSUE DATE:

04/03/06
DEPARTMENT: PREVIOUS DATE ISSUED:
Windows Service Team

Objective
This document will provide detailed instructions to complete the installation of a Microsoft
Windows Server 2003 two node cluster attached to shared disk. It will also tell you how to
manually failover a node as well as how to recover from a failed node or Quorum. The
information in this document applies to:
• Microsoft Clustered Server
• Windows Server 2003 Enterprise Edition
• TSM V5.3.0.3

Prerequisites
An understanding of the Auto Install process and EMC/shared disk. Knowledge of Virtual
Center if creating a cluster using virtual hardware.

Overview
This document will take you through the steps needed to install and configure a Windows
Server 2003 Cluster on physical or virtual hardware. Also included in this document are
instructions on how to create a cluster file share and how to manually failover an active node
to a non-active node in order to complete maintenance tasks such as applying patches or
installing new software or hardware. In the event of a single node, multiple node, or Quorum
drive (cluster database) crash, this document also provides instructions to restore those
components to a server with identical hardware. In the event of a Quorum failure, a short
cluster outage will occur during the restore. You will need administrator rights to do all of the
above.

Responsibilities
Windows Service Team

Table of Contents

Before You Start.........................................................................................................................2


Checklist for Cluster Server Installation....................................................................................3
Cluster User Account.............................................................................................................3
Server Hardware Requirements ............................................................................................3
Single Site..........................................................................................................................3
Multi-Site...........................................................................................................................3
RFID...................................................................................................................................4
Network Requirements...........................................................................................................4
Single Site..........................................................................................................................4
Multi-Site...........................................................................................................................4
RFID...................................................................................................................................4
EMC / Shared Disk requirements...........................................................................................4

Page 1 of 41
INTERNAL USE ONLY

Single Site..........................................................................................................................4
Multi-Site...........................................................................................................................5
RFID...................................................................................................................................5
Hardware Setup for VMware Clusters.......................................................................................5
Create & Install Virtual Servers.............................................................................................5
Add Components to New Servers..........................................................................................8
Add NIC #2........................................................................................................................8
Installation Overview...............................................................................................................13
Configuring the Heartbeat Network Adapter...........................................................................14
Setting up Shared Disks...........................................................................................................15
Configuring Shared Disks ..............................................................................................15
Assigning Drive Letters...................................................................................................15
Configuring the First Node......................................................................................................16
Configuring the Second Node..................................................................................................21
Configuring the Cluster Groups and Verifying Installation.....................................................24
Test failover to verify that the cluster is working properly..............................................25
Verify cluster network communications...........................................................................25

Configuring a Cluster File Share..............................................................................................28


Failing Over an Active Cluster Node.......................................................................................31
Windows Server 2003 Cluster Restore Instructions.................................................................33
Before You Begin.................................................................................................................33
Single Node Restore Instructions.............................................................................................33
Restoring the Server from Backup ..........................................................................................33
Windows Service Team........................................................................................................33
Storage Management Team..................................................................................................34
Windows Service Team........................................................................................................34
Multiple Node Restore Instructions.........................................................................................36
Restoring the Server from Backup ..........................................................................................36
Windows Service Team........................................................................................................36
Storage Management Team..................................................................................................37
Windows Service Team........................................................................................................38
Quorum Drive Restore Instructions.........................................................................................39
Restoring the Cluster Database (Quorum Drive) ....................................................................39
Windows Service Team........................................................................................................39
Amendment Listing..................................................................................................................41

Before You Start


Before you can begin installing and configuring your Windows Server 2003 cluster you will
need to have two Windows 2003 Enterprise Edition servers built and connected to the
network. The servers must have teamed NICs so that a single network failure will not cause
the cluster to fail over. You will also need HBAs installed in each server and you will need to
request EMC disk to be configured. The EMC disk needs to be presented to both servers.
Besides the EMC disk you need for the application running on the cluster, you will need an
additional “Quorum” EMC disk configured. This disk needs to be a minimum of 500MB. The
Quorum disk holds the cluster configuration files and should not be used to store any thing
else.

Besides the NetBios name and IP address for each of the servers (nodes) in the cluster, the
cluster itself needs a NetBios name and IP address. To make the installation less confusing,

Page 2 of 41
INTERNAL USE ONLY

rename the network connection on both servers to be used for the heartbeat network (the
non-teamed NIC) to Heartbeat.

Also, a domain user account is needed for the cluster. You need to request this account
from Computer Security by submitting the Global – Non-User ID Request form in Outlook.
Make sure that this account has logon rights to only the two servers in the cluster and that
the account password does not expire. Please note that this account cannot be requested
until the two servers are actually built. Also the password must be a secure one (minimum
of 14 characters etc) and it needs to be changed every 180 days. This account is to be used
for the cluster service only; no application should be using this ID. The owner should be
listed as Bryan Miller (B27021S).

Note: Multi-site clusters installed with an odd number (ex. USTCCA001) should have their
“A” node at TCC. Multi-site clusters installed with an even number (ex. USTCCA002) should
have their “A” node at TCC-West.

Checklist for Cluster Server Installation


Cluster User Account
• Request a domain user account from Computer Security after the two cluster servers
are built.
• The account should only have logon rights to the two servers in the cluster.
• The password for this account should never expire.
• The password MUST be a secure one (minimum of 14 characters etc.) and must be
changed every 180 days.
• The account should be used for the cluster service ONLY; no applications on the
cluster should use the account.
• Bryan Miller (B27021S) should be the owner
• It MUST be entered in to the Windows Service team fire fight account list.
• After you receive the account from Computer Security please verify that the account
is configured as stated above.
• NOTE: A cluster user account is required for any/all of the three scenarios listed
below (single site, multi-site, RFID).

Server Hardware Requirements


Single Site
• 2 Windows 2003 Enterprise Edition servers
• 3 Ethernet ports per server (2 for a teamed NIC and 1 for the heartbeat network)
• 2 single-port Emulex HBAs (or one dual-port HBA) per server

Multi-Site
• 2 Windows 2003 Enterprise Edition servers
• 3 Ethernet ports per server (2 for a teamed NIC and 1 for the heartbeat network)
• 2 single-port Emulex HBAs (or one dual-port HBA) per server

Page 3 of 41
INTERNAL USE ONLY

RFID
• 2 Proliant DL380 G4 servers with 4 - 3.4GHZ CPUs and 3.5GB RAM, installed with
Windows 2003 Server Enterprise Edition
• HP NC7771 NIC installed in slot 1
• 642 Smart Array controllers installed in slots 2 and 3 (part of the HP StorageWorks
Modular Smart Array 500 62 High Availability Kit)

Network Requirements
Single Site
• 1 Cluster IP and NetBios name
• 1 IP and NetBios name per server (For the teamed NIC)
• 1 IP per server for the heartbeat network, for clusters running at TCC you need to
request the heartbeat IP from Network Operations. Servers not running at TCC use
192.168.1.249 for node A and 192.168.1.250 for node B.
• 1 Crossover patch cord (Servers at TCC don’t need this)
• 2 Network ports on the production network per server

Multi-Site
• 1 Cluster IP and NetBIOS name. Request an IP address from Network Operations
using the IP request form and select “TCC/TCC West Spanned Vlan – 165.28.94” or
“TCC/TCC West Spanned Vlan – 165.28.111” form the “Site” drop down list.
• 1 IP and NetBIOS name per server (For the teamed NIC). Request an IP address
from NetOps using the IP request form and select “TCC/TCC West Spanned Vlan –
165.28.94” or “TCC/TCC West Spanned Vlan – 165.28.111” form the “Site” drop
down list.
• Note: All three IP’s for the cluster must be on the same Vlan.
• 1 IP per server for the heartbeat network. Request a heartbeat IP from Network
Operations using the IP request form and select “USTC- UNIX Heartbeat
172.12.123” form the “Site” drop down list.
• 2 Network ports on the production network per server.

RFID
• 1 Cluster IP and NetBIOS name
• 1 IP and NetBIOS name per server (For the teamed NIC)
• 1 IP per server for the heartbeat network, (use 192.168.1.249 for node A and
192.168.1.250 for node B)
• 1 Crossover patch cord
• 2 Network ports on the production network

EMC / Shared Disk requirements


Single Site
• 1 510MB Quorum disk minimum (must be a minimum of 500mb after formatting)
• Whatever disk configuration you need for the application you are running.
• The above disk needs to be seen by both servers.

Page 4 of 41
INTERNAL USE ONLY

Multi-Site
• 1 510MB Quorum disk minimum (must be a minimum of 500mb after formatting)
• Whatever disk configuration you need for the application you are running.
• The above disk needs to be seen by both servers.
• Note: When requesting disk inform the DSM team that this cluster is split
between TCC and TCC West.

RFID
• 1 HP StorageWorks Modular Smart Array 500 G2
• 1 HP StorageWorks Modular Smart Array 500 62 High Availability Kit
• 6 HP 72GB 10K Ultra320 SCSI HDD
• Configure 2 logical disks
o 1 510MB for the Quorum and the remainder of the disk as R: for SQL data.

Hardware Setup for VMware Clusters


If you don’t have the physical hardware to create a cluster, you can use VMware to create a
cluster using virtual servers and virtual shared disks. However, when doing this, there are
several steps to complete in order to properly replicate a true hardware cluster.

Create & Install Virtual Servers


1. Open Virtual Center (Start/All Programs/VMware/VMware Virtual Center). If
Virtual Center is not installed on your desktop, please install it from
\\usnca001\share\ESMSoftware\VMWare\Virtual Center\1.3.0-16701
2. Right click on the VM host where the guest servers will be created. Select
New Virtual Machine.

NOTE: Both servers to be used in the cluster should be located on the same
ESX host.

Page 5 of 41
INTERNAL USE ONLY

3. Click Next at the window that appears.

4. Select Typical from the two options and then click Next.

5. Select the VM group where the new server will be located and click Next.

6. Select Microsoft Windows as the OS and Windows Server 2003


Enterprise Edition as the version.

Page 6 of 41
INTERNAL USE ONLY

7. Enter the name of the new server (must be lower case) and choose the
datastore location for the c:\ drive. DO NOT choose vol_local. Click Next.

8. Choose vlan_04 as the NIC and vlance as the adapter type. Be sure to
check connect at power on as well. Click Next.

Page 7 of 41
INTERNAL USE ONLY

9. Select the disk size for the c:\ drive of the new server. Click Next.

10. Click Finish at the next window. The new virtual server should show up in
the correct VM group in about 10 seconds. Repeat this task for the second
server (node) which will be included in the cluster.
After both servers are created in Virtual Center, continue with the install of the
new servers using script builder and the auto-install process. Choose to
install just the c:\ drive from script builder and be sure to specify that it is an
ESX guest.
11. Once the new servers are built power them both off using Virtual Center.

NOTE: Both servers to be used in the cluster should be located on the same
ESX host.

Add Components to New Servers


Add NIC #2
12. After both servers are off, right click on first server (node A) and click
Properties.

Page 8 of 41
INTERNAL USE ONLY

13. Click Add at the window below. Click Next when the next window appears.

14. Highlight Ethernet Adapter and then click Next.

15. Choose heartbeat 1 for the NIC and be sure connect at power on is
checked. Click Finish.

Page 9 of 41
INTERNAL USE ONLY

16. Verify that Adapter Type for both NIC1 & NIC2 is set to vmxnet. Click OK.

17. Repeat steps 13-17 for Node B.

Add Shared Disk


18. Click Add again. Click Next when the next window appears.
19. This time, choose Hard Disk from the list and then click Next.
20. Select Create New Virtual Disk, then click Next.

21. Select the size and location of the disk you are creating, then click Next. This
disk should be located on the same datastore as the first disk you created
when building the server if at all possible. Also, this disk should be the
quorum disk, so it needs to be at least 500mb after formatting.

Page 10 of 41
INTERNAL USE ONLY

22. This new disk should be set on SCSI 1:0, then click Finish.

23. Repeat steps 19-23 for any other shared disks needed. Be sure to attach
them to the next available opening on SCSI Controller 1, not 0. (ex. SCSI 1:1)
24. Once all shared disks are created, click OK to exit the virtual machine
properties.
25. Re-open the virtual machine properties again and verify that the SCSI Bus
Sharing for SCSI Controller 1 is set to Virtual and that all shared disks were
created successfully. Click OK.

26. For Node B, repeat steps 19 & 20. Then select Use An Existing Virtual
Disk, then click Next.

Page 11 of 41
INTERNAL USE ONLY

27. Select the correct datastore where you created the shared disk on Node A,
then click Browse.

28. Choose the first shared disk file that was created with Node A and select
Open. You may need to review the properties for Node A to be sure you are
selecting the same disk file. Click Next at the next window that appears.

29. Be sure to add this new shared disk to the same SCSI controller and port that
it is using on Node A.

Page 12 of 41
INTERNAL USE ONLY

30. Repeat this process for the remaining shared disks that were created on
Node A. In the end, you want both nodes to have the exact same
configuration. See below for an example of two servers using the same
shared drives on the same host.

Once you have finished setting up both nodes you may power on Node A, but only Node A at
this time. Follow the steps below to setup the heartbeat NIC, the shared disks, and then the
first node in the cluster. Once you have done all those steps, you may power on Node B
and continue with adding it to the cluster.

Installation Overview
During the installation process, some nodes will be shut down and some nodes will be
rebooted. These steps are necessary to guarantee that the data on disks that are attached
to the shared storage bus is not lost or corrupted. This can happen when multiple nodes try
to simultaneously write to the same disk that is not yet protected by the cluster software.
Use Table 1 below to determine which nodes and storage devices should be powered on
during each step.

Table 1. Power Sequencing for Cluster Installation

Page 13 of 41
INTERNAL USE ONLY

Step Node Node Storage Comments


A B
Setting Up Networks On On Off Disconnect Fiber from the HBAs on both nodes.

Setting up Shared On Off On Shutdown both nodes. Connect Fiber to the HBAs on
Disks both nodes, then power on node A.

Verifying Disk Off On On Shut down node A, power on node B.


Configuration

Configuring the First On Off On Shutdown both nodes; power on node A.


Node

Configuring the On On On Power on node B after node A was successfully


Second Node configured.

Post-installation On On On At this point all nodes should be on.

Configuring the Heartbeat Network Adapter


To avoid possible data corruption and as a Microsoft best practice, make sure that at this
point that the fiber to the Emulex HBA is disconnected for both servers. Perform the following
steps on both nodes of the cluster. At this point the crossover patch cord should be
connected.

1. Right-click My Network Places and then click Properties.


2. Right-click the Heartbeat icon.
3. Right-click Heartbeat, click Properties, and click Configure.
4. Click Advanced.
5. Set the Speed & Duplex to 100Mb Full , click OK (gig full at TCC)
6. Click Transmission Control Protocol/Internet Protocol (TCP/IP).
7. Click Properties.
8. Click the radio-button for Use the following IP address and type in the address
supplied by Network Operations or if the cluster is not running at TCC use following
address: 192.168.1.249 for node A. Use 192.168.1.250 for node B.
9. Type in a subnet mask of 255.255.255.0. (Leave default gateway blank)
10. Click the Advanced radio button and select the WINS tab. Select Disable
NetBIOS over TCP/IP. Click Yes to This connection has an empty primary WINS
address prompt. Click OK, and OK to exit. Do these steps for the heartbeat
network adapter only.
11. Repeat steps 1-10 on node B

The window should now look like this.

Page 14 of 41
INTERNAL USE ONLY

Setting up Shared Disks


Power off node B. Make sure that the Fiber cable from the SAN is connected to the HBA on
node A.

Configuring Shared Disks

1. Right click My Computer, click Manage, and click Storage.


2. Double-click Disk Management. (Cancel Write Signature and Upgrade Disk
wizard)
3. Verify that all shared disks are formatted as NTFS and are designated as Basic. If
you connect a new drive, the Write Signature and Upgrade Disk Wizard starts
automatically. If this happens, click Next to go through the wizard. The wizard sets
the disk to dynamic. To reset the disk to Basic, right-click Disk # (where # specifies
the disk you are working with) and click Revert to Basic Disk.

NOTE: For SAN connected devices the Storage team must write the signatures
with the Diskpar command to set the Sectors Per Track to the proper offset.

4. Right-click unallocated disk space.


5. Click Create Partition…
6. The Create Partition Wizard begins. Click Next twice.
7. Enter the desired partition size in MB and click Next.
8. Accept the default drive letter assignment by clicking Next.
9. Click Next to format and create partition.

Assigning Drive Letters

After the disks and partitions have been configured, drive letters must be assigned to each
partition on each clustered disk. The Quorum disk (this disk holds the cluster information and
will usually be 1GB or less in size) should be assigned drive letter Q:.

1. Right-click the partition and select Change Drive Letter and Path.
2. Click Change and select a new drive letter, Click OK and then Yes.

Page 15 of 41
INTERNAL USE ONLY

3. Right-click the partition select Properties. Assign the disk label using the cluster
name (example: XXTCCA004-D) click OK.
4. Repeat steps 1 thru 3 for each shared disk.

5. When finished, the Computer Management window should look like Figure above.
Now close the Computer Management window.
6. Reboot Server.

Configuring the First Node


Note: During the configuration of Cluster service on node A, node B must powered Off.

1. Logon to node A with your admin account


2. Click Start, All Programs, Administrative Tools, Cluster Administrator

3. At the “Open Connection to Cluster” screen select “Create New Cluster” from the
drop down menu. Then click OK.

Page 16 of 41
INTERNAL USE ONLY

4. At the “Welcome to The New Server Cluster Wizard” screen click Next.
5. At the “Cluster Name and Domain” screen select the domain the cluster will be in
from the drop down menu and enter the cluster name in the Cluster name field.
Then click Next.

6. At the “Select Computer” screen verify that the Computer name is that of node A.
Then click Next.

Page 17 of 41
INTERNAL USE ONLY

7. The “Analyzing Configuration” screen will appear and verify the server
configuration. If the setup is acceptable the next button will be available. Click Next
to continue. If not check the log and troubleshoot.

Page 18 of 41
INTERNAL USE ONLY

8. Enter the IP address for the Cluster then click Next.

9. At the “Cluster Service Account” screen enter the cluster server account name and
password and verify that the domain is correct then click Next.

Page 19 of 41
INTERNAL USE ONLY

10. At the “Proposed Cluster Configuration” screen verify that all is correct. Take a
moment to click on the Quorum button and also verify that the Q:\ drive is selected
to be used as the Quorum disk. Click Next once configuration is verified.

11. At the “Creating Cluster” screen check for errors, if none click Next.

12. At the “Completing the New Server Cluster Wizard” screen click Finish.

Page 20 of 41
INTERNAL USE ONLY

Configuring the Second Node


1. Power up node B and wait for it to fully load Windows.
2. From node A click Start, All Programs, Administrative Tools, Cluster
Administrator
3. At the “Cluster Administrator” right click on the cluster then go to New then click
Node.

4. At the “Welcome to Add Nodes Wizard” click Next .


5. At the “Select Computers” screen enter the server name for node B then click Add,
then Next.

Page 21 of 41
INTERNAL USE ONLY

6. The “Analyzing Configuration” screen will appear and verify the server
configuration. If the setup is acceptable the next button will be available. Click Next
to continue. If not, check the log and troubleshoot.

Page 22 of 41
INTERNAL USE ONLY

7. At the “Cluster Service Account” screen enter the cluster user account password
then click Next.

8. At the “Proposed Cluster Configuration” screen, verify the configuration and if it is


correct click Next.

Page 23 of 41
INTERNAL USE ONLY

9. At the “Adding Node to Cluster” screen click Next.

10. At the “Completing the Add Nodes Wizard” screen click finish.

Configuring the Cluster Groups and Verifying Installation


The cluster groups need to be renamed. Rename the default group Cluster Group to
<Cluster Name> (If the name of the cluster is USTCCA001 the group Cluster Group will be
renamed to USTCCA001) and the Disk Group # to correspond to the application being
installed on the cluster appended to the Cluster Name, i.e. SQL#, would be
USTCCA001SQL1 and SAP# would be STCCA001SAP1 etc. Naming convention
standards can be found at http://esm.kcc.com/serverfacts.aspx

1. Click Start, click Administrative Tools, and click Cluster Administrator.

Page 24 of 41
INTERNAL USE ONLY

2. Right click on the Cluster Group and then click rename and enter the name <Cluster
Name>.

3. Right click on the Disk Group # and then click rename and enter the name that
corresponds to the application you are installing on the cluster appended to the Cluster
Name I.E USTCCA001SQL1.

Test failover to verify that the cluster is working properly

1. To test Failover Right-Click the Cluster Group and select Move Group. The
group and all its resources will be moved to node B. After a short period of time
the resources will be brought online on the node B. If you watch the screen, you
will see this shift.
2. Move the Cluster Group back to node A by repeating step 1.

Verify cluster network communications

1. You’ll need to verify the cluster network communications settings as a final step in
the cluster configuration. Click Start, then Administrative Tools, then Cluster
Administrator.
2. Right click on the cluster name, then choose Properties.

Page 25 of 41
INTERNAL USE ONLY

3. Click on the Network Priority tab from the window that appears.

4. Verify that the Heartbeat network is listed first and the Team network is listed
second.
5. Select the Heartbeat network from the list and click on Properties. Verify that
the Heartbeat network is set to Internal Cluster Communications Only (private
network) and then click OK.

6. Select the Team network from the list and click on Properties. Verify that the
Team network is set to All Communications (mixed network) and then click

Page 26 of 41
INTERNAL USE ONLY

OK.

7. Click OK to exit the Network Priority screen to return to the Cluster


Administrator GUI.

Congratulations you have successfully installed and configured a Microsoft Server


Cluster. Don’t forget to add the Cluster account to the ESM firefight database.
Additionally, please update Asset Management with all the nodes, including the cluster
name itself. (Ex. ustcca008, ustcca008a, ustcca008b)

Firefight – http://ws.kcc.com/AccountMgmt/

Asset Mgmt - http://ws.kcc.com/Asset/

Example of an Asset Mgmt entry with the cluster name and two nodes listed separately.

When entering the cluster name into Asset Mgmt the following fields should be filled in as
follows:

• Serial # - Name of the cluster node (ex. USTCCA008)


• Description – Something that designates that this is the cluster node and not a
physical server (ex. Cluster node for servers USTCCA008A and USTCCA008B)
• Asset Tag – Name of cluster node (ex. USTCCA008)
• Build Type – Cluster Node Name
• Image Version – Cluster Node Name
• Hardware Description - _Cluster Name

Page 27 of 41
INTERNAL USE ONLY

Configuring a Cluster File Share


Summary

In order to create a cluster file share you need to create the file share using the Cluster
Administrator GUI or the Cluster command line utility. The following steps will take you
through the process of creating a file share resources using the Cluster Administrator GUI.

Configuring a Cluster File Share

1. Logon to one of the cluster nodes with your administrator account


2. Start the Cluster Administrator GUI, Click Start\Administrative Tools\Cluster
Administrator. Note - if this is the first time you are running this utility you will be
prompted to enter the cluster name. Enter the cluster name and Click Open.
3. Under Groups, right click on cluster group which contains the drive you need to
create the share on then go to New\Resource

4. The New Resource window will pop up. Fill in the following fields:
Name, Enter the Name of the Share
Description, Enter the path to the folder you are sharing
Resource Type, From the drop down list select File Share (note: the folder to
be shared must currently exist, if not, please create it at this point)

Page 28 of 41
INTERNAL USE ONLY

Click Next

5. In the Possible Owners windows make sure that all nodes of the cluster are listed in
the possible owners box then Click Next.

6. In the Dependences Windows Highlight the Disk resources in the resource window
that the Folder is on then click Add then Next.

Page 29 of 41
INTERNAL USE ONLY

7. In the File Share Parameters window fill in the following fields;

Share name, Enter the name of the share


Path, Enter the path to the folder you are sharing

Then Click Finish and then OK.

8. The newly created share will be offline and you will need to bring it online. Right click
on the share you just created than click on Bring Online.

Page 30 of 41
INTERNAL USE ONLY

9. To set security on the share you right click on the share resource then select
Properties then click on the Parameters tab then click on the Permissions button
and add the security rights you need.
10. You can verify that the new share is online by accessing it from your desktop by
clicking Start/Run and then typing \\<cluster name>\<share name>

Failing Over an Active Cluster Node


The following instructions will explain how to manually failover a cluster node using the
Cluster Administrator GUI.

1. Logon to one of the cluster nodes with your administrator account.

2. Start the Cluster Administrator GUI, Click Start\All Programs\Administrative


Tools\Cluster Administrator. Note - if this is the first time you are running this utility
you will be prompted to enter the cluster name. Enter the cluster name and Click OK.

3. Under Groups click on the cluster group name (XXTCCA004 in the screen shot
example below). The right pane will display information about all of the resources in
the group. The Owner column will list the node that is currently running the cluster.
Make note of the owner as well as the state of the resources. In most cases, all of the
resources will be online and should be online again after you failover the cluster.

4. To failover the active node, under Groups right click on XXTCCA004 then click on
Move Group. If more than two nodes exist as part of the cluster you can choose a
specific node or choose Best Possible. If Best Possible is chosen, the group will
move to one of the available nodes based on preferred ownership designation for
that specific resource group.

Page 31 of 41
INTERNAL USE ONLY

5. You will notice the that state of the resources will go to offline pending, then to offline,
then to online pending, and finally to online. The owner will also switch at this time,
and this whole process will take about 10 to 15 seconds to complete. After it is
completed all the resources that were online before will now be online and the owner
will have changed.

Repeat steps 3-5 for all resource groups listed under the Groups heading.

Page 32 of 41
INTERNAL USE ONLY

Windows Server 2003 Cluster Restore Instructions


Before You Begin
Before the process is started, you will need to gather some information about the server you
are restoring:
1. Server name and IP information (i.e. IP address, default gateway, subnet mask)
2. The service pack that was installed prior to the crash
3. Cluster Name
4. Cluster account and password
5. Blank 1.44MB floppy diskette
6. Compare hardware – Document any differences

Single Node Restore Instructions


The instructions below will detail how to restore a single node of a failed Windows Server
2003 cluster.

Restoring the Server from Backup


Windows Service Team

1. These instructions are based on you restoring the server to identical hardware.

2. Create an install script from the following location


(http://esm.kcc.com/ScriptBuilder/Win2000Server/Default.aspx?OS=WS03) using the
same server name and IP address. Select Disaster Recovery as “Application type”,
Select the same service pack version as the crashed server had installed. Make sure
you select C: drive only.

3. Make sure that the fiber cable to the SAN is disconnected. For VMWare clusters, make
sure the shared disks are removed from the server properties in Virtual Center before
installing the DR build.

4. Install Windows 2003 using the Auto Install Floppy Diskette


http://esm.kcc.com/ScriptBuilder/AutoInstallDocs.aspx

5. Logon to the server with the following credentials: user name is administrator and
password is Admin1. These are case sensitive. Note: After the DR auto-install
completes, the server will probably auto-logon the first time with the administrator
account.

6. Make sure the Network Card is set to communicate at 100/Full (If the switch can handle
100/Full). In other words, make sure the Network card is configured the same as the
switch port. Do not leave it set to Auto/Auto.

7. Browse to c:\windows\system32\ and copy HAL.DLL, NTKRNLPA.EXE, and


NTOSKRNL.EXE to c:\temp

8. Perform a search for the three files mentioned above. Take note of the paths to all
instances found. Disregard those paths leading to *.cab files. You will have to copy
these file back to those locations later.

Page 33 of 41
INTERNAL USE ONLY

9. Install the TSM client on the server by mapping a drive to \\<distibutionserver>\smspkgd$


and then running \N040000e\Script\DR.bat.

10. Enable remote desktop connections on server if not already enabled (right click on
server name icon on desktop, Properties, Remote tab, Enable Remote Desktop on this
computer, OK)

11. Install correct service pack that was on the server before it crashed, most likely SP1.
Reboot server.

12. Log back into server. Check the three files from step 7, if newer files exist after service
pack is installed, then copy them to c:\temp and overwrite previously copied files.

13. Contact the Storage Management team to restore the required information.

Storage Management Team

1. On the desktop, double-click the “TSM Backup Client” icon.

2. Click the Restore button.

3. Click on the “+” next to “File Level”. You will see:


\\<ServerName>\c$ (C:)

4. Select the folders needed to complete the restore.

5. Click the Options button.

6. From the “Action for Files That Already Exist”, Select “Replace” from the dropdown box.
Click the check box next to “Replace file even if read-only/locked” (You may still be
prompted to make a decision. Always select Replace).

7. Click OK.

8. Click Restore.

9. Verify the radio button for “Original location” is selected. Click Restore – If a message
asking to restart the machine pops up select No…. Always.

10. Click OK.

11. When prompted to reboot server, click NO.

12. Close the TSM Backup Client.

Windows Service Team

1) Now run NT Backup to restore the system state.


a) Browse to c:\sstate\ and double-click systemstate.bkf.
b) Ignore the wizard that pops up and instead choose Advanced Mode.
c) An NT backup window should appear. Go to the tools menu and select
options.

Page 34 of 41
INTERNAL USE ONLY

d) Click on the Restore tab and select always replace the file on my
computer. Click OK.
e) Click on the Restore tab and highlight file. Right-click and choose catalog
file…..
f) Enter the path to the catalog file (c:\sstate\systemstate.bkf) and hit OK.
g) Click on the “+” sign next to file.
h) Click OK on Popup window if it appears.
i) Click on the “+” sign next to System State.bkf created… that has the most
recent date/time
j) Click OK on Popup window if it appears.
k) Make sure the path still points to c:\sstate\systemstate.bkf and click OK.
l) Check the box next to system state.
m) Make sure the dropdown menu restore files to: Original Location is
selected. Click Start Restore.
n) A warning message stating “Restoring system state will always overwrite
current system state……..etc. Click OK.
o) A Confirm Restore message box will appear. Click OK.
p) Click OK on Popup window if it appears.
q) An Enter Backup File Name window may appear. Verify the path to be
c:\sstate\systemstate.bkf and click OK.
r) When the restore completes, click Close.
s) You’ll be prompted to reboot. Click on “NO”.
t) Close the NT Backup window.

2) Browse to c:\temp\ and copy HAL.DLL, NTKRNLPA.EXE, and NTOSKRNL.EXE back to


the locations you found them in earlier (should be c:\windows\system32)

3) Reboot the server.

4) On startup, press F8 and then select “Safe Mode with Networking” from the menu.

5) Logon with a KCUS “Administrative” account.

6) Continue to install any new hardware that is detected.

7) Open Device manager. Under “Display Adapters” delete any/all defined. By “System
Device” if there is an “!” next to “Compaq …..System Management Controller”. Delete the
device. Close Device Manager and Reboot.

8) Logon with a KCUS “Administrative” account.

9) Map T: to \\<distribution server>\SMSPKGD$

10) Reapply the service pack version of the “Crashed Server”, Restart.

11) Reboot and repeat steps 9 and 10.

12) Apply the Compaq Support pack. (N040037C\Shell37Cs.bat), Restart.

13) Logon with an “Administrative” account.

14) Verify that the server is functioning.

Page 35 of 41
INTERNAL USE ONLY

15) Reconnect the SAN fiber cable to the HBA. (Physical server only, not VMWare)

16) Reboot server

17) If the server is VMWare, add the shared disks back to the server properties thru Virtual
Center while the server is off. (This step is only for virtual servers.)

18) Logon with a KCUS “Administrative” account.

19) Verify that the cluster node is activated and functional. Items to check to be sure the
rebuild is successful:

• Verify local disk names


• Verify IP address settings
• Verify remote desktop connections
• Ping cluster name (virtual node)
• Schedule outage to test failing cluster groups between nodes (primary node to
secondary node and back again)

Multiple Node Restore Instructions


The instructions below will detail how to restore both nodes of a failed Windows Server 2003
cluster.

Restoring the Server from Backup


Windows Service Team

1. These instructions are based on you restoring the server to identical hardware.

2. Create an install script from the following location


(http://esm.kcc.com/ScriptBuilder/Win2000Server/Default.aspx?OS=WS03) using the
same server name and IP address. Select Disaster Recovery as “Application type”,
Select the same service pack version as the crashed server had installed. Make sure
you select C: drive only.

3. Make sure that the fiber cable to the SAN is disconnected. For VMWare clusters,
make sure the shared disks are removed from the server properties in Virtual Center
before installing the DR build.

4. Install Windows 2003 using the Auto Install Floppy Diskette


http://esm.kcc.com/ScriptBuilder/AutoInstallDocs.aspx

5. Logon to the server with the following credentials: user name is administrator and
password is Admin1. These are case sensitive. Note: After the DR auto-install
completes, the server will probably auto-logon the first time with the administrator
account.

6. Make sure the Network Card is set to communicate at 100/Full (If the switch can
handle 100/Full). In other words, make sure the Network card is configured the same
as the switch port. Do not leave it set to Auto/Auto.

Page 36 of 41
INTERNAL USE ONLY

7. Browse to c:\windows\system32\ and copy HAL.DLL, NTKRNLPA.EXE, and


NTOSKRNL.EXE to c:\temp

8. Perform a search for the three files mentioned above. Take note of the paths to all
instances found. Disregard those paths leading to *.cab files. You will have to copy
these file back to those locations later.

9. Install the TSM client on the server by mapping a drive to


\\<distibutionserver>\smspkgd$ and then running \N040000e\Script\DR.bat.

10. Enable remote desktop connections on server if not already enabled (right click on
server name icon on desktop, Properties, Remote tab, Enable Remote Desktop on
this computer, OK)

11. Install correct service pack that was on the server before it crashed, most likely SP1.
Reboot server.

12. Log back into server. Check the three files from step 7, if newer files exist after
service pack is installed, then copy them to c:\temp and overwrite previously copied
files.

13. Contact the Storage Management team to restore the required information.

Storage Management Team

1. On the desktop, double-click the “TSM Backup Client” icon.

2. Click the Restore button.

3. Click on the “+” next to “File Level”. You will see:


i. \\<ServerName>\c$ (C:)

4. Select the folders needed to complete the restore.

5. Click the Options button.

6. From the “Action for Files That Already Exist”, Select “Replace” from the dropdown
box. Click the check box next to “Replace file even if read-only/locked” (You may
still be prompted to make a decision. Always select Replace).

7. Click OK.

8. Click Restore.

9. Verify the radio button for “Original location” is selected. Click Restore – If a
message asking to restart the machine pops up select No…. Always.

10. Click OK.

11. When prompted to reboot server, click NO.

12. Close the TSM Backup Client.

Page 37 of 41
INTERNAL USE ONLY

Windows Service Team

1. Now run NT Backup to restore the system state.


a) Browse to c:\sstate\ and double-click systemstate.bkf.
b) Ignore the wizard that pops up and instead choose Advanced Mode.
c) An NT backup window should appear. Go to the tools menu and select
options.
d) Click on the Restore tab and select always replace the file on my
computer. Click OK.
e) Click on the Restore tab and highlight file. Right-click and choose catalog
file…..
f) Enter the path to the catalog file (c:\sstate\systemstate.bkf) and hit OK.
g) Click on the “+” sign next to file.
h) Click OK on Popup window if it appears.
i) Click on the “+” sign next to System State.bkf created… that has the most
recent date/time
j) Click OK on Popup window if it appears.
k) Make sure the path still points to c:\sstate\systemstate.bkf and click OK.
l) Check the box next to system state.
m) Make sure the dropdown menu restore files to: Original Location is
selected. Click Start Restore.
n) A warning message stating “Restoring system state will always overwrite
current system state……..etc. Click OK.
o) A Confirm Restore message box will appear. Click OK.
u) Click OK on Popup window if it appears.
v) An Enter Backup File Name window may appear. Verify the path to be
c:\sstate\systemstate.bkf and click OK.
w) When the restore completes, click Close.
x) You’ll be prompted to reboot. Click on “NO”.
y) Close the NT Backup window.

2. Browse to c:\temp\ and copy HAL.DLL, NTKRNLPA.EXE, and NTOSKRNL.EXE back to


the locations you found them in earlier (should be c:\windows\system32)

3. Reboot the server.

4. On startup, press F8 and then select “Safe Mode with Networking” from the menu.

5. Logon with a KCUS “Administrative” account.

6. Continue to install any new hardware that is detected.

7. Open Device manager. Under “Display Adapters” delete any/all defined. By “System
Device” if there is an “!” next to “Compaq …..System Management Controller”. Delete the
device. Close Device Manager and Reboot.

8. Logon with a KCUS “Administrative” account.

9. Map T: to \\<distribution server>\SMSPKGD$

10. Reapply the service pack version of the “Crashed Server”, Restart.

11. Reboot and repeat steps 9 and 10.


Page 38 of 41
INTERNAL USE ONLY

12. Apply the Compaq Support pack. (N040037C\Shell37Cs.bat), Restart.

13. Logon with an “Administrative” account.

14. Verify that the server is functioning.

15. Reconnect the SAN fiber cable to the HBA. (Physical server only, not VMWare)

16. Reboot server

17. If the server is VMWare, add the shared disks back to the server properties thru Virtual
Center while the server is off. (This step is only for virtual servers.)

18. Logon with a KCUS “Administrative” account.

19. Verify that the cluster node is activated and functional. Items to check to be sure the
rebuild is successful:

• Verify local disk names


• Verify IP address settings
• Verify remote desktop connections
• Ping cluster name (virtual node)
• Verify shared disks
• Launch cluster administrator to verify cluster and associated groups are operational

20) Restore second node by same process as first node, using all steps above.

21) Verify that the second cluster node is active and functional.

22) Schedule outage to test cluster failover by moving cluster groups from primary to
secondary node and then back again.

Quorum Drive Restore Instructions


The instructions below will detail how to restore a failed/corrupted cluster database (Quorum
drive) for all nodes in a Windows Server 2003 cluster.

Restoring the Cluster Database (Quorum Drive)


Windows Service Team

1. Logon to the primary node and reconfigure the Quorum drive using Disk Manager
(right click on computer name icon on desktop, Manage, Disk Management)

2. Delete and recreate Q:\ drive. Format as NTFS. Logoff server.

3. Logon to the primary node and open Backup (Start, All Programs, Accessories,
System Tools, Backup)

4. Ignore the Backup/Restore wizard and click on “Advanced Mode” instead.

5. Click on the “Restore and Manage Media” tab.

Page 39 of 41
INTERNAL USE ONLY

6. Double-click on the SystemState.bkf file on the right hand pane with the most recent
date/time stamp.

7. Now put a check next to “System State” in the expanded view to the left.

8. Be sure “Restore Files to:” is set to Original Location

9. Click “Start Restore”, then “OK” at the pop-up window.

10. On the Confirm Restore window, click “Advanced”

11. Put a check next to “Restore the Cluster Registry to the quorum disk…” and then
click “OK”. Click “Yes” at the next pop-up window, then “OK” again.

12. If asked to verify the location of the SystemState.bkf file, please browse to c:\sstate
and locate the file, then the restore should begin. This process will stop the cluster
service on the primary node and will restore the cluster configuration for that node.

13. When the restore has completed, select YES to reboot the server. This will restart
the Cluster service on the primary node and will then stop the Cluster service on all
other nodes in the cluster. Backup will then copy the restored cluster database
information from the restarted primary node to the Quorum disk (Q:\) and all other
nodes in the cluster.

14. Once the primary node is restarted, logon and verify that the Cluster service is
running. Also, try to ping the cluster name to be sure it is running properly.

15. Now go to each of the other nodes and start the cluster service manually. Verify the
cluster service is running after starting it on each node by doing the following:

16. Schedule outage to test failing cluster groups between nodes (primary node to
secondary node and back again)

Page 40 of 41
INTERNAL USE ONLY

Amendment Listing

AMENDMENT DATE OF SUMMARY OF RECOMMENDED


NO. ISSUE AMENDMENT BY
001 04/03/06 First issue Bryan Miller

This procedure has been prepared to comply with the requirements of Kimberly-Clark and
the Corporate Financial Instructions of Kimberly-Clark Corporation. It is important that no
deviation from the procedure occurs without prior reference to: Windows Services

Page 41 of 41

Potrebbero piacerti anche