Sei sulla pagina 1di 13

HP-UX Serviceguard Heartbeat Configuration Solutions

October 24, 2005

Executive Summary...................................................................................... 2
Overview ................................................................................................... 2
What does this paper provide? .................................................................. 2
Interrupt Assignment Algorithm ................................................................... 3
Ensuring Heartbeat interfaces are optimally configured .................................... 4
Ensure that the heartbeat interface drivers are on different processors.............. 4
If Your System has a Non-Link Aggregate Interface for Serviceguard............. 4
If Your System has a Link Aggregate Interface for Serviceguard.................... 6
Managing Interrupts to Improve High Availability.......................................... 8
Systems with more Processors than Heartbeat interfaces .............................. 8
Systems with more Heartbeat interfaces than Processors .............................. 9
Moving interrupts ................................................................................. 11
References................................................................................................ 13

Version 1.0
Executive Summary
The ability to correctly configure the various heartbeat interfaces in a ServiceGuard environment
helps to increase the High Availability (HA) “worthiness” of a customer solution. While some
aspects may be considered common sense concepts, this paper reviews some of the more
complicated configurations and instructs you how to optimize these particular configurations.
Topics include

• Overview
• Interrupt assignment algorithm
• Ensuring heartbeat interface drivers are optimally placed
Auto Port Aggregation (APA) configuration
Non-APA configuration
• Managing Interrupts to Improve High Availability

Overview
What does this paper provide?
I’ve just installed Serviceguard and set up my network heartbeat interfaces. I’ve followed the
instructions in the Serviceguard manual – what else do I need to know?

HP has identified additional rules and information to consider in assigning heartbeat network
cards.

The Serviceguard manual identifies key points to consider when installing the network heartbeat
cards. However, an additional requirement may also need to be considered for your
configuration. You may need to read this document if you have

• Multiple heartbeats in a Serviceguard environment

• High end systems with high I/O activity on high speed I/O interfaces.

Under heavy I/O traffic situations a Serviceguard configuration may encounter problems related
to high CPU utilization from the I/O cards (both network and storage) which may result in
problems that cause the Serviceguard cluster to failover.

One additional consideration in installing heartbeat cards, especially under such circumstances,
is the assignment of the card/driver to a CPU. This is done automatically when the card is first
installed in the system. However, to optimize the configuration, you may need to review the
following data.

(If you have installed the heartbeat cards as part of a card On-Line Addition (OLA), you should
reboot your system. The assignment of the interrupts may be different between your current
environment and when the system is again rebooted).

Note: Before you begin, you should have already

2
Installed the Serviceguard product
Installed and set up network heartbeat cards as part of the Serviceguard instructions.
Installed the Interrupt migration tool (separate product for 11iv1)

Interrupt Assignment Algorithm


Cards and interrupts are assigned to CPU’s through a discovered round-robin algorithm. As the
system queries the IO system to look for cards/devices, it assigns the interrupts to the CPU’s.
The algorithm simply assigns interrupts to the CPU’s in a simple round robin fashion.

When cards are added into a system after boot (through card OLA) the next available CPU is
assigned. However, on the subsequent boot the CPU may be different because the discovery
order may be different than when the card was inserted.

Applications such as Processor Sets (PSets) will not affect the round robin order.

A system with more IO cards than CPU’s will overlap interrupts. Multiple cards may be
assigned to a single CPU to handle the interrupts. This overlap may nullify customer
requirements for performance and High Availability with various devices.

In the case of the heartbeat NIC’s, HP recommends that a system administrator may need to
examine the interrupt assignments to ensure correctness of configuration between the primary
and secondary NIC’s, as well as those configurations with multiple primary and secondary
interfaces.

The concepts detailed in this white paper are applicable to 11iv1 and 11iv2. The examples
themselves are derived from an 11iv1 system.
Consult the appropriate command man pages for your operating system for obtaining and
reading output.

3
Ensuring Heartbeat interfaces are optimally configured

Ensure that the heartbeat interface drivers are on different processors


Identify the interface(s) assigned to each of the primary and/or secondary heartbeat network
interface cards (NICs) for ServiceGuard.

1. Enter the following command on the target system running HP ServiceGuard:


/usr/sbin/cmgetconf

2. Look at the resulting output from /usr/sbin/cmgetconf. It should look something


like this:
NETWORK_INTERFACE lan9
HEARTBEAT_IP 1.1.1.1
NETWORK_INTERFACE lan15
NETWORK_INTERFACE lan5
HEARTBEAT_IP 1.1.2.1
:
:
# Possible standby Network Interfaces for lan9:lan15.

3. In the example above, the network interface is shown as lan9. This system does not
have a Link Aggregate interface. The lan9 and lan5 interfaces are being used for the
primary heartbeat, and lan15 is the standby for lan9.

Continue with the instructions in “If Your System has a Non-Link Aggregate Interface for
Serviceguard”.

4. The output from the /usr/sbin/cmgetconf command may look something like this:

NETWORK_INTERFACE lan903
HEARTBEAT_IP 1.1.1.1
NETWORK_INTERFACE lan905
NETWORK_INTERFACE lan900
HEARTBEAT_IP 1.1.1.1

In the example above, the primary network interfaces shown are lan903 and lan900
(and lan905 is the standby). If the numeric portion of the NETWORK_INTERFACE value
is 900 or above (900, 903 and 905 in this case) the system has a Link Aggregate
interface. Continue with the instructions “If Your System has a Link Aggregate Interface
for Serviceguard”.

If Your System has a Non-Link Aggregate Interface for Serviceguard

4
NOTE Be sure you have determined that your system has a NON-Link Aggregate
interface by completing the section “Ensure that the heartbeat interface drivers are on
different processors” before continuing here.

1. Determine the hardware path of the interfaces.

From the steps in the previous section, the Serviceguard information example showed

NETWORK_INTERFACE lan9
HEARTBEAT_IP 1.1.1.1
NETWORK_INTERFACE lan15
NETWORK_INTERFACE lan5
HEARTBEAT_IP 1.1.2.1
:
:
# Possible standby Network Interfaces for lan9:lan15.

a) Enter the following ioscan command to determine the hardware path number:
/usr/sbin/ioscan -kfC lan

Note: uppercase “C” in the above command

The output should look something like this:

Class I H/W Path Driver S/W State H/W Type Description


=======================================================================
lan 0 8/16/6 btlan CLAIMED INTERFACE Built-in LAN
.
lan 5 1/10/1 btlan CLAIMED INTERFACE HP PCI 10/100Base-TX 4 port
.
lan 9 1/12/2 btlan CLAIMED INTERFACE HP PCI 10/100Base-TX 4 port
lan 15 1/14/1 btlan CLAIMED INTERFACE HP PCI 10/100Base-TX 4 port

b) This output shows the LAN interfaces on your system. The “I” column indicates the
instance number. In the case of a network interface of lan9 (where the value “9” is
the instance number), and the output shows a hardware path of 1/12/2 for this
interface.
c) Likewise, the above output shows that the hardware path for a network interface =
lan15 to be 1/14/1 and interface = lan5 to be 1/10/1.

2. Identify the processors.


a) Obtain the hardware path value using the ioscan procedure shown in the previous
step.
b) Enter the following interrupt migration command: /usr/contrib/bin/intctl
c) The output should look something like this:

5
hw path class drv card cpu cpu intr intr card
name cell ID cell type ID description
======================================================================
0/0/0/0 lan btlan N/A 4 N/A L 1 HP PCI 10/100Base-TX Core
0/0/1/0 ext_bus c720 N/A 1 N/A L 1 SCSI C895 Fast Wide LVD
0/0/2/0 ext_bus c720 N/A 2 N/A L 1 SCSI C87x Ultra Wide Single-Ended
.
1/10/1 lan btlan N/A 5 N/A L 1 HP PCI 10/100Base-TX 4 port
.
1/12/2 lan btlan N/A 7 N/A L 1 HP PCI 10/100Base-TX 4 port
1/14/1 lan btlan N/A 9 N/A L 1 HP PCI 10/100Base-TX 4 port

d) In the output example above, look for the hardware path obtained in the previous
step. It will be listed under the column labeled “hw path”. For example, the
hardware path for lan9 was found to be 1/12/2, and the corresponding processor
is listed as 7 under the “cpu ID” column.
e) Likewise, the above output shows that the processor ID for lan15 (hardware path =
1/14/1) is 9, and the processor ID for lan5 (hardware path = 1/10/1) is 5.
f) In this example lan9, lan15, and lan5 are on different processors (7, 9, and 5). In
this case, no further steps are necessary.
However, if you have heartbeat interfaces which overlap on CPUs (primary or
standby interfaces), you would need to follow the instructions in the section
“Managing Interrupts to Improve High Availability.”

If Your System has a Link Aggregate Interface for Serviceguard

NOTE Be sure you have determined that your system has a link aggregate interface by
completing “Ensure that the heartbeat interface drivers are on different processors” before
continuing here.

1. Enter the following command on the target system running HP ServiceGuard:


/usr/sbin/cmgetconf

2. Look at the resulting output from /usr/sbin/cmgetconf. It should look something like
this:

NETWORK_INTERFACE lan903
HEARTBEAT_IP 1.1.1.1
NETWORK_INTERFACE lan905
NETWORK_INTERFACE lan900
HEARTBEAT_IP 1.1.1.1
.
.
#Link Aggregate lan900 contains the following port(s):lan9, lan10
#Link Aggregate lan903 contains the following port(s):lan15, lan16
#Link Aggregate lan905 contains the following port(s):lan18, lan19
#Possible standby Network Interfaces for lan900:lan905

In the example above, the primary heartbeat interfaces are link aggregate lan900 and
lan903. lan900 is made up of 2 interfaces – lan9 and lan10. lan903 is made up of 2

6
interfaces – lan15 and lan16. The standby heartbeat interface is link aggregate lan905,
which is made up of 2 interfaces - lan18 and lan19.

3. Determine the hardware paths of the interfaces.

a) Determine the names of the interfaces.

b) Enter the following ioscan command to determine the hardware path numbers:

/usr/sbin/ioscan -kfC lan

The output should look something like this:


Class I H/W Path Driver S/W State H/W Type Description
=======================================================================
lan 0 8/16/6 btlan CLAIMED INTERFACE Built-in LAN
.
lan 9 1/12/2 btlan CLAIMED INTERFACE HP PCI 10/100Base-TX 4 port
.
lan 10 1/13/4 btlan CLAIMED INTERFACE HP PCI 10/100Base-TX 4 port
.
lan 15 1/14/1 btlan CLAIMED INTERFACE HP PCI 10/100Base-TX 4 port
.
lan 16 1/15/1 btlan CLAIMED INTERFACE HP PCI 10/100Base-TX 4 port
.
lan 18 1/17/1 btlan CLAIMED INTERFACE HP PCI 10/100Base-TX 4 port
.
lan 19 1/18/1 btlan CLAIMED INTERFACE HP PCI 10/100Base-TX 4 port

c) This output shows the LAN interfaces on your system. The “I” column indicates the
instance number. For example, instance number 9 corresponds to the lan9 network
interface, and the hardware path is 1/12/2 for this interface.

d) Likewise, the above output shows that the hardware path for the lan10 network
interface is 1/13/4.

e) For the standby link aggregate consisting of lan18 and lan19, lan18 (with an
instance of 18) has a hardware path of 1/17/1. For lan19, a hardware path of
1/18/1.

4. Identify the processors.

a) Note the hardware path obtained in the ioscan procedure in the previous step.

b) Enter the following interrupt migration command: /usr/contrib/bin/intctl

The output should look something like this:

hw path class drv card cpu cpu intr intr card


name cell ID cell type ID description
======================================================================
0/0/0/0 lan btlan N/A 4 N/A L 1 HP PCI 10/100Base-TX Core

0/0/1/0 ext_bus c720 N/A 1 N/A L 1 SCSI C895 Fast Wide LVD
0/0/2/0 ext_bus c720 N/A 2 N/A L 1 SCSI C87x Ultra Wide Single-Ended
.
.
1/12/2 lan btlan N/A 7 N/A L 1 HP PCI 10/100Base-TX 4 port
1/13/4 lan btlan N/A 8 N/A L 1 HP PCI 10/100Base-TX 4 port

7
1/14/1 lan btlan N/A 9 N/A L 1 HP PCI 10/100Base-TX 4 port
1/15/1 lan btlan N/A 10 N/A L 1 HP PCI 10/100Base-TX 4 port
.
.
1/17/1 lan btlan N/A 3 N/A L 1 HP PCI 10/100Base-TX 4 port
1/18/1 lan btlan N/A 4 N/A L 1 HP PCI 10/100Base-TX 4 port

c) In the output example above, look for the hardware path obtained in step 3. It will be
listed under the column labeled “hw path”. For example, in the case where the
primary link aggregate is lan9 (hardware path = 1/12/2) and lan10 (hardware
path = 1/13/4), the processor ID for lan9 is 7 and the processor ID for lan10 is 8.
For the other primary heartbeat aggregate lan903, the individual interfaces lan15
(hardware path = 1/14/1) and lan16 (hardware path = 1/15/1) have processor
ids of 9 (lan15) and 10 (lan16).

d) In the case where the standby link aggregate is lan18 (hardware path = 1/17/1)
and lan19 (hardware path = 1/18/1), the processor ID for lan18 is 3 and the
processor ID for lan19 is 4.

e) In this example, lan9, lan10, lan15, lan16, lan18 and lan19 are all on different
processors. So no further steps are necessary in this case.

However, if you have heartbeat interfaces which overlap on CPUs (primary or


standby interfaces), you would need to follow the instructions in the section
“Managing Interrupts to Improve High Availability.”

Managing Interrupts to Improve High Availability

NOTE If you have a configuration with APA (Auto Port Aggregation), an “interface” in the
following section refers to the individual ports within the link aggregation and not to the link
aggregate itself.

Systems with more Processors than Heartbeat interfaces


• If you have more processors than Heartbeat interfaces or the same number of processors
as than Heartbeat interfaces (count both primaries and standby interfaces):

- Assign each primary Heartbeat interface to a different processor

- Assign each standby Heartbeat interface to a different processor

In the following picture (Figure 1), we have 2 heartbeat pairs. There are primary and standby
#1 (P1 and S1), and primary and standby heartbeat #2 (P2 and S2). Since we have 4
interfaces and 4 processors, we can assign each interface to a processor without overlap as
shown.

8
Figure 1

Systems with more Heartbeat interfaces than Processors


• If you have more Heartbeat interfaces than processors (count both primary and standby
interfaces):

1) Assign each primary interface to a different processor until all processors have one
primary interface. Then, follow either (a) or (b).

a) If there are still some unassigned primary interfaces (e.g., you ran out of
processors to assign the primary interfaces), assign the remaining primary
interfaces to separate processors even if this creates overlap with those assigned
in (1) until all primary interfaces have been assigned to a processor. Attempt to
spread the assignments as evenly as possible across the processors.

OR

b) If you have assigned all the primary interfaces and still have some processors
available, then begin assigning standbys until you have all assigned all
processors an interface (either a primary or standby).

2) Assign any remaining standby interface(s) to the same processor as its associated
primary interface.

In the following picture (Figure 2), we have 3 heartbeat pairs (non-Link Aggregate). There
are primary and standby #1 (P1 and S1), primary and standby heartbeat #2 (P2 and S2),
and primary and standby heartbeat #3 (P3 and S3). Since we have 6 interfaces but only 4
processors, we will have overlap. We assign the primaries to separate processors and as
many standbys to separate processors. When we begin to overlap, we assign the remaining
standbys to processors which have their primary already assigned as shown.

9
Number of interfaces greater than number of Processors

CPU 1 CPU 2 CPU 3 CPU 4

P1 P2 P3
S1

S2 S3

Figure 2

In the next picture (Figure 3), we have multiple heartbeat pairs with APA. The configuration
consists of:

• APA aggregate primary heartbeat P1 contains interfaces P1A and P1B. The standby
of S1 contains a single interface S1.

• APA aggregate primary heartbeat P2 contains interfaces P2A and P2B. The standby
APA aggregate of S2 contains interfaces S2A and S2B.

• APA aggregate primary heartbeat P3 contains interfaces P3A and P3B. The standby
of S3 contains a single interface S3.

Since we have 10 interfaces but only 4 CPUs, we will have overlap. We assign the primaries
to separate CPUs and as many standbys to separate CPUs. When we begin to overlap, we
assign the remaining standbys to CPUs which have their primary already assigned as shown.

10
APA – number of interfaces greater than number of Processors

CPU 1 CPU 2 CPU 3 CPU 4

P1A P1B P2A P2B

P3A P3B

S3 S1 S2A S2B

Figure 3

Moving interrupts
If you have determined that you need to move one or more interrupts as explained in the
previous section, perform the following steps:

1. Choosing the target processor.

a) Use the intctl command to get a list of all interrupts and their associated
processors by entering:
/usr/contrib/bin/intctl

Use the intctl command to get a list of the interrupts on a particular processor by
entering:
/usr/contrib/bin/intctl -c <processor number>

Where <processor number> is the processor number associated with a particular


processor.

b) If there is a processor with no interrupts, select it and go to Step 2. Otherwise go to


(c).

c) If there is a processor with fewer interrupts than other processors, select it and go to
Step 2. Otherwise go to (d).

d) Simply choose another appropriate processor with a similar number of interrupting


devices and go to Step 2.

11
IMPORTANT: You should take into account the fact that devices may vary in their
maximum bandwidth and processor consumption requirements.

For example, highly utilized interfaces will tend to consume more processor cycles than
lower utilized interfaces. Also, interfaces with high speeds (e.g., 1 Gbps vs 100 Mbps
interfaces) can consume more bandwidth and cycles due to the higher throughput of the
interface.

CPU consumption can be measured by through tools such as TOP or GlancePlus.

2. Moving an interrupt.

Enter the following intctl command:

/usr/contrib/bin/intctl -M -H <hardware path> -I 1 -c <cpu ID>

Where <hardware path> is the hardware path of the device using the format
#/#/#/#, and <cpu ID> is the target processor to move.

3. Save the current processor assignments after completing the interrupt migration for the
system by entering the following command:
/usr/contrib/bin/intctl -s /etc/interrupt_migration_conf

This information will be saved in the file /etc/interrupt_migration_conf.

IMPORTANT: You must run /usr/contrib/bin/intctl -r


/etc/interrupt_migration_conf after each subsequent bootup to ensure that the
heartbeats are assigned as per your selected changes.

Changes to I/O, such as adding or deleting cards, will probably affect the current
processor assignments so the /etc/interrupt_migration_conf file should be
regenerated after any I/O change.

12
References
http://docs.hp.com/en/ha.html
(HP-UX High Availability)
http://docs.hp.com/en/ha.html#Serviceguard
(HP-UX Serviceguard)
http://docs.hp.com/en/5969-4363/index.html
(HP-UX Interrupt Migration Product Note)

Legal Notices

© 2005 Hewlett-Packard Company, L.P. The information contained herein is


subject to change without notice.
The only warranties for HP products and services are set forth in the express warranty statements accompanying such
products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for
technical or editorial errors or omissions contained herein.

HP-UX® , Serviceguard®, and Superdome® are registered trademarks of the Hewlett-Packard Corporation. PCI is a
registered trademark of the PCI SIG.

All other trademarks and registered trademarks are the property of the respective corporations.

Potrebbero piacerti anche