Sei sulla pagina 1di 45

Backup and recovery best practices for Oracle 10g with NetBackup 6.

0 white paper
Environment: Oracle 10g on Red Hat AS4, HP Integrity rx7620 server, HP ProLiant DL580 G2 server, using HP StorageWorks EVA8000 and EVA5000 storage arrays and HP StorageWorks EML E-Series103e and VLS 6510 libraries

Executive summary............................................................................................................................... 3 Key findings........................................................................................................................................ 3 Overview............................................................................................................................................ 4 Components........................................................................................................................................ 5 Configuring the hardware..................................................................................................................... 5 Hardware statistics ........................................................................................................................... 6 Partitioning the Integrity rx7620 server ............................................................................................... 8 Configuring the Management Processor .......................................................................................... 8 Creating nPars ............................................................................................................................. 8 Defining the zones ........................................................................................................................... 9 Configuring the EVA8000 storage array for primary storage .............................................................. 10 Configuring the EVA5000 for disk backups ...................................................................................... 10 Configuring the HP StorageWorks EML E-Series 103e Tape Library ..................................................... 11 Configuring the HP StorageWorks 6510 Virtual Library System ........................................................... 11 Configuring the software .................................................................................................................... 12 Configuring the QLogic driver ......................................................................................................... 12 Setting up QLogic Dynamic Load Balancing .................................................................................. 12 EVA8000 and EVA5000Active-Active/Active-Passive.................................................................. 12 OCFS2 ......................................................................................................................................... 13 OCFS disk configuration ............................................................................................................. 14 Setting up the OCFS Clustered File Systems ................................................................................... 15 Working with Benchmark Factory .................................................................................................... 16 Oracle parameter changes .......................................................................................................... 18 OLTP workload results..................................................................................................................... 18 Oracle backup and restore ................................................................................................................. 19 NetBackup policies ........................................................................................................................ 19 Oracle templates............................................................................................................................ 20 Setting up the storage units ............................................................................................................. 21 Set the master server jobs global attribute ......................................................................................... 21 Backup issues ................................................................................................................................ 22 Setting up restores.......................................................................................................................... 22

Backup and restore performance results ............................................................................................... 22 Backup methodologies.................................................................................................................... 23 Disk-to-disk backupsBacking up Oracle 10gR2 to EVA5000......................................................... 23 Disk-to-virtual tape backupsBacking up Oracle 10gR2 to VLS ....................................................... 23 Disk-to-tape backupsBacking up Oracle 10gR2 to EML ................................................................ 23 EVA performance results ................................................................................................................. 23 EVA5000 RAW performance characterization............................................................................... 23 EVA8000 RAW performance characterization............................................................................... 23 EVA5000 backup and restore results................................................................................................ 23 EML E-Series backup and restore performance results ......................................................................... 25 VLS6510 backup and restore performance results.............................................................................. 26 Conclusions ...................................................................................................................................... 28 Oracle RMAN ............................................................................................................................... 28 Disk-to-disk backup......................................................................................................................... 28 Disk-to-tape backup ........................................................................................................................ 28 Diskto-VLS backup ........................................................................................................................ 28 Server configuration ....................................................................................................................... 29 Best practices .................................................................................................................................... 29 Best practices for disk-to-disk backups on the EVA5000 storage array.................................................. 29 Best practices for disk-to-tape backups on the EML E-Series Tape Library ............................................... 29 Best practices for disk-to-virtual tape backups on the VLS Virtual Tape Library ........................................ 30 Best practices for using Oracle Recovery Manager (RMAN) ................................................................ 31 Appendix A. Bill of Materials .............................................................................................................. 32 Appendix B. Configuring Oracle RMAN .............................................................................................. 34 Appendix C. Examples....................................................................................................................... 35 Appendix D. Other issues ................................................................................................................... 43 Server OS hangs/crashes ............................................................................................................... 43 Oracle session hangs ..................................................................................................................... 43 NetBackup catalog synchronization ................................................................................................. 43 RMAN not backing up archive logs.................................................................................................. 43 RMAN specific syntax changes........................................................................................................ 43 Imbalanced backups ...................................................................................................................... 43 Poorly streaming backups ............................................................................................................... 44 RAC issues .................................................................................................................................... 44 General Oracle changes................................................................................................................. 44 For more information.......................................................................................................................... 45 HP................................................................................................................................................ 45 Oracle .......................................................................................................................................... 45 Symantec NetBackup 6.0 ............................................................................................................... 45 Quest Benchmark Factory 5.0 ......................................................................................................... 45 Open Source Tools ..................................................................................................................... 45

Executive summary
Many businesses use Oracle databases to store critical data and they need a reliable, robust, and efficient backup and recovery method. Backup and recovery of Oracle databases is a vital part of IT data protection strategies. Recovery times and backup windows are at the core of establishing recovery time objectives (RTOs) and recovery point objectives (RPOs). A faster backup method is required as data grows to maintain these objectives and allow administrators peace of mind that the implemented backup and recovery strategy continues to be viable. The HP StorageWorks Customer Focused Testing Team constructed a Red Hat AS4, Oracle 10g environment with HP StorageWorks storage arrays to represent an enterprise environment. The purpose of the testing was to develop best practices for the backup and restore of an enterprise environment consisting of the deployment of HP StorageWorks 8000 Enterprise Virtual Array (EVA8000), HP StorageWorks 5000 Enterprise Virtual Array (EVA5000), HP StorageWorks 6000 Virtual Library System (VLS6000), and HP StorageWorks Enterprise Modular Library (EML) to provide data protection and recovery operations for Oracle 10g incorporating NetBackup 6. The objectives for the testing, based on actual customer input, included the following: Back up an approximate 2.5-TB Oracle database in 2.5 hours or better (approximately 1 TB/hr) Establish best practices for online backup and restore of Oracle databases Determine impact to a transaction workload while backups are running Provide details of NetBackup 6.0 integration with RMAN

Key findings
Testing successfully provided the following high-level results: Limited the impact to transaction workloads while optimizing database backup and restore times: HP Integrity RAC VLS backup923 GB/hr (approximately 2 hours, 30 mins) HP Integrity rx7620 RAC EML restore453 GB/hr (approximately 5 hours, 20 mins) HP ProLiant DL580 G2 VLS backup100 GB/hr (approximately 5 hours, 30 mins) HP ProLiant DL580 G2 EML restore60 GB/hr (approximately 7 hours, 40 mins) Successfully exemplified different EVA5000 configurations. Successfully determined maximum server workloads and capacities for the DL580 and Integrity rx7620 database servers. Successfully determined best configurations for each backup methodology: Disk-to-disk Disk-to-tape Disk staging Important findings uncovered during the tests are documented in the Best practices section.

Overview
The main purpose of the project was to conduct backup and restore testing using various backup targets in an effort to determine best ways for reducing Oracle database downtime and increase database availability using Symantec NetBackup 6 with RMAN. HP integrated and tested backup and recovery of different Oracle databases with the following objectives: Demonstrate best practices to back up an approximate 2.5-TB database within 2.5 hours Demonstrate NetBackup 6 integration with Oracle RMAN Determine best practices for backup and recovery to tape, virtual tape, and disk Characterize the impact of online backups on application performance Testing included full backup of the databases, utilizing staged backups, and performing full and incremental restores. The backup testing included the database data, control files, and archive logs with and without user load. Two different restore tests were performed. In the first test, the full database was restored after a simulated disaster, such as the loss of an entire storage array. In the second test, incremental restores were conducted. Each time a restore was conducted the database was opened and checked for data integrity by conducting a simulated workload against the database and monitoring the test for errors. Several options for backup and restore were evaluated as were their impact on database recovery, complexity, and recovery speed: Integrity and ProLiant Servers to HP StorageWorks EML E-Series Tape LibraryThe scenario utilized a standard RMAN backup method using an online database and Symantec NetBackup to spool the data directly to tape. The test goal was to measure the backup time and throughput for a quiesced and busy full database backup. Times for the full backups were recorded. Integrity and ProLiant Servers to HP StorageWorks VLS6510 Virtual Tape libraryThis scenario utilized a standard RMAN backup method using an online database and Symantec NetBackup to spool the data directly to a virtual tape library. The test goal was to measure the time taken and throughput for a quiesced and busy full database backup. Times for the full backups were recorded. Integrity and ProLiant Servers to HP StorageWorks EVA5000 storage arrayThis scenario utilized a standard RMAN backup method using an online database and Symantec NetBackup to spool the data directly to another physical disk array. The test goal was to measure the time taken and throughput for a quiesced and busy full database backup. Times for the full backups were recorded. Full restore from HP StorageWorks EML E-Series, VLS6510, and EVA5000 to Integrity and ProLiant ServersThis testing provided data for a full restore using each methodology with RMAN and Symantec NetBackup. The test goal was to measure the recovery time using each method with a mounted control file. Times for each restore were recorded. Incremental restore from HP StorageWorks EML E-Series, VLS6510, and EVA5000 to Integrity and ProLiant ServersThis testing provided data for an incremental restore using each methodology with RMAN and Symantec NetBackup. The test goal was to measure the recovery time using each method with a mounted control file. Times for each restore were recorded.

Components
To run these tests, HP configured the system illustrated in Figure 1. The environment was based on input from customers and is representative of a typical Oracle database environment. The key components include the following: Oracle 10gBenchmark Factory was used to generate load against the Oracle database, which was backed up and restored. HP rx7620 serverThis server was used to host the Oracle database. Two configurations were useda single instance database and a two-node RAC instance. DL580 serverThis server was used to host multiple database instances. EVA8000The primary SAN-based storage array, which held the Oracle database, logs, and so on. EVA5000This storage array was used as a disk-to-disk backup target to show how an older EVA may be re-deployed within existing infrastructure. HP StorageWorks EML E-Series 103e Tape LibraryThe EML was configured as a primary tape backup and restore device. VLS6510The VLS was configured as a primary tape backup and restore device. Red Hat AS4 operating systemEnterprise Linux operating system used on both the rx7620 and DL580 servers. Symantec NetBackup 6.0 was used as the backup application. Benchmark Factory was used to create the OLTP data in the databases and simulate 500- and 1,000-user workloads.
Note: At the time of this writing, the rx7620 server was obsoleted by the rx7640 server. The Best Practices outlined in this document are still pertinent.

Configuring the hardware


HP constructed the enterprise configuration using Integrity rx7620 and DL580 servers and EVA8000 and EVA5000 storage arrays to best simulate an enterprise environment supporting different Oracle databases on Itanium- and Xeon-based servers. For the complete list of hardware, see Appendix A. Bill of Materials. Figure 1 shows the configuration of the Oracle RAC database (IA64 RAC and IA64 Single), multiple database (IA32 Multi), Benchmark Factory, NetBackup, and command view servers. The 2-Gb/s Fibre Channel (FC) links to each system are depicted by the red and blue lines. The white arrows show the disk-to-disk backup data flow. The orange arrows show the disk staging data flow, and the green arrows show the disk-to-tape data flow.

Hardware statistics
The hardware involved in the configuration of this environment includes: Database Server 1 and 2 HP Integrity rx7620 server 8x IA64 1.6-GHz Processors in one partition 8x QLogic-based HP A6286A Dual Port FC HBAs 64-GB RAM Two cells and two partitions Database Server 3 HP ProLiant DL580 G3 server 4x Intel Xeon 3.0-GHz Processors 2x QLogic-based HP FC2214A Dual Port FC HBAs 16-GB RAM Storage EVA8000 (primary) 144x 300-GB FC drives Dual Controllers (Active/Active) One Disk Group (FC) EVA5000 (D2D, secondary) 56x 250GB FATA Drives Dual Controllers (Active/Active) One Disk Group (FATA) Tape devices EML E-Series103e Tape Library 4x LTO3 drives 24x tapes at 400-GB Native 2 FC paths VLS6510 24x 250-GB SATA drives Emulating 12x LTO3 drives with four FC paths 49x tapes at 100-GB each

Figure 1. Environment configuration diagram

Partitioning the Integrity rx7620 server


Configuring the Management Processor For this environment the Integrity rx7620 server was partitioned into two nPars, or server partitions. For partitioning to have been created from a remote system and for remote hardware management, the Management Processor (MP) was configured for remote access:
1. Connect a server running Microsoft Windows or Linux to the MP serial management port. 2. Start a terminal program, such as Windows Hyperterminal or Linux Minicom and configure the IP

address with the lc command and follow the on-screen prompts.


Note You can configure the MP for DHCP or static IP address. You may also enable or disable telnet, SSH, or HTTPS remote access.

Creating nPars The system was prepared for partitioning and then partitioned:
1. If no partition exists, a new complex must be created with the cc command. Choose cell 0

and save the configuration.


a. b. c. d.

2. If a single partition exists, reset the partition for reconfiguration:

Use the rr command to reset the partition. Use the rs command to restart the partition. Create a new complex with the cc command. Choose cell 0 and save the configuration.

3. On a server installed with nPar utilities, perform the following commands:

parcreate -P nextPartition -c 1::: parstatus u Admin h <IPADDRESS>

-u Admin h <IPADDRESS>

This created a partition consisting of a single Cell, and an OS was loaded onto the system. After an OS is loaded, you can install the nPar command line utilities and connect to the MP to create the second partition. Alternatively, you can install the nPar utilities on a server running Linux, HP-UX, or Windows and create the second partition from a remote host. To increase the process, one partition was created to provide access to all the on-board SCSI disks, and an OS was loaded on the first SCSI disk. The first SCSI disk was then duplicated using the dd utility to the remaining disks. Upon completion of the duplication, the system partitions were re-set and the rx7620 server was repartitioned into two partitions consisting of one cell each. Each new server partition now had an identical bootable OS because of the disk duplication effort. Alternatively, a network install could have been performed on each partition if a PXE server was set up. Since no PXE server was used, disk duplication was the simplest method of preparing the OS disks for each server partition.

Defining the zones


Since multiple paths for each device can be mapped, zoning was needed to reduce the total number of paths. Only one path should be used for tape devices as they are not supported in multipath configurations and sometimes cause issues with the backup application installation or configuration. Since the current release of Red Hat AS4 is limited to 256 busses and multiple busses are generated for each path from a host port to an EVA port, bus exhaustion can occur if the EVA is not properly zoned. After zoning is introduced, it must be used in all cases for the devices to be visible to one another. Each rx7620 server partition had four dual-port HP A6826A HBAs. Four ports per partition were explicitly zoned to the EVA8000 and EVA5000. The last four were zoned to the VLS6510 and two of the last four ports were also zoned to the EML E-Series 103e Tape Library. The DL580 server has two dual-port HP FCA2214A HBAs where two ports were assigned to the EVA8000 and EVA5000 and the other two were assigned to the VLS6510 and the EML E-Series 103e Tape Library. rx7620 server SAN connectivity to EVA8000 storage array Zones were established for four of the rx7620 server HBA ports and each of the EVA8000 host ports. DL580 server SAN connectivity to EVA8000 storage array Zones were established for two of the four DL580 server HBA ports and each of the EVA8000 host ports. rx7620 server SAN connectivity to EVA5000 storage array for disk-based backups Zones were established for four of the rx7620 server HBA ports and each of the EVA5000 host ports. DL580 server SAN connectivity to EVA5000 storage array for disk-based backup Zones were established for two of the four DL580 server HBA ports and each of the EVA5000 host ports. rx7620 server SAN connectivity to VLS6510 Zones were established for four of the rx7620 server HBA ports and each of the VLS6510 host ports. DL580 server SAN connectivity to VLS6510 Zones were established for two of the four DL580 server HBA ports and each of the VLS6510 host ports. rx7620 server SAN connectivity to EML E-Series 103e Tape Library Zones were established for two of the rx7620 server HBA ports and each of the EML E-Series 103e host ports. DL580 server SAN connectivity to EML E-Series 103e Tape Library Zones were established for two of the four DL580 server HBA ports and each EML E-Series 103e host ports.

Configuring the EVA8000 storage array for primary storage


The EVA8000 configuration included the following: EVA VCS Microcode: 5.110 An EVA8000 Controller Pair 12 EVA disk shelves 144 300-GB FC disks Three-phase 208-VAC redundant power The fiber connections were connected to two Brocade-based HP 2/16N SAN switches and two Brocade Silkworm 3800s and configured in dual fabrics. The EVA8000 presented nine RAID1 virtual disks, which were all from a single disk group. The disk group included all 144 available FC disks and was configured for double disk failure protection. The virtual disks were presented to all host ports that were connected to any port of the EVA. HP/QLogic load balancing driver used the LRU policy for load balancing. Each used host port was identified on the EVA and the OS type was set to Linux. Each virtual disk from the EVA had Preferred Path/Mode set to No Preference and two were set to Path BFailover Only, alternating. This equally divided the load across the two controllers according to each host and LUN mapping.

Configuring the EVA5000 for disk backups


The EVA5000 configuration included the following: EVA VCS Microcode: 4.001 An EVA5000 Controller Pair Eight EVA disk shelves 56 250-GB FATA disks Three-phase 208-VAC redundant power The EVA configuration consisted of one HSV110 controller pair and eight disk enclosures, which were populated with 56 250-GB FATA drives. Firmware v4.001 (the latest release at time of printing) and v3.028 (previous release) were used for the EVA VCS Microcode. The EVA5000 presented four RAID0 virtual disks, which were all from a single disk group. The disk group included all 56 available FATA disks and was configured for no disk failure protection. The virtual disks were presented to all host ports that were connected to the EVA. The HP/QLogic load balancing driver was used with the least busy policy for load balancing. Each used host port was identified on the EVA and the OS type was set to Linux. Two virtual disks from the EVA had Preferred Path/Mode set to Path AFailover Only and two were set to Path B Failover Only, alternating. This equally divided the load across the two controllers according to each host and LUN mapping.

10

Configuring the HP StorageWorks EML E-Series 103e Tape Library


The HP StorageWorks EML E-Series 103e Tape Library was configured with four FC HP StorageWorks Ultrium 960 tape drives (LTO3). The e2400-FC Interface Controllers had six FC ports. Four ports were for the back-end tape devices, and the remaining two were for the SAN. All interfaces were 2 GB and each SAN port on the Interface Controller was connected to separate fabrics to distribute the load evenly across fabrics and HBAs. The tape library was managed from a dedicated SAN Management Server. HP StorageWorks Command View TL software was installed on the same SAN Management Server used for HP StorageWorks Command View EVA. Symantec NetBackup 6.0 was used as the backup application and Maintenance Pack 2 (MP2) was applied. For clarification, the global catalog server managed the backup images and the media where the images reside. Since there can only be one host per robotic device, NetBackup elects one of the hosts to be the robotic control host. The robotic control host moves the media to the tape drives when backups or restores are activated. Each server in the environment was configured as a NetBackup media server and was responsible for writing data directly to the tape devices. This allows network backups to be avoided.

Configuring the HP StorageWorks 6510 Virtual Library System


The HP StorageWorks 6510 Virtual Library System (VLS6510) was configured as an Ultrium 960 tape library with 50 LTO2 tapes slots and 12 tape drives. The VLS Interface Controllers had four FC ports and four SCSI ports. The four SCSI ports were for the back-end HP StorageWorks 20 Modular Smart Array (MSA20) disk devices, while the four FC ports were for SAN connectivity. All FC interfaces were 2 GB and each set of FC ports on the VLS Interface Controller was connected to separate fabrics to distribute the load evenly across fabrics and HBAs. The tape library was managed from a dedicated SAN Management Server. HP StorageWorks Command View TL software was installed on the same SAN Management Server used for HP StorageWorks Command View EVA. Symantec NetBackup 6.0 was used as the backup application and MP2 was applied. For clarification, the global catalog server managed the backup images and the media where the images reside. Since there can only be one host per robotic device, NetBackup elects one of the media server hosts to be the robotic control host. The robotic control host moves the media to the tape drives when backups or restores are activated. Each server in the environment was configured as a NetBackup media server and was responsible for writing data directly to the tape devices. This allows network backups to be avoided.

11

Configuring the software


For the complete list of software, see Appendix A. Bill of Materials. Kernel tuning was applied to accommodate for the Oracle databases running on the hosts. Table 1 lists the Linux kernel 2.6 parameters that were modified for this testing. These settings were used as best practices based on information provided from a previous project. The default values have been included for convenience.
Table 1. Altered kernel parameters Tunable net.core.rmem_default net.core.rmem_max kernel.sem kernel.shmall kernel.shmmax kernel.shmmni fs.file-max fs.aio-max-nr net.ipv4.ip_local_port_range vm.swappiness Default 110592 131071 250 32000 32 128 Used 262144 262144 250 32000 100 128 209715200 24064771072 16384 658576 65535 1024 30 65000

2097152 33554432 4096 232233 65536 32768 61000 10

Configuring the QLogic driver


Setting up QLogic Dynamic Load Balancing The Linux QLogic driver supports active-active configuration and an active-passive configuration. The EVA8000 storage array was configured to load balance across HBAs and controllers with ActiveActive enabled. The EVA5000 required an active-passive configuration but supports load balancing across different ports of the same controller. The latest QLogic/HP driver can be obtained from the HP website under support and downloads. EVA8000 and EVA5000Active-Active/Active-Passive When using the QLogic driver for Linux with the EVA product line, a set of configuration utilities was installed with the driver source. An initial ramdisk (initrd) was created as port of the post-installation tasks of the package. To manually configure the driver options, the hp_qla2300.conf file in /etc was modified. Figure 2 shows changes made to the defaults.

12

The values for the load_balancing parameter are shown in Table 2:


Table 2. QLogic driver load_balancing parameter descriptions Type (0) Static (1) Static Policy None Automatic Description Finds the first active path or the first active optimized path for each LUN. Distributes commands across the active paths and available HBAs such that one path is used per LUN. Paths are automatically selected by drivers for supported storage systems. Sends command to the path with the lowest I/O count. Includes special commands such as path verification and normal I/O. Sends command to the path with the shortest execution time. Does not include special commands.

(2) Dynamic (3) Dynamic

Least Recently Used (LRU) Least Service Time (LST)

Figure 2. The /etc/hp_qla2300.conf file qdepth = 16 port_down_retry_count = 30 login_retry_count = 30 failover = 1 load_balancing = 2 auto_restore = 0x80

A value of 2 was used for Least Recently Used (LRU). In this way, the paths selected were balanced across the HBAs and switches automatically for the EVA8000 and load balanced across HBAs to the same controller within the same switch for the EVA5000.

OCFS2
OCFS is the Oracle Clustered File System and was required when using the RAC configuration where a file system must be employed. OCFS provided a Distributed Lock Management (DLM) facility to coordinate writes to the files contained within. OCFS version 2 was used for this environment and is currently the only supported version of OCFS.
Note OCFS, ASM, and RAW devices can all be used with RAC depending on platform and business requirements.

13

OCFS disk configuration The Virtual Disks, VDisks, are zoned so both hosts have visibility to all LUNs for use with RAC. The single LUN containing u06 and u07 are the flash recovery area and Voting and CRS files, respectively. Each of the LUNs has a single partition, which is then formatted OCFS. Figure 3 shows the hosts and LUN visibility. For OCFS format information, see Figure 4.

Figure 3.Shared OCFS2 LUN presentation to RAC nodes

14

Setting up the OCFS Clustered File Systems Each of the five EVA virtual disks was formatted as an OCFS file system with the defaults listed in Figure 4.

Figure 4. Viewing OCFS formatted devices in the OCFS2 console

The OCFS2 Console can simplify the formatting and configuration of the OCFS filesystems and helps set up lock management and OCFS heartbeat. The cluster size was set at 64 K with a blocksize of 4 K. Likewise the command line is still valuable to mount OCFS filesystems. If this is a RAC and the file system will be used to store the Voting Disk file (CRS), Oracle Cluster Registry (CRS), Data files, Redo logs, Archive logs or Control files, the file system should be mounted with datavolume and nointr options. For example: # mount o _netdev,datavolume,nointr /dev/cciss/c0d7p1 /data The default /etc/fstab entries were modified to include the _netdev and datavolume options: LABEL=CFT106_DATA1 /u02 ocfs2 _netdev,datavolume,nointr 0 0

The datavolume option is necessary when putting data files on an OCFS volume. Prior versions of OCFS would not allow data files to be placed on OCFS volumes. The only allowed file types were the shared Oracle Home. The datavolume option was not added until OCFS version 2.
Note All of the OCFS2 file systems need to have the _netdev option specified. This guarantees that the network is started before mounting the file systems and unmounted before the network is stopped.

15

For more information, see the OCFS2 Users Guide at: http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_users_guide.pdf

Working with Benchmark Factory


Benchmark Factory 5.1 Beta3 was used to conduct the workload generation. Benchmark Factory is a generic workload generation utility that can perform DSS and OLTP workloads. Benchmark Factory can also perform a custom workload based on any trace generated by a database or any custom SQL used. The workloads used for OLTP testing in accordance with the industry standard are defined in Table 3.
Table 3. Benchmark Scale Factors Host IA64 RAC IA64 Singles IA32 Multi Size 2.4 TB 1.5 TB 750 GB User Load 1,000 500 500

After creating the database, the Oracle spfile parameters were tuned to generate best performance for backup during workloads. Refer to Table 5 for the specific Oracle parameters that were used during the testing. These are the optimal settings for this environment. Benchmark Factory scale factors are approximate and should not be used as absolute guides. The following example should make approximately 930 GB of data when in fact this scale factor created 1.3 TB of data and 300 GB of indexes. Although this seems somewhat greater than the target size, it does not include the actual indexes since size will vary depending on database block size.

16

Figure 5. Benchmark scale factors

The scale factor shown is an estimate as stated. The actual size of the database must include indexes as well. This may be as much as 33% of the total data size after the data has been completely generated. The generated data alone may be as much as 20% greater than the estimated size. Table 4a and Table 4b show the table and index parameters, respectively.
Table 4a. Benchmark Factory Table/index settings Object C_ORDER_LINE C_STOCK C_DISTRICT C_CUSTOMER C_HISTORY C_ORDER C_ITEM C_WAREHOUSE C_NEW_ORDER Type Table Table Table Table Table Table Table Table Table Creation Parameters tablespace bmf parallel (degree default instances default) nologging cache monitoring tablespace bmf parallel (degree default instances default) nologging cache monitoring tablespace bmf parallel (degree default instances default) nologging cache monitoring tablespace bmf parallel (degree default instances default) nologging cache monitoring tablespace bmf parallel (degree default instances default) nologging cache monitoring tablespace bmf parallel (degree default instances default) nologging cache monitoring tablespace bmf parallel (degree default instances default) nologging cache monitoring tablespace bmf parallel (degree default instances default) nologging cache monitoring tablespace bmf parallel (degree default instances default) nologging cache monitoring

17

Table 4b. Benchmark Factory Table/index settings Object C_STOCK_I1 C_WAREHOUSE_I1 C_NEW_ORDER_I1 C_CUSTOMER_I2 C_ORDER_LINE_I1 C_ORDER_I1 C_ITEM_I1 C_DISTRICT_I1 C_CUSTOMER_I1 Type Index Index Index Index Index Index Index Index Index Creation Parameters tablespace bmf parallel (degree default instances default) nologging compute statistics tablespace bmf parallel (degree default instances default) nologging compute statistics tablespace bmf parallel (degree default instances default) nologging compute statistics tablespace bmf parallel (degree default instances default) nologging compute statistics tablespace bmf parallel (degree default instances default) nologging compute statistics tablespace bmf parallel (degree default instances default) nologging compute statistics tablespace bmf parallel (degree default instances default) nologging compute statistics tablespace bmf parallel (degree default instances default) nologging compute statistics tablespace bmf parallel (degree default instances default) nologging compute statistics

Oracle parameter changes Table 5 lists the parameter changes that were made and the defaults. These changes were made to accommodate for the maximum user load and performance during workloads while backups were occurring.
Table 5. Changed Oracle parameters Oracle parameter sort_area_size parallel_max_servers parallel_threads_per_cpu db_files Processes Dbwr_io_slaves Tape_io_slaves db_file_multiblock_read_count Cusor_sharing Default 65536 15 2 200 150 0 False 8 Exact Used 262144 2048 8 1024 1250 4 True 128 force

OLTP workload results


After the databases were populated with Benchmark Factory, the OLTP workloads were begun. The following results show the impact to the OLTP workload generated against each database by Benchmark Factory. This workload is an industry-standard workload, and user levels exercised are 500 and 1,000 users, depending on the system. Table 6 shows the transactions rates without a backup running, OLTP Baseline, and with the backup running.
Table 6. OLTP workload impact by backup method Host type Database size (GB) 3090 1373 750 OLTP baseline (TPS) 49.5 24.86 22.38 Disk backup (TPS) 49.5 24.5 16.7 VLS backup (TPS) 49.5 24.5 21.83 EML backup (TPS) 49.45 24.35 21.8 User levels

IA64 RAC IA64 Single IA32 Multi

1000 500 500

18

The IA64 RAC system performed best providing no negative impact to users regardless of backup activity. The IA64 Single system also shows there would be almost no impact to users and is operating at half the transaction rate of the entire RAC. The IA32 Multi system also shows a flat performance curve with the exception of the EML backup, which dropped by approximately 6 TPS. This was mainly due to the I/O wait incurred by the system, which impacted the TPS. This could be eliminated by adding more RAM or more HBAs. The next section outlines the backup and restore performance of each methodology in contrast to the OLTP workload.

Oracle backup and restore


The major goal for the project was to back up the Oracle databases from each server directly to the tape, virtual tape, and disk devices. The backups were conducted in two scenariosthe first without a simulated user load and the second with load. During the first scenario, the database had no workload applied and the backup was performed to disk, tape, and virtual tape. During the second scenario, the database was put under a peak OLTP workload and then backed up to disk, tape, and virtual tape. This was done to observe the interaction between the workload and the backup to understand what impact the backup would have on the client experience, and vice versa. A total of 200 data files were backed up using RMAN through NetBackup. A combined total of approximately 5 TB backed up from each server using each of the methodologies in this paper. Data was sent to multiple drives in multiple streams, where media allowed, to achieve as high a throughput as possible for each configuration. One channel, or stream, was configured per tape device and two streams per disk device (also known as multiplexing) during the backup. Restores from each backup device were conducted to observe the performance of each methodology. The most important metric captured was the actual speed of the restore as a factor of the overall TimeTo-Recover (TTR), which allowed the possible performance within an environment for each of the backup technologies to be gauged.

NetBackup policies
NetBackup policies determine the way a backup is executed. While each of the backup policies has similar attributes, a template or script is the ultimate influence as to how the backup will be performed. Since the Policy Type used is Oracle, the EML, VLS, and disk backups all share common policy settings. Policy settings used include: Policy settings Policy TypeSet to Oracle Policy Storage UnitSelect the appropriately configured storage unit for the Volume Pool Policy Volume PoolSet to the volume pool associated with the storage unit device ScheduleTwo schedule types must exist if automated backups are to work, an Automatic Backup type and an Application Backup type. The defaults are Full and Default-Application-Backup. ClientsThis is a list of clients to be backed up by this schedule. At least one client must be defined on an active policy. Backup SelectionsThis can be a template or script. Multiplexing cannot be set in the Oracle Policy Type since RMAN already performs a level of multiplexing and can be specified in the template.

19

Oracle templates
On the NetBackup Media Server you can create a template using the Java NetBackup Admin Console as root or a defined administration user: # /usr/openv/netbackup/bin/jnbsa When the admin console starts, select the Backup Files tab and then select the checkbox next to the database name. A dialog box will appear and ask for database connection information. When logged in, the Backup button as well as the rest of the database objects that can be backed up will display. Selecting the entire database implies a database backup, but granular objects can be selected as well. Clicking the Backup button starts the Backup Wizard. The following information must be provided: Authentication Select System or Oracle Select RMAN catalog if you are using a catalog Archived Redo Logs Include archived redo logs and specify the range, if any Delete archived logs after they are backed up, if desired Configuration Default Existing template Backup options Backup file name formatSet the format you want to use for backup files Backup set identifierSet the identifier you want the template to employ, if any Database state Online Offline Configuration variables Backup policy nameAn Oracle policy name on the EMM server Schedule nameA schedule name of type Application Backup on the EMM server Server nameThe EMM server Client nameThe Media Server or remote Oracle client system Backup limits RMAN defaultsThe currently configured RMAN defaults Specify maximum limits I/O limitsRead Rate, size of backup piece, number of open files Backup Set LimitsFilesPerSet, MaxSize, ArchivelogMaxSize I/O outputNumber of streams (channels) to configure for the backup The last step is to run the backup, save the backup to a template, both, or cancel. Figure 6 shows the initial screen after entering the database credentials.

20

Figure 6. Java NetBackup Admin Console

Setting up the storage units


In NetBackup, the storage unit defines the relationship between the host and its storage device. The storage unit is where each of the database servers was selected as the hosts for the backup devices. Each of the hosts shares tape devices, and the storage units are modified per host and are treated as separate storage units. The disk devices are presented explicitly to each host and are truly separate in every respect. On the tape storage unit, select Maximum concurrent write drives and Maximum streams per drive. Maximum concurrent write drives tells the host how many drives may be used simultaneously, and Maximum streams per drive tells the host how many streams can be simultaneously sent to any one drive if multiplexing is desired. During these tests only Maximum concurrent write drives was used and set to a value equaling the number of available drives per tape storage unit. On the disk storage unit, set the path in Absolute pathname to directory and Maximum concurrent jobs for the device. You may also select the Enable temporary Staging Area if you want to enable scheduled disk staging. You must also create a schedule for staging with the provided menu button. This storage unit is created when using the Device Configuration wizard within NetBackup. HP and Symantec recommend that the Device Configuration wizard always be used to create tape libraries and their associated devices.

Set the master server jobs global attribute


In the master server host properties, the Maximum Jobs Per Client must be set to avoid job contention at the master server. This is the total number of active jobs that can run concurrently at any one time. Since the maximum devices configured for a library is 12, this value had to be set to a minimum of 12, but could be set higher if required.

21

Backup issues
Since each environment is different, know that issues may arise and require patches or workarounds to allow proper backup execution. One example encountered was shortly after beginning tests: archivelogs would not backup properly after full backups. Expiring all media and performing the backup again would allow the archivelogs to be backed up once, but not after. Since this is not an optimal situation, applying MP2 was required to fix the issue. For other issues, see Appendix D. Other issues.

Setting up restores
Three options exist when performing restores to the Oracle database servers in this environment. Create a restore job on the Media Server with the Java NetBackup Admin Console. Manually create a script on the Media Server and execute the restore with RMAN at the command line. Copy a backup template and modify the RMAN script portion to perform restores instead of backups. You can then create a backup policy but use the restore template. For files backed up on one machine that were to be restored to another machine, a parameter must be set in the bp.conf file located in the /usr/openv/netbackup directory. The line to be added must be in the following format: FORCE_RESTORE_MEDIA_SERVER = backup_server,restore_server Where the backup_server is the host name of the server that performed the backup and the restore_server is the host name of the server that needs to perform the restore. RMAN and NetBackup automated the restore process of de-multiplexing files and mounting the required backup images, regardless of media type. This is especially helpful with disk backups and disk staging.
Note: For the restore to be successful, the controlfile from the last backup must be available or you must have a RMAN catalog configured.

Backup and restore performance results


The testing effort provided the following results: Successfully backs up at the rate of approximately 1 TB per hour Successfully recovered the database after simulated catastrophic data corruption

22

Backup methodologies
Disk-to-disk backupsBacking up Oracle 10gR2 to EVA5000 For this test, RMAN performed a tape backup and NetBackup translated the data streams into files and wrote them to the defined Disk Storage Units. The I/O load was balanced across HBAs and controller ports. Disk-to-virtual tape backupsBacking up Oracle 10gR2 to VLS For this test, RMAN performed a tape backup using 12 streams to the VLS. The I/O load was balanced across HBAs and controller ports. Disk-to-tape backupsBacking up Oracle 10gR2 to EML For this test, RMAN performed a tape backup using four streams to the EML. The I/O load was balanced across HBAs and controller ports.

EVA performance results


EVA5000 RAW performance characterization Raw write testing of the EVA5000 was performed with dd utility, the low-level UNIX tool that only does sequential I/O with a modifiable block size. This testing provided a baseline performance for the tested configuration. The raw performance of the EVA5000 being accessed from a host was able to be shown. An example of a dd command is: dd if=/dev/zero of=/backup1/test.fil bs=256K The RAW sequential largeblock performance yielded approximately 190 MB/sec for a single port on a single HSV110 Controller. This performance doubled to 380 MB/sec for two simultaneous dd commands to two different LUNs separated by HSV110 Controllers. EVA8000 RAW performance characterization Raw write testing the EVA8000 was performed with dd utility. An example of a dd command is: dd if=/dev/zero of=/u02/test.fil bs=256K The RAW sequential largeblock performance yielded approximately 190 MB/sec for a single Vdisk using multiple HBAs and Controllers. This performance doubled to 380 MB/sec for two simultaneous dd commands to two different LUNs. The I/O load was balanced across HBAs and Controller Host Ports to avoid I/O bottlenecks.

EVA5000 backup and restore results


The following results show the speed of the disk-to-disk backups. These results were generated by performing a disk backup with a NetBackup block size of 256 KB with and without a workload applied.
Table 7. Backup results for disk-to-disk backups using EVA5000, VCS 4.001, with OLTP load Host type IA64 RAC IA64 Single IA32 Multi Database size (GB) 2400 1320 600 Backup LUNs 2 1 1 Backup time (hrs:min) 7:06 4:10 7:42 Channels 12 12 8 Backup rate (GB/hr) 338.03 316.98 78.43

23

At first these results might not mean much, but if you examine the backup time, you can understand if this meets the backup window requirements. The target of approximately 1 TB per hour was missed. For comparison, Table 10 shows the restore performance for this EVA5000 configuration.
Table 8. Backup results for disk-to-disk backups using EVA5000, VCS 4.001, without OLTP load Host type IA64 RAC IA64 Single IA32 Multi Database size (GB) 2400 1320 600 Backup LUNs 2 1 1 Backup time (hrs:min) 5:47 3:19 4:15 Channels 12 12 8 Backup rate (GB/hr) 415 400 140

The IA64 systems performed best overall but the target of approximately 1 TB/hr was missed. When contrasted with the results during OLTP workload, you can see that performing a backup during peak hours is not advised if trying to meet specific backup times. One of the issues causing the results in both of these backups is the firmware. Table 9 shows the same type of backup but with VCS 3.028 and cache mirroring disabled for the backup LUNs.
Table 9. Backup results for disk-to-disk backups using EVA5000, VCS 3.028, without OLTP load Host type IA64 RAC IA64 Single IA32 Multi Database size (GB) 3090 1373 750 Backup LUNs 2 1 1 Backup time (hrs:min) 4:24 2:46 2:10 RMAN channels 12 12 8 Backup rate (GB/hr) 702 508 245

Using the backrev firmware improved backup times dramatically, nearly doubling the throughput for each system. The largest contributing factor was turning off controller cache mirroring.
Table 10. Restore results for disk-to-disk backups using EVA5000, VCS 4.001, offline restore Host type IA64 RAC IA64 Single IA32 Multi Database size (GB) 2400 1300 750 Backup LUNs 2 1 1 Restore time (hrs:min) 5:54 2:05 7:36 RMAN channels 12 12 8 Restore rate (GB/hr) 406.78 624.00 87.21

The two IA64 systems performed best during restores, and much better than the IA32 system overall. The RAC system did not seem to perform as well on the restore as the Single instance database. The RAC system had two factors causing these results. The RAC system had imbalanced backups because of NetBackup. The disk I/O was shared on the same HBA for backups. The IA32 system performed poorly due to two similar factors: The disk I/O was shared on the same HBA for backups. Bigfile tablepsaces were used. The side effect of these were long backups and restores using only one channel because of the limited capabilities of the hardware and the bigfile tablespace used to store the entire 150-GB database.

24

From these results, you can conclude: Using bigfile tablespaces can hamper overall restore results if you do not have enough tablepsaces to spread across multiple devices. Using 4.001 or higher code and RAID1 can yield the best protection from failures by eliminating the capability to turn off cache mirroring and not taxing controllers with RAID5 parity overhead. Using 3.028 firmware and RAID0 LUNs yields the best performance by allowing you to turn off cache mirroring, but eliminates any possible recovery from failures. Properly balancing datafiles/tablespaces to RMAN channels and tape devices yields the best results.

EML E-Series backup and restore performance results


The following results show the speed of the disk-to-tape backups. These results were generated by performing a tape backup with NetBackup using a block size of 256 KB with and without a workload applied.
Table 11. Backup results for disk-to-tape backup using EML 103e, with OLTP load Host type IA64 RAC IA64 Single IA32 Multi Database size (GB) 2400 1300 600 Backup LUNs 2 1 1 Backup time (hrs:min) 5:48 2:41 5:07 RMAN channels 4 4 4 Backup rate (GB/hr) 413.79 484.47 117.26

The two IA64 backups were similar in their rates. There is still an imbalance due to NetBackup, but it is more streamlined on the system bus and there is no contention from the EVA5000 cache. The cause and fix for this imbalance is discussed in Appendix C. Examples. The IA32 system still has fairly low performance overall, but better than disk-to-disk. This was because one of the limitations was removed by backing up from one FC Port to two separate FC Ports. Congestion at the system bus or EVA5000 controller cache was not an issue. Table 12 shows backup results for the EML while not under load.
Table 12. Backup results for disk-to-tape backup using EML 103e, without OLTP load Host type Database size (GB) 2400 1300 600 Backup LUNs Backup time (hrs:min) 5:24 2:13 3:20 RMAN channels Backup rate (GB/hr) 4 4 4 441.72 586.67 180

IA64 RAC IA64 Single IA32 Multi

2 1 1

25

The performance characteristics of this test again look different than when performing a backup under load. Each of the backups for the servers had better performance than the previous backup results in Table 11. This again reinforces the fact that a backup while under heavy load is not advised. The IA32 Multi system saw the greatest percentage increase in performance, though it is not at the level of the IA64 systems.
Table 13. Restore results for disk-to-tape backup using EML 103e, offline restore Host type IA64 RAC IA64 Single IA32 Multi Database size (GB) 2400 1300 600 Backup LUNs 2 1 1 Backup time (hrs:min) 3:25 1:42 2:25 RMAN channels 4 4 4 Restore rate (GB/hr) 702.44 764.71 248.28

Each of these results is fairly good for each server. Each of the restores is done in an offline manner, so each of the servers has more resources available for the restore than they would for the backup. The IA32 system could yield even higher performance if more bigfile tablespaces were employed, or smallfiles were used for the single 150-GB tablespace. From these results, you can conclude: Tape backup is very sensitive to the layout of the data paths and device visibility. As long as you have proper size files to back up, tape streaming can perform at decent levels of I/O. Tape restores are fast, while backups are highly impacted if poor channel balancing occurs.

VLS6510 backup and restore performance results


The following results show the speed of the disk-to-tape backups. These results were generated by performing a tape backup with NetBackup using a block size of 256 KB with and without a workload applied.
Table 14. Backup results for Disk to Tape backup using VLS 6510, w/out OLTP load. Host type IA64 RAC IA64 Single IA32 Multi Database size (GB) 2400 1320 425 Backup LUNs 2 1 1 Backup Time (hrs:min) 2:36 1:44 3:20 RMAN channels 12 12 8 Restore rate (GB/hr) 923.08 761.54 196.15

The backup results show the high throughput the VLS can deliver. The approximate 1-TB/hr target on the RAC system is met. Each system had great improvement over disk and tape though during backup.

26

The VLS6510 allows current tape users to enjoy the speed of disk. Since the VLS is emulating any type of tape device and multiple libraries, you could use this system for consolidation when moving to one type of tape media and devices.
Table 15. Backup results for disk-to-tape backup using VLS6510, with OLTP load Host type IA64 RAC IA64 Single IA32 Multi Database size (GB) 2400 1320 425 Backup LUNs 2 1 1 Backup time (hrs:min) 2:44 2:43 3:45 RMAN channels 12 12 8 Restore rate (GB/hr) 883.44 484.47 113.21

Even though these backup results are also very close to the approximate 1-TB/hr target, there is a large decrease in throughput during workloads. The single largest percent decrease is seen by IA32 Multi, with the IA64 Single following closely behind. This further reinforces that backups during peak workloads should be avoided. Table 16 shows the VLS restore results.
Table 16. Restore results for disk-to-tape backup using VLS6510, without OLTP load Host type IA64 RAC IA64 Single IA32 Multi Database size (GB) 2400 1320 600 Backup LUNs 2 1 1 Backup time (hrs:min) 9:25 2:46 1:48 RMAN channels 12 12 8 Restore rate (GB/hr) 254.87 477.11 333.33

These results show some startling results. The first of which is the low performance of the restore on IA64 RAC when compared to the backup results. The reason this occurred is the backup imbalance mentioned earlier in the paper. When a scripted backup is performed ensuring no RMAN script modification by NetBackup, these results would be similar instead of highly contrasted as seen here. The backup performance masks the issue but the restore is impacted. The same is true for the IA64 Single system, but not to as high a degree. Because a single LUN on IA64 Single is being written to, there is not as great an impact. Because IA64 RAC is restoring to two LUNs, tape channels are being waited on to complete and so the restore finishes with a lesser degree of parallelism than seen on IA64 Single. The IA32 Multi performed very well for the VLS restore considering that bigfile tablepsaces were being used and were again serialized. If several smallfiles or many bigfile tablespaces were to be used, this restore could be improved. From these results, you can conclude: Backups are sensitive to the layout of the data paths, device visibility, and the ratio of the number of files to RMAN device channels. The VLS is very easy to use and maintain allowing administrators to incorporate a speedy method of backup into any tape environment. This allows shorter backup windows than traditional tape. Configuring many emulated devices on the VLS is essential in achieving maximum performance from the VLS. Using four MSA20 disk shelves, the maximum for the VLS6510, is a must where high throughput backups are required on the VLS6510. It is also essential when emulating multiple types of libraries, tape devices, and media.

27

Conclusions
The data derived for the workload, backup, restore, and EVA5000 VCS tests provide an understanding to important things about each methodology discussed.

Oracle RMAN
Using few bigfile tablespaces can hamper overall restore results if you do not have enough tablepsaces to spread across all backup devices. Preventing poorly balanced channels will improve backup performance. Conducting backups under load is not advised where short backup windows are required.

Disk-to-disk backup
Using 4.001 or higher code and RAID1 can yield the best protection from failures by eliminating the capability to turn off cache mirroring and not taxing controllers with RAID5 parity overhead. Using 3.028 firmware and RAID0 LUNs yields the best performance by allowing cache mirroring to be disabled, but has potential to corrupt the backup if a failure occurs. Using the same HBA for disk backups can cause contention on the PCI bus resulting in slower backups. The EVA5000 VCS 4.00x cannot disable use of the cache mirror port, which is used for more than just host LUN cache mirroring. The need for Active/Active versus Active/Passive capabilities should be weighed before implementation.

Disk-to-tape backup
Tape backup is sensitive to the layout of the data paths and device visibility. Having proper size files to back up will allow tape streaming to perform well. Using OS and hardware tape buffering as well as enabling hardware compression with a proper block size will allow the tape device to perform at its peak. Tape restores are fast, while backups are highly impacted if poor channel balancing occurs.

Diskto-VLS backup
Backups are sensitive to the layout of the data paths, device visibility, and the ratio of the number of files to RMAN device channels. The VLS is very easy to use and maintain allowing administrators to incorporate a speedy method of backup into any tape environment. This allows shorter backup windows than traditional tape. Configuring many emulated devices on the VLS is essential in achieving maximum performance from the VLS. Using four MSA20 disk shelves, the maximum for the VLS6510, is a must where high throughput backups are required on the VLS6510. It is also essential when emulating multiple types of libraries, tape devices, and media.

28

Server configuration
The DL580 server system was very loaded as a multi-database server. A DL580 server with more memory and HBAs would be advised if performing high levels of transactions to a server of this type. System bus contention should be avoided by ensuring servers have enough HBAs or PCI-X busses. If too much swapping occurs during backup, add more RAM to your system until swapping reduces to acceptable levels. The latest QLogic driver as of August 11, 2006, supports dynamic load balancing; earlier versions do not. This is important because servers can avoid possible I/O contention causing better backup and/or restore performance.

Best practices
During testing, several best practices were developed to improve backup and recovery performance for each scenario.

Best practices for disk-to-disk backups on the EVA5000 storage array


Use RAID0 for backup target LUNs for best performance on VCS 3.028 firmware Use RAID1 for lowest overhead data protection on VCS 4.001 firmware Use VCS 3.028 firmware for higher performance by disabling cache mirroring for backup LUNs
Note VCS 4.00x firmware includes several critical bug fixes and other improvements. Read the VCS 4.001 and 3.028 release notes carefully before choosing to use a down-rev firmware version.

Use as many disks as possible for a given Disk Group. Create Disk Groups of at least 56 FATA disks for highest throughput per disk group Use two or more LUNs for each host to spread data streams across controllers for improved bandwidth

Best practices for disk-to-tape backups on the EML E-Series Tape Library
Several settings were set to achieve optimum performance. These settings should be tuned for the specific environment, and when possible, validated first within a test environment. The following is a list of the important tuning options used to achieve good performance during the backup to tape. Buffer configurationThe bulk of the NetBackup tuning was done at this level. There were three touch files that were used within NetBackup to achieve better levels of performance, with respect to the use of memory buffers. These files will typically be located in the /usr/openv/netbackup/db/config directory. A document that explains buffer configuration for NetBackup can be found at http://seer.support.veritas.com/docs/183702.htm. Another recommended document regarding buffer configuration for LTO3 and NetBackup can be found at http://h71028.www7.hp.com/ERC/downloads/5982-9971EN.pdf.

29

Block settingsThe OS level tape block settings were configured using the /etc/stinit.def file to specify settings for the LTO tape devices. The following settings were used in the test environment: NUMBER_DATA_BUFFERS: The number of buffers used by NetBackup to buffer data before sending it to the tape drives. The default value is 16 and was set to 32. SIZE_DATA_BUFFERS: The size of each buffer setup multiplied by the NUMBER_DATA_BUFFERS value. The default value is 65536 and was set to 262144. NUMBER_DATA_BUFFERS_RESTORE: The number of buffers used by NetBackup to buffer data before writing it to the disk. The default value is 16 and was set to 32. (optional) Blocksize: This is a stinit.def setting. This is used by stinit to set each tape devices defaults. A setting of 0 is used so that the blocksize is automatically determined at write time. Drive-buffering: This is a stinit.def setting. This is used by stinit to set each tape devices defaults. Adding this to the stinit device definition will enable hardware buffering for the LTO3 tape device (this parameter can only be used if the drive is buffer capable).

Best practices for disk-to-virtual tape backups on the VLS Virtual Tape Library
Several settings were set to achieve optimum performance. These settings should be tuned for the specific environment, and when possible, validated first within a test environment. The VLS6510 uses four MSA20 disk arrays to emulate tape devices. Buffer configurationThe bulk of the NetBackup tuning was done at this level. There were three touch files that were used within NetBackup to achieve better levels of performance, with respect to the use of memory buffers. These files will typically be located in the /usr/openv/netbackup/db/config directory. A thorough document that explains buffer configuration for NetBackup can be found at http://seer.support.veritas.com/docs/183702.htm. Another recommended document regarding buffer configuration for LTO3 and NetBackup can be found at http://h71028.www7.hp.com/ERC/downloads/5982-9971EN.pdf. Block settingsThe OS level tape block settings were configured using the /etc/stinit.def file to specify settings for the LTO tape devices. Number of devicesThe number of devices emulated can contribute greatly to the overall performance of the VLS. The best performing configuration is generally a 3 or 4:1 ratio of emulated devices to MSA20 disk shelves attached to the VLS Interface Controller. Following are the settings that were used in the test environment: NUMBER_DATA_BUFFERS: The number of buffers used by NetBackup to buffer data before sending it to the tape drives. The default value is 16 and was set to 32. SIZE_DATA_BUFFERS: The size of each buffer setup multiplied by the NUMBER_DATA_BUFFERS value. The default value is 65536 and was set to 262144. Blocksize: This is a stinit.def setting. This is used by stinit to set each tape devices defaults. A setting of 0 is used so that the blocksize is automatically determined at write time. Drive-buffering: This is a stinit.def setting. This is used by stinit to set each tape devices defaults. Adding this to the stinit device definition will enable hardware buffering if the drive is capable.

30

Best practices for using Oracle Recovery Manager (RMAN)


Use a separate RMAN catalog database This will help in managing your backups by providing redundancy to your NetBackup catalog and retain information otherwise backed up in control files. A catalog database also makes it possible to restore image copies required if you choose to create an Oracle Data Guard physical standby database, and can add protection against and simplify recovery of lost controlfiles. Protect your RMAN repository, NetBackup catalog, or both The RMAN catalog should be backed up and redundant copies kept on separate media. Create a NetBackup catalog backup policy upon installation. Create a binary copy of the controlfile if not using a RMAN catalog. Enable Block Change Tracking to improve the speed of incrementals This is especially important when it comes to incremental backups. Without a BCT table an incremental level 1 backup will take as long as a full level 0. Use flash recovery for online point-in-time recovery and transaction rollback tracking Using flash recovery is great for online recovery and helps prevent the need to utilize backups for recovery. Flash recovery should be part of your storage growth planning, if implemented, as it can take a very large amount of storage. The actual storage required depends on the number of days of data and the database transaction rate. Configuring even a small flash recovery area will at least ensure you have a recent copy of a binary control file. Choose effective backup policy types Incorporating full and incremental backups should help ease recovery, but one policy does not fit all. Understanding how and when to use cumulative versus differential backups is important for your recovery strategy as well. Maintain your Oracle backup images effectively Use Oracle 10gs incremental merge to create a current full image copy from incrementals. Creating a full backup often to ensure recoverability, even with incrementals handy, does not mean you can recover, unless you have all the archive logs since the previous full. If using disk backups as the primary recovery option, move old on-disk backups to tape with the backup database backupset command or NetBackup Media Copy to manage space. Test copies of backups One definite way of testing backups is to restore them to an alternate location and attempting to start the restored database. The next method to check the backup is to use the RMAN validate command. Finally, use the RMAN crosscheck command to verify that backup, data, and archive log files are still located on the target media. Manage your archived online logs and keep them safe Ensure your archived logs are part of your full and incremental backup. Also, create duplicate copies of media containing archive logs. Manage the archive log space by deleting archive logs as part of the backup. If you cannot use RMAN, NetBackup can help NetBackup can help your backup strategy by managing online backups with a single point of management. NetBackup can also provide binary block level incremental backup and restores and may be slower than RMAN, but the option is available. For additional RMAN configuration information, see Appendix B. Configuring Oracle RMAN.

31

Appendix A. Bill of Materials


Backup Server 1 HP Integrity rx7620 server 1 Operating system Multi-path solution Backup Software Solution MP BMC EFI System 1.6-GHz CPU GB Memory A6826A (dual port HBA) Backup Server 2 HP ProLiant DL580 G2 server 1 Operating system Multi-path solution Backup Software Solution Red Hat Linux AS4 U3 QLogic Driver Dynamic Load Balancing Symantec NetBackup Enterprise Server 6.0 (MP2) 8(4 per partition) 64 (32 per partition) 8 (4 per partition) Firmware version: 3.03.150 Driver 1.42 Red Hat Linux AS4 U3 QLogic Driver Dynamic Load Balancing Symantec NetBackup Enterprise Server 6.0 (MP2) E.03.13 03.47 03.10 03.11

3.0-GHz CPU GB Memory FCA2214 (dual port HBA) StorageEVA8000 EVA8000 (2C12D) 300-GB FC Disk (NDSOMEOMER) HP StorageWorks SAN 2/16N switches Brocade SilkWorm 3800 SAN switches HP OpenView Storage Management Appliance III 1 144 2 2 1 2

4 8 Firmware version: 3.21 Driver 1.45

V5.110 HP02 V4.2.0c V3.2.0a V2.1

32

Disk-to-disk backup targetEVA5000 EVA5000 (2C8D) 250-GB FATA Disk (ND25058238) Disk-to-virtual tape backup targetVLS6510 VLS6510 250-GB SATA Disk (ND12341234) HP OpenView Command View TL Disk-to-tape backup targetEML 103e EML E-Series 103e Tape Library Ultrium 960 LTO-3 drives (ND25058238) 1 4 V3.020 HP01 1 48 1 V3.020 HP02b V3.2 1 56 V4.001 and V3.028 HP01

33

Appendix B. Configuring Oracle RMAN


This section shows the recommended defaults for each RMAN instance as well as the configuration options for configuring the RMAN backup.

Figure 7. RMAN configuration defaults CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default CONFIGURE BACKUP OPTIMIZATION OFF; # default CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default CONFIGURE CONTROLFILE AUTOBACKUP OFF; # default CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '%F'; # default CONFIGURE DEVICE TYPE DISK PARALLELISM 1 BACKUP TYPE TO BACKUPSET; # default CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default CONFIGURE MAXSETSIZE TO UNLIMITED; # default CONFIGURE ENCRYPTION FOR DATABASE OFF; # default CONFIGURE ENCRYPTION ALGORITHM 'AES128'; # default CONFIGURE ARCHIVELOG DELETION POLICY TO NONE; # default CONFIGURE SNAPSHOT CONTROLFILE NAME TO '/u01/app/oracle/product/10.2.0/db_1/dbs/snapcf_ORDB1.f'; # default

You may want to modify some of the preceding defaults, in particular the Backup Optimization, Default Device Type, Controlfile Autobackup, Parallelism, and Archivelog Deletion Policy. The following are examples of suggested changes to these settings: Enable Backup Optimization If you plan to use several incrementals and merge them. Only changed blocks will be backed up since the last backup. Not very useful for Full backups. Default Device Type The default device type may need to be a tape library or worm drive, so setting this may relieve some scripting. Controlfile Autoackup This is highly useful to ensure a controlfile backup is done often. Parallelism When writing backup sets, this will stream multiple files together to the same channel if set to a value greater than one. Archivelog Deletion Policy Setting this can ease management of scripts since you can set the archivelogs to be deleted at a predefined interval.

34

Appendix C. Examples
This section shows RMAN scripts examples, NetBackup templates, and NetBackup Screenshots.

Figure 8. RMAN Full Backup script (four channels configured) RUN { ALLOCATE CHANNEL ch00 ALLOCATE CHANNEL ch01 ALLOCATE CHANNEL ch02 ALLOCATE CHANNEL ch03 TYPE 'SBT_TAPE'; TYPE 'SBT_TAPE'; TYPE 'SBT_TAPE'; TYPE 'SBT_TAPE';

SEND 'NB_ORA_CLIENT=PRIM1,NB_ORA_SERV=EMMSRV,NB_ORA_POLICY=PRIM1-EML,NB_ORA_PC_SCHED=DefaultApplication-Backup'; BACKUP INCREMENTAL LEVEL=0 FORMAT 'Data_Plus_Arch_%d_u%u_s%s_p%p_t%t' TAG 'DB1 Full Standby Backup' DATABASE PLUS ARCHIVELOG; RELEASE CHANNEL ch00; RELEASE CHANNEL ch01; RELEASE CHANNEL ch02; RELEASE CHANNEL ch03; ALLOCATE CHANNEL ch00 TYPE 'SBT_TAPE'; SEND 'NB_ORA_CLIENT=PRIM1,NB_ORA_SERV=EMMSRV,NB_ORA_POLICY=PRIM1-EML, \ NB_ORA_SCHED=Default-Application-Backup'; BACKUP FORMAT 'STBYCTLFILE-_%d_u%u_s%s_p%p_t%t' CURRENT CONTROLFILE FOR STANDBY; RELEASE CHANNEL ch00; }

35

Figure 9. RMAN Duplicate script run { # Auxiliary channels are the only way to restore a database as a duplicate allocate auxiliary channel ch00 device type 'sbt_tape'; allocate auxiliary channel ch01 device type 'sbt_tape'; allocate auxiliary channel ch02 device type 'sbt_tape'; allocate auxiliary channel ch03 device type 'sbt_tape'; SEND 'NB_ORA_CLIENT=STBY1,NB_ORA_POLICY=STBY1-EML,NB_ORA_SERV=EMMSRV, \ NB_ORA_SCHED=Default-Application-Backup'; duplicate target database for standby; release channel ch00; release channel ch01; release channel ch02; release channel ch03; }

36

Figure 10. NetBackup 6 Oracle template #^oracle template configuration file <<MUST BE FIRST IN FILE, DO NOT REMOVE>> # Template level: 1.9.0 # Generated on: 06/28/06 16:01:13 # ----------------------------------------------------------------TEMPLATE_ID1=<SOURCE TEMPLATE> TEMPLATE_ID2=<CURRENT TEMPLATE> TEMPLATE_OWNER=root RUN_AS_USER=oracle

# ----------------------------------------------------------------# BACKUP_TYPE is derived from the schedule type when this script # is used in a NetBackup scheduled backup. For example, when: BACKUP_TYPE=INCREMENTAL LEVEL=0 ORACLE_HOME=/u01/app/oracle/product/10.2.0/db_1 ORACLE_SID=PRIM1 TARGETDB_LOGIN=sys TARGETDB_PASSWD=<SHA128 Encoded Password> TARGETDB_TNSNAME=PRIM1 # ----------------------------------------------------------------# RMAN command section # ----------------------------------------------------------------RUN { ALLOCATE CHANNEL ch00 TYPE 'SBT_TAPE'; SEND 'NB_ORA_CLIENT=Client1,NB_ORA_POLICY=Oracle-Policy, \ NB_ORA_SERV=EmmServer,NB_ORA_SCHED=DefaultApplication-Backup'; BACKUP INCREMENTAL LEVEL=0

37

FILESPERSET 1 MAXOPENFILES 8 FORMAT 'bk_u%u_s%s_p%p_t%t' DATABASE; RELEASE CHANNEL ch00; # Backup Archived Logs sql 'alter system archive log current'; ALLOCATE CHANNEL ch00 TYPE 'SBT_TAPE'; SEND 'NB_ORA_CLIENT=Client1,NB_ORA_POLICY=Oracle-Policy, \ NB_ORA_SERV=EmmServer,NB_ORA_SCHED=DefaultApplication-Backup'; BACKUP FORMAT 'arch-s%s-p%p-t%t' ARCHIVELOG ALL DELETE INPUT; RELEASE CHANNEL ch00; # Control file backup ALLOCATE CHANNEL ch00 TYPE 'SBT_TAPE'; SEND 'NB_ORA_CLIENT=Client1,NB_ORA_POLICY=Oracle-Policy, \ NB_ORA_SERV=EmmServer,NB_ORA_SCHED=DefaultApplication-Backup'; BACKUP FORMAT 'bk_u%u_s%s_p%p_t%t' CURRENT CONTROLFILE; RELEASE CHANNEL ch00; }

38

Figure 11. NetBackup 6 Policy Properties Dialog

39

Figure 12. NetBackup Policy Pane

40

Figure 13. NetBackup 6 Storage Unit Settings Disk Storage Unit

Tape Storage Unit

41

Figure 14. Stinit.def configurations for the EML and VLS # HP Ultrium 960 LTO-3 devices on the EML E-Series 103e manufacturer="HP" model="Ultrium 3-SCSI" revision="L29S" { scsi2logical=1 # Common definitions for all modes

can-bsr drive-buffering can-partitions auto-lock buffer-writes async-writes read-ahead compression timeout=800 long-timeout=14400 mode1 blocksize=0 density=0x00 }

# HP Ultrium 960 LTO-3 devices emulated on the VLS 6510 manufacturer="HP" model="Ultrium 3-SCSI" revision="R138" { scsi2logical=1 # Common definitions for all modes can-bsr drive-buffering can-partitions auto-lock buffer-writes async-writes read-ahead timeout=800 long-timeout=14400 mode1 blocksize=0 density=0x00 compression=0 }

42

Appendix D. Other issues


This section has a set of general issues encountered during testing and a description of suggested resolutions.

Server OS hangs/crashes
Issue: System hangs under high load. Resolution: Upgrade from AS4 U1 to U3.

Oracle session hangs


Issue: Oracle instances would stop leaving sessions in hung state during high load. Resolution: Upgrade Oracle to 10.2.0.2 patch.

NetBackup catalog synchronization


Issue: RMAN backups could not be reliably restored before the expire time. Resolution: Upgrade NetBackup to MP2.

RMAN not backing up archive logs


Issue: RMAN would not back up archive logs at the end of a full backup. Resolution: Upgrade NetBackup to MP2. Resolution (if MP2): Ensure the NetBackup Oracle template has the proper schedule type as part of the SEND command for archive logs. It should be Default-Application-Backup or another schedule that is of type Application Backup.

RMAN specific syntax changes


Issue: RMAN commands are modified before backup, based on specific arguments used. Argument 1: FilesPerSetMaxSetSize, Rate, MaxPieceSize rewritten to the RMAN script. Argument 2: MaxOpenFilesMaxSetSize rewritten to the RMAN script. Argument 3: MaxSetSizeRate, MaxPieceSize rewritten to the RMAN script. Resolution: Set the explicit commands in the template, or create a script on the server and call it directly.

Imbalanced backups
Issue: Allocated channels do not back up data evenly resulting in overall decrease in performance of the backup. Resolution: Use FilesPerSet, DiskRatio, MaxSetSize, or MaxPieceSize arguments to create more balanced backupsets.

43

Poorly streaming backups


Issue: Backups to tape are not streaming to tape well. Resolution 1: Check v$views: backup_async_io and backup_sync_io. Resolution 2: Use the BlkSize parameter of RMAN. Resolution 3: Adjust NetBackup parameters NUMBER_DATA_BUFFERS or SIZE_DATA_BUFFERS and observe impact. Resolution 4: Adjust MaxOpenFiles and/or FilesPerSet RMAN parameters and observe impact.

RAC issues
Issue: OCFS2 timeouts under load. Resolution: Set default timeout value greater than seven.

General Oracle changes


Cursor_Sharing set to force Optimizer_Mode set to all_rows Dbwr_io_slaves set to 4, used for disk backups Tape_io_slaves set to true

44

For more information


HP
Customer Focused Testing http://www.hp.com/go/hpcft Storage HP StorageWorks Enterprise Virtual Array configuration best practices The role of HP StorageWorks 6000 Virtual Library Systems in a modern data protection strategy Ultrium 960 Drive Performance Guide EML E-Series User Guide Enterprise Backup Solution (EBS) http://www.hp.com/go/ebs HP StorageWorks Enterprise Backup Solution Near Online Backup-Restore Solution EBS Design Guide Performance and Troubleshooting http://www.hp.com/support/pat HP SAN design guide

Oracle
Backup and Recovery Best Practices Guide http://www.oracle.com/technology/deploy/availability/pdf/S942_Chien.doc.pdf Backup and Restore Overview http://www.oracle.com/technology/deploy/availability/htdocs/BR_Overview.htm

Symantec NetBackup 6.0


NetBackup 6.0 Administrator Guide NetBackup 6.0 for Oracle Administrator Guide on UNIX and Linux NetBackup 6.0 Backup Planning and Performance Tuning Guide

Quest Benchmark Factory 5.0


Open Source Tools TIOBenchThreaded I/O profiling Bonnie++Disk stress tests DSTATProvides a consolidated view of iostat and vmstat on one screen

2006 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein. Intel, Xeon, and Itanium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation. Java is a US trademark of Sun Microsystems, Inc. Oracle is a registered US trademark of Oracle Corporation, Redwood City, California. 4AA0-8102ENW, October 2006

Potrebbero piacerti anche