Sei sulla pagina 1di 5

STORAGE ENVIRONMENT

Data Replication and Recovery Using

EMC SnapView and MirrorView


EMC SnapView 2.1 and MirrorView 1.7 software help administrators protect and recover their data in Dell|EMC storage area networks (SANs). This article explains the differences between these applications, demonstrating how they can be used to achieve high availability in a disaster recovery environment, reduce backup windows, and remove the processing overhead for backups from production servers.
BY RICHARD HOU, STEVE FEIBUS, AND PATTY YOUNG

torage area networks (SANs) offer an effective means of storing and sharing data. As the amount of data

bandwidth. Acme connected a tape library to the SAN for backup and restore; tapes are stored in a remote site after tapes at the production site are backed up and verified. The remote site also acts as a disaster recovery site for the primary (production) site, and the IT department established a secondary SAN for the applications running at the remote location. Currently, Acme faces three business challenges:

stored on a SAN increases, however, backup windows lengthen and disaster recovery requires more time. EMC SnapView 2.1 and MirrorView 1.7 storage management software can help facilitate efficient backups and disaster recovery in Dell | EMC SAN environments. To demonstrate how IT administrators can improve SAN management in a typical data center environment using EMC storage management software, this article presents a scenario using a fictional company called Acme. In this scenario, Acme has a local data center that uses a Dell | EMC SAN for consolidated, redundant, high-availability storage. In addition, the company has heterogeneous servers sharing the storage and tape library. To protect the companys valuable business data, the Acme IT department implemented a disaster recovery plan. The plan involved deploying a fully redundant Dell | EMC SAN, which allows Acme to achieve high availability at the hardware level. To prevent failures at the operating system (OS) and application levels, the IT department deployed the companys main application in a Microsoft Cluster Service (MSCS) environment. Every application server that is connected to the SAN has two host bus adapter (HBA) cards to provide redundancy and increase

Increasing backup window: As the database grows, the backup window will soon exceed the time available for the daily backups. Lengthy time to recovery: The length of time required for disaster recoveryreferred to as mean time to recovery (MTTR)is growing because larger databases require longer restore times. The companys main application is inaccessible during restoration.

Overhead on the production environment: Acme uses its production database to perform application development work; however, developer access to the production database creates overhead on the database engine. Performing online backups also incurs overhead that affects the performance of the production servers.
August 2003

68

POWER SOLUTIONS

STORAGE ENVIRONMENT

Primary site
NAS server Clustered application server group Stand-alone application server

Secondary site
Remote application server Backup/restore server

Primary storage array

Remote backup/restore tape library

Remote storage array

Figure 1. Primary and secondary SANs connected for disaster recovery

To resolve these issues, Acme decided to update its disaster recovery plan. The Acme IT department connected the primary-site SAN with the secondary-site SAN, as shown in Figure 1. The company plans to use SnapView snapshots and clones to create replicas for online backups and for development use. Acme will use MirrorView software across the Dell | EMC SANs to create remote copies for disaster recovery. Using SnapView and MirrorView will also enable Acme to create a plan for recovery at the file, logical unit number (LUN), and array levels, as well as to complete online backups without affecting the production environment.

The snapshot feature uses a cache-and-pointer design, where a chunk map table keeps track of data chunks (groups of blocks) based on their state at a given time. As the first write request to a block is made to the source LUN, the chunk to be modified is copied to a snapshot cache on private LUNsa process known as copy on first write (COFW). The source LUN, the snapshot cache, and the chunk map table work together to create the virtual snapshot LUN. The snapshot LUN is an exact copy of the production LUN, and thus the snapshot must be accessed by a different host, such as a development or backup server. The backup server can read from and write to a snapshot LUN, but any changes made to the snapshot LUN do not replicate back to the source LUN. When the snapshot session is deactivated, the virtual snapshot LUN will be invisible to the server. As Figure 2 indicates, every source LUN can have as many as eight sessions and eight snapshots. Snapshots have a one-to-one relationship with a server. Each snapshot must be assigned to a different server, whereas sessions can be related to any server, depending on which session is activated and when it is activated. The most common use of a snapshot is to produce a backup copy of a large database. Performing an online backup of a database can help to shorten the backup window without interrupting

Creating snapshots and clones for backups using SnapView


SnapView 2.1 creates either a virtual point-in-time copy (snapshot) of the original data or a full, physical point-in-time copy (clone) of the original data. Currently, SnapView is supported on Dell | EMC FC4700-2, CX400, and CX600 storage arrays as a nondisruptive upgrade, meaning that the software can be added at any time without disturbing the production environment.

Snapshots depend on source LUN


At Acme, a long backup window consumes resources from the database engine and creates overhead on the production environment. The necessity for Acme development engineers to access the production database for development work contributes to these two problems. SnapView can help resolve both of these business issues. Using SnapView, administrators can create up to eight pointin-time snapshots of a LUN, which can subsequently be made accessible to as many as eight hosts. For example, the Acme SAN administrator can make a snapshot accessible to a backup server, allowing the production server to continue processing without the downtime traditionally associated with backup processes. An administrator also can create additional snapshot sessions for use by the development engineers without affecting the data on the production, or source, LUN.
www.dell.com/powersolutions

Storage group 1

One source LUN

Storage group 2

Storage group 3

Multiple snapshot cache

Up to eight sessions

Possible relationship between snapshot LUN and the session

Up to eight snapshots

Up to eight servers
Storage group 8

Figure 2. Source LUN, session, and snapshot LUN relationship

POWER SOLUTIONS

69

STORAGE ENVIRONMENT

Local area network

To create a clone, the initial data is copied, or synchronized, to the clone (see Figure 4). During synchronization, any host write requests made to the source LUN are copied to the clone. Once the
Backup/restore server Snapshot LUN

Monday 6:00 P.M. session NAS server Source LUN Tuesday 6:00 P.M. session Wednesday 6:00 P.M. session Thursday 6:00 P.M. session Friday 6:00 P.M. session

clone is 100 percent synchronized, it is fractured manually at a point in time to create a stand-alone BCV that is independent of the source LUN. Servers cannot access the clone LUN until it is fracturedthough application I/O can still access the source LUN during synchronization. Resynchronization can occur in either direction. To recover data from the clone to the source LUN, administrators can use the reverse synchronization feature while I/O continues to the source LUN. A clone becomes available for read and write access once it is fractured. Administrators also can access a clone by creating a snapshot and then assigning the snapshot to a second server storage group as long as the snapshot is in a different storage group than the source LUN. This manner of implementation not only removes the overhead on the server, but it also enables the source LUN to access snapshots without I/O overhead. After synchronization and fracturing, a clone becomes a fully populated, physical copy of its source LUN. Because clones are not pointer-based replicas, they are not affected by the COFW performance penalty; the data is replicated to the clone instead of being copied to nonvolatile memory along with the modified chunks. This process results in lower performance overhead for clones than snapshots. A clone is commonly used in environments that require quick MTTR or online backups based on the point-in-time copies that have zero impact on the production data. A server can read from and write to a fractured clone without affecting the source LUN. Also, resynchronizing the clone is fast because clones use a space in memory called the clone private log (CPL) to keep track of the changes that occur after they have been fractured. For efficiency, 100 percent resynchronization is avoided; only post-fracture changes are resynchronized.

Figure 3. File recovery from a snapshot LUN to the source LUN

production access to the database. However, online backups create overhead on the production database server, sometimes even requiring that the database be stopped during the backup window. A SnapView snapshot allows the database to be replicated instantaneously. The replica can then be used for online backups, as well as for development work, without putting additional overhead on the application server. SnapView snapshots also improve and simplify file-level recovery. Administrators can maintain a repository of snapshot sessions across multiple days on the network attached storage (NAS) server connected to the SAN, as shown in Figure 3. If, for example, a user wants to access files from the Friday snapshot session, the SAN administrator can simply activate the Friday session and share that snapshot LUN with the user. The user can then retrieve the needed files by copying files from the snapshot LUN to the source LUN.

Clones produce full, independent copies


Although SnapView reduces the backup window and removes backup overhead from the production server, the snapshot features cache-and-pointer design means that snapshot LUNs depend on the existence of the source LUN. If the source LUN is damaged or destroyed, administrators would need to rebuild the source LUN and recover the data from tape or another backup medium (assuming that the local Dell | EMC storage array is still up and running). The MTTR after such an event might take hours depending on the size of the LUN and the speed of the tape technology. For a company requiring fast disaster recovery, as in the Acme scenario, snapshot LUNsthat is, virtual LUNsare not an ideal solution. To decrease MTTR, administrators can use the SnapView clone function to create LUN copies that are independent of the source LUN. Unlike snapshots, which are point-in-time views of a source LUN, clones are synchronous copies of the source LUN. Each clone LUN consumes exactly the same amount of physical space as the source LUN. Essentially a local mirror of the source LUN, a clone offers high availability and can withstand storage processor failures or source LUN failures, as well as path failures, provided that EMC PowerPath or Application Transparent Failover (ATF) software is installed and properly configured. Clones, therefore, are business continuance volumes (BCVs).

Enabling array-level disaster recovery through MirrorView


The Acme disaster recovery plan protects critical business data by outlining a procedure for recovery when the primary site is

Production storage group

Fracture after synchronization

Snapshot LUN

Clone group (Up to eight clones)

Figure 4. Clone creation and access

70

POWER SOLUTIONS

August 2003

STORAGE ENVIRONMENT

down. The plan also addresses the replication of data from the primary location to the secondary location so that applications running at the secondary site can access the same business data. To implement these processes, the Acme scenario uses the EMC MirrorView add-on software option. MirrorView is similar to the SnapView clone option, but works between Dell | EMC arrays instead of within a single array. Because MirrorView is arraybased software, it does not use server I/O or CPU resources, and it supports all of the operating systems used on the array. Provision for disaster recovery is the major benefit of MirrorView mirroring. As shown in Figure 5, multiple arrays in different locations can mirror to a common disaster recovery site, which makes it the central mirroring site for disaster recovery. If a disaster cripples the primary site, a MirrorView secondary image can be used to recover data and operations at the disaster recovery site. MirrorView runs redundantly across arrays. If one storage processor fails, MirrorViewrunning on the other storage processorwill take ownership of the mirrored LUNs. If the host can fail over I/O to the remaining storage processor (using PowerPath software), then mirroring will continue as normal. After the primary-site array has been recovered, the data at the secondary site can be synchronized back to the primary site. Although the mirrored target cannot be directly assigned to a server while it is acting as a mirrored target, SnapView software can be used to take a snapshot of the secondary mirrored LUN and then assign the snapshot to the servers on the secondary site for immediate access, even if the two sites are mirroring. MirrorView mirroring is synchronous, thus the longer the distance, the longer the delay, because the application must wait for a commitment to be returned from the remote array. For disaster recovery, primary and secondary storage systems should be relatively far apart (within 10 km) and connected through dedicated redundant pairs of fiber-optic cabling for Fibre Channelbased mirroring. For longer distances, other solutions exist.

MirrorView can ensure that data from the primary storage system replicates to the secondary array (see Figure 6). The host (if any) connected to the secondary array might normally sit idle until the primary site fails. With SnapView at the secondary site, the host at the secondary site can take snapshot copies of the mirror images (that is, secondary LUNs) and back them up to other media. This technique provides point-in-time snapshots of production data with little impact to production server performance. MirrorView provides a synchronous mirroring solution, which can help ensure that any write to the primary array also is committed on the secondary array before the production server gets an acknowledgment. Although this technique is commonly implemented on most mirroring technologies, it also requires that latency between two storage arrays be calculated and considered to prevent any performance degradation. Currently, MirrorView runs through either Fibre Channel (using dedicated fiber-optic cables) or Fibre Channel over IP (using routers and sufficient dedicated bandwidth on an IP wide area network, or WAN).

Selecting the appropriate data-protection strategy


SnapView snapshots, SnapView clones, and MirrorView mirrors provide different levels of data protection. Snapshots are most likely to be used in a parallel processing environment to provide online backups or file-level recovery, whereas clones and mirrors are more often used in disaster recovery situations. Clones may be used for fast recovery of local corrupt LUNs; clones support read and write access to both source LUN and clone once the clone has been fractured. Mirrors usually enable recovery of arrays or sites. Mirrors also can be used to replicate data to multiple sites, and then used with snapshots for remote access. Mirroring provides read and write capability only to the source LUN, but read and write access to the remote copy of the data can be accomplished by using SnapView on the target array to take a snapshot of the mirror. To support either MirrorView or SnapView, administrators must install the EMC Access Logix tool. This software masks source and

Snapshot storage group

target LUNs to different servers to prevent LUN corruption.

Production storage group

Snapshot storage group


Snapshot of the secondary image

Primary location A

Primary location C

Secondary location B

Snapshot storage group


Snapshot of the secondary image

Primary location B

Disaster recovery site

Primary location D

Primary location A

Secondary location A

Figure 5. Central mirroring for disaster recovery

Figure 6. Using MirrorView for data replication

72

POWER SOLUTIONS

August 2003

STORAGE ENVIRONMENT

CX400, CX600, or FC4700-2? Yes Single or multiple array? Single What is the purpose of the data copy?

No

Tape or third-party solutions for data replication

Customer operating systems

Mainframe

STOP MirrorView not a solution

Multiple

Data replication across arrays; BCV on remote site

Microsoft Windows 2000 Server, IBM AIX , Linux , Sun Solaris, Novell NetWare , HP-UX

Arrays to be utilized

CX200, Dell PowerVault 660F, Dell PowerVault 650F

Over 500 km 60 km500 km Dense wavelength division multiplexing (DWDM) extender Fibre Channel LW-GBIC Fibre Channel-2 Fibre Channel-1 10 km60 km Up to 10 km Up to 500 m Up to 300 m MirrorView IP or third-party solution

Online backup, decision support, and testing for instantaneous copy

BCV, data replication, online backup, and data recovery within array

CX400, CX600, FC4700-2

MirrorView Fibre Channel or MirrorView IP

Snapshot

Clone

Distance between mirrored locations

Figure 7. Decision tree for selecting snapshot, cloning, or mirroring

Combined solutions reduce backup window and production server overhead


In the Acme scenario, administrators were able to use both SnapView and MirrorView to solve the three business problems that the company faced. The company now uses its NAS server, to which any user can map, for storing snapshots. This server enables administrators to recover data from a specific point in time without a large backup window. The company created a local clone as a development server for its main clustered application, removing overhead from the production environment. Acme also mirrored its data to the remote site and created a snapshot of the mirror to enable online backups that will not affect the production environment. Mirroring the companys main application to the remote site provides quick MTTR and allows for remote backups in case of disaster at the primary site. Through snapshots, data can be assigned to servers at the remote location for other applications. Figure 7 provides a decision tree to help administrators choose the right replication and recovery tools for their own companys specific implementations.

affecting the production environment. These features also provide a way to replicate data to multiple locations as well as maintain data consistency.

Richard Hou (richard_hou@dell.com) is a systems engineer and consultant for the Dell Enterprise Technology and Education Center (ETEC), part of the Dell Enterprise Services and Support Group, where he specializes in SAN and Microsoft solutions. Richard has an M.S. in Electrical and Computer Engineering from The University of Texas at Austin and a B.S. in Mechanical Engineering from Zhejiang University, Hangzhou, China. Steve Feibus (steve_feibus@dell.com) has been a storage enterprise technologist in the Advanced Systems Group at Dell for the past two years and was recently promoted to manager of the Client Technologist team at Dell. Steve has a B.S. in Electrical Engineering from the University of Florida and has spent many years solving customer storage issues using the latest technologies and products. Patty Young (patty_young@dell.com) is a storage enterprise technologist in the Advanced Systems Group at Dell. She has been working with storage solutions for many years, supporting field system consultants in architecting storage solutions for their customers and providing feedback from customers to Dell regarding storage challenges and requirements. Patty has a B.A. from North Carolina State University.

Enabling comprehensive data-recovery plans using EMC software


Dell | EMC SANs provide a reliable environment for data consolidation. The optional SnapView and MirrorView software add-ons enable administrators to create a comprehensive data-recovery plan for different disaster scenarios. When administrators use the features provided in SnapView and MirrorView, they enable online development work or data mining to be performed without
www.dell.com/powersolutions

FOR MORE INFORMATION

EMC: http://www.emc.com Dell|EMC: http://www.dell.com/emc

POWER SOLUTIONS

73

Potrebbero piacerti anche