Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Plan, install, and configure DS8000 Copy Services Learn about IBM FlashCopy and Copy Services Learn through examples and practical scenarios
Peter Cronauer Bertrand Dufrasne Thorsten Altmannsberger Charles Burger Michael Frankenberg Jana Jamsek Peter Klee Flavio Gondim de Morais Luiz Moreira Alexander Warmuth Mark Wells
ibm.com/redbooks
February 2013
SG24-6788-06
Note: Before using this information and the product it supports, read the information in Notices on page xvii.
Seventh Edition (February 2013) This edition applies to the IBM System Storage DS8700 with DS8000 License Machine Code (LMC) 6.6.xxx.x and IBM System Storage DS8800 with DS8000 LMC 7.6.xxx.x.
Copyright International Business Machines Corporation 2008, 2013. All rights reserved. Note to U.S. Government Users Restricted Rights -- Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix The team who wrote this book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix Special thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi Now you can become a published author, too! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii Summary of changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii February 2013, Seventh Edition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiii Part 1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Chapter 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Point-in-time copy functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 FlashCopy SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.3 Remote Pair FlashCopy (Preserve Mirror) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Remote Mirror and Copy functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Metro Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.2 Global Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.3 Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.4 3-site Metro/Global Mirror with Incremental Resync . . . . . . . . . . . . . . . . . . . . . . . . 3 4 4 4 5 5 6 6 6 6
Chapter 2. Copy Services architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1 Introduction to the Copy Services structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.1 Management console defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.2 Storage Unit defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.3 Storage Facility Image (SFI) defined. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.1.4 Storage Complex defined . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2 The structure of Copy Services management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.1 Communication path for Copy Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.2.2 Differences between the DS CLI and the DS GUI . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 Easy Tier and I/O Priority Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Chapter 3. Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Licenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Considerations for a 2-site Metro Mirror configuration . . . . . . . . . . . . . . . . . . . . . 3.1.2 Considerations for a 2-site Global Mirror configuration. . . . . . . . . . . . . . . . . . . . . 3.1.3 Considerations for a 2-site Global Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.4 Additional information for Metro/Global Mirror and Global Mirror licensing. . . . . . 3.1.5 DS GUI support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Authorized level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2.2 Charging example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 16 19 19 19 19 20 20 20 22
Part 2. Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
iii
Chapter 4. Copy Services interfaces overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.1 DS8000 interface network components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Chapter 5. DS GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.1 Accessing the DS GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 5.2 Defining another DS8000 Storage Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Chapter 6. DS Command-Line Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1 Introduction and functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.1 User accounts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 DS CLI profile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.1 Simplifying the DS CLI command syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 Creating a password file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Command modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Single-shot command mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Interactive command mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.3 Script command mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Return codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 User assistance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 Man pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.6 Copy Services command structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 34 35 36 37 37 38 38 39 39 40 41 41 41
Part 3. FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Chapter 7. FlashCopy overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 FlashCopy operational environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Full volume copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 No copy option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 FlashCopy in combination with other Copy Services . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 FlashCopy with Metro Mirror and Global Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Remote Pair FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3 FlashCopy and Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 8. FlashCopy options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.1 Multiple relationship FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Consistency Group FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 FlashCopy target as a Metro Mirror or Global Copy primary. . . . . . . . . . . . . . . . . . . . . 8.4 Incremental FlashCopy: Refreshing the target volume . . . . . . . . . . . . . . . . . . . . . . . . . 8.5 Incremental FlashCopy: Reverse restore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.6 Remote FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.7 Remote Pair FlashCopy (Preserve Mirror) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.8 Persistent FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.9 Fast reverse restore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.10 FlashCopy SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.11 FlashCopy with thin provisioned Extent-Space-Efficient (ESE) volumes . . . . . . . . . . 8.12 Options and interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 9. FlashCopy interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 FlashCopy management interfaces: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1.1 FlashCopy control with the interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 DS CLI and DS GUI: Commands and options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2.1 Local FlashCopy management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 46 47 47 53 54 54 54 55 57 59 60 61 61 62 67 67 68 70 70 70 70 71 73 74 74 74 75
iv
9.2.2 Remote FlashCopy management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Local FlashCopy using the DS CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.1 Parameters that are used with local FlashCopy commands . . . . . . . . . . . . . . . . . 9.3.2 Local FlashCopy commands: Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3.3 FlashCopy consistency groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4 Remote FlashCopy using the DS CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.1 Remote FlashCopy commands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.4.2 Parameters that are used in Remote FlashCopy commands . . . . . . . . . . . . . . . . 9.5 Remote Pair FlashCopy using the DS CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.1 Remote Pair FlashCopy commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.5.2 Parameters that are used in Remote Pair FlashCopy commands . . . . . . . . . . . . 9.6 FlashCopy management using the DS GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.6.1 Initiating a FlashCopy relationship . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.6.2 Working with existing FlashCopy relationships . . . . . . . . . . . . . . . . . . . . . . . . . . .
76 76 77 78 86 87 87 87 88 89 89 89 90 92
Chapter 10. IBM FlashCopy SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 10.1 IBM FlashCopy SE overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 10.2 Track-Space-Efficient volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 10.3 Repository for Track-Space-Efficient (TSE) volumes . . . . . . . . . . . . . . . . . . . . . . . . . 98 10.3.1 Capacity planning for FlashCopy SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 10.3.2 Creating a repository for Track-Space-Efficient (TSE) volumes . . . . . . . . . . . . 101 10.3.3 Creating Track-Space-Efficient volumes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 10.4 Performing FlashCopy SE operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 10.4.1 Creating and resynchronizing FlashCopy SE relationships . . . . . . . . . . . . . . . 108 10.4.2 Removing FlashCopy relationships and releasing space . . . . . . . . . . . . . . . . . 111 10.4.3 Other FlashCopy SE operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 10.4.4 Working with Track-Space-Efficient volumes . . . . . . . . . . . . . . . . . . . . . . . . . . 114 10.4.5 Monitoring repository space and out-of-space conditions. . . . . . . . . . . . . . . . . 114 Chapter 11. Remote Pair FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Remote Pair FlashCopy overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.1 Features of Remote Pair FlashCopy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.2 Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1.3 Software support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Remote Pair FlashCopy implementation and usage . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.2 Preparing for Remote Pair FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.3 Remote Pair FlashCopy establishment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2.4 Remote Pair FlashCopy withdrawal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3 Using Remote Pair FlashCopy in Open Systems environments . . . . . . . . . . . . . . . . 11.3.1 DS CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.3.2 Remote Pair FlashCopy with DSGUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 12. FlashCopy performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1 FlashCopy performance overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.1 Distribution of the workload: Location of source and target volumes . . . . . . . . 12.1.2 LSS/LCU versus rank: Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.3 Rank characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 FlashCopy establish performance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 Background copy performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4 FlashCopy impact on applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4.1 FlashCopy nocopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4.2 FlashCopy full copy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.4.3 Incremental FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Contents
117 118 119 119 120 120 120 121 121 123 124 124 125 127 128 128 129 130 130 130 132 132 132 132 v
12.5 Performance planning for IBM FlashCopy SE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6 FlashCopy scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6.1 Scenario #1: Backup to disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6.2 Scenario #2: Backup to tape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6.3 Scenario #3: IBM FlashCopy SE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.6.4 Scenario #4: FlashCopy during peak application activity . . . . . . . . . . . . . . . . . 12.6.5 Scenario #5: Ranks reserved for FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 13. FlashCopy examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1 Creating a test system or integration system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1.1 One-time test system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.1.2 Multiple setup of a test system with the same contents . . . . . . . . . . . . . . . . . . 13.2 Creating a backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.1 Creating a FlashCopy for backup purposes without volume copy . . . . . . . . . . 13.2.2 Using IBM FlashCopy SE for backup purposes . . . . . . . . . . . . . . . . . . . . . . . . 13.2.3 Incremental FlashCopy for backup purposes . . . . . . . . . . . . . . . . . . . . . . . . . . 13.2.4 Using a target volume to restore its contents back to the source . . . . . . . . . . .
133 135 135 135 136 136 137 139 140 140 140 141 141 142 142 143
Part 4. Metro Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Chapter 14. Metro Mirror overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.1 Metro Mirror overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.2 Metro Mirror volume state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.3 Data consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4 Rolling disaster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.5 Automation and management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 15. Metro Mirror operation and configuration . . . . . . . . . . . . . . . . . . . . . . . . 15.1 Basic Metro Mirror operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1.1 Establishing a Metro Mirror pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1.2 Suspending a Metro Mirror pair. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1.3 Resuming a Metro Mirror pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1.4 Removing a Metro Mirror pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2 Metro Mirror paths and links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.1 Fibre Channel/Physical links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.2.2 Logical paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3 Consistency Group function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.1 Data consistency and dependent writes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.3.2 Consistency Group function: How it works . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.4 Failover and failback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 16. Metro Mirror implementation considerations . . . . . . . . . . . . . . . . . . . . . 16.1 Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.1.1 Peak bandwidth requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2.1 Managing the load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2.2 Initial synchronization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.2.3 Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.3 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.3.1 Adding capacity to the same DS8000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.3.2 Adding capacity to new DS8000s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.4 Symmetrical configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.5 Volume selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16.6 Hardware requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 148 149 150 150 151 153 154 154 155 155 155 155 155 157 158 158 159 167 171 172 172 172 172 173 173 173 173 174 174 175 176
vi
16.7 Interoperability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 Chapter 17. Metro Mirror interfaces and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.1 Metro Mirror interfaces: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.1.1 Similar functions of DS CLI and DS GUI for Metro Mirror . . . . . . . . . . . . . . . . . 17.2 DS Command-Line Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.1 Setup of the Metro Mirror configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.2 Removing the Metro Mirror environment using DS CLI . . . . . . . . . . . . . . . . . . 17.2.3 Managing the Metro Mirror environment with the DS CLI . . . . . . . . . . . . . . . . . 17.2.4 Switching over to a backup site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.5 Switching back to a primary site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.2.6 The freezepprc and unfreezepprc commands . . . . . . . . . . . . . . . . . . . . . . . . . 17.3 Using DS GUI for Metro Mirror operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.1 Establishing paths with the DS GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.2 Adding paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.3 Changing the LSS options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.4 Deleting paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.5 Creating volume pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.6 Suspending volume pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.7 Resuming volume pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.8 Metro Mirror failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17.3.9 Metro Mirror failback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 178 178 179 181 186 188 191 198 200 203 204 206 207 209 210 214 216 217 219
Part 5. Global Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Chapter 18. Global Copy overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.1 Global Copy overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.2 Volume states and change logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18.3 Global Copy positioning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 19. Global Copy options and configuration . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1 Global Copy basic options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1.1 Establishing a Global Copy pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1.2 Suspending a Global Copy pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1.3 Resuming a Global Copy pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1.4 Terminating a Global Copy pair . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.1.5 Converting a Global Copy pair to Metro Mirror . . . . . . . . . . . . . . . . . . . . . . . . . 19.2 Creating a consistent point-in-time copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.3 Cascading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.4 Hardware requirements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.4.1 License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.4.2 Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.4.3 Global Copy connectivity: Ports, paths, and links . . . . . . . . . . . . . . . . . . . . . . . 19.4.4 LSS and consistency group considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.5 Bandwidth considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.6 Distance considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.6.1 Channel extender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19.6.2 Dense Wavelength Division Multiplexor (DWDM). . . . . . . . . . . . . . . . . . . . . . . 19.7 Other planning considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 20. Global Copy interfaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.1 Global Copy interfaces: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2 Using DS CLI for Global Copy operations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.2.1 Defining Global Copy paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 228 229 230 231 232 232 233 233 233 233 234 236 237 237 237 237 238 239 239 239 240 241 243 244 244 245
Contents
vii
20.2.2 Managing Global Copy pairs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3 Using DS GUI for Global Copy operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3.1 Defining another DS8000 Storage Complex . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3.2 Establishing paths with the DS GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3.3 Adding paths with the DS GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3.4 Deleting paths with the DS GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20.3.5 Establishing Global Copy pairs using DS GUI . . . . . . . . . . . . . . . . . . . . . . . . . 20.3.6 Managing existing Global Copy pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 21. Global Copy examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1 Setting up a Global Copy environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.1 Defining the paths for Global Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.1.2 Creating Global Copy pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Removing the Global Copy environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2.1 Removing the Global Copy pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2.2 Removing the logical paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Maintaining the Global Copy environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3.1 Suspending and resuming the Global Copy data transfer . . . . . . . . . . . . . . . . 21.4 Changing the copy mode from Global Copy to Metro Mirror. . . . . . . . . . . . . . . . . . . 21.5 Changing the copy mode from Metro Mirror to Global Copy. . . . . . . . . . . . . . . . . . . 21.6 Periodic offsite backup procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.6.1 Initial setup for this environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.6.2 Periodical backup operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.7 Global Copy cascading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.8 Managing data migration with Global Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.8.1 Migration from simplex volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.8.2 Migration from the secondary volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 22. Global Copy performance and scalability . . . . . . . . . . . . . . . . . . . . . . . . 22.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.1.1 Peak bandwidth requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2 Scalability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22.2.1 Adding capacity overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
246 249 249 250 252 252 252 254 257 258 258 259 260 260 261 263 263 264 265 266 267 269 272 272 273 274 277 278 278 278 278
Part 6. Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 Chapter 23. Global Mirror overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 23.1 Terminology that is used in Global Mirror environments . . . . . . . . . . . . . . . . . . . . . . 284 23.2 Asynchronous data replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 23.2.1 Asynchronous data replication and dependent writes. . . . . . . . . . . . . . . . . . . . 286 23.3 Basic concepts of Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 23.4 Setting up a Global Mirror session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 23.4.1 A simple configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 23.4.2 Establishing connectivity to a secondary site . . . . . . . . . . . . . . . . . . . . . . . . . . 290 23.4.3 Creating a Global Copy relationship between the primary volume and the secondary volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 23.4.4 Introducing FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 23.4.5 Defining a Global Mirror session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 23.4.6 Populating a Global Mirror session with volumes . . . . . . . . . . . . . . . . . . . . . . . 295 23.4.7 Starting a Global Mirror session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 23.5 Consistency groups. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 23.5.1 Consistency group formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 23.5.2 Consistency Group parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 23.6 Multiple Global Mirror sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 viii
IBM System Storage DS8000 Copy Services for Open Systems
Chapter 24. Global Mirror options and configuration . . . . . . . . . . . . . . . . . . . . . . . . . 24.1 PPRC paths and links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.1.1 Fibre Channel links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.1.2 Logical paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.2 Bandwidth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.3 LSS design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.4 Global Mirror remote storage system considerations . . . . . . . . . . . . . . . . . . . . . . . . 24.5 Creating a Global Mirror environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.6 Modifying a Global Mirror session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.6.1 Adding to or removing volumes from a Global Mirror session . . . . . . . . . . . . . 24.6.2 Adding or removing storage systems or LSSs . . . . . . . . . . . . . . . . . . . . . . . . . 24.6.3 Modifying Global Mirror session parameters. . . . . . . . . . . . . . . . . . . . . . . . . . . 24.6.4 Global Mirror environment topology changes . . . . . . . . . . . . . . . . . . . . . . . . . . 24.6.5 Removing FlashCopy relationships. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.6.6 Removing the Global Mirror environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.7 Global Mirror with multiple storage systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8 Recovery scenario after a site failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.1 Normal Global Mirror operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.2 Production site failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.3 Global Copy failover B volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.4 Verifying a valid consistency group state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.5 Setting consistent data on B volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.6 Re-establishing FlashCopy relationships between B and C . . . . . . . . . . . . . . . 24.8.7 Restarting the application at the remote site. . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.8 Preparing to switch back to the local site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.9 Returning to the local site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.10 Conclusions of the failover/failback example . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.11 Remote site failure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.12 Restarting Global Mirrors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.8.13 Remote site conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 25. Global Mirror interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.1 Global Mirror interfaces: Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.2 DS Command-Line Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.3 DS Storage Manager GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25.4 Tivoli Storage Productivity Center for Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 26. Global Mirror performance and scalability . . . . . . . . . . . . . . . . . . . . . . . . 26.1 Performance aspects for Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26.2 Performance considerations for network connectivity . . . . . . . . . . . . . . . . . . . . . . . . 26.2.1 Considerations for long-distance fabrics using FC . . . . . . . . . . . . . . . . . . . . . . 26.2.2 Considerations for long-distance fabrics using FC-IP . . . . . . . . . . . . . . . . . . . . 26.2.3 Considerations for host adapter usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26.3 Global Mirror remote storage system recommendations . . . . . . . . . . . . . . . . . . . . . 26.4 Performance considerations at coordination time . . . . . . . . . . . . . . . . . . . . . . . . . . . 26.5 Consistency Group drain time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26.6 Remote storage system configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26.6.1 Logical configurations with classical FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . 26.6.2 Logical configurations with Space-Efficient FlashCopy. . . . . . . . . . . . . . . . . . . 26.7 Balancing the storage system configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26.8 Growth within Global Mirror configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
305 306 307 307 308 309 309 309 311 311 312 313 313 313 314 315 318 319 319 320 321 325 326 327 328 329 331 331 332 333 335 336 336 338 339 341 342 344 345 345 346 347 348 349 350 351 351 352 353
Chapter 27. Global Mirror examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 27.1 Setting up a Global Mirror environment using the DS CLI . . . . . . . . . . . . . . . . . . . . 356
Contents
ix
27.1.1 Preparing to work with the DS CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.1.2 Configuration that is used for the example environment. . . . . . . . . . . . . . . . . . 27.1.3 Setup procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.1.4 Creating Global Copy relationships: A to B volumes . . . . . . . . . . . . . . . . . . . . 27.1.5 Creating FlashCopy relationships: B to C volumes. . . . . . . . . . . . . . . . . . . . . . 27.1.6 Starting Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.2 Removing a Global Mirror environment with the DS CLI. . . . . . . . . . . . . . . . . . . . . . 27.2.1 Ending Global Mirror processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.2.2 Removing the A volumes from the Global Mirror session . . . . . . . . . . . . . . . . . 27.2.3 Removing the Global Mirror session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.2.4 Terminating FlashCopy pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.2.5 Terminating Global Copy pairs and removing the paths . . . . . . . . . . . . . . . . . . 27.3 Managing the Global Mirror environment with the DS CLI . . . . . . . . . . . . . . . . . . . . 27.3.1 Pausing and resuming Global Mirror Consistency Group formation . . . . . . . . . 27.3.2 Changing the Global Mirror tuning parameters . . . . . . . . . . . . . . . . . . . . . . . . . 27.3.3 Stopping and starting Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.3.4 Adding and removing A volumes to the Global Mirror environment . . . . . . . . . 27.3.5 Adding and removing an LSS to an existing Global Mirror environment. . . . . . 27.3.6 Adding and removing a subordinate disk system . . . . . . . . . . . . . . . . . . . . . . . 27.3.7 Recovering from a suspended state after a repository fills . . . . . . . . . . . . . . . . 27.4 Recovery scenario after a local site failure using the DS CLI . . . . . . . . . . . . . . . . . . 27.4.1 Summary of the recovery scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.4.2 Stopping Global Mirror processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.4.3 Performing Global Copy failover from B to A . . . . . . . . . . . . . . . . . . . . . . . . . . 27.4.4 Verifying a valid consistency group state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.4.5 Reversing FlashCopy from B to C. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.4.6 Re-establishing the FlashCopy relationship from B to C. . . . . . . . . . . . . . . . . . 27.4.7 Restarting the application at the remote site. . . . . . . . . . . . . . . . . . . . . . . . . . . 27.5 Returning to the local site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.5.1 Creating paths from B to A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.5.2 Performing Global Copy failback from B to A . . . . . . . . . . . . . . . . . . . . . . . . . . 27.5.3 Querying for the Global Copy first pass completion . . . . . . . . . . . . . . . . . . . . . 27.5.4 Quiescing the application at the remote site . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.5.5 Querying the out-of-sync tracks until the result shows zero . . . . . . . . . . . . . . . 27.5.6 Creating paths from A to B if they do not exist . . . . . . . . . . . . . . . . . . . . . . . . . 27.5.7 Performing Global Copy failover from A to B . . . . . . . . . . . . . . . . . . . . . . . . . . 27.5.8 Performing Global Copy failback from A to B . . . . . . . . . . . . . . . . . . . . . . . . . . 27.5.9 Starting Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.5.10 Starting the application at the local site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.6 Practicing disaster recovery readiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.6.1 Querying the Global Mirror environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.6.2 Pausing Global Mirror and checking its completion . . . . . . . . . . . . . . . . . . . . . 27.6.3 Pausing Global Copy pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.6.4 Performing Global Copy failover from B to A . . . . . . . . . . . . . . . . . . . . . . . . . . 27.6.5 Creating consistent data on B volumes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.6.6 Waiting for the FlashCopy background copy to complete. . . . . . . . . . . . . . . . . 27.6.7 Re-establishing the FlashCopy relationships . . . . . . . . . . . . . . . . . . . . . . . . . . 27.6.8 Taking a FlashCopy from B to D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.6.9 Performing disaster recovery testing using the D volume. . . . . . . . . . . . . . . . . 27.6.10 Performing Global Copy failback from A to B . . . . . . . . . . . . . . . . . . . . . . . . . 27.6.11 Waiting for the Global Copy first pass to complete . . . . . . . . . . . . . . . . . . . . . 27.6.12 Resuming Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.7 DS Storage Manager GUI: Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
IBM System Storage DS8000 Copy Services for Open Systems
356 356 356 357 358 359 365 365 367 367 368 369 369 370 372 373 374 376 378 378 379 380 380 381 382 384 387 388 388 389 389 391 392 392 392 393 394 396 397 397 398 398 399 399 400 400 400 401 402 402 403 404 405
27.8 Setting up a Global Mirror environment using the DS GUI . . . . . . . . . . . . . . . . . . . . 27.8.1 Establishing paths with the DS GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.8.2 Creating Global Copy pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.8.3 Creating FlashCopy relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.8.4 Creating the Global Mirror session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.9 Managing the Global Mirror environment with the DS GUI . . . . . . . . . . . . . . . . . . . . 27.9.1 Viewing settings and error information of the Global Mirror session. . . . . . . . . 27.9.2 Viewing the information of the volumes in the Global Mirror session . . . . . . . . 27.9.3 Pausing a Global Mirror session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.9.4 Resuming a Global Mirror session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.9.5 Modifying a Global Mirror session. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.9.6 Adding paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.9.7 Changing the LSS paths: Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.9.8 Deleting paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.10 Multiple Global Mirror sessions within DS8700 and DS8800 systems . . . . . . . . . . 27.10.1 Failing over Global Mirror session 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.10.2 Failing back GM session 20 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27.10.3 Returning to the primary site using GM session 20 . . . . . . . . . . . . . . . . . . . .
405 406 408 410 414 417 417 419 420 420 421 422 423 425 425 429 430 431
Part 7. Metro/Global Mirror (MGM) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 Chapter 28. Metro/Global Mirror overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28.1 Metro/Global Mirror overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28.1.1 Metro Mirror and Global Mirror: Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . 28.1.2 Metro/Global Mirror design objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28.2 Metro/Global Mirror processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 29. Metro/Global Mirror configuration and setup . . . . . . . . . . . . . . . . . . . . . 29.1 Metro/Global Mirror configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29.1.1 Metro/Global Mirror with additional Global Mirror environments . . . . . . . . . . . . 29.1.2 Metro/Global Mirror with multiple storage systems . . . . . . . . . . . . . . . . . . . . . . 29.1.3 Multiple consistency groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29.2 Architectural Metro/Global Mirror example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29.3 Initial setup of Metro/Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29.3.1 Identifying the PPRC ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29.4 Migrating from Metro Mirror to Metro/Global Mirror. . . . . . . . . . . . . . . . . . . . . . . . . . 29.5 Preferred practices for setting up Metro/Global Mirror . . . . . . . . . . . . . . . . . . . . . . . Chapter 30. General Metro/Global Mirror operations . . . . . . . . . . . . . . . . . . . . . . . . . . 30.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.2 General considerations for storage failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.3 Freezing and unfreezing Metro Mirror volumes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.4 Checking consistency at the remote site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30.5 Setting up an additional Global Mirror from the remote site . . . . . . . . . . . . . . . . . . . Chapter 31. Metro/Global Mirror recovery scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . 31.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Recovering the production environment at the intermediate site . . . . . . . . . . . . . . . 31.3 Returning the production environment to the local site from the intermediate site . . 31.4 Recovery of the production environment at the remote site . . . . . . . . . . . . . . . . . . . 31.5 Returning the production environment to the local site from the remote site . . . . . . 437 438 438 439 440 443 444 444 445 446 447 448 449 456 457 459 460 460 462 463 466 469 470 470 473 477 480
Chapter 32. Metro/Global Mirror disaster recovery test scenarios . . . . . . . . . . . . . . 485 32.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486
Contents
xi
32.2 Providing consistency with Metro Mirror freeze. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2.1 Disaster recovery test at the intermediate site . . . . . . . . . . . . . . . . . . . . . . . . . 32.2.2 Disaster recovery test at the remote site. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.3 Providing consistency with Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 33. Metro/Global Mirror incremental resynchronization . . . . . . . . . . . . . . . . 33.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.1.1 Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.1.2 Options for DS CLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.2 Setting up Metro/Global Mirror with Incremental Resync . . . . . . . . . . . . . . . . . . . . . 33.2.1 Setting up of Metro/Global Mirror with Incremental Resync . . . . . . . . . . . . . . . 33.2.2 Migrating from Global Mirror to Metro/Global Mirror with Incremental Resync . 33.3 Incremental Resync recovery scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3.1 Local site fails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3.2 Local site is back. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3.3 Intermediate site failure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.3.4 Intermediate site is back . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33.4 Swapping between the local and intermediate sites . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 34. Metro/Global Copy Incremental Resync . . . . . . . . . . . . . . . . . . . . . . . . . . 34.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34.2 Replacing a Metro Mirror target system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34.3 Replacing a Global Mirror target system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
486 486 490 491 497 498 500 501 501 501 502 509 509 512 517 521 528 537 538 539 541
Part 8. Thin provisioning and Copy Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543 Chapter 35. Thin provisioning overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.1 Thin provisioning: Basic concept. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.2 Extent-Space-Efficient (ESE) volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.2.1 Quick initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.2.2 Capacity allocation with ESE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.2.3 Out-of-space condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.2.4 Volume creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35.2.5 Releasing space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 36. Thin provisioning and Copy Services considerations . . . . . . . . . . . . . . 36.1 Thin provisioning and Copy Services considerations . . . . . . . . . . . . . . . . . . . . . . . . 36.1.1 FlashCopy considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36.1.2 Metro Mirror and Global Copy considerations. . . . . . . . . . . . . . . . . . . . . . . . . . 36.1.3 Global Mirror and Metro/Global Mirror considerations . . . . . . . . . . . . . . . . . . . 545 546 547 547 548 548 549 549 551 552 552 553 553
Part 9. Copy Services with IBM i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555 Chapter 37. IBM i overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.2 IBM i architecture and external storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.2.1 The hardware models for IBM i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.2.2 Single-level storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.2.3 Object-based architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.2.4 Storage management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.2.5 Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.2.6 Disk pools in IBM i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.3 DS8000 Copy Services with IBM i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.4 Managing solutions with DS8000 Copy Services and IBM i . . . . . . . . . . . . . . . . . . . 37.4.1 PowerHA SystemMirror for IBM i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
IBM System Storage DS8000 Copy Services for Open Systems
557 558 558 558 559 559 559 560 561 563 563 564
37.4.2 Advanced Copy Services for PowerHA on i and Copy Services Tool Kit . . . . . 37.4.3 Tivoli Storage Productivity Center for Replication. . . . . . . . . . . . . . . . . . . . . . . 37.5 Supported solutions and management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 38. IBM i options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.1 Metro Mirror for independent disk pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.1.1 Solution description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.1.2 Solution benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.1.3 Planning and requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.2 Global Mirror for independent disk pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.2.1 Solution description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.2.2 Solution benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.2.3 Planning and requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.3 FlashCopy for independent disk pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.3.1 Solution description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.3.2 Solution benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.3.3 Planning and requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.4 Full system Metro Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.4.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.4.2 Solution benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.4.3 Planning and requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.5 Full system Global Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.5.1 Solution description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.5.2 Solution benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.5.3 Planning and requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.6 Full System FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.6.1 Solution description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.6.2 Solution benefits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.6.3 Planning and requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.7 Solutions with Remote Copy and FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.8 FlashCopy SE with IBM i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.8.1 Sizing for a FlashCopy SE repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.8.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38.8.3 Monitoring the usage of the repository space with workload CPW . . . . . . . . . . 38.8.4 System behavior with a repository full condition . . . . . . . . . . . . . . . . . . . . . . . . 38.9 Metro/Global Mirror with IBM i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 39. IBM i implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.1 Copy Services with independent disk pools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.1.1 Implementing independent disk pools. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.1.2 Setting up the cluster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.1.3 Managing DS8000 Copy Services from IBM i. . . . . . . . . . . . . . . . . . . . . . . . . . 39.1.4 Implementing PowerHA SystemMirror for i . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.1.5 Implementing Advanced Copy Services for PowerHA on i . . . . . . . . . . . . . . . . 39.1.6 IBM i journaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.1.7 Quiescing IBM i data to disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.2 Full System Copy Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.2.1 Boot from SAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.2.2 Cloning IBM i. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.2.3 Managing DS8000 Copy Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.2.4 Possibilities to automate the solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39.2.5 Quiescing IBM i data to disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
564 564 564 566 567 568 568 569 569 570 570 571 572 572 572 573 574 574 574 575 575 576 576 577 577 578 578 579 579 580 581 581 583 584 587 591 593 594 594 595 596 597 598 600 601 603 603 604 605 605 605
Contents
xiii
39.2.6 IBM i journaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606 Chapter 40. IBM i examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.1 Metro Mirror with PowerHA for i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.1.1 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.1.2 Solution setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.1.3 Switchover for planned outages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.1.4 Failover for unplanned outages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.2 Metro Mirror and FlashCopy on the remote site with PowerHA for i . . . . . . . . . . . . . 40.2.1 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.2.2 Solution setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.2.3 Using FlashCopy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40.2.4 Metro Mirror switchover while FlashCopy session is detached. . . . . . . . . . . . . 40.2.5 Metro Mirror failover while the FlashCopy session is attached . . . . . . . . . . . . . 40.3 Full system Global Mirror with Tivoli Storage Productivity Center for Replication. . . 40.3.1 Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 608 608 608 610 616 620 621 621 621 623 628 630 631
Part 10. Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647 Chapter 41. Multi-site replication scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.1 Data migration with double cascading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Four-site scenario with Metro/Global Mirror with Global Copy . . . . . . . . . . . . . . . . . 41.3 Four-site scenario with host and storage-based mirroring . . . . . . . . . . . . . . . . . . . . Chapter 42. IBM Tivoli Storage FlashCopy Manager overview . . . . . . . . . . . . . . . . . . 42.1 Tivoli Storage FlashCopy Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.1 Features of Tivoli Storage FlashCopy Manager . . . . . . . . . . . . . . . . . . . . . . . . 42.1.2 Cloning support for SAP databases with Tivoli Storage FlashCopy Manager . 649 650 651 653 657 658 658 660
Chapter 43. IBM Open HyperSwap for AIX with IBM Tivoli Storage Productivity Center for Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661 43.1 Open HyperSwap for AIX with Tivoli Storage Productivity Center . . . . . . . . . . . . . . 662 43.1.1 Open HyperSwap examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 664 Chapter 44. IBM PowerHA SystemMirror for IBM AIX Enterprise Edition . . . . . . . . . 673 44.1 PowerHA SystemMirror for AIX Enterprise Edition . . . . . . . . . . . . . . . . . . . . . . . . . . 674 Chapter 45. VMware Site Recovery Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677 45.1 VMware Site Recovery Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 678 Chapter 46. Geographically Dispersed Open Clusters . . . . . . . . . . . . . . . . . . . . . . . . 681 46.1 Geographically Dispersed Open Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682 Chapter 47. IBM Tivoli Storage Productivity Center for Replication . . . . . . . . . . . . . 47.1 Tivoli Storage Productivity Center for Replication overview . . . . . . . . . . . . . . . . . . . 47.1.1 Sources of information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.1.2 Why is Tivoli Storage Productivity Center for Replication needed . . . . . . . . . . 47.1.3 What Tivoli Storage Productivity Center for Replication provides. . . . . . . . . . . 47.1.4 Tivoli Storage Productivity Center for Replication reliability, availability, and serviceability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.2 Tivoli Storage Productivity Center for Replication terminology . . . . . . . . . . . . . . . . . 47.2.1 Copy set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.2.2 Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.2.3 Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.2.4 Volume types in a copy set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 685 686 686 687 687 689 690 691 691 693 694
xiv
47.2.5 Actions on sessions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.3 DS8000 specific information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.3.1 PPRC paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.3.2 DS8000 connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.3.3 Metro Mirror heartbeat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47.4 Tivoli Storage Productivity Center for Replication interfaces . . . . . . . . . . . . . . . . . . 47.4.1 Tivoli Storage Productivity Center for Replication GUI . . . . . . . . . . . . . . . . . . . 47.4.2 CLI for Tivoli Storage Productivity Center for Replication . . . . . . . . . . . . . . . . . Appendix A. Open Systems specifics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Database and file system specifics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Consistency Groups specifics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . File system consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Database consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . AIX specifics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . AIX and FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . AIX and Remote Mirror and Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Windows and Remote Mirror and Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Copy services with Windows volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Microsoft Volume Shadow Copy Services (VSS) overview . . . . . . . . . . . . . . . . . . . . . Microsoft Virtual Disk Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Oracle Solaris and Copy Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Copy Services without using a volume manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Copy Services using Symantec VERITAS Volume Manager . . . . . . . . . . . . . . . . . . . . HP-UX and Copy Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . HP-UX and FlashCopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . HP-UX with Remote Mirror and Copy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Expanding an existing Copy Services environment . . . . . . . . . . . . . . . . . . . . . . . . . . . VMware vSphere and Copy Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Virtual machine considerations about Copy Services . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix B. SNMP notifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SNMP overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Physical connection events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Remote Mirror and Copy events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Global Mirror related SNMP traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Thin Provisioning feature related SNMP traps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tivoli Storage Productivity Center for Replication related SNMP traps . . . . . . . . . . . . Correlating Remote Copy traps and possible actions . . . . . . . . . . . . . . . . . . . . . . . . . . Appendix C. Resource Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overview of Resource Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Functional description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Basic attributes and relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Remote relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Default behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Special attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Implementation examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . FlashCopy and DS GUI example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Metro Mirror and DS CLI example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
695 697 697 698 698 699 699 701 705 706 706 707 707 708 708 712 714 715 717 720 725 725 726 729 729 732 733 734 734 743 744 744 746 746 749 750 752 755 756 757 758 758 759 760 760 760 765
xv
Online resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 769 How to get IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770 Help from IBM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770
xvi
Notices
This information was developed for products and services offered in the U.S.A. IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not give you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing, IBM Corporation, North Castle Drive, Armonk, NY 10504-1785 U.S.A. The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs.
xvii
Trademarks
IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. These and other IBM trademarked terms are marked on their first occurrence in this information with the appropriate symbol ( or ), indicating US registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at http://www.ibm.com/legal/copytrade.shtml The following terms are trademarks of the International Business Machines Corporation in the United States, other countries, or both:
AIX 5L AIX AS/400 DB2 DS4000 DS6000 DS8000 Easy Tier ESCON eServer FICON FlashCopy GDPS HACMP HyperSwap i5/OS IBM iSeries MVS NetView Parallel Sysplex PowerHA POWER RACF Redbooks Redbooks (logo) Storwize System i5 System i System p System Storage DS System Storage System x System z SystemMirror Tivoli VM/ESA XIV z/OS z/VM zSeries
The following terms are trademarks of other companies: Intel, Intel logo, Intel Inside logo, and Intel Centrino logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. Snapshot, Network Appliance, SnapMirror, and the NetApp logo are trademarks or registered trademarks of NetApp, Inc. in the U.S. and other countries. Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates. UNIX is a registered trademark of The Open Group in the United States and other countries. Other company, product, or service names may be trademarks or service marks of others.
xviii
Preface
This IBM Redbooks publication helps you plan, install, tailor, configure, and manage Copy Services for Open Systems environments on the IBM System Storage DS8000. It should be read with IBM System Storage DS8000: Architecture and Implementation, SG24-8886. This book helps you design and implement a new Copy Services installation or migrate from an existing installation. It includes hints and tips to maximize the effectiveness of your installation, and information about tools and products to automate Copy Services functions. It is intended for anyone who needs a detailed and practical understanding of the DS8000 Copy Services. There is a companion book that supports the configuration of the Copy Services functions in an IBM z/OS environment, IBM System Storage DS8000 Copy Services for IBM System z, SG24-6787.
xix
Charles Burger joined IBM in 1978 in Pittsburgh, Pa, US, and spent three years as a Customer Engineer in Pittsburgh and Wilmington, De, US. In 1981, Charlie became an IBM MVS level 1 representative at the Chicago IBM Support center. He worked there until 1984 and then joined the Chicago Area Systems Center (ASC) supporting MVS, JES2, SMP/E, and other MVS components for seven years. He then took a temporary assignment with the EMEA Support Center (ESC) developing large systems education courses. When the assignment was over, Charlie moved to San Jose to work for Advanced Technical Skills (ATS), supporting VSAM, catalogs, and SMS. He is currently supporting DS8000 focusing on Copy Services in an IBM System z environment. Michael Frankenberg is a Certified IT Specialist in Germany. He has 14 years of experience in IBM System p, and high-end and midrange storage. He works as a Field Technical Sales Support Specialist for high-end and midrange storage systems, including storage virtualization solutions. His areas of expertise include performance analysis, disaster recovery solutions, and implementation for storage systems. Jana Jamsek is an IBM Certified IT specialist. She works at Storage Advanced Technical Skills for Europe as a specialist for IBM Storage Systems and IBM i systems. Jana has eight years of experience in the IBM System i and IBM AS/400 areas, and 10 years of experience in IBM Storage. She has a Master's degree in Computer Science and a degree in Mathematics from the University of Ljubljana, Slovenia. Peter Klee is an IBM Professional Certified IT specialist at IBM Germany. He has 18 years of experience in Open Systems platforms, storage area networks (SANs), and high-end storage systems in large data centers. He formerly worked for a major bank in Germany where he was responsible for the architecture and the implementation of the disk storage environment, which included products from various vendors. He joined IBM in 2003, where he worked for Strategic Outsourcing. Since July 2004, he has worked for ATS System Storage Europe in Mainz. His main focus is Copy Services, disaster recovery, and storage architectures for DS8000 in the Open Systems environment. Flavio Gondim de Morais is a GTS Storage Specialist in Brazil. He has over six years of experience in the SAN/storage area. He holds a degree in Computer Engineering from Instituto de Ensino Superior de Brasilia. His areas of expertise include DS8000 planning, Copy Services, IBM Tivoli Storage Productivity Center for Replication, and performance troubleshooting. He has been extensively exposed to performance and Copy Services problems with Open Systems. Luiz Moreira is an IT specialist in Brazil. He has 41 years of experience in mainframes, working in various areas, such as storage, z/OS, IBM z/VM, IBM Parallel Sysplex, performance, and capacity planning. Luiz is a former ITSO Project Leader for VM/XA and IBM VM/ESA. He is working in Mainframe Storage, supporting DS8000 Copy Services. Luiz holds a degree in Metallurgical Engineering from the Instituto Militar de Engenharia (IME) at Rio de Janeiro, Brazil and also an MBA from Fundacao Getulio Vargas (FGV), also at Rio de Janeiro. Alexander Warmuth is a Senior IT Specialist for IBM at the European Storage Competence Center. Working in technical sales support, he designs and promotes new and complex storage solutions, drives the introduction of new products, and provides advice to clients, business partners, and sales. His main areas of expertise are high-end storage solutions and business resiliency. He joined IBM in 1993 and has been working in technical sales support since 2001. Alexander holds a diploma in Electrical Engineering from the University of Erlangen, Germany.
xx
Mark Wells is a Certified Consulting IT Specialist working in the United States. As a Client Technical Specialist since 2005, he provides storage solution design, technical consulting, and implementation support to his customers. He has extensive experience on the DS8000 platform in large-scale replication environments. Mark has worked for IBM since 1997 in various roles and has worked in the System z and Enterprise environment for over 25 years.
Special thanks
We especially want to thank John Bynum (Global Technical Sales Enablement) and Peter Kimmel (ATS System Storage Europe). Many thanks to the authors of the previous editions of this book: Doug Acuff, Pat Atkinson, Urban Biel, Denise Brown, Hans-Paul Drumm, Wilhelm Gardt, Jean Iyabi, Jana Jamsek, Peter Kimmel, Peter Klee, Jukka Myyrylainen, Lu Nguyen, Markus Oscheka, Gerhard Pieper, Gero Schmidt, Shin Takata, Ying Thia, Robert Tondini, Paulus Usong, Anthony Vandewerdt, Bjoern Wesselbaum, Stephen West, Axel Westphal, Roland Wolf We also would like to thank: Dale Anderson, John Cherbini, Nick Clayton, Matthew Craig, Jenny Dervin, Selwyn Dickey, Ingo Dimmer, Hans-Paul Drumm, Dieter Flaut, Robert Gensler, Craig Gordon, Lisa Gundy, Theodore (TJ) Harris, Kai Jehnen, Bob Kern, Steve E. Klein, Mike Koester, James Lembke, Thomas Luther, Alan McClure, Rosemary McCutchen, Allen Marin, Carol Mellgren, Markus Oscheka, Richard Ripberger, Michael Romano, Torsten Rothenwaldt, Hank Sautter, Guenter Schmitt, Mike Schneider, Dietmar Schniering, Uwe Schweikhard, Jim Sedgwick, David Shackelford, Brian Sherman, Paul Spagnolo, Gail Spear, Warren Stanley, Jeff Steffan, Edgar Strubel, Steve Wilkins, Bjoern Wesselbaum, Sonny Williams, Jens Wissenbach, Allen Wright Yan Xu
Preface
xxi
Comments welcome
Your comments are important to us! We want our books to be as helpful as possible. Send us your comments about this book or other IBM Redbooks publications in one of the following ways: Use the online Contact us review Redbooks form found at: ibm.com/redbooks Send your comments in an email to: redbooks@us.ibm.com Mail your comments to: IBM Corporation, International Technical Support Organization Dept. HYTD Mail Station P099 2455 South Road Poughkeepsie, NY 12601-5400
xxii
Summary of changes
This section describes the technical changes that were made in this edition of the book and in previous editions. This edition might also include minor corrections and editorial changes that are not identified. Summary of Changes for SG24-6788-06 IBM System Storage DS8000 Copy Services for Open Systems as created or updated on April 5, 2013.
New information
New graphical user interface (GUI) Enhancement for IBM FlashCopy and thin provisioning
Changed information
Updated IBM Tivoli Storage Productivity Center for Replication section Other various contents reorganization and updates
xxiii
xxiv
Part 1
Part
Overview
In this part of the book, we provide a general introduction to the various Copy Services offerings for the DS8000 series. We describe their overall architecture and review some of the licensing requirements. The Copy Services configuration is done by using either the IBM System Storage DS8000 Command-Line Interface (DS CLI) or the IBM System Storage DS8000 graphical user interface (DS GUI). Copy Services can also be managed by using the IBM Tivoli Storage Productivity Center for Replication application. Note the following requirements and considerations: The DS CLI provides a consistent interface for current and planned IBM System Storage products. The DS CLI invokes Copy Services functions directly and DS CLI commands can be saved in reusable scripts. The DS GUI can be used only for one-time execution of a Copy Services operation; it cannot save tasks. The Tivoli Storage Productivity Center for Replication is an automated application to manage Copy Services functionalities. It is easy to use and customizable through different existing scenarios.
Chapter 1.
Introduction
Copy Services are a collection of functions that provide disaster recovery, data migration, and data duplication solutions. There are two primary types of Copy Services functions: Point-in-Time Copy and Remote Mirror and Copy. Generally, the Point-in-Time Copy functions are used for data duplication, and the Remote Mirror and Copy functions are used for data migration and disaster recovery. With the Copy Services functions, for example, you can create backup data with little or no disruption to your application, and you can back up your application data to the remote site for disaster recovery. Copy Services that are run on the DS8000 Storage Unit and support Open Systems and System z environments. The optional licensed functions of Copy Services are: IBM FlashCopy, which is a point-in-time copy function IBM FlashCopy SE, a Space-Efficient point-in-time copy function Remote Mirror and Copy functions, previously known as Peer-to-Peer Remote Copy (PPRC), which include: Metro Mirror, previously known as Synchronous PPRC Global Copy, previously known as PPRC Extended Distance Global Mirror, previously known as Asynchronous PPRC 3-site Metro/Global Mirror with Incremental Resync
The Copy Services functions are optional licensed functions of the DS8000. Additional licensing information for Copy Services functions can be found in Chapter 3, Licensing on page 15. You can manage the Copy Services functions through a command-line interface (DS CLI) and a graphical user interface (DS GUI). You also can manage the Copy Services functions through the open application programming interface (DS Open API). The IBM Tivoli Storage Productivity Center for Replication program provides yet another interface for managing Copy Services functions. When you manage the Copy Services through these interfaces, these interfaces invoke Copy Services functions through the Ethernet network. We explain these interfaces in Part 2, Interfaces on page 23.
Copyright IBM Corp. 2008, 2013. All rights reserved.
1.1.1 FlashCopy
When you set up a FlashCopy operation, a relationship is established between the source and target volumes, and a bitmap of the source volume is created. After this relationship and bitmap are created, the target volume can be accessed as though all the data was physically copied. While a relationship between the source and target volume exists (optionally), a background process copies the tracks from the source to the target volume. When a FlashCopy operation is invoked, it takes only a few seconds to complete the process of establishing the FlashCopy pair and creating the necessary control bitmaps. Thereafter, you have access to a Point-in-Time Copy of the source volume. As soon as the pair is established, you can read and write to both the source and target volumes. After you create the bitmap, a background process begins to copy the real data from the source to the target volumes. If you access the source or the target volumes during the background copy, FlashCopy manages these I/O requests, and facilitates both reading from and writing to both the source and target copies. When all the data is copied to the target, the FlashCopy relationship ends, unless it is set up as a persistent relationship (used for incremental copies, for example). The user may withdraw a FlashCopy relationship at any time before all data is copied to the target.
1.1.2 FlashCopy SE
IBM FlashCopy SE is a FlashCopy relationship for which the target volume is a Track-Space-Efficient volume. By using FlashCopy SE, you can reduce the amount of physical space that is used on your DS8000. Important: Do not confuse Track-Space-Efficient volumes and Extent-Space-Efficient volumes. A Track-Space-Efficient volume is a volume for which physical space is allocated dynamically on a track basis. Initially, a Track-Space-Efficient volume does not use any physical space. When data is written to a Track-Space-Efficient volume, a track of physical space is taken from a common preallocated repository and is used to hold the data for the Track-Space-Efficient volume. Contrast this situation with a traditional, fully provisioned volume for which all of the physical space is allocated when the volume is created. A repository is a special volume that is used to contain the physical space for Track-Space-Efficient volumes. As tracks are written to a Track-Space-Efficient volume, storage for the tracks is obtained from the space that is assigned to the repository. The data for a Track-Space-Efficient volume is stored on the repository, but the data is only accessible from the Track-Space-Efficient volume. The host does not have access to the repository, only the associated Track-Space-Efficient volumes. The repository provides the physical space for multiple Track-Space-Efficient volumes. There can be multiple repositories in a DS8000, one in each Extent Pool.
When a track on the source volume of any FlashCopy relationship is updated, the current version of the track must be copied to the target device before the update can be destaged on the source device. For FlashCopy SE, the current version of the track is written to space taken from the repository and assigned to the Track-Space-Efficient volume. In this manner, the amount of physical space that is used by the target volume of a FlashCopy SE relationship is limited to the minimum amount of space that is required to maintain the copy. FlashCopy SE should be used for copies that are short term in nature. Examples include copies that are backed up to tape and the FlashCopy relationships in a Global Mirror session. FlashCopy SE could also be used for copies that are kept long term, if the installation knows that there are few updates to the source and target volumes. FlashCopy SE supports full volume relationships. It does not support data set level copies or partial volume copies, as standard FlashCopy does in System z environments. For specific information about FlashCopy SE, see IBM System Storage DS8000 Series: IBM FlashCopy SE, REDP-4368. IBM FlashCopy SE is licensed separately from FlashCopy.
Chapter 1. Introduction
This configuration provides a resilient and flexible solution for recover in various disaster situations. The customer also benefits from a synchronous replication of the data to a close location that acts as the intermediate site. It also enables the possibility to copy the data across almost unlimited distance, where data consistency can be provided in any time in each location. With Incremental Resync, it is possible to change the copy target destination of a copy relation without requiring a full copy of the data. This functionality can be used, for example, when an intermediate site fails because of a disaster. In this case, a Global Mirror is established from the local to the remote site, which bypasses the intermediate site. When the intermediate site becomes available again, the Incremental Resync is used to bring it back into the Metro/Global Mirror setup. The 3-site Metro/Global Mirror is an optional chargeable feature available on all models.
Chapter 1. Introduction
Chapter 2.
dscli> lssu Name ID Model WWNN pw state ======================================================= ATS_04 IBM.2107-75TV180 951 500507630AFFFA9F On dscli> lssi Name ID Storage Unit Model WWNN State ESSNet ============================================================================== ATS_04 IBM.2107-75TV181 IBM.2107-75TV180 951 500507630AFFC29F Online Enabled
10
DS8000
DS8000
DS HMC
Ethernet connection
DS HMC
Storage Complex 1
Storage Complex 2
11
On DS8000 systems with Rel. 3 through Rel. 6.1 code level, this action is no longer possible. Instead, you access the DS GUI by starting the DS8000 Element Manager on a Tivoli Storage Productivity Center server. This system can be either a customer workstation that is running Tivoli Storage Productivity Center for Replication, or a System Storage Productivity Center (SSPC) workstation, which is delivered with the DS8000 and has Tivoli Storage Productivity Center for Replication preinstalled. The Element Manager in the Tivoli Storage Productivity Center for Replication server provides the means to access the DS GUI. With Rel. 6.2, it is again possible to directly point a web browser to the DS HMC and access the GUI. The communication paths for DS CLI and DS GUI are illustrated in Figure 2-2. The DS CLI communication path is not changed in Rel. 3.
Client
DS CLI or Web browser
Client
DS CLI Web browser
Client
D S CLI or Web browser
SSPC/TP DS HMC
DS8000 Element Manager
DS HMC
Network Interface Server
DS HMC
Network Interface Server
SFI
Netwo rk In te rface N ode Ne tw ork Interface Nod e N etwork Interface No de
SFI
Networ k In te rfa ce N ode Ne tw ork Interface Nod e
SFI
Ne twork Interface No de
Micro code
Micro code
Micro code
Micro code
Micro code
Micro code
Pro cessor co mp le x1
DS8000 pre R3
DS8000 R3-R6.1
DS8000 R6.2+
12
13
DS8000 I/O Priority Manager is also a new licensed function feature that is introduced for IBM System Storage DS8700 and DS8800 storage systems with DS8000 Licensed Machine Code (LMC) Rel. 6.1 or higher. It enables more effective storage consolidation and performance management and the ability to align QoS levels to separate workloads in the system that are competing for the same shared and possibly constrained storage resources. For more information, see DS8000 I/O Priority Manager, REDP-4760. Copy Services is not aware, in an Easy Tier or I/O Priority Manager environment, from which tier that the data is copied from. Therefore, Copy Services cannot ensure that data copied from one tier at the primary or source volume is written to the same tier on the secondary or target volume. In a Remote Mirror environment, if the data in a remote storage system is accessed by a server, and if Easy Tier is active on the remote storage system, the DS8000 learns the characteristics of the workload and moves the data to the appropriate tier.
14
Chapter 3.
Licensing
This chapter describes how the licensing functions for Copy Services for the DS8000 Series are arranged. This chapter covers the following topics: Licenses Authorized level
15
3.1 Licenses
All DS8000 Series machines must have an Operating Environment License (OEL) for the total storage that is installed, as defined in gross decimal TB. In addition, IBM offers value-based licensing for the Operating Environment License. which is priced based on various disk drive characteristics, such as performance, capacity, and speed. These features are required in addition to the per TB OEL features, but might provide more flexible and optimal price/performance configurations, for example, for high capacity drives. Licenses are also required to use of Copy Services functions. Licensed functions require the selection of a DS8000 Series feature number (IBM 242x or 2107) and the acquisition of DS8000 Series Function Authorization (IBM 239x) feature numbers: The 242x or 2107 licensed function indicator feature number enables technical activation of the function subject to the client applying an activation code made available by IBM. The 239x function authorization feature numbers establish the extent of IBM authorization for that function on the DS machine for which it was acquired. Table 3-1 lists the DS8000 Series feature numbers and corresponding DS8000 Series Function Authorization feature numbers.
Table 3-1 DS8000 licensed functions Licensed function IBM 242x Indicator Feature Number 0700 and 70xx 0707 and 7071 0703 and 7091 0708 0709 and 7092 0713 and 7083 0714 and 7094 0720 and 72xx 0730 and 73xx 0742 and 74xx 0744 and 75xx 0746 and 75xx 0760 and 76xx 0763 and 76xx 0780 and 78xx IBM 239x Function Authorization Models and Feature Numbers Model LFA 7030-7065 Model LFA 7071 Model LFA 7091 Model LFA 7080 Model LFA 7092 Model LFA 7083 Model LFA 7094 Model LFA 7250-7260 Model LFA 7350-7360 Model LFA 7480-7490 Model LFA 7500-7510 Model LFA 7520-7530 Model LFA 7650-7660 Model LFA 7680-7690 Model LFA 7820-7830
Operating Environment Thin Provisioning IBM FICON Attachment DB protection indicator High Performance FICON IBM System Storage Easy Tier z/OS Distributed Data Backup FlashCopy Space-Efficient FlashCopy Metro/Global Mirror Metro Mirror Global Mirror z/OS Global Mirror z/OS Metro/Global Mirror Incremental Resync Parallel Access Volumes
16
Licensed function
IBM 242x Indicator Feature Number 0784 and 78xx 0782 and 7899
IBM 239x Function Authorization Models and Feature Numbers Model LFA 7840-7850 Model LFA 7899
The license for the Space-Efficient (SE) FlashCopy does not require the ordinary FlashCopy (PTC) license. As with the ordinary FlashCopy, the FlashCopy SE is licensed in tiers by gross amount of TB installed. FlashCopy (PTC) and FlashCopy SE can be complementary licenses. FlashCopy SE is targeted to do FlashCopies onto Track Space-Efficient (TSE) target volumes and requires that background nocopy be used. If you also want to do FlashCopies with background copy, a PTC license is also needed. Here is the breakdown of DS8000 Series Function Authorization feature numbers: OEL: Operating Environment License: OEL - inactive: 7030 OEL - 1 TB: 7031 OEL - 5 TB: 7032 OEL - 10 TB: 7033 OEL - 25 TB: 7034 OEL - 50 TB: 7035 OEL - 100 TB: 7040 OEL - 200 TB: 7045 OEL - Value Unit inactive: 7050 OEL - 1 Value Unit: 7051 OEL - 5 Value Unit: 7052 OEL - 10 Value Unit: 7053 OEL - 25 Value Unit: 7054 OEL - 50 Value Unit: 7055 OEL - 100 Value Unit: 7060 OEL - 200 Value Unit: 7065
FICON: z/OS attachment: FICON Attachment: 7091 High Performance FICON: 7092 PAV: Parallel Access Volumes: PAV - inactive: 7820 PAV - 1 TB: 7821 PAV - 5 TB: 7822 PAV - 10 TB: 7823 PAV - 25 TB: 7824 PAV - 50 TB: 7825 PAV - 100 TB: 7830 HyperPAV: 7899
The following licenses apply to Copy Services: PTC: Point-in-Time Copy, also known as FlashCopy: PTC - inactive: 7250 PTC - 1 TB: 7251 PTC - 5 TB: 7252 PTC - 10 TB: 7253
Chapter 3. Licensing
17
PTC - 25 TB: 7254 PTC - 50 TB: 7255 PTC - 100 TB: 7260 SE: FlashCopy SE, also known as Space-Efficient FlashCopy: SE - inactive: 7350 SE - 1 TB: 7351 SE - 5 TB: 7352 SE - 10 TB: 7353 SE - 25 TB: 7354 SE - 50 TB: 7355 SE - 100 TB: 7360 MGM - inactive: 7480 MGM - 1 TB: 7481 MGM - 5 TB: 7482 MGM - 10 TB: 7483 MGM - 25 TB: 7484 MGM - 50 TB: 7485 MGM - 100 TB: 7490 MM - inactive: 7500 MM - 1 TB: 7501 MM - 5 TB: 7502 MM - 10 TB: 7503 MM - 25 TB: 7504 MM - 50 TB: 7505 MM - 100 TB: 7510 GM - inactive: 7520 GM - 1 TB: 7521 GM - 5 TB: 7522 GM - 10 TB: 7523 GM - 25 TB: 7524 GM - 50 TB: 7525 GM - 100 TB: 7530
RMZ: Remote Mirror and Copy for z/OS, also known as z/OS Global Mirror (zGM, or XRC): RMZ - inactive: 7650 RMZ - 1 TB: 7651 RMZ - 5 TB: 7652 RMZ - 10 TB: 7653 RMZ - 25 TB: 7654 RMZ - 50 TB: 7655 RMZ - 100 TB: 7660 RMZ resync - inactive: 7680 RMZ resync - 1 TB: 7681 RMZ resync - 5 TB: 7682 RMZ resync - 10 TB: 7683 RMZ resync - 25 TB: 7684
RMZ resync:
18
3.1.4 Additional information for Metro/Global Mirror and Global Mirror licensing
For the 3-site Metro Mirror and Global Mirror solution, the following licensed functions are required: For the DS8800 model 951, DS8700 model 941, and DS8100/8300 Turbo Models 931, 932, and 9B2: Site A: A Metro/Global Mirror (MGM) license, and a Metro Mirror (MM) license (additionally, a Global Mirror (GM) license is required for site A if Site B goes away and you want to resynchronize between Site A and Site C) Site B: A Metro/Global Mirror license (MGM), a Metro Mirror (MM) license, and a Global Mirror (GM) license Site C: A Metro/Global Mirror license (MGM), a Global Mirror (GM) license, and a FlashCopy (PTC or, alternatively, SE) license
Chapter 3. Licensing
19
3.2.1 Licensing
All Copy Services functions require licensing to be activated, which means that the customer must purchase a license for the appropriate level of storage for each Copy Service function that is required. The customer must install the License Key generated using the Disk Storage Feature Activation (DSFA) application, which can be found at the following website: http://www.ibm.com/storage/dsfa Another consideration relates to the authorized level required. In most cases, the total capacity that is installed must be licensed. This capacity is the total capacity in decimal TB equal to or greater than the actual capacity installed, including all RAID parity disks and hot spares. An exception might be where a mix of both System z and Open Systems hosts are using the same storage system. In this case, it is possible to acquire Copy Services licenses for just the capacity that is formatted for CKD, or just the capacity that is formatted for FB storage. This situation implies that the licensed Copy Services function is required only for Open Systems hosts, or only for System z hosts. If, however, a Copy Services function is required for both CKD and FB, then that Copy Services license must match the total configured capacity of the machine. The authorization level is maintained by the licensed code in the controller and the DSFA application.
20
For example, the actual capacity is 15 TB, used for both CKD and FB, so the scope for the OEL is a type of ALL and the installed OEL must be at least 15 TB. If the client splits storage allocation, with 8 TB for CKD, and only CKD storage is using FlashCopy, then the scope type for the PTC license can be set to CKD. Now the PTC license can be purchased at the CKD level of 8 TB. However, this situation means that no Open Systems hosts can use the FlashCopy function. Check Table 3-2 for the licensed functions and license scope options.
Table 3-2 Licensed functions and license scope options Licensed function Operating Environment Thin Provisioning FICON Attachment Database Protection High Performance FICON IBM System Storage Easy Tier z/OS Distributed Data Backup FlashCopy Space-Efficient FlashCopy Metro/Global Mirror Metro Mirror Global Mirror z/OS Global Mirror z/OS Metro/Global Mirror Incremental Resync Parallel Access Volumes I/O Priority Manager HyperPAV License scope options ALL FB CKD FB, CKD, or ALL CKD FB, CKD, or ALL CKD FB, CKD, or ALL FB, CKD, or ALL FB, CKD, or ALL FB, CKD, or ALL FB, CKD, or ALL CKD CKD CKD FB CKD
The actual ordered level of any Copy Service license can be any level above what is required or installed. Licenses can be added and have their capacities increased, non-disruptively to an installed system. Important: A decrease in the scope or the capacity of a license requires a disruptive Initial Microcode Load (IML) of the DS8000 Storage Facility Image.
Chapter 3. Licensing
21
22
Part 2
Part
Interfaces
In this part of the book, we describe the interfaces available to manage the Copy Services features of the DS8000. We provide an overview of the interfaces, describe the options available, describe configuration considerations, and provide some interface usage examples.
23
24
Chapter 4.
25
To use a z/OS interface to manage Open Systems LUNs, the DS8000 must have at least one CKD volume. If you are interested in this possibility, see IBM System Storage DS8000 Copy Services for IBM System z, SG24-6787.
SSPC
TPC DS GUI
Customer LAN
DS HMC 1 (internal)
Ethernet Switch 1
Server 0
DS8000
DS HMC 2 (external)
Ethernet Switch 2
Server 1
DS GUI and DS CLI (and DS Open API) calls are run through the Ethernet network to the DS HMC. When the DS HMC receives the command request, it communicates with each server in the disk system through the internal Ethernet network. Therefore, the DS HMC is a key component to configure and manage the DS8000 and its functions. Each DS8000 has an internal DS HMC in the base frame, and you can have an external DS HMC for redundancy. You need at least one available DS HMC to run Copy Services commands. If you have only one DS HMC and if it has a failure, you cannot run Copy Services commands. Therefore, having a dual DS HMC configuration, as shown in Figure 4-1, is crucial, especially when you use automation scripts to run Copy Services functions so that the script keeps working in case there is one DS HMC failure.
26
Chapter 5.
DS GUI
The DS GUI provides a GUI to configure the DS8000 and manage DS8000 Copy Services. The DS GUI is started remotely from any compatible web browser. This chapter covers the following topics: Accessing the DS GUI Defining another DS8000 Storage Complex
27
28
Figure 5-1 shows the welcome and overview window of the new DS GUI. It presents the navigation icons on the left side. In the right pane, it displays the most important icons that are required for the initial setup of the machine and also links to directly navigate to other setup tasks and helpful information, such as e-Learning video tutorials and the DS8000 Information Center website.
Chapter 5. DS GUI
29
To start the Copy Services functions within the DS GUI, use the window shown in Figure 5-2. From here, you can create, manage, and delete paths (mirroring connectivity) and FlashCopy, Metro Mirror, Global Copy, and Global Mirror relationships.
30
You can change to the Legacy View by pressing the Navigation Choice icon in the lower left. Figure 5-3 shows the Legacy View, which can be helpful if you search for a specific window and cannot find it in the new icon view. Legacy View refers to the navigation window on the left side only and displays the menu in text style rather than the new icons. The content of the windows on the right is always displayed in the new DS GUI style.
Chapter 5. DS GUI
31
Figure 5-4 Showing the system status to add another DS8000 storage complex
The user ID and password that you use to log on to the source system DS GUI is also used to establish the connection to the target storage complex. For this reason, the same user ID with the appropriate role must exist and the password must be set the same in the target storage complex. Attention: After you complete actions in the DS GUI, it is sometimes necessary to click the Refresh button at the top of the window to reflect the latest changes. This situation is especially true when you are working with DS GUI and DS CLI in parallel and you make changes using the DS CLI, which must be used with extreme care. For examples and illustrations about how to use the DS GUI to define and manage those Copy Services, see the different specific Copy Services parts in this book.
32
Chapter 6.
DS Command-Line Interface
This chapter provides an introduction to the DS Command-Line Interface (DS CLI), which you can use to configure and to administer the DS storage system. We explain how you can use the DS CLI to manage Copy Services relationships. This chapter describes the following topics: Functionality and authentication Command modes Return codes User assistance Copy Services command structure In this chapter, we describe the usage of the DS CLI for Copy Services configuration in the DS8000. For information about the storage configuration of the DS8000 using the DS CLI, see the following books: IBM System Storage DS8000: Command-Line Interface Users Guide, GC53-1127 IBM System Storage DS8000: Architecture and Implementation, SG24-8886
33
34
The password of the admin user ID must be changed before it can be used. The GUI forces you to change the password when you first log in. You can use the DS CLI to log in, but you cannot run any other commands until you change the password. As an example, to change the admin users password to passw0rd, run the following DS CLI command: chuser -pw passw0rd admin After you run that command, you can then run other commands. Single Point of Authentication: The DS8800 supports the Single Point of Authentication function for the GUI and CLI through a centralized LDAP server. This capability requires a Tivoli Storage Productivity Center for Replication Version 4.1 server or higher. For detailed information about LDAP-based authentication, see IBM System Storage DS8000: LDAP Authentication, REDP-4505.
User roles
During the planning phase of a project, typically a worksheet is created with a list of all people who need access to the DS GUI or DS CLI. A user can be assigned to more than one group. At least one person should be assigned to each of the following roles (user_id): The Administrator (admin) has access to all HMC service methods and all Storage Image resources, except for encryption functionality. This user authorizes the actions of the Security Administrator during the encryption deadlock prevention and resolution process. The Security Administrator (secadmin) has access to all encryption functions. secadmin requires an Administrator user to confirm the actions that are taken during the encryption deadlock prevention and resolution process. The Physical operator (op_storage) has access to physical configuration service methods and resources, such as managing storage complex, Storage Image, rank, array, and extent pool objects. The Logical operator (op_volume) has access to all service methods and resources that relate to logical volumes, hosts, host ports, logical subsystems, and Volume Groups, excluding security methods. The Monitor group has access to all read-only, nonsecurity HMC service methods, such as list and show commands. The Service group has access to all HMC service methods and resources, such as performing code loads and retrieving problem logs, plus the privileges of the Monitor group, excluding security methods.
35
The Copy Services operator has access to all Copy Services methods and resources, plus the privileges of the Monitor group, excluding security methods.
No access prevents access to any service method or Storage Image resources. This group
is used by an administrator to temporarily deactivate a user ID. By default, this user group is assigned to any user account in the security repository that is not associated with any other user group.
36
# cat p2s.profile #Primary to Secondary hmc1: 9.1.2.3 username: admin devid: IBM.2107-7520781 remotedevid: IBM.2107-7503461 banner: off paging: off header: on # cat s2p.profile #Secondary to Primary hmc1: 9.1.2.4 username: admin devid: IBM.2107-7503461 remotedevid: IBM.2107-7520781 banner: off paging: off header: on
37
dscli> managepwfile -action add -mc1 9.177.88.99 -name admin -pw xxxxxxxx CMUC00205I managepwfile: Password file C:\Users\IBM_ADMIN\dscli\security.dat successfully created. CMUC00206I managepwfile: Record 9.177.88.99/admin successfully added to password file C:\Users\IBM_ADMIN\dscli\security.dat.
C:\Program Files\ibm\dscli>dscli -hmc1 10.10.10.1 -user admin -passwd pwd lsuser Name Group State =============================================== admin admin active copy op_volume,op_copy_services active doug op_storage locked test1 admin active Considerations for using the single-shot command mode: When you submit the command, you can use the host name or the IP address of the HMC (on command line or provided in the profile file). It is also important to understand that every time a command is run in single-shot mode, the user must be authenticated. The authentication process time adds to the time to actually perform the submitted command.
38
C:\Program Files\ibm\dscli>dscli Enter your username: admin Enter your password: IBM.2107-1312345 dscli> dscli> lsarraysite arsite DA Pair dkcap (Decimal GB) State Array ================================================ S1 0 146.0 Assigned A0 S2 0 146.0 Assigned A1 S3 0 146.0 Assigned A2 S4 0 146.0 Assigned A3 dscli> dscli> lssi Name ID Storage Unit Model WWNN State ESSNet ============================================================================ IBM.2107-1312345 IBM.2107-1312345 951 500507630EFFFC6F Online Enabled dscli> quit
39
lsarray lsrank In Example 6-6, you start the DS CLI by using the -script parameter and specifying a profile and the name of the script that contains the commands from Example 6-5 on page 39.
Example 6-6 Starting DS CLI with a script file
C:\Program Files\ibm\dscli>dscli -cfg ds8000a.profile -script sample.script arsite DA Pair dkcap (10^9B) State Array ============================================= S1 0 300.0 Unassigned S2 0 300.0 Unassigned S3 0 300.0 Unassigned S4 0 300.0 Unassigned S5 0 300.0 Unassigned S6 0 300.0 Unassigned CMUC00234I lsarray: No Array found. CMUC00234I lsrank: No Rank found. C:\Program Files\ibm\dscli>
DS CLI script: The DS CLI script can contain only DS CLI commands. Using shell commands results in a process failure. You can add comments in the scripts, which are prefixed by the hash symbol (#). It must be the first non-blank character on the line. Empty lines are allowed in the script file. Only one single authentication process is needed to run all the script commands.
40
dscli> help -s mkflash mkflash The mkflash command initiates a point-in-time copy from source volumes to target volumes. In Example 6-8, the -l parameter is used to get a list of all parameters that can be used with the mkflash command.
Example 6-8 Usage of the help -l command
dscli> help -l mkflash mkflash [ { -help|-h|-? } ] { -nocp|-cp } [-v on|off] [-bnr on|off] [-dev storage_image_ID] [-tgtpprc] [-tgtoffline] [-tgtinhibit] [-freeze] [-record] [-persist] [-tgtse] [-wait] [-seqnum Flash_Sequence_Num] [-pmir no|required|preferred] SourceVolumeID:TargetVolumeID ... | -
41
5. ch commands: These commands are used to change the attributes of existing objects. For example, chsession modifies a Global Mirror session. 6. Other commands that are specific to a Copy Services type. For example, freezepprc initiates a set of actions to preserve data consistency on a group of secondary volumes. For a complete description of the available commands for a specific Copy Services discipline (for example, FlashCopy), see the respective parts of this book.
42
Part 3
Part
FlashCopy
This part describes the IBM System Storage FlashCopy and IBM FlashCopy SE when used in Open Systems environments with the DS8000. We describe the FlashCopy and FlashCopy SE features and describe the options for setup. We also show which management interfaces can be used and the important aspects to be considered when you establish FlashCopy relationships.
43
44
Chapter 7.
FlashCopy overview
FlashCopy creates a copy of a volume at a specific point-in-time, which is also known as a Point-in-Time copy, instantaneous copy, or time-zero copy (t0 copy). This chapter explains the basic characteristics of FlashCopy when used in an Open Systems environment with the DS8000. This chapter describes the following topics: FlashCopy operational areas FlashCopy basic concepts FlashCopy in combination with other Copy Services IBM FlashCopy SE (Space-Efficient) IBM FlashCopy with thin provisioned Extent-Space-Efficient (ESE) volumes Remote Pair FlashCopy
45
Production System System Operation Reverse FlashCopy Production Backup System Application Help desk or System Operation
Production data
other systems
FlashCopy is suitable for the following operational environments: Production backup system A periodic FlashCopy of the production data allows data recovery from an earlier version of data. This action might be necessary because of a user error or a logical application error. Assume that a user accidentally deletes a customer record. The production backup system could work with one of the periodic FlashCopies copies of the data. The necessary part of the customer data can be exported and then be imported into the production environment. Thus, production continues while a specific problem is being fixed, and most users continue to work without any knowledge of this issue. The FlashCopy of the data can also be used by another operating system to re-establish production in case there are any server errors. A FlashCopy of the production data allows the client to create backups with the shortest possible application outage. An additional reason for data backup is to provide protection in case there is source data loss because of a disaster, hardware failure, or software failure. Data mining system A FlashCopy of the data can be used for data analysis, thus avoiding performance impacts for the production system because of long running data mining tasks.
46
Test system Test environments that are created with FlashCopy can be used by the development team to test new application functions with real production data, which leads to a faster test setup process. Integration system New application releases (for example, SAP releases) are likely to be tested before you implement them onto a production server. By using FlashCopy, a copy of the production data can be established and used for integration tests. With the capability to reverse a FlashCopy, a previously created FlashCopy can be used within seconds to bring production back to the point-in-time when the FlashCopy was taken.
7.2 Terminology
In a discussion about Metro Mirror, Global Copy, and Global Mirror, the following terms are frequently used interchangeably: The terms local, production, application, primary, or source, denote the site where the production applications run while in normal operation. These applications create, modify, and read the application data. The meaning is extended to the storage system that holds the data as well as to its components, that is, volumes and LSS. The terms remote, recovery, backup, secondary, or target denote the site to where the data is replicated (the copy of the application data). The meaning is extended to the storage system that holds the data as well as to its components (volumes and LSS). When you describe FlashCopy, we use the term source to refer to the original data that is created by the application, and we use the term target to refer to the point-in-time backup copy. The terms LUN and volume are also used interchangeably in our descriptions.
47
Three variations of FlashCopy are available. Standard FlashCopy uses a fully provisioned volume as a target volume. FlashCopy SE (Space-Efficient) uses Track-Space-Efficient volumes as FlashCopy target volumes and must be in a background no copy relationship. A Space-Efficient volume has a virtual size that is equal to the source volume size. However, space is not allocated when the volume is initially created and the FlashCopy initiated. Space is allocated in a repository when a first update is made to a track on the source volumes, which causes the source track to be copied to the FlashCopy SE target volume to maintain the t0 copy. Writes to the SE target also use repository space. For more information about Space-Efficient volumes and the concept of repository, see Chapter 10, IBM FlashCopy SE on page 95. FlashCopy is supported by thin provisioned Extent-Space-Efficient (ESE) volumes with LMC 6.6.20.nnn for DS8700 and with LMC 7.6.20.nnn for DS8800. Space is not allocated when the thin provisioned volume is initially created. Extents are allocated from an extent pool when a first update is made to an extent on thin provisioned volume. Thin provisioning does not use tracks from a repository, but rather uses extents from the extend pool. For more information about thin provisioned Extent-Space-Efficient volume and Copy Services, see Chapter 36, Thin provisioning and Copy Services considerations on page 551. FlashCopy, FlashCopy SE, and Thin Provisioning are optional and distinct licensed features of the DS8000. All features can coexist on a DS8000. Typically, large databases have their data spread across multiple volumes. If these volumes are copied, the order of dependent writes must be maintained to ensure that the target volumes have consistent data. Consistent data allows a database restart, as opposed to a database recovery, which could take a long time to complete. Consistency Group FlashCopy can maintain the order of dependent writes and create volume copies that have consistent data. The following characteristics are basic to the FlashCopy operation: Establishing a FlashCopy relationship When a FlashCopy is started, the relationship between source and target is established within seconds by creating a pointer table, including a bitmap for the target. While the FlashCopy relationship is being created, the DS8000 holds off I/O activity to a volume for a period of time. No user disruption or intervention is required. I/O activity resumes when the FlashCopy is established.
48
If all bits for the bitmap of the target are set to their initial values, this configuration means that no data block is copied so far. A bitmap entry of '1' indicates that the track is not copied yet, and a '0' indicates that it is copied. The data in the target is not modified during the setup of the bitmaps. At this first step, the bitmap and the data look as illustrated in Figure 7-2.
bitmap 1 1 1 1 1 1
The target volume, as depicted in various figures in this section, can be a normal volume or a Space-Efficient volume. In both cases, the logic is the same. The difference between standard FlashCopy and FlashCopy SE is where the physical storage is. For standard FlashCopy, it is a fully provisioned volume; for IBM FlashCopy SE, it is a repository (see Figure 7-4 on page 51).
49
When the relationship is established, it is possible to perform read and write I/Os on both the source and the target. Assuming that the target is used for reads only while production is ongoing, things look as illustrated in Figure 7-3.
Writing to the source volume and reading from the source and the target volume
t0 tx ty tz time read 1 read 2
read
write x
write y
write z
time-zero data not yet available in target volume: read it from source volume.
bitmap 0 1 1
before physical write to the source volume: copy time-zero data from the source volume to the target volume
Figure 7-3 Reads from source and target volumes and writes to source volume
Figure 7-4 on page 51 shows reads and writes for IBM FlashCopy SE. Reading from the source The data is read immediately from the source volume, as shown in Figure 7-3. Writing to the source Whenever data is written to the source volume while the FlashCopy relationship exists, the storage system makes sure that the t0 data is copied to the target volume before it overwrites the data in the source volume. When the target volume is a Space-Efficient volume, the data is written to a repository. To identify if the data of the physical track on the source volume must be copied to the target volume, the bitmap is analyzed. If it determines that the t0 data is not available on the target volume, then the data is copied from the source to the target. If it states that the t0 data is already copied to the target volume, then no further action is done. If the bitmap is '0', then original data is copied to the target volume, so this I/O is written straight to the source volume. See Figure 7-3. The target volume is immediately available for reading data and for writing data.
50
Reading from the target Whenever a read-request goes to the target while the FlashCopy relationship exists, the bitmap is used to identify if the data must be retrieved from the source or from the target. If the bitmap states that the t0 data is not yet copied to the target, then the physical read is directed to the source. If the t0 data is copied to the target, then the read is performed immediately against the target. See Figure 7-3 on page 50 or Figure 7-4 here.
Writing to the source volume in an IBM FlashCopy SE relationship and reading from the source and the target volume
t0 tx ty tz time read 1 read 2
read
write x
write y
write z
time-zero data not yet available in target volume: read it from source volume.
bitmap 0 1 1
before physical write to the source volume: copy time-zero data from the source volume to the target volume
t0 t0
Figure 7-4 Reads from source and target volumes and writes to source volume for an IBM FlashCopy SE relationship
51
Writing to the target Whenever data is written to the target volume while the FlashCopy relationship exists, the storage system ensures that the bitmap is updated. This way the t0 data from the source volume never overwrites updates that are done directly to the target volume. So, if the bitmap is '1', then it is set to '0' to prevent the data from being overwritten by source data in the future. See Figure 7-5.
Terminating the FlashCopy relationship The FlashCopy relationship is automatically removed when all tracks are copied from the source volume to the target volume. The relationship can also be explicitly withdrawn by running the relevant commands. If the -persistent option is specified, then the FlashCopy relationship continues until it is explicitly withdrawn. An IBM FlashCopy SE relationship ends when it is withdrawn. When the relationship is withdrawn, there is an option to release the allocated space of the Space-Efficient volume.
52
Background copy
t0 tx ty 0 0 0 0 0 time bitmap 0
Background copy will copy all time-zero data from source volume to target volume
Figure 7-6 Target volume after the full volume FlashCopy relationship is finished
If not explicitly defined as persistent, the FlashCopy relationship ends as soon as all data is copied. Only the classical FlashCopy allows a full copy; IBM FlashCopy SE has no such function. Remember that both features can coexist. If there are writes to the target, you see a chart that is similar to the one in Figure 7-7.
Background copy
t0 tx ty ta 0 0 0 tb 0 0 time bitmap 0
Background copy will copy all time-zero data from source volume to target volume.
53
FlashCopy
source target
primary
Metro Mirror
primary secondary
Metro Mirror
source
target
FlashCopy
Figure 7-8 FlashCopy and Metro Mirror
A FlashCopy source volume can become a Metro Mirror primary volume and vice versa. The order of creation is optional. A FlashCopy target volume can become a Metro Mirror primary volume and vice versa. If you wish to use a FlashCopy target volume as a Metro Mirror primary, be aware of the following considerations: 54
IBM System Storage DS8000 Copy Services for Open Systems
The issue with this approach is when the user initiates a FlashCopy onto a Metro Mirror / Global Copy primary volume that is in a FULL DUPLEX mode, the mode switches to COPY PENDING state during the FlashCopy background copy operation. After FlashCopy is finished, the primary volume returns to FULL DUPLEX. But, while the configuration is in the COPY PENDING state, the system is vulnerable to disaster and there is no disaster recovery protection until FlashCopy is finished and the state return to the FULL DUPLEX state. This issue is resolved with Remote Pair FlashCopy (see 7.4.2, Remote Pair FlashCopy on page 55). Figure 7-9 shows the system as vulnerable during the COPY PENDING state.
IB M S ystem StorageT M
R em ot e st orage syst em
(1)
L ocal A
FlashC op y
Metro Mi rror
L ocal B R em ote B
(2 )
Rep lic ate F UL L DU PLEX F UL L DU PLE X
(2)
CO PY PEND ING
(3)
(2 )
CO PY PEND ING
(3)
At the secondary site of the Metro Mirror, a FlashCopy source volume can be the Metro Mirror secondary volume and vice versa. There are no restrictions on which relationship should be defined first.
55
The function preserves the existing Metro Mirror status of FULL DUPLEX during the copy operation. Figure 7-10 shows this approach, which ensures that there is no loss of disaster recovery functionality: 1. A FlashCopy command is issued by an application or by the customer to Local A with Local B volumes as the FlashCopy target. The DS8000 firmware propagates the FlashCopy command through the Metro Mirror links from the local storage system to the remote storage system. This inband propagation of a Copy Services command is only possible for FlashCopy commands. 2. Independent of each other the local storage system and the remote storage system then run the FlashCopy operation. The local storage system coordinates the activities at its end and acts when the FlashCopies do not succeed at both storage systems. Remote Pair FlashCopy supports both data set (only for z/OS environments) and full volume FlashCopy. The key is that disaster recovery protection is not absent at any time and FlashCopy operations can be freely taken within the disk storage configuration.
IB M Sy stem Stora g e TM
Is s ue F l as hC opy
Loc a l s tora ge s y s te m
Re mo te stor a ge s y ste m
(1 )
Loc a l A
o M et r r M ir ro
Re mote A
( 1) F las hC op y c om m a nd
PPR
ks C lin
( 2) F las hC op y
P PR
Loc a l B
ks C l in o M e tr r o M ir r
Re mo te B
FULL DU P LEX
FULL D UP LE X
| 20 10 IB M C or po rat ion
Figure 7-10 Remote Pair FlashCopy preserves the Metro Mirror FULL DUPLEX state
The following conditions are required to establish Remote Pair FlashCopy: Both the Local A / Remote A and the Local B / Remote B Metro Mirror pairs are in the FULL DUPLEX state. LMC 6.6.20.nnn and 7.6.20.nnn or later allow Remote Pair FlashCopy to Metro Mirror pairs that are SUSPENDED or COPY Pending. The Remote A and Remote B volumes are in the same DS8000 Storage Facility Image (SFI). The required microcode level must be installed on both the local and remote storage systems. Remote Pair FlashCopy can be initiated using DS CLI, DS GUI, and Tivoli Storage Productivity Center for Replication on fixed block (FB) volumes for Open Systems.
56
To establish Remote Pair FlashCopy using DS CLI, run mkflash with the new parameter -pmir (for preserve mirror). The -tgtpprc parameter must also be included to indicate that the FlashCopy target is a Metro Mirror primary volume. For example, the command mkflash -tgtpprc -pmir required 0001:0101 initiates FlashCopy with the requirement that the Remote Pair FlashCopy function is used. For more information about Remote Pair FlashCopy, see Chapter 11, Remote Pair FlashCopy on page 117.
FlashCopy
source target
secondary
primary
On the Global Mirror secondary site, the Global Mirror secondary volume cannot be used as a FlashCopy source or FlashCopy target unless the Global Mirror pair is first suspended.
57
58
Chapter 8.
FlashCopy options
This chapter describes the options available for FlashCopy when you work with the IBM System Storage DS8000 series in an Open Systems environment. This chapter covers the following options: Multiple Relationship FlashCopy Consistency Group FlashCopy FlashCopy on existing Metro Mirror or Global Copy source Remote Pair FlashCopy (Preserve Mirror) Incremental FlashCopy Remote FlashCopy Persistent FlashCopy Reverse Restore and Fast Reverse Restore IBM FlashCopy SE (Space-Efficient) IBM FlashCopy with thin provisioned Extent-Space-Efficient (ESE) volume Most of the considerations in the following sections apply to standard FlashCopy, FlashCopy SE (Track-Space-Efficient), and FlashCopy with thin provisioned (ESE) volumes.
59
FlashCopy
source target
FlashCopy
source target
FlashCopy
source source and target target
not allowed a volume or dataset can be only a source or target at any given time
Source or target?: At any point-in-time, a volume or LUN can be only a source or a target.
60
61
Figure 8-2 illustrates this capability. In this figure, the FlashCopy target and the Metro Mirror (or Global Copy) primary are the same volume. They are displayed as two separate volumes for ease of understanding.
Local site
Remote site
source
FlashCopy
primary
secondary
Global Copy
Metro Mirror
Global Copy
Metro Mirror
Figure 8-2 FlashCopy target is Metro Mirror (or Global Copy) primary
A Metro Mirror or Global Copy primary volume can be the target of a FlashCopy relationship if the correct parameter is specified (for example, -tgtpprc). If it is a Metro Mirror primary, the FlashCopy causes the Metro Mirror to drop out of DUPLEX status to COPY PENDING. The Metro Mirror primary returns to DUPLEX after all of the data from the FlashCopy is transferred to the secondary. To preserve the DUPLEX status of a Metro Mirror primary, the Remote Pair FlashCopy function can be used. This function is covered in Chapter 11, Remote Pair FlashCopy on page 117. If the target of a FlashCopy is a Global Copy primary, the primary volume is already in COPY PENDING status. The combination of FlashCopy and Global Copy can be used to create consistent data at a remote site for disaster recovery. Simply use FlashCopy on a production volume to a Global Copy primary dedicated for this procedure either after quiescing I/O to the primaries or by using Consistency Group FlashCopy. The data is flashed to the Global Copy primary and the data is then transmitted to the remote site. This procedure can be repeated regularly, which can satisfy the users Recovery Point Objective (RPO). If you are bringing a new volume online for the first time, you could create the Metro Mirror (or Global Copy) relationship first with the nocopy option to avoid the initial sending of unnecessary data across to the Metro Mirror (or Global Copy) secondary and then do a full copy FlashCopy.
62
If the relationship is copy, a bitmap is created for both the source and target volumes. The background copy proceeds as a full volume copy, but after all of the tracks are created, the relationship is kept (change recording forces Persistent) and changes to both the source and target are recorded in their bitmaps. An Incremental FlashCopy can be performed at any time after the initial Incremental FlashCopy if the direction of the flash is the same. If a new increment is created, only the tracks that changed since the last increment are copied to the target (instead of all of the volumes tracks), and those tracks on the target that changed are overlaid by tracks from the source to put the two volumes back in sync. You must specify incremental (change recording) on each flash to maintain the incremental relationship. There is the option to Inhibit Target Write when you use Incremental FlashCopy. To reverse the FlashCopy, you must wait for the background copy to complete, and then the relationship can be reversed. Once again, the bitmaps are checked to see which tracks changed on what was the target (now the source after the reversal) and must be copied, and which tracks changed on what was the source (now the target after the reversal) and must be overlaid by the new sources tracks to put the two volumes back into sync. If the nocopy option is chosen with change recording, bitmaps are created for the source and target volumes/LUNs. Because nocopy is specified, the only tracks copied to the target volume are those tracks that are updated for the first time on the source volume. In other words, copy on writes. Changes to the source and target volume are recorded in their associated bitmaps. Like Incremental FlashCopy with the copy option, Incremental FlashCopy nocopy can flash again in the same direction at any time. Unlike Incremental FlashCopy with the copy option, Incremental FlashCopy with the nocopy option can be reversed immediately. After the relationship is reversed, the data is background copied from what was the target to what is now the source. After the background copy completes, the relationship is withdrawn. After the relationship is withdrawn, a new Incremental FlashCopy with nocopy can be established. Important: A refresh of the target volume always overwrites any writes previously written to the target volume. Considerations: If a FlashCopy source has multiple targets, an Incremental FlashCopy relationship can be established with only one target.
63
To review, with Incremental FlashCopy, the initial FlashCopy (copy or nocopy) relationship between a source and target volume is subject to the following considerations (see Figure 8-3): FlashCopy with the nocopy option If the original FlashCopy is established with the nocopy option, then the bitmap for the target volume is reset, and the updates on the target volumes are overwritten. FlashCopy with the copy option If the original FlashCopy is established with the copy option (full volume copy), then the updates that took place on the source volume since the last FlashCopy are copied to the target volume. Also, the updates done on the target volume are overwritten with the contents of the source volume.
source volume
target volume
no updates in source
nothing to be done as data was already copied from source to target and is identical on both sides
no updates in target
updates took place in source volume updates took place in source volume no updates in source
no updates in target volume current source data will be copied to target updates took place in target volume updates took place in target volume
Figure 8-3 Updates to the target volume caused by refreshing a FlashCopy target volume
64
When you initialize a FlashCopy with Start Change Recording activated, a second and third bitmap is used to identify writes done to the source or the target volume (see Figure 8-4).
tz write z
read 2 t b write b
tc
time
write c
bitmap 0 0 1 0 0 1 0
bitmap
source data t0 tx t0 tz t0 t0 ta t0 t
target data tb t0 t
before physical write to the source: copy time-zero data from the source to the target
All three bitmaps are necessary for Incremental FlashCopy: Target bitmap: Keeps track of tracks not yet copied from source to target. Source Change Recording bitmap: Keeps track of changes to the source. Target Change Recording bitmap: Keeps track of changes to the target. These bitmaps allow subsequent FlashCopies to transmit only those blocks of data for which updates occurred. Every write operation to the source or target volume is reflected in these bitmaps by setting the corresponding bit to 0.
65
When the refresh takes place, the bitmap used for change recording is used to analyze which blocks must o be copied from the source volume to the target volume (see Figure 8-5).
tz 1
source data t0 tx t0 tz t0 t0 t0 tx
target data t0 tz t0 t0
needs to be copied as a write occured on the target update to the source needs to be copied update to the source needs to be copied needs to be copied as a write occured on the target
bitmap 1 1 1 1 1 1
After the refresh (which takes place only on the bitmap level), the new FlashCopy based on time-0 is active. The copy of the time-0 data to the target is done in the background. Tip: You can do the incremental copy at any time if the direction is the same as the previous increment. You do not have to wait for the previous background copy to complete.
66
no updates in source
nothing to be done as data was already copied from source to target and is identical on both sides
no updates in target
updates took place in source volume updates took place in source volume no updates in source
data of previous target (now source) will be copied to previous source (now target)
no updates in target volume updates took place in target volume updates took place in target volume
The source and target bitmaps (illustrated in Figure 8-4 on page 65) are exchanged and then handled as described with the Incremental FlashCopy option.
67
Figure 8-7 illustrates Remote FlashCopy. In this figure, the Metro Mirror (or Global Copy) secondary and the FlashCopy source are the same volume. They are displayed as two separate volumes for ease of understanding.
Local site
Remote site
same volume
primary
secondary
Global Copy
Metro Mirror
Global Copy
Metro Mirror
source
target
FlashCopy
Figure 8-7 Remote FlashCopy
68
In Figure 8-8, when a FlashCopy command is received to perform a FlashCopy from the Local A volume to Local B volume, a similar command is generated and sent to perform a FlashCopy from the Remote A volume to Remote B volume. This approach ensures that there is no blackout of disaster recovery functionality for these volumes.
Issue FlashCopy
Local storage system
(1) (1)
Local A
Metro r Mirro
(1)
Remote A
FlashCopy command
FlashCopy command
PPRC
links
(2) FlashCopy
(2) FlashCopy
Local B
Remote B
FULL DUPLEX
FULL DUPLEX
Figure 8-8 Remote Pair FlashCopy preserves the Metro Mirror FULL DUPLEX state
The following conditions are required to establish Remote Pair FlashCopy: Both the Local A / Remote A and the Local B / Remote B Metro Mirror pairs are in the FULL DUPLEX state. LMC 6.6.20.nnn for DS8700 and 7.6.20.nnn or later for the DS8800 allow Remote Pair FlashCopy to Metro Mirror pairs that are SUSPENDED or COPY PENDING. The Remote A and Remote B volumes are in the same Storage Facility Image (SFI). The required microcode level must be installed on both the local and remote storage systems. The FlashCopy establish command has three options (required, preferred, and no) that specify how the Remote Pair FlashCopy should be performed. The corresponding keywords differ slightly on the various interfaces but the functionality is the same. Keywords for FlashCopy: The FlashCopy to a PPRC Primary OK keyword must be specified when the Remote Pair FlashCopy required or preferred option is used. The keyword differs depending on the interface used to establish FlashCopy. If Remote Pair FlashCopy is used in combination with Incremental FlashCopy, the usage of preferred or required when you issue a resync must be consistent with the existing relationship. If the existing incremental relationship is established without Remote Pair FlashCopy, a resync cannot be issued with Remote Pair FlashCopy Required, because the remote relationship does not exist. For more information about Remote Pair FlashCopy, see Chapter 11, Remote Pair FlashCopy on page 117.
Chapter 8. FlashCopy options
69
8.10 FlashCopy SE
FlashCopy SE is a FlashCopy relationship in which the target volume is a Space-Efficient volume. For more details about FlashCopy SE, see Chapter 10, IBM FlashCopy SE on page 95.
70
Multiple relationship FlashCopy Consistency Group FlashCopy Target on existing Metro Mirror or Global Copy primary Incremental FlashCopy Remote FlashCopy Persistent Flashcopy Reverse restore, fast reverse restore Remote Pair FlashCopy
71
72
Chapter 9.
FlashCopy interfaces
The setup of FlashCopy in an Open Systems environment can be done by using different interfaces. This chapter explains these interfaces and gives you some examples of their usage for FlashCopy management on the IBM System Storage DS8000 in an Open Systems environment. This chapter describes standard FlashCopy. For information about IBM FlashCopy SE, see Chapter 10, IBM FlashCopy SE on page 95.
73
74
Change the source-target-relationship A B to B A. Reestablish contents of target B to the contents of source A as it was during last consistency formation. Reset a Consistency Group FlashCopy. Run new background copy for persistent FlashCopy. Terminate FlashCopy. Remove local FlashCopy.
rmflash. Is automatically removed as soon as all the data is copied and if the FlashCopy pair was not established by using the -persist parameter.
75
Remote FlashCopy support: Remote FlashCopy is not supported by the DS GUI or DS Open API interfaces.
76
Param eters Sourc e fr eez e T arget tgtpprc tgtoffline tgtinhibit tgtonly resettgt inhibit F lashc opy pair dev record persis t nocp seqnum sourc e:target fas t sourc e cp sourc e LSS l s act ivec p revertible state retry C om mand wait quiet
m k flas h
lsflash
rev er t flash
unfreeze flas h
rmflash
x x x x
x x
When FlashCopy receives these parameters, the following actions result: freeze: Establishes a Consistency Group FlashCopy. With the DS CLI, it is possible to establish a consistency group by using the -freeze parameter and identifying all the FlashCopy pairs and target volumes that belong to the consistency group. tgtpprc: Establishes a target on an existing Metro Mirror or Global Copy primary. When this option is selected, the target volume can be or become a primary volume for a Metro Mirror or Global Copy relationship. tgtinhibit: Inhibits writes to a target volume. If FlashCopy is active, writes to the target volume are not allowed (inhibited). record: Changes recording. Activating the change recording option during setup of a FlashCopy enables subsequent refreshes to the target volume. Records the tracks that change on both volumes within a FlashCopy pair. Select this parameter when you establish an initial FlashCopy volume pair that you want to use with the resyncflash command. The -persist parameter is automatically selected when this parameter is selected. persist: Does a persistent FlashCopy. The FlashCopy relationship continues to exist until it is explicitly removed by an interface method. If this option is not selected, the FlashCopy relationship exists until all data is copied from the source volume to the target.
77
nocp: Does a full volume background copy. With the nocp parameter, it is possible to indicate whether the data of the source volume is copied to the target volume in the background. If -nocp is not used, a copy of all data from source to target takes place in the background. With -nocp selected, only updates to the source volume cause writes to the target volume. This way, the time-zero data can be preserved. seqnum: Sequence number for FlashCopy pairs. A number that identifies the FlashCopy relationship. Once used with the initial mkflash command, it can be used within subsequent commands to refer to multiple FlashCopy relationships. source:target: Identifies the source volume and target volume. fast: Reverses FlashCopy before the background copy finishes. You can run reverseflash before the background copy finishes by using this option. cp: Restricts a command to FlashCopy relationships with background copy. sourceLSS: Resets a consistency group for source logical subsystems. s: Displays FlashCopy pairs when used with the lsflash command. The shortened output of the lsflash command is returned. Only the FlashCopy pair IDs are displayed. l: Displays more FlashCopy information. The standard output of the lsflash command is enhanced. The values for the copy indicator, out-of-sync tracks, date created, and date synchronized are also displayed. activecp: Selects FlashCopy pairs with active background copy. revertible: Selects FlashCopy pairs with the revertible attribute. state: Displays the FlashCopy relationships that are identified by the specific state. retry: Displays how you want the system to handle a validation-required state.
Listing of the properties of the FlashCopies lsflash -dev IBM.2107-7506571 0001-0004 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 0001:0101 00 1 300 Disabled Disabled Disabled Disabled Enabled Enabled Enabled 0002:0102 00 2 300 Disabled Enabled Enabled Disabled Enabled Enabled Enabled
78
0003:0103 00 0004:0104 00
3 4
300 300
Disabled Disabled
Disabled Disabled
Enabled Disabled
Disabled Disabled
Enabled Enabled
Enabled Enabled
Enabled Disabled
The following explanations apply to the cases presented in Example 9-1 on page 78: Example 1: 0001 0101 The FlashCopy between volume 0001 and volume 0101 is established by using the default parameters. By default, the following properties are enabled: SourceWriteEnabled, TargetWriteEnabled, and BackgroundCopy (the default; you can specify a different property by using the -nocp parameter). All other properties are disabled. The background copy takes place immediately and after everything is copied, the FlashCopy relationship is automatically removed. Example 2: 0002 0102 The FlashCopy between volume 0002 and volume 0102 is established by enabling the following FlashCopy properties: Recording, Persistent, and BackgroundCopy. Persistence: The -persist parameter is automatically added whenever -record is used. The background copy takes place immediately and the relationship remains a persistent relationship. Using other DS CLI commands, it could be reversed and resynchronized. Example 3: 0003 0103 The FlashCopy between volume 0004 and volume 0104 is established by enabling the following FlashCopy properties: Persistent and BackgroundCopy. The background copy takes place immediately. When the background copy finishes, the FlashCopy relationship remains because of the persistent flag. Example 4: 0004 0104 The FlashCopy between volume 0005 and volume 0105 is established with the -nocp parameter, which means that no full background copy is done. Only the data that is changed in the source is copied to the target before changing it. Over time, this situation could result in all the data being copied to the target, and then the FlashCopy relationship ends. The relationship would also end after a background copy is initiated by using the DS GUI. This way, the relationship is temporarily persistent, even though the property Persistent is not activated.
79
0005:0105 00
300
Disabled
Disabled
Disabled
Disabled
Disabled
Disabled
Disabled
#--- Example 3 lsflash -dev IBM.2107-7506571 -l 0001-0005 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy OutOfSyncTracks DateCreated DateSynced ======================================================================================================================= 0003:0103 00 3 300 Disabled Enabled Enabled Disabled Disabled Disabled Enabled 0 Mon Jul 11 19:30:06 CEST 2005 Mon Jul 11 19:30:06 CEST 2005 0004:0104 00 4 300 Disabled Disabled Enabled Disabled Disabled Disabled Enabled 0 Mon Jul 11 19:30:10 CEST 2005 Mon Jul 11 19:30:10 CEST 2005 0005:0105 00 5 300 Disabled Disabled Disabled Disabled Disabled Disabled Disabled 50085 Mon Jul 11 19:30:13 CEST 2005 Mon Jul 11 19:30:13 CEST 2005 #--- Example 4 lsflash -dev IBM.2107-7506571 -s 0001-0005 ID ========= 0003:0103 0004:0104 0005:0105 #--- Example 5 lsflash -dev IBM.2107-7506571 -activecp 0001-0004 #--- Example 6 lsflash -dev IBM.2107-7506571 -record 0001-0004 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 0003:0103 00 3 300 Disabled Enabled Enabled Disabled Disabled Disabled Enabled #--- Example 7 lsflash -dev IBM.2107-7506571 -persist 0001-0004 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 0003:0103 00 3 300 Disabled Enabled Enabled Disabled Disabled Disabled Enabled 0004:0104 00 4 300 Disabled Disabled Enabled Disabled Disabled Disabled Enabled #--- Example 8 lsflash -dev IBM.2107-7506571 -revertible 0001-0004 #--- Example 9 lsflash -dev IBM.2107-7506571 -cp 0001-0004 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 0003:0103 00 3 300 Disabled Enabled Enabled Disabled Disabled Disabled Enabled 0004:0104 00 4 300 Disabled Disabled Enabled Disabled Disabled Disabled Enabled
The following explanations apply to the cases presented in Example 9-2 on page 79: Example 1: Lists FlashCopy information for a specific volume. In this example, the lsflash command shows the FlashCopy relationship information for volume 0004, showing the status (enabled/disabled) of the FlashCopy properties. Example 2: Lists existing FlashCopy relationships information within a range of volumes. In this example, the lsflash command shows the FlashCopy relationships information for volumes 0001 - 0005, showing the properties status (enabled/disabled). Example 3: Lists existing FlashCopy relationships with full information. Using the -l parameter with the lsflash command displays the default output plus information about the following properties: OutOfSyncTracks, DateCreated, and DateSynced. Example 4: Lists volume numbers of existing FlashCopy pairs within a volume range. Using the -s parameter displays only the FlashCopy source and target volume IDs for the specified range of volumes. Example 5: Lists FlashCopy relationships with active background copy running. Using the -activecp parameter displays only those FlashCopy relationships within the selected range of volumes for which a background copy is actively running. The output format is the default output. In our example, there were no active background copies.
80
Example 6: Lists existing FlashCopy relationships with -record enabled. Using the -record parameter displays only those FlashCopy relationships within the selected range of volumes that are established with the -record parameter. Example 7: Lists existing FlashCopy relationships with the Persistent attribute enabled. When you use the -persist parameter, only those FlashCopy relationships within the range of selected volumes for which the Persistent option is enabled are displayed. Example 8: Lists existing FlashCopy relationships that are revertible. When you use the -revertible parameter, only those FlashCopy relationships within the range of selected volumes for which the Revertible option is enabled are displayed. There are no revertible relationships in our example. Example 9: Lists existing FlashCopy relationships for which BackgroundCopy is enabled. When you use the -cp parameter, only those FlashCopy relationships within the range of selected volumes for which the BackgroundCopy option is enabled are displayed.
The following explanations apply to the cases presented in Example 9-3: Example 1: Sets the FlashCopy relationship to revertible. This command sets the existing FlashCopy for source volume 0002 and target volume 0102 to revertible. After the property Revertible is enabled, any subsequent commands result in an error message similar to the one displayed in Example 9-3. Example 2: Error occurs when you try to set FlashCopy relationship to revertible. When you try to set a FlashCopy relationship to revertible for which the property Recording is disabled, an error results. The script ends after this command with return code 2 and any other commands that follow the one that causes the error are not run.
81
The following explanations apply to the cases presented in Example 9-4: Example 1: Commits the FlashCopy relationship. This example shows, by running lsflash, the properties of the two FlashCopy relationships 0001:0101 and 0005:0105 before and after you run setflashrevertible. After the commitflash command is run, the properties of the two FlashCopy relationships are listed again. Example 2: Error when you try to commit a FlashCopy relationship. When you try to commit a FlashCopy relationship that is not revertible (the Revertible property is disabled), an error results. The script ends after this command with return code 2. Any other commands that follow the one that causes the error are not run.
82
To ensure that an existing FlashCopy relationship can be incremented multiple times, you must repeat the resyncflash command with the -record and -persist parameters. Example 9-5 shows examples where the resyncflash command is used.
Example 9-5 resyncflash command examples
#--- Example 1 mkflash -dev IBM.2107-7506571 -record -persist -seqnum 01 0001:0101 0005:0105 CMUC00137I mkflash: FlashCopy pair 0001:0101 successfully created. CMUC00137I mkflash: FlashCopy pair 0005:0105 successfully created. mkflash -dev IBM.2107-7506571 -record -seqnum 03 0003:0103 CMUC00137I mkflash: FlashCopy pair 0003:0103 successfully created. lsflash -dev IBM.2107-7506571 0000-0005 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 0001:0101 00 1 300 Disabled Enabled Enabled Disabled Disabled Disabled Enabled 0003:0103 00 3 300 Disabled Enabled Enabled Disabled Disabled Disabled Enabled 0005:0105 00 1 300 Disabled Enabled Enabled Disabled Disabled Disabled Enabled resyncflash -dev IBM.2107-7506571 -record -persist -seqnum 11 0001:0101 0005:0105 CMUC00168I resyncflash: FlashCopy volume pair 0001:0101 successfully resynchronized. CMUC00168I resyncflash: FlashCopy volume pair 0005:0105 successfully resynchronized. resyncflash -dev IBM.2107-7506571 seqnum 13 0003:0103 CMUC00168I resyncflash: FlashCopy volume pair 0003:0103 successfully resynchronized. lsflash -dev IBM.2107-7506571 0000-0005 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 0001:0101 00 11 300 Disabled Enabled Enabled Disabled Disabled Disabled Enabled 0003:0103 00 13 300 Disabled Disabled Disabled Disabled Disabled Disabled Enabled 0005:0105 00 11 300 Disabled Enabled Enabled Disabled Disabled Disabled Enabled #--- Example 2 mkflash -dev IBM.2107-7506571 -nocp -seqnum 03 0004:0104 CMUC00137I mkflash: FlashCopy pair 0004:0104 successfully created. lsflash -dev IBM.2107-7506571 0004 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 0004:0104 00 3 300 Disabled Disabled Disabled Disabled Disabled Disabled Disabled resyncflash -dev IBM.2107-7506571 -record -persist -seqnum 14 0004:0104 CMUN03027E resyncflash: 0004:0104: FlashCopy operation failure: action prohibited by current FlashCopy state
The following explanations apply to the examples shown in Example 9-5: Example 1: Increments a FlashCopy relationship. In this example, three FlashCopy relationships are created with the -record and -persist parameters. The resyncflash commands are run using a different sequence number, which overwrites the one of the current FlashCopy relationship. The sequence number changes only if the resyncflash finishes successfully. The resyncflash commands for the 0001:0101 and 0005:0105 relationships take place using the -record and -persist parameters. Because the two parameters are omitted for the 0003:0103 FlashCopy relationship, the two properties Recording and Persistent change to disabled for this FlashCopy relationship. When the background copy for the 0003:0103 FlashCopy relationship finishes, then the FlashCopy relationship terminates. Example 2: Error occurs when you try to increment a FlashCopy relationship. When you try to increment a FlashCopy relationship for which the properties Recording and Persistent are disabled, an error results. The script ends after this command with return code 2. Any other commands that follows the one that caused the error are not run.
83
The following explanations apply to the examples shown in Example 9-6: Example 1: Reverses a FlashCopy relationship. In this example, three FlashCopy relationships are created with the -record and -persist parameters. The reverseflash commands are run using a different sequence number, which overwrites the one of the current FlashCopy relationships. The reverseflash commands for the 0001:0101 and 0005:0105 relationships run with the -record and -persist parameters. Because the two parameters are omitted for the 0003:0103 FlashCopy relationship, the two properties Recording and Persistent change to disabled for this FlashCopy relationship. This action terminates the 0003:0103 FlashCopy relationship as soon it is successfully reversed. Example 2: Reverses a FlashCopy relationship multiple times It is possible to reverse a FlashCopy relationship multiple times, thus recopying the contents of the original FlashCopy target volume multiple times back to the original source volume. In this example, the 0002:0102 is reversed once as part of example 1. Then, changes are done to data on volume 0002. A subsequent reverseflash command for 0002:0102 eliminates the changes that are done to 0002 and brings back the data from volume 0102 to volume 0002, as it was during the initial FlashCopy.
84
Example 3: Reestablishes the original FlashCopy direction, reversing again It is possible to reverse a FlashCopy relationship back again. In example 3, this action is shown for the reversed FlashCopy relationship 0102:0002. Reversing it a second time and referring to it as FlashCopy pair 0102:0002 is similar to establishing a new FlashCopy for the volume pair 0002:0102. In this case, a sequence number that is provided with the reverseflash command is used to identify a new FlashCopy relationship.
Resetting a target to the contents of the last consistency point using revertflash
The revertflash command can be used to reset the target volume to the contents of the last consistency point. Like the commitflash command, this command is intended to be used in asynchronous environments like Global Mirror environments. Before this command can be run, the relationship must be made revertible, either automatically with Global Mirror, or manually by running setrevertible. See Example 9-7.
Example 9-7 revertflash command example
#--- Example 1 mkflash -dev IBM.2107-7506571 -record -persist -seqnum 01 0001:0101 CMUC00137I mkflash: FlashCopy pair 0001:0101 successfully created. mkflash -dev IBM.2107-7506571 -nocp -seqnum 04 0001:0104 0001:0105 CMUC00137I mkflash: FlashCopy pair 0001:0104 successfully created. CMUC00137I mkflash: FlashCopy pair 0001:0105 successfully created. lsflash -dev IBM.2107-7506571 0000-0005 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 0001:0101 00 1 300 Disabled Enabled Enabled Disabled Disabled Disabled Enabled 0001:0104 00 4 300 Disabled Disabled Disabled Disabled Disabled Disabled Disabled 0001:0105 00 4 300 Disabled Disabled Disabled Disabled Disabled Disabled Disabled setflashrevertible -dev IBM.2107-7506571 0001:0101 CMUC00167I setflashrevertible: FlashCopy volume pair 0001:0101 successfully made revertible. revertflash -dev IBM.2107-7506571 0001 CMUC00171I revertflash: FlashCopy volume pair 0001:0001 successfully reverted.
In Example 9-7, three FlashCopy relationships are created for one source volume: 0001:0101, 0001:0104, and 0105. The revertflash command is run for source 0001, and because the FlashCopy relationship 0001:0101 has the Recording and Persistent properties enabled, this command refers to the FlashCopy relationship 0001:0101. Any updates that are done to volume 0101 are overwritten.
#--- Example 1 rmflash -dev IBM.2107-7506571 -quiet -cp 0001:0101 CMUC00143I rmflash: Background copy process for FlashCopy pair 0001:0101 successfully started. The relationship will be removed when the copy ends.
In Example 9-8, to create a background copy of a FlashCopy relationship, the existing FlashCopy relationship 0001:0101 is used to create a background copy.
85
In scripts, it should always be used with the -quiet parameter to avoid the confirmation prompt. See Example 9-9.
Example 9-9 rmflash command example
#--- Example 1 rmflash -dev IBM.2107-7506571 -quiet 0001:0101 CMUC00140I rmflash: FlashCopy pair 0001:0101 successfully removed.
dscli> mkflash -dev IBM.2107-7506571 -freeze 1500-1501:1502-1503 CMUC00137I mkflash: FlashCopy pair 1500:1502 successfully created. CMUC00137I mkflash: FlashCopy pair 1501:1503 successfully created.
#--- Example 1 unfreezeflash -dev IBM.2107-7506571/00 CMUC00172I unfreezeflash: FlashCopy consistency group for logical subsystem 00: successfully reset.
86
87
Figure 9-2 summarizes the parameters and the corresponding DS CLI commands that can be used when performing Remote FlashCopy.
DS CLI Commands resync mkremote lsremote setremote commit flash flash flash remote remote revertible flash flash x x x x x x x x x
Parameters Source freeze Target tgtpprc tgtoffline tgtinhibit tgtonly Flashcopy pair dev record persist nocp seqnum source:target fast source cp source LSS l s activecp revertible state srcss Command conduit quiet
rmflash
x x x x x x x
x x x x x x x x x x x x x
x x
x x x x x x x
x x x
x x
x x
x x
x x
x x
x x x
The description of the parameters is similar to the description presented in 9.3.1 Parameters that are used with local FlashCopy commands on page 77. In regard to the -conduit parameter, which applies only to remote FlashCopy, the following explanation applies: Within a Remote Mirror environment, the FlashCopy commands are sent across the mirror paths, thus avoiding the necessity of having separate network connections to the remote site solely for the management of the remote FlashCopy. The -conduit parameter identifies the path to be used for transmitting the commands to the remote site.
88
89
5. Select the Filter by option (the available filter options are All Volumes, Host, LSS, Storage Allocation Method, and Volume Group) to get the volumes for which you want to create a FlashCopy relationship and then select a Source Volume and Target Volume. 6. Select the necessary options for your FlashCopy relationship and then press Add. Here is a description of the options you can select from the Create FlashCopy window: Persistent: Retains the FlashCopy relationship after the background copy completes on the volume pair. When you select this option, the relationship between the source and the target volume remains until the FlashCopy relationship is deleted. This option is a prerequisite for using the Refresh target volume operation. If you do not select this option, an automatic withdrawal of the FlashCopy relationship occurs when the background copy completes. When you select the Change Recording option, the Persistent option is automatically selected and you cannot clear it until you clear the Change Recording option. This situation occurs because a relationship must be persistent to enable change recording.
90
Change Recording: Monitors writes and records changes on the volume pair that is participating in a FlashCopy relationship. Selecting this option automatically enables the Persistent option and makes disabling it unavailable because a relationship must be persistent to enable change recording. Both options are required to refresh a FlashCopy relationship. Initiate Background Copy (default): Starts a physical copy of all tracks on the source volume to the target volume. After a FlashCopy pair is created, an automatic withdrawal of the FlashCopy relationship occurs when all tracks on the source volume are physically copied to the target volume unless the Persistent option is selected. Note: Some options are available only if you click Advanced (Figure 9-4).
Here is a description of all options you can select for FlashCopy from the Create FlashCopy Options window: Permit FlashCopy if target is online for host access: Not valid for Open Systems environments. Inhibit writes to target volume: If a FlashCopy relationship exists, this option prevents host system write operations to the target volume. Establish target on existing Metro Mirror source: Creates a local point-in-time copy of a volume and uses the Metro Mirror feature to create a point-in-time copy at a remote site. If this option is not selected and the FlashCopy target volume is a Metro Mirror source volume, the create FlashCopy relationship task fails. This option defaults to not selected and displays in the Create FlashCopy Verification box on the Create FlashCopy page as disabled. The following Preserve Mirror (used for Remote Pair FlashCopy) options are supported: No: FlashCopy operations are not performed on the remote site. If the target volume is a Metro Mirror primary volume, the Remote Copy might temporarily change to the duplex pending state. Preferred: Uses the Preserve Mirror function for FlashCopy operations when possible. The Preserve Mirror function cannot be used if the configuration is not correct or the state of the volume is not supported by this function. Required: FlashCopy operations do not change the state of the Metro Mirror primary volume pair to duplex pending. Both the source Metro Mirror volume pair and the target Metro Mirror volume pair must be in the Full Duplex state.
91
Sequence Number: Displays the sequence number that is defined for the FlashCopy relationships. The sequence number is a maximum of eight hexadecimal digits in length. The FlashCopy sequence number corresponds to a particular relationship that is created. If the FlashCopy sequence number that is specified does not match the sequence number of a current relationship or if a sequence number is not specified, the selected operation is performed. If the FlashCopy sequence number that is specified matches the sequence number of a current relationship, the operation is not performed. The default value is zero. 7. If you want to add multiple targets volumes to one source volume, you must select the source volume again and then select another target volume. Do not forget to select the necessary options again. Then, if you click Create, all FlashCopy relationships that are shown under Create FlashCopy Verification (see Figure 9-3 on page 90) are established if no error message occurs.
From the Action drop-down menu, you can select the following actions: Create: Creates a FlashCopy relationship. Delete: Deletes an existing FlashCopy relationship
92
Reset Target Write Inhibit: Allows writes to the FlashCopy target volume. Initiate Background Copy: Starts a Background Copy to a target volume for a persistent FlashCopy relationship. Resync Target (Resync FlashCopy): Does an incremental resync of a target volume. Reverse FlashCopy: Reverses the FlashCopy direction, where the source volume becomes a target volume and vice versa. Properties: At the Overview tab. you find all the selected options and the Out-of-sync-tracks for a FlashCopy relationship (Figure 9-6).The Volumes tab shows all the related volumes for a FlashCopy relationship, including the Allocation method Extent-Space-Efficient (ESE), Track-Space-Efficient (TSE), or standard (Figure 9-7).
As with the Properties and the Create actions, for each action you select, a new window opens. Follow the instructions in the new window, depending upon the actions you might take.
93
For more information about how to set up a FlashCopy SE, see Chapter 10, IBM FlashCopy SE on page 95. For more information about Remote Pair FlashCopy, see Chapter 11, Remote Pair FlashCopy on page 117.
94
10
Chapter 10.
IBM FlashCopy SE
IBM FlashCopy Space-Efficient (SE) is functionally not very different from the standard FlashCopy. The concept of Track-Space-Efficient (TSE) volumes with IBM FlashCopy SE relates to the attributes or properties of a DS8000 volume. FlashCopy SE can coexist with standard FlashCopy. This chapter describes the setup and usage of IBM FlashCopy SE. This chapter covers the following topics: IBM FlashCopy SE overview Setting up Track-Space-Efficient volumes Doing FlashCopies onto Track-Space-Efficient volumes
95
When data is read from the target volume, it can be retrieved from the source if it is still there, just as it would in standard FlashCopy. If the data is in the repository, the mapping structure is used to locate it. FlashCopy SE is designed for temporary copies. Because the target storage capacity is smaller than the source, a background copy does not make much sense and is not permitted with FlashCopy SE. The copy duration should generally not last longer than 24 hours unless the source data has little write activity. Durations for typical use cases are expected to generally be less than 8 hours. FlashCopy SE is optimized for use cases where less than 5% of the source volume is updated during the lifetime of the relationship. If more than 20% of the source is expected to change, then standard FlashCopy is likely a better choice. Standard FlashCopy generally has superior performance to FlashCopy SE. If performance on the source or target volumes is important, use standard FlashCopy. ESE volumes: Since Licensed Machine Code (LMC) 6.6.20.nnn for DS8700 and 7.6.20.nnn for DS8800, another Space-Efficient FlashCopy method is supported. Now you can use thin provisioned Extent-Space-Efficient (ESE) volumes together with FlashCopy. For more information about Extent-Space-Efficient volumes and FlashCopy, see 36.1, Thin provisioning and Copy Services considerations on page 552.
96
Here are some scenarios for the use of FlashCopy SE: Create a temporary copy with FlashCopy SE to dump it to tape. Create a temporary snapshot for application development or DR testing. Online backup for different points in time, for example, to protect your data against virus infection. Create checkpoints (only if the source volumes undergo moderate updates). Create FlashCopy target volumes in a Global Mirror (GM) environment. However, if the Global Mirror session is suspended, the repository fills up and eventually becomes full.
97
98
Ranks
normal Volume
99
From the write data rate MBps, you can estimate the amount of changed data by multiplying this number with the planned lifetime of the FlashCopy SE relationship. Assume a set of volumes for a 1 TB database. Assume an average of 3 MBps write activity. Within 10 hours (36 000 seconds) we update about 100 GB, which is about 10% of the capacity. In many cases, the change rate is much lower. However, you cannot be sure that this number of changes is identical to the capacity needed for the repository. There are two factors that are important here: The required capacity in the repository could be higher because there is always a full track (64 KB) copied to the repository when there is any change to the source track, even if it is only 4 KB, for example. The required capacity in the repository could be lower because several changes to the same source data track do not change anything in the repository. Important: If your source volume has several FlashCopy SE relationships and there is an update of a source volume track that is not updated since the last FlashCopy copies were taken, this track is copied for each FlashCopy SE relationship of the source volume. This track is copied several times. Because in most cases, you do not know the workload, assume that both effects even out.
Repository impact
There is some capacity that is needed in the repository for internal tables. The size depends on the physical and logical size of the repository. The space for these internal tables is allocated in addition to the specified repository size when the repository is created. Usually this additional storage is in the range of about 2% of the repository capacity. However, if you define your virtual capacity much larger than the physical capacity, you get another ratio.
100
An estimate for the additional capacity (repoverh) that is allocated when a repository with a certain repository capacity (repcap) and a certain virtual capacity (vircap) can be obtained from the following equation: repoverh (GiB) = 0.01 * repcap (GiB) + 0.005 * vircap (GiB) For example, if a repository of 5,000 GiB is created and a virtual capacity of 50,000 GiB is specified, about 300 GiB is allocated in addition to the specified 5,000 GiB.
There can be one repository per extent pool. A repository has a physical capacity that is available for storage allocations by Track-Space-Efficient volumes and a virtual capacity that is the sum of all LUN/volume sizes of the Track-Space-Efficient volumes. The physical repository capacity is allocated when the repository is created.
dscli> mksestg -repcap 40 -vircap 200 -extpool p4 CMUC00342I mksestg:: The Space-Efficient storage for the extent pool P4 has been created successfully. dscli> -repcap or -reppercent specifies the repository size. The -repcap option specifies the actual repository size and the -reppercent option specifies a percentage of virtual capacity. The minimum size is 16 GiB for the repository capacity. -vircap specifies the virtual capacity. With Licensed Machine Code (LMC) Rel. 6.2 or later, this parameter is optional. The virtual capacity of a repository increases automatically if a new Track-Space-Efficient (TSE) volume is created in a repository.
101
-captype optionally specifies all capacity unit types with gb (default), cyl, or blocks. -recapthreshold optionally sets user warning threshold. This threshold is the repository threshold, not the virtual threshold. Defaults to 0% available (100% used). The chsestg command is used to change SE repository, but currently, it can change only the user warning threshold and the virtual capacity. It cannot change the repository capacity (physical capacity) sizes. Repeat this step for all extent pools in which you want to define Space-Efficient storage. You can get information about the repository by running showsestg. Example 10-2 shows the output of the showsestg command. You can determine how much capacity within the repository is used by checking the repcapalloc value.
Example 10-2 Getting information about a Track-Space-Efficient repository
dscli> showsestg p4 extpool stgtype datastate configstate repcapstatus %repcapthreshold repcap(GiB) repcap(Mod1) repcap(blocks) repcap(cyl) repcapalloc(GiB/Mod1) %repcapalloc vircap(GiB) vircap(Mod1) vircap(blocks) vircap(cyl) vircapalloc(GiB/Mod1) %vircapalloc overhead(GiB/Mod1) reqrepcap(GiB/Mod1) reqvircap(GiB/Mod1)
P4 fb Normal Normal below 0 40.0 83886080 0.0 0 200.0 419430400 180.0 90 2.0 40.0 200.0
The lssestg command provides information about all repositories in the DS8000 (see Example 10-4 on page 105). You can delete a repository by running rmsestg if it is no longer needed.
102
DS GUI windows: All DS GUI screen captures in this chapter are done with LMC Rel. 6.3. To create the repository, complete the following steps: 1. 2. 3. 4. Click the Pool icon . Click Internal Storage. Select the check box next to the extent pool where you want to create the repository. Click Add Space-Efficient Repository.
103
After you click Add Space-Efficient Repository, the window that is shown in Figure 10-4 opens.
In this window, you can specify the physical size of the repository that is allocated on the ranks within this extent pool. There are also options to set thresholds for warnings when the repository fills up. There are similar options for the DS CLI. For the complete syntax, see IBM System Storage DS8000: Command-Line Interface Users Guide, GC53-1127. When you create a repository with a certain repository capacity, the actual capacity that is allocated in the extent pool is larger than the specified capacity to hold some internal tables.
104
In Figure 10-3 on page 103 you see that the DS GUI has an action named Delete Space-Efficient Storage that you can use to delete a repository. Here, Figure 10-5 shows the DS GUI when you select the Properties action.
dscli> mkfbvol -extpool p4 -cap 20 -name ITSO_CS_FC_#h -sam tse 420D CMUC00025I mkfbvol: FB volume 420D successfully created. dscli> When you list Space-Efficient repositories by running lssestg (see Example 10-4), you can see that in extent pool P4 you have a virtual allocation of 40 extents (GiB), but that the allocated (used) capacity repcapalloc is still zero.
Example 10-4 Getting information about Track-Space-Efficient repositories
dscli> lssestg -l extentpoolID stgtype datastate configstate repcapstatus %repcapthreshold repcap (2^30B) vircap repcapalloc vircapalloc ====================================================================================================================== P2 ckd Normal Normal below 0 64.0 1.0 0.0 0.0 P3 fb Normal Normal below 0 70.0 282.0 0.0 264.0 P4 fb Normal Normal below 0 40.0 200.0 0.0 40.0
105
This allocation comes from the volumes you created. To see the allocated space in the repository for just this volume, run showfbvol (see Example 10-5).
Example 10-5 Checking the repository usage for a volume
dscli> showfbvol 420D Name ITSO_CS_FC_420D ID 420D accstate Online datastate Normal configstate Normal deviceMTM 2107-900 datatype FB 512 addrgrp 4 extpool P4 exts 20 captype DS cap (2^30B) 20.0 cap (10^9B) cap (blocks) 41943040 volgrp ranks 0 dbexts 0 sam TSE repcapalloc 0.0 eam reqcap (blocks) 41943040 realextents 0 virtualextents 20 migrating 0 perfgrp PG0 migratingfrom resgrp RG0
106
Figure 10-6 Selecting an extent pool with a repository for Space-Efficient volumes
The next window is where you define a Track-Space-Efficient (TSE) volume (see Figure 10-7).
107
To create a Track-Space-Efficient volume, select the storage allocation method Track-Space-Efficient (TSE). The remaining steps are the same as for a standard volume.
When you want to establish a FlashCopy SE relationship between the two volumes, you can use any option that is available for standard FlashCopy, such as -record or -persist, but you must specify -tgtse, and you cannot specify -cp, which means you establish a nocopy relationship (see Example 10-7).
Example 10-7 Establishing a FlashCopy SE relationship
dscli> mkflash -tgtse -record -persist 1720:1740 CMUC00137I mkflash: FlashCopy pair 1720:1740 successfully created.
108
Example 10-8 shows the result of a lsflash -l command. You see that BackgroundCopy is disabled when you run a FlashCopy SE copy. The isTgtSE Enabled attribute indicates that it actually is a FlashCopy SE relationship.
Example 10-8 Listing a FlashCopy SE relationship dscli> lsflash -l 1720
ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled ======================================================================================================================== 1720:1740 17 0 60 Disabled Enabled Enabled Disabled Enabled Enabled
BackgroundCopy OutOfSyncTracks DateCreated DateSynced State isTgtSE ========================================================================================================= Disabled 409600 Wed Oct 24 15:55:55 CEST 2007 Wed Oct 24 15:55:55 CEST 2007 Valid Enabled
The lsflash command has an option to show only FlashCopy SE relationships: -tgtse. When you use this command, you should specify a range of volume addresses where you want to look for FlashCopy SE relationships (see Example 10-9).
Example 10-9 Listing FlashCopy SE relationships dscli> lsflash -tgtse 1700-1750
ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 1720:1740 17 0 60 Disabled Disabled Disabled Disabled Enabled Enabled Disabled
Example 10-10 shows the resyncflash command with some additional options to show that they can be used as in standard FlashCopy operations. Only the -tgtse parameter is important for FlashCopy SE.
Example 10-10 Resynchronizing a FlashCopy SE pair
dscli> resyncflash -record -persist -tgtpprc -tgtinhibit -tgtse 1720:1740 CMUC00168I resyncflash: FlashCopy volume pair 1720:1740 successfully resynchronized. A normal reverseflash operation is not possible for a FlashCopy pair that is in a nocopy relationship (an FlashCopy SE relation always is a nocopy relationship), but you can do a fast reverse restore operation by running reverseflash -fast.
109
If you want to check how much space is allocated for a Track-Space-Efficient volume, you must select a TSE volume from Volume window and select Properties. In Figure 10-8, you can see that the volume that you selected currently occupies 0.0 GiB physical storage.
110
In a similar way, we can select an extent pool with a virtual capacity (which means that the extent pool has a repository) and click Properties for that extent pool. Figure 10-9 shows an example for an extent pool that has 100.0 GiB TSE Virtual Capacity that is allocated, but no Repository Capacity allocated. The overhead for the Space-Efficient volumes is 2.0 GiB.
Figure 10-9 Properties of an extent pool with a repository for Space-Efficient storage
In addition, you see that in this extent pool, 20 GiB Virtual Capacity is allocated for thin provisioned (ESE) volumes. For more information about FlashCopy with thin provisioned volumes, see Chapter 36, Thin provisioning and Copy Services considerations on page 551.
111
extentpoolID stgtype datastate configstate repcapstatus %repcapthreshold repcap (2^30B) vircap repcapalloc vircapalloc ====================================================================================================================== P3 fb Normal Normal below 0 50.0 100.0 8.9 37.0
If you run a normal rmflash command to withdraw a FlashCopy relationship, the target volume is in an undefined state, as is always the case with nocopy relationships when they are withdrawn. The allocated space for that Space-Efficient volume is still allocated and uses up space in the repository. There are four ways to release that space: By specifying -tgtreleasespace with the rmflash command. By running initfbvol -action releasespace. Whenever you do a new FlashCopy SE onto the Track-Space-Efficient target volume. When you delete a Track-Space-Efficient volume with the rmfbvol command. Similar functions are available when you use the DS GUI.
However, the Space-Efficient volume still exists with the same virtual size and it can be reused for another FlashCopy SE relationship. Incidentally, if the virtual size of the Space-Efficient volume does not match the size of your new source volume, you can dynamically expand the virtual size by running chfbvol -cap newsize volume. However, you cannot make the volume smaller. If you want to make it smaller, you must delete it and re-create it with a different size.
112
If you did not specify the -tgtreleasespace parameter on the rmflash command, you can use the initfbvol -releasespace volume command to release space for the specified volume (see Example 10-13).
Example 10-13 Releasing space with the initfbvol command
dscli> initfbvol -action releasespace 1740 CMUC00337W initfbvol: Are you sure that you want to submit the command releasespace for the FB volume 1740?[Y/N]:y CMUC00340I initfbvol:: 1740: The command releasespace has completed successfully. Important: Your DS8000 user ID needs Administrator rights to run initfbvol. After you run this command for a volume, it is empty; all space is released, but the virtual volume still exists.
113
114
When the repository fills up and the threshold is reached, a warning is sent out, depending on the options that are set in the DS8000. If SNMP notification is configured, you receive a trap, as shown in Example 10-15. You can also configure email notification.
Example 10-15 SNMP alert when the repository threshold is reached
2007/10/25 15:22:26 CEST Space-Efficient Repository or Over-provisioned Volume has reached a warning watermark UNIT: Mnf Type-Mod SerialNm IBM 2107-922 75-03461 Volume Type: 0 Reason code: 0 Extent Pool ID: 3 Percentage Full: 50 You can also specify a threshold for the repository when you use the DS GUI. You can specify it in the window that is shown in Figure 10-9 on page 111. You can also set a limit and a threshold for the virtual capacity in a repository by running mkextpool or chextpool. There are new options (-virextentlimit, -virlimit, and -virthreshold) to enable this limit and set a threshold.
115
When space is exhausted while a FlashCopy SE relationship exists, the relationship is placed in a failed state, which means that the target copy becomes invalid. Writes continue to the source volume. If the space becomes depleted, the relationship continues to exist in a failed state. Reads and writes to the source are allowed, but updated tracks are not copied to the target. Also, all reads and writes to the target fail, causing any jobs that are running against the target volume to fail, rather than to succeed without data integrity. To clear this condition for the target volume, withdraw the relationship and release the space on the Track-Space-Efficient volume. You can also establish a new FlashCopy SE relationship that releases all space in the repository that is associated with the target volume. Another possibility is to run initfbvol to release the space. As space begins to approach depletion on a repository, the control unit begins delaying writes to the Track-Space-Efficient volumes backed by that repository volume. This situation allows the data in cache to be destaged before the space is exhausted, which minimizes the amount of data that gets trapped in NVS when the space is exhausted. The delay is based on how many updates are occurring and how much space is left. In a Global Mirror environment, where the FlashCopy target volumes are Track-Space-Efficient volumes, the behavior of a FlashCopy SE relation is different when the repository becomes full. In this case, you want to keep the FlashCopy relation because it represents a consistent state of the Global Mirror target volumes. The FlashCopy source volumes are put in a write-source inhibit state. Because these FlashCopy source volumes are target volumes of Global Mirror, the Global Mirror pairs are suspended at the next mirror write to the remote volume.
116
11
Chapter 11.
117
Issue FlashCopy
Local storage system
(1)
Local A
Metro r Mirr o
Remote A
links P PRC
(2) FlashCopy
Local B
link P PRC
Remote B
Metr o r Mirr o
FULL DUPLEX
FULL DUPLEX
Figure 11-1 Remote Pair FlashCopy preserves the Metro Mirror FULL DUPLEX state
118
The following conditions are required to establish Remote Pair FlashCopy: 1. Both the Local A / Remote A and the Local B / Remote B Metro Mirror pairs must be in the FULL DUPLEX state. Remote Pair FlashCopy required code: Licensed Machine Code (LMC) 6.6.2.nnn and 7.6.2.nnn allows Remote Pair FlashCopy to Metro Mirror pairs that are SUSPENDED or COPY PENDING. 2. The Remote A and Remote B volumes are in the same Storage Facility Image (SFI).
11.1.2 Considerations
The following options are not supported by Remote Pair FlashCopy: Certain FlashCopy options are not supported by Remote Pair FlashCopy: Commit is not supported. Revert is not supported. Fast Reverse Restore is not supported. Local or remote targets cannot be Space-Efficient volumes.
Remote Pair FlashCopy onto a Global Mirror target is not supported. Remote Pair FlashCopy onto a Global Copy target is not supported. Remote Pair FlashCopy with cascading configurations has the following limitations: Metro/Global Copy: Remote Pair FlashCopy can be used by this configuration if the configuration requirements for the devices that are involved in the Metro Mirror relationships are met. The FlashCopy command is run from Local A to Local B, and an inband FlashCopy command is run to do the FlashCopy copy from Remote A to Remote B. The tracks in the relationship are copied from Remote B to its corresponding PPRC secondary device through the Global Copy copy mechanism.
119
Metro/Global Mirror: Because the ability to perform a FlashCopy to a primary device that is participating in a Global Mirror session is not allowed, any attempt to perform a Remote Pair FlashCopy (Preserve Mirror) Required operation fails. This failure occurs because the inband FlashCopy operation attempts to perform a FlashCopy from Remote A to Remote B, and Remote B is a PPRC primary that is in a Global Mirror session. In this case, you must use Preserve Mirror with the Preferred or the No options. Existing Copy Services restrictions still apply (the remote target cannot be a source, and so on).
11.2.1 Terminology
The following terms describe the configuration details of Remote Pair FlashCopy: Local A Local B The device at the local site that is the source of the FlashCopy relationship that is being requested. The device at the local site that is the intended target of the FlashCopy relationship that is being requested. Local A and Local B can be the same device for a data set level operation. When Local A is a Metro Mirror primary device, Remote A is the Metro Mirror secondary associated with Local A. When Local B is a Metro Mirror primary device, Remote B is the Metro Mirror secondary associated with Local B. Remote A and Remote B can be the same device for a data set level operation.
Remote A Remote B
Mirrored FlashCopy relationship or Mirrored relationship This is a FlashCopy relationship that is established as a Remote Pair FlashCopy operation. Preserve Mirror Local Relationship Remote Relationship The software terminology that is used to describe the hardware Remote Pair FlashCopy function. The FlashCopy relationship that is established between Local A and Local B as part of a Remote Pair FlashCopy operation. The FlashCopy relationship that is established between Remote A and Remote B as part of a Remote Pair FlashCopy operation.
120
Preferred
No
121
Keyword: The FlashCopy to a Metro Mirror or Global Copy Primary OK (for example, -tgtpprc) keyword must be specified when the Remote Pair FlashCopy required or preferred options are used. The keyword differs depending on the interface that is used to establish FlashCopy. Resynchronizing: If Remote Pair FlashCopy is used in combination with Incremental FlashCopy, the usage of preferred or required when you issue a resync must be consistent with the existing relationship. If the existing incremental relationship is established without Remote Pair FlashCopy, a resync cannot be issued with Remote Pair FlashCopy Required because the remote relationship does not exist.
Table 11-1 summarizes the behavior of Remote Pair FlashCopy (either as Required or Preferred) for different combinations of the Remote Mirror (Local B / Remote B) pair status
Table 11-1 Remote Pair FlashCopy behavior Source Duplex Target Duplex Situation Normal Required Remote Pair FlashCopy is performed. FlashCopy failed. Preferred Remote Pair FlashCopy is performed. Target device duplex pending.
Duplex
Duplex
Problem with Remote FlashCopy detected early Problem with remote FlashCopy detected late Any
Duplex
Duplex
Suspended
Suspended
With Rel. 6.2 or later, perform a local FlashCopy and set the Metro Mirror OOS bitmap or the FlashCopy fails. With Rel. 6.2 or later, perform a local FlashCopy and set the Metro Mirror OOS bitmap or the FlashCopy fails. With Rel. 6.2 or later, perform a local FlashCopy and set the Metro Mirror OOS bitmap or the FlashCopy fails.
Pending
Pending
Any
Duplex
Suspended / Pending
Any
122
Required FlashCopy failed. FlashCopy issued locally. FlashCopy failed. With Rel. 6.2 or later, perform a local FlashCopy and set the Metro Mirror OOS bitmap or the FlashCopy fails.
Preferred Target device duplex pending. FlashCopy issued locally. Target device duplex pending. Set target device OOS bitmap.
123
11.3.1 DS CLI
DS CLI is updated to support the new Remote Pair FlashCopy function for full volume FlashCopy operations. These operations can be performed on both FB volumes for Open Systems or for CKD volumes on z/OS systems.
The -pmir parameter has three options: [-pmir no|required|preferred] A detailed description of each option is described in Remote Pair FlashCopy establish options on page 121.
124
The lsremoteflash command shows the status of the remote pair. This output is also updated with the Pmir column (Example 11-3).
Example 11-3 DS CLI lsremoteflash showing Remote Pair FlashCopy status on a remote pair dscli> lsremoteflash -fmt delim -l -conduit IBM.2107-1301411/c0 IBM.2107-75HT431/c000:IBM.2107-75HT431/c040 ID,SrcLSS,SequenceNum,ActiveCopy,Recording,Persistent,Revertible,SourceWriteEnabled,TargetWriteEnabled,Back groundCopy,OutOfSyncTracks,State,isTgtSE,Pmir =========================================================================================================== c000:c040,c0,0,Disabled,Disabled,Disabled,Disabled,Enabled,Enabled,Disabled,163840,Valid,No,Remote
125
126
12
Chapter 12.
FlashCopy performance
This chapter describes the preferred practices when you configure FlashCopy for specific environments or scenarios. This chapter covers the following topics: FlashCopy performance overview FlashCopy establish performance Background copy performance FlashCopy impact to applications FlashCopy options FlashCopy scenarios IBM FlashCopy Space-Efficient (SE) performance considerations
127
Terminology
Before you proceed with the description of FlashCopy preferred practices, review some of the basic terminology we use in this chapter: Server The current DS8000 models have one pair of servers (server 0 and server 1, one on each processor complex), both integrated in a single Storage Facility Image (SFI). You can run lsserver to see the available servers. A physical component of the DS8000 that provides communications between the servers and the storage devices. The lsda command lists the available device adapters. An array site that is made into an array, which is then made into a rank. For the DS8000, a rank is a collection of eight disk drive modules (DDMs). The lsrank command displays detailed information about the ranks.
Rank
128
It is a preferred practice to locate the FlashCopy target volume on the same DS8000 server as the FlashCopy source volume. Before the advent of Storage Pool Striping, when the preferred practice for configuring extent pools was one rank per extent pool, it was a preferred practice to put the FlashCopy target volume on a different device adapter (DA) and rank than the source volume. It was also a preferred practice that the source and target devices be on separate ranks. With Storage Pool Striping (the default with Rel. 6.0) and Easy Tier, the preferred practice for logical configuration is to have as few extent pools as possible. Extent pools should have multiple ranks from different DAs and use Storage Pool Striping. This configuration spreads the FlashCopy activity across DAs and ranks. Therefore, the only consideration that is left to the user is to have the source and targets in the same server. See Table 12-1 for a summary of the volume placement considerations.
Table 12-1 FlashCopy source and target volume location Server FlashCopy establish performance Background copy performance FlashCopy impact to applications Same server Same server Same server Device adapter Unimportant Different device adapter Unimportant Rank Different ranks Different ranks Different ranks
Tip: To find the relative location of your volumes, you can use the following procedure: 1. Run lsfbvol to learn which extent pool contains the relevant volumes. 2. Run lsrank command to display both the device adapter and the rank for each extent pool. 3. To determine which server contains your volumes, look at the extent pool name. Even-numbered extent pools are always from server 0, while odd-numbered extent pools are always from server 1.
129
130
Copy-on-write: The term copy-on-write describes a forced copy from the source to the target because a write to the source occurs. This situation occurs on the first write to a track only. Because the DS8000 writes to a non-volatile cache, there is typically no direct response time delay on host writes. The forced copy occurs only when the write is destaged onto disk. If the copy option is used, then upon completion of the logical FlashCopy establish phase, the source is copied to the target in an expedient manner. If many volumes are established, then do not expect to see all pairs actively copying data when their logical FlashCopy relationship is complete. The DS8000 microcode has algorithms that limit the number of active pairs that copy data. This algorithm tries to balance active copy pairs across the DS8000 device adapter resources. Additionally, the algorithm limits the number of active pairs so that there is bandwidth for host or server I/Os. Tip: The DS8000 gives higher priority to application performance than background copy performance. The DS8000 throttles the background copy if necessary so that applications are not unduly impacted. The preferred placement of the FlashCopy source and target volumes with regard to the
131
132
Incremental FlashCopy has the least impact on applications. During normal operation, no copy-on-write is done (as in a nocopy relationship), and during a resync, the load on the back end is much lower compared to a full copy. There is only a small impact for the maintenance of out-of-sync bitmaps for the source and target volumes. The resyncflash command: The Incremental FlashCopy resyncflash command does not have a nocopy option. Running resyncflash automatically uses the copy option, regardless of whether the original FlashCopy was copy or nocopy.
I/O from server: update Vol 100 Trk 17 Cache Server 0 Cache
I/O complete New process with IBM FlashCopy SE Track table of repository
:
Server 1
NVS
destaging FlashCopy relationship? New update? Release Data in NVS yes
NVS
Got it
Wait
Write update
Write
Because of space efficiency, data is not physically ordered in the same sequence on the repository disks as it is on the source. Processes that might access the source data in a sequential manner might not benefit from sequential processing when they access the target.
133
Another important consideration for FlashCopy SE is that it always has nocopy relationships. A full copy or incremental copy is not possible. If there are many source volumes that have targets in the same extent pool, all updates to these source volumes cause write activity to this one extent pools repository. We can consider a repository as something similar to a volume. So, we have writes to many source volumes that are copied to just one volume (the repository). There is less space in the repository than the total capacity (sum) of the source volumes so you might be tempted to use less disk spindles (DDMs). By definition, fewer spindles mean less performance. You can see how careful planning is needed to achieve the required throughput and response times from the Space-Efficient volumes. A good strategy is to keep the number of spindles roughly equivalent, but use smaller/faster drives (but do not use Nearline drives). For example, if your source volumes are 300 GB 15K RPM disks, then using 73 GB 15 K RPM disks on the repository can provide both space efficiency and excellent repository performance. RAID 6: There is no advantage in using RAID 6 for the repository other than resilience. It should be considered only where RAID 6 is used as the standard throughout the DS8000. Another possibility is to consider RAID 10 for the repository, although that configuration goes somewhat against space efficiency (you might be better off using standard FlashCopy with RAID 5 than SE with RAID 10). However, there might be cases where trading off some of the space efficiency gains for a performance boost justifies RAID 10. Certainly, if RAID 10 is used at the source, you should consider it for the repository (the repository always uses striping when in a multi-rank extent pool). Storage Pool Striping has good synergy with the repository (volume) function. With Storage Pool Striping, the repository space is striped across multiple RAID arrays in an extent pool, which helps balance the volume skew that might appear on the sources. It is generally preferred to not use more than eight RAID arrays in the multi-rank extent pool that is intended to hold the repository. Finally, try to use at least the same number of disk spindles on the repository as the source volumes. Avoid severe fan in configurations, such as 32 ranks of source disk being mapped to an 8-rank repository. This type of configuration likely has performance problems unless the update rate to the source is modest. It is possible to share the repository with production volumes on the same extent pool. These configurations must be handled carefully from a performance point of view. You can expect a high random write workload for the repository. To prevent the repository from becoming overloaded, take the following precautions: Have the repository in an extent pool with several ranks (a repository is always striped). Do not use more than eight ranks. Use fast 15 K RPM and small capacity disk drives for the repository ranks. Use RAID 10 instead of RAID 5, as it can sustain a higher random write workload. The recommendations are not required, but you should consider them in your planning for FlashCopy SE. Because FlashCopy SE does not need much capacity (if your update rate is not too high), you might want to make several FlashCopy copies from the same source volume. For example, you might want to make a FlashCopy copy several times a day to set checkpoints, to protect your data against viruses, or for other reasons.
134
Creating more than one FlashCopy SE relationship for a source volume increases the overhead because each first change to a source volume track must be copied several times for each FlashCopy SE relationship. Therefore, you should keep the number of concurrent FlashCopy SE relationships to a minimum, or test how many relationships you can do without affecting your application performance too much.
135
If you choose the copy option, that is probably because the data that is being backed up is coming from the target volumes (assuming that the backup to tape does not start until the background copy completes). If the backup starts sooner, the data could be coming from a mixture of source volumes and target volumes. As the backup continues, more of the data comes from the target volumes as the background copy moves more of the data to the target volumes. To have the least impact on the application and to have a fast backup to tape, spread the source volumes evenly across the available storage system resources. After the backup to tape is complete, withdraw the FlashCopy relationship. Tip: Withdraw the pairs as soon as the backup to tape is finished. This action eliminates any additional copying from the source volume, either because of copy or copy-on-write. These recommendations would be equally valid for a copy or nocopy environment.
136
Using copy: The goal of using copy is to quickly complete the background copy so that the overlapping situations between FlashCopy and application processing ends sooner. If copy is used, then all I/Os experience some degradation as they compete for resources with the background copy activity. However, this impact might be less than the impact to the individual writes that a copy-on-write causes. If FlashCopy nocopy is active during a period of application high activity, there could be a high rate of copy-on-demand (that is, destages being delayed so that the track image can be read and then written to the FlashCopy target track to preserve the point-in-time copy). The destage delay could cause degradation of the performance for all writes that occur during the delay destage periods. It is only the first write to a track that causes a collision, and only when that write is destaged. The reads do not suffer collision degradation. If you use the copy option, also consider these tips: Examine the application environment for the highest activity volumes and the most performance sensitive volumes. Consider arranging the FlashCopy order such that the highest activity and most performance sensitive volumes are copied early and the least active and least performance sensitive volumes are copied last. Tip: One approach to achieve a specified FlashCopy order is to partition the volumes into priority groups. Issue the appropriate FlashCopy commands for all volumes, but use copy on only the highest priority group and nocopy on all other groups. After a specified period or after some observable event, issue FlashCopy commands to the next highest priority group from nocopy to copy. Continue in this manner until all volumes are fully copied. If a background copy is the wanted result and FlashCopy is started just before or during a high activity period, consider the possibility of starting with nocopy and converting to copy after the high activity period completes. You might also want to examine the use of Incremental FlashCopy in a high performance sensitive activity period. Incremental FlashCopy automatically uses the copy option, so if the nocopy option was selected, using Incremental FlashCopy might impact performance by causing a full background copy. If the Incremental FlashCopy approach is chosen, it might be best to create a FlashCopy copy relationship during a quiet time. To minimize the amount of data to be copied when you take the wanted point-in-time copy, schedule an incremental refresh sufficiently in advance of the point-in-time refresh to complete the copy of the changed data. Finally, take the required point-in-time copy with the incremental refresh at the required point-in-time.
137
There is a trade-off that must be decided upon: Use all ranks for your application: This action maximizes normal application performance. FlashCopy performance is reduced. Use only half of the ranks for your applications: This action maximizes FlashCopy performance. Normal performance is reduced. If you plan for a FlashCopy implementation at the disaster recovery (DR) site, you must consider two distinct environments: DR mirroring performance with and without FlashCopy active Application performance if DR failover occurs The solution should provide acceptable performance for both environments.
138
13
Chapter 13.
FlashCopy examples
This chapter presents examples of the usage of FlashCopy in the following scenarios: Fast setup of test systems or integration systems Fast creation of volume copies for backup purposes
139
#--- remove existing FlashCopy relationships for volume 6100 rmflash -quiet 6100:6300 #--- establish FlashCopy relationships for source volume 6100 mkflash -seqnum 01 6100:6300 #--- list FlashCopy relationships for volume 6100 lsflash -l 6100 The application typically should be quiesced or briefly suspended before you run the FlashCopy. Also, some applications cache their data, so you might have to flush this data to disk, using application methods, before running the FlashCopy (this action is not covered in our example).
#=== Part 1: establish FlashCopy relationship #--- remove, establish, list FlashCopy relationships rmflash -quiet 6100:6101 rmflash -quiet 6101:6300 mkflash -seqnum 01 6100:6101 lsflash -l 6100-6400 #=== Part 2: establish FlashCopy 2 relationship 03:00 pm 6100 6300 #--- after the full volume copy of 6100 6101 finished #--- establish relationship from 6101:6300 mkflash -seqnum 02 6101:6300 lsflash -l 6100-6300 Alternatively, you can also run rmflash with the -cp and -wait parameters. These parameters cause the command to wait until the background copy is complete before you continue with the next step.
140
Whenever the test environment must be reset to the original data, run Part 2 of the scripts or use the DS GUI to perform a FlashCopy.
#=== Part 1: establish FlashCopy relationship #--- remove existing FlashCopy relationships for volume 6100 rmflash -quiet 6100:6300 #--- establish FlashCopy relationships for source volume 6100 mkflash -nocp -seqnum 01 6100:6300 #--- list FlashCopy relationships for volume 6100 lsflash -l 6100 2. Run the backup. 3. Remove the FlashCopy relationship after the volume backup completes (see Example 13-4).
Example 13-4 Withdraw the relationship
#=== Part 2: remove FlashCopy relationships rmflash -quiet 6100:6300 After you take the backup, remove the FlashCopy relationship if you do not intend to use it for other purposes. Thus, you can avoid unnecessary writes (see Example 13-4). When you do backups that are based on the target volume, you can o use a target multiple times. In complex application environments (for example, SAP), FlashCopy is often used as part of the backup solutions. Good examples for such solutions are IBM Tivoli Storage FlashCopy Manager for UNIX and Linux, which integrates into the IBM Tivoli Storage Manager backup infrastructure.
141
#=== Part 1: establish FlashCopy relationship #--- remove existing FlashCopy relationships for volume 6100 rmflash -quiet -tgtreleasespace 6100:6300 #--- establish FlashCopy relationships for source volume 6100 and a Space-Efficient target mkflash -tgtse -nocp -seqnum 01 6100:6300 #--- list FlashCopy relationships for volume 6100 lsflash -l 6100 After you take the backup, remove the FlashCopy relationship (see Example 13-6). The -tgtreleasespace option is specified to release storage for the target volume in the repository. Thus, you can avoid unnecessary writes and an increase of the used capacity in the repository.
Example 13-6 Withdrawing the IBM FlashCopy SE relationship
#=== Part 2: remove FlashCopy relationships and release space for the target rmflash -quiet -tgtreleasespace 6100:6300 The target volume still exists, but it is empty.
#=== Part 1: establish FlashCopy relationship #--- remove existing FlashCopy relationships for volume 6100 rmflash -quiet 6100:6300 #--- establish FlashCopy relationships mkflash -record -persist -seqnum 01 6100:6300 #--- list FlashCopy relationships for volume 6100 lsflash -l 6100
142
After the initial full volume copy, the script that is shown in Example 13-8 supports the incremental copy of the FlashCopy relationship.
Example 13-8 Create an Incremental FlashCopy
#=== Part 2: resynch FlashCopy relationship resyncflash -record -persist -seqnum 01 6100:6300 lsflash -l 6100
13.2.4 Using a target volume to restore its contents back to the source
You might have to apply logs to the target, and then reverse the target volume to the source volume. For each source volume, one FlashCopy can exist with the -record and -persist attributes set. Use this volume to refresh the source volume. To reverse the relationship, the data must be copied to the target before you reverse it back to the source. To avoid a situation where the full volume must be copied with each FlashCopy, Incremental FlashCopy should be used. Because logs might need to be applied to the target volume before you reverse it, the target volume should be write-enabled. This example consists of the following steps: 1. Establish initial FlashCopy (see Example 13-9, Part 1). 2. Establish Incremental FlashCopy (see Example 13-10, Part 2). 3. Reverse the relationship (see Example 13-11, Part 3). You should also consider applying application or DB logs as well.
Example 13-9 Run the initial FlashCopy to support the refresh of the source volume
#=== Part 1: Establish Incremental FlashCopy #--- remove, establish, list FlashCopy relationships rmflash -quiet 6100:6101 mkflash -persist -record -tgtinhibit -seqnum 01 6100:6101 lsflash -l 6100 After the initial FlashCopy, the incremental copies can be done (see Example 13-10).
Example 13-10 Create an Incremental FlashCopy
#=== Part 2: Resynch FlashCopy relationship resyncflash -record -persist -tgtinhibit -seqnum 01 6100:6101 lsflash -l 6100 The reverse of the FlashCopy is done by running reverseflash (Example 13-11).
Example 13-11 Reverse the volumes
#=== Part 3: Reverse FlashCopy relationship reverseflash -persist -record -tgtinhibit -seqnum 01 6100:6101 lsflash -l 6100
143
144
Part 4
Part
Metro Mirror
This part of the book describes IBM System Storage Metro Mirror for DS8000 when used in a System z environment. This part describes the characteristics of Metro Mirror and describes the options for its setup. This part also shows which management interfaces can be used, and the important aspects to be considered when you establish a Metro Mirror environment. This part concludes with examples of Metro Mirror management and setup.
145
146
14
Chapter 14.
147
Server write 1
When the application performs a write update operation to a source volume, the following actions occur: 1. 2. 3. 4. Write to source volume (DS8000 cache and NVS). Write to target volume (DS8000 cache and NVS). Signal write complete from the remote target DS8000. Post I/O complete to host server.
The Fibre Channel connection between the local and the remote storage systems can be direct, through a switch, or through other supported distance solutions (for example, Dense Wave Division Multiplexor (DWDM)).
148
149
150
Metro Mirror by itself does not offer the means of controlling such scenario. It offers the
Consistency Group and Critical attributes, which, along with appropriate automation
solutions, can manage data consistency and integrity at the remote site. The Metro Mirror volume pairs are always consistent, because of the synchronous nature of Metro Mirror. However, cross-system or cross-LSS data consistency must have an external management method. IBM offers Tivoli Storage Productivity Center for Replication to deliver a solution in this area. Tivoli Storage Productivity Center for Replication is described in Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685
151
152
15
Chapter 15.
153
154
Disable autoresync Optional for suspended Global Copy (GC) relationships. Wait option This option delays the command response until the volume pairs are in one of the final states: Simplex, Full Duplex, Suspended, Target Full Duplex, Target Suspended (until the pair is not in the Copy Pending state). This parameter cannot be used with -type gcp or -mode nocp.
155
Fibre Channel support: For Metro Mirror, the DS8000 supports Fibre Channel links only; you cannot use FICON links. Metro Mirror source and target volumes are supported on the same DS8000. Metro Mirror paths: Consider defining all Metro Mirror paths that are used by one application environment on the same set of physical links if you intend to keep the data consistent. With this approach, the paths between multiple Metro Mirror volume relationships cannot fail at a different time. For more information, see 15.3.2, Consistency Group function: How it works on page 159. A DS8000 Fibre Channel port can simultaneously be: A sender for a Metro Mirror source A receiver for a Metro Mirror target A target for Fibre Channel Protocol (FCP) hosts I/O from Open Systems and Linux on System z Each Metro Mirror port provides connectivity for all LSSs within the DS8000 and can carry multiple logical Metro Mirror paths. Although one FCP link has sufficient bandwidth for most Metro Mirror environments, the preferred practices are to: Configure at least two Fibre Channel links between each source and remote disk system to: Provide redundancy for continuous availability in the event of a physical path failure Provide multiple logical paths between the LSSs Dedicate Fibre Channel ports for Metro Mirror usage, ensuring no interference from host I/O activity. This action is essential with Metro Mirror, which is time critical and should not be impacted by host I/O activity. IBM technical services are available to assist you in determining the number of links that are used by a bandwidth analysis to ensure that the environment is able to effectively handle the workload. Sharing links: In general, you should not share the FCP links used for Metro Mirror with asynchronous Remote Copy functions. For more information, see 26.2.3, Considerations for host adapter usage on page 346. Metro Mirror FCP links can be directly connected, or connected by up to two switches. Channel extension: If you use channel extension technology devices for Metro Mirror links, you should verify with the products vendor what environment (directly connected or connected with a SAN switch) is supported by the vendor and what SAN switch is supported.
156
. . . .
LSS FE DS8000 #1
Figure 15-1 Logical paths
. . . .
LSS FE DS8000 #2
Logical paths are unidirectional, that is, they can operate in either one direction or the other. Metro Mirror is bidirectional, allowing any particular pair of LSSs to have logical paths that are defined in opposite directions (for example, an LSS can be both a primary and a secondary at the same time). Also, logical paths in opposite directions can be defined on the same Fibre Channel physical link. For bandwidth and redundancy, one logical path can use more than one physical path. The maximum number of logical paths per LSS pair is eight. Metro Mirror balances the workload across the available physical paths. Figure 15-2 shows an example where you have a 1:1 mapping of source to target LSSs, and where the three logical paths are accommodated over one physical link: LSS1 in DS8000-1 to LSS1 in DS8000-2 LSS2 in DS8000-1 to LSS2 in DS8000-2 LSS3 in DS8000-1 to LSS3 in DS8000-2
DS8000 1
LSS 1
1 logical path 3-9 logical paths
DS8000 2
LSS 1
1 logical path
LSS 2
1 logical path
switch
Port
Metro Mirror paths
1 Link
Port
1 logical path
LSS 2
LSS 3
1 logical path
LSS 3
1 logical path
Figure 15-2 Logical paths over a physical link for Metro Mirror
157
Alternatively, if the volumes in each of the LSSs of DS8000-1 map to volumes in all three target LSSs in DS8000-2, there are nine logical paths over the physical link (not fully illustrated in Figure 15-2 on page 157). You should use a 1:1 LSS mapping. Metro Mirror FCP paths have certain architectural limits, which include: A source LSS can maintain paths up to a maximum of four target LSSs. Each target LSS can be in a separate DS8000. You can define up to eight physical port pairs per LSS-LSS relationship. An FCP port can host up to 1280 logical paths. These are the logical and directional paths that are made from LSS to LSS. An FCP physical link (the physical connection from one port to another port) can host up to 256 logical paths. An FCP port can accommodate up to 126 different physical links (DS8000 port to DS8000 port through the SAN).
158
If the copy of data contains any of those combinations, then the data is inconsistent (the order of dependent writes is not preserved): Operation 2 and 3 Operation 1 and 3 Operation 2 Operation 3 Regarding the Consistency Group function, data consistency means that this sequence is always kept in the copied data. The order of non-dependent writes does not necessarily have to be preserved. For example, consider the following two sequences: 1. 2. 3. 4. Deposit paycheck in checking account A. Withdraw cash from checking account A. Deposit paycheck in checking account B. Withdraw cash from checking account B.
In order for the data to be consistent, the deposit of the paycheck must be applied before the withdrawal of cash for each of the checking accounts. However, it does not matter whether the deposit to checking account A or checking account B occurred first if the associated withdrawals are in the correct order. So, for example, the data copy is consistent if the following sequence occurred at the copy: 1. 2. 3. 4. Deposit paycheck in checking account B. Deposit paycheck in checking account A. Withdraw cash from checking account B. WIthdraw cash from checking account A.
The order of updates is not the same as it is for the source data, but the order of dependent
159
Operation principles
Whether writes are dependent or not depends on the application architecture. Typically, you want your database file system, database log file system, and application file system to be consistent. To balance your workload across both DS8000 internal storage servers, your data is usually placed in at least two LSSs.
160
In Figure 15-3 we have a file system that is placed on four volumes in two different LSSs, one even and one odd. There are two logical Metro Mirror paths between LSS 11 on the source storage and LSS 11 on the target storage. There are another two different logical Metro Mirror paths between LSS 12 on source and LSS 12 on target storage. The logical paths between LSSs 11:11 and LSSs 12:12 might fail at different times.
DS8000 #1 DS8000 #2
LSS11
I/O #1
LSS11
1101 1102
Logical paths
1101
I/O #2
1102
1103
Application
LSS12
I/O #3
LSS12
1201 1202
Logical paths
1201
I/O #4
1202
CG aware management
DS HMC
An application is going to issue four dependent writes in the order #1, #2, #3, and #4. The application does not issue write #N until write #(N1) is confirmed. Each write in our scenario goes to a different volume. There is also a volume with no Remote Copy relationship.
161
Assume that, before write #1 is issued, both logical paths between LSSs 11:11 fail, as shown in Figure 15-4. Failure for write #1 causes volume 1101 to be put into an ELB or queue full condition during the timeout, which is 60 seconds by default. As the application gets no write acknowledgement for that time period, it does not issue any of the dependent writes #2, #3, and #4.
DS8000 #1 DS8000 #2
LSS11
I/O #1
LSS11
1101 1102
Logical paths
1101
I/O #2
1102
1103
Application
LSS12
I/O #3
LSS12
1201 1202
Logical paths
1201
I/O #4
1202
SNMP trap
CG aware management
DS HMC
Figure 15-4 Metro Mirror path failure with Consistency Group enabled
162
At the time that the first copy operation fails, a system message is issued (IEA494I EXTENDED LONG BUSY STATE) so that management software can react to that event within the ELB timeout. The Consistency Group aware management application (for example, Tivoli Storage Productivity Center for Replication) must ensure data consistency on the target storage within the ELB timeout window by issuing a freeze to all volumes that belong to a group that requires consistency. To do this task, the application must have information about where the application (or applications that are related and dependent) has its data that must be consistent. In this example, these volumes are volumes 1101, 1102,1201, and 1202 on LSSs 11 and 12. Within the ELB timeout, the management software creates a consistency group by freezing paths for all volumes that are mirrored, in this example, for paths 11:11 and 12:12. All volumes in these LSSs that are mirrored are put in Extended Long Busy for the ELB timeout, and logical paths from LSSs 11:11 and 12:12 are removed (see Figure 15-5). The paths are still visible on the storage system in a failed state with the Failed Reason set to System Reserved Path and with no physical ports defined. Now the data consistency at the target storage system is preserved.
DS8000 #1 DS8000 #2
LSS11
I/O #1 extended long busy I/O #2
LSS11
1101 1102 1102
1101
1103
Application
LSS12
I/O #3
LSS12
1201 1202 1202
1201
I/O #4
CG aware management
CGROUP FREEZE 11:11,12:12
DS HMC
Consistent data
163
An unfreeze of LSS 11 and LSS 12 resets the ELB on all volumes in these LSSs, even before the ELB timeout expires. Write operation #1 finishes and the application can issue the next dependent operation. All writes are performed only at the source storage system, and no data is copied to the target storage system for volume pairs in the source LSS 11 and the target LSS 11 and also for volume pairs in the source LSS 12 and the target LSS 12 (see Figure 15-6).
DS8000 #1 DS8000 #2
LSS11
I/O #1
1101 1101 1102 1102
LSS11
1101 1101 1102 1102
I/O #2
1103 1103
Application
LSS12
I/O #3
1201 1201
LSS12
1201 1201
I/O #4
1202 1202
1202 1202
CG aware management
CGROUP RUN 11:11,12:12
DS HMC
Consistent data
164
If no manual or automated actions are taken during the ELB timeout, write #1 is written locally only, and path 11:11 is put in to the suspended mode. The application gets write #1 commitment and issues write #2. This action does not cause another Extended Long Busy on volume 1102 or another SNMP trap. Write #2 is written locally only and write OK is sent to the application. Write #3 is written locally and also copied to the second storage system, as shown in Figure 15-7. Data is not consistent at the remote site anymore. The same situation applies to write #4.
DS8000 #1
DS8000 #2
LSS11
I/O #1
1101 1101 1102 1102
LSS11
1101 1101
I/O #2
1103 1103
1102 1102
Logical paths
Application
LSS12
I/O #3
1201 1201
LSS12
1201 1201
I/O #4
1202 1202
1202 1202
Logical paths
being updated
CG aware management
DS HMC
Incostitent data
165
Figure 15-8 shows a more real-world scenario together with timing. There are the same volumes, 1101, 1102, 1201, and 1202, as in Operation principles on page 160. The additional volume 1103 is not in a Metro Mirror relationship. When the DS8000 detects a condition where it cannot update the Metro Mirror target volume, the Metro Mirror source volume within the LSS that has the Consistency Group option set becomes suspended and enters the queue full condition. At the same time, the DS8000 issues an SNMP trap (Trap 202: source PPRC Devices on LSS Suspended Due to Error).
write read
1100 1100
queue full
1101 1101
queue full
1102
1201
queue full
XLB timeout
freeze
time
An automation program, triggered by the SNMP trap, can issue the freezepprc command to all LSS pairs that have volumes that are related to the application. This action causes all Metro Mirror source volumes in the LSSs with the Consistency Group option set to become suspended and enter the queue full condition upon write attempt. The default 60 seconds of the queue full condition gives the automation enough time to issue a freezepprc command to the necessary LSSs. Volume 1103 is not affected by the consistency group actions because it has no Metro Mirror relationship. In Figure 15-8, the read and write operations are shown in different colors. At the time of the path 11:11 failure, there are read and write I/Os going to all volumes except volume 1102, and this volume has only read I/Os. The first write attempt to the volume 1101 causes this volume to be put into the queue full state for the ELB timeout. Later, a write attempt to volume 1102 causes this volume to be put into the queue full condition for 60 seconds. Within the ELB timeout, the automation program runs freezepprc 11:11 12:12 and causes all write IO attempts to source volumes of a Metro Mirror relationship between LSSs 11:11 and 12:12 not to finish and the volumes to be put into the queue full state. The ELB on volume 1202 times out when the ELB timeout expires because freezepprc is issued, no matter when the first write attempt is issued. All I/O resumes after the default ELB timeout, if the unfreezepprc command is not received first.
166
To minimize application outage, run unfreezepprc 11:11 12:12 to thaw the volume, which causes the queue full state to be reset on all related volumes and allows storage system to finish all write attempts, as shown in Figure 15-9.
write read
1100 1100
queue full
1101 1101
queue full
1102
1201
queue full
XLB timeout
freeze
unfreeze
time
It is not possible to issue Consistency Group (freeze/unfreeze) type commands from the DS GUI, but you can change the timeout values by using the DS GUI. Important: The queue full condition is presented only for the source volume that is affected by the error (in the case of path failures, multiple volumes are often affected). Still, the freeze operation is performed at the Metro Mirror path level, causing all Metro Mirror volumes using those paths to go into a suspended state with an Extended Long Busy condition and terminating all associated paths. Therefore, when you plan your implementation, you should consider not intermixing volumes from different applications in an LSS pair that is part of a consistency group. Otherwise, the not-in-error volumes that belong to other applications are frozen as well.
167
At the recovery site, the Metro Mirror failover function combines three steps that are involved in the switch over (planned or unplanned) to the remote site into a single task. This design takes into account the possibility that the original source LSS might no longer be reachable. 1. Suspends all copy operations from the primary site. 2. Changes the state of the target volume to source volume suspended. 3. Tracks all updates to the new source volume. The state of the original source volume (Primary Full Duplex) at the normal production site is not changed by Metro Mirror failover operation. The device state changes to Primary Suspended, when either one of two things occurs: 1. An attempt to perform certain Remote Copy operations from the original source to the original target volume (for example, through pausepprc). The operation fails, but the former primary device is now Primary Suspended. The failure of the command is expected. 2. A write operation from the host to the original source volume. The failure of Metro Mirror to copy the change to the former target results in the state change. The volumes at both sites are now in the Primary Suspended state and track any updates that are made. To initiate the return to a mirroring status to the production site, the Metro Mirror failback function is submitted at the recovery site. The failback operation checks the state of the original source volume at the production site to determine how much data to copy back. Then, either all tracks or only out-of-sync tracks are copied, with the original source volume becoming a Target Full Duplex. Metro Mirror failback: The Metro Mirror failback does not mean switchback to the production site itself. It is an initial step for a procedure of switching back to the primary site, after it is available. Metro Mirror failback operates in the following way: If a volume at the production site is in the full duplex or suspended state and without changed tracks, only the modified data on the volume at the recovery site is copied back to the volume at the production site. If a volume at the production site is in a suspended state and has tracks that are updated, then both the tracks that are changed on the production site and the tracks that are marked at the recovery site are copied back. Finally, the volume at the production site becomes a write-inhibited target volume. This action is performed on an individual volume basis.
168
The switchback is completed with one more sequence of a Metro Mirror failover followed by a Metro Mirror failback operation, both given at the now recovered production site. Figure 15-10 summarizes the whole process.
Establish with PPRC Failback Target = A copy pending Source = B copy pending
At (A): 1. Metro Mirror Failover 2. start application processing 3. Metro Mirror Failback
Establish with PPRC Failback Source = A full duplex Target = B full duplex
Depending on the conditions at the primary site and your requirements, you could initiate the failback from either site. A failback re-establishes mirroring in either direction. The failback operation functions in the same manner regardless of where it is initiated. In this manner, Metro Mirror failover and failback are dual interdependent operations. It is possible to implement all site switch operations with two pairs of failover and failback tasks (one pair for each direction). Planned switchover: For a planned switch over from site (A) to site (B), and to keep data consistency at (B), the application at site (A) must be quiesced before the Metro Mirror failover operation at (B) is quiesced. Alternatively, you can use the freezepprc and unfreezepprc commands. The same consideration applies when switching back from site (B) to (A). The Metro Mirror failover and failback modes can be invoked from the DS GUI or the DS CLI.
169
170
16
Chapter 16.
171
16.1 Bandwidth
Before you establish your Metro Mirror solution, determine the bandwidth requirements for connecting the primary and remote storage systems. This step is a critical one in designing the Metro Mirror environment. Bandwidth analysis and capacity planning helps you define how many links you require initially, and when you need to add more, to ensure continuous optimal performance. A software suite to assist you with this analysis is Tivoli Storage Productivity Center. The operating system onboard tools, such as iostat, do not provide precise numbers for that objective and moreover cannot provide accumulated I/O workload figures from several servers. Tivoli Storage Productivity Center for Replication gives you precise information about amounts of data, throughput, and IOPS for a specific storage system. Another method, but much less exact, is to collect historical traffic data from the Fibre Channel (FC) switches by using FC switch tools.
16.2 Performance
Because Metro Mirror is a synchronous mirroring technology, it has a performance impact on the storage system (for write operations) greater than a similar environment that has no remote mirroring. It does not use any host processor resources, unlike host-based or operating system mirroring. Consider this aspect as part of the planning process for Metro Mirror.
172
16.2.3 Distance
Distance is an important topic. The distance between your local and remote DS8000 systems affects the response time impact of the Metro Mirror implementation. The maximum supported distance for Metro Mirror is 300 km. Light speed is less than 300,000 km/s, that is, less than 300 km/ms on Fibre Channel. The data must go to the other site, and then an acknowledgement must come back. Add possible latency times of any active components in the configuration (for example, Fibre Channel directors), and you get approximately a 1-ms impact per 100 km for write I/Os.
16.3 Scalability
The DS8000 Metro Mirror environment can be scaled up or down as required. If new volumes are added to the DS8000 that require mirroring, they can be dynamically added. If additional Metro Mirror paths are required, they also can be dynamically added. The logical nature of the LSS makes a Metro Mirror implementation on the DS8000 easier to plan, implement, and manage. However, if you must add more LSSs to your Metro Mirror environment, your management and automation solutions should be set up to handle this task. Tivoli Storage Productivity Center for Replication for Replication is designed to provide this functionality. For more information, see Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685.
173
174
Figure 16-1 shows a logical configuration. This idea applies equally to the physical aspects of the DS8000. You should attempt to balance workload and apply symmetrical concepts to other aspects of your DS8000, like Extent Pools.
0001
0002
0001
0002
0101
0102
0101
0102
1001
1002
1003 1004
LSS10
1001
1002
1003 1004
LSS10
1101
1102
1103 1104
LSS11
1101
1102
1103 1104
LSS11
2E01
2E02
2E03 2E04
LSS2E
2E01
2E02
2E03 2E04
LSS2E
2F01
2F02
2F03 2F04
LSS2F
2F01
2F02
2F03 2F04
LSS2F
DS8000 #1
DS8000 #2
175
16.7 Interoperability
Metro Mirror pairs can be established only between storage systems with the same (or similar) type and features. For more information, see the System Storage Interoperation Center (SSIC) at the following website: http://www-03.ibm.com/systems/support/storage/ssic/interoperability.wss
176
17
Chapter 17.
177
Metro Mirror and Global Copy paths commands List available I/O ports that can be used to establish Metro Mirror paths List established Metro Mirror paths Establish path lsavailpprcport This information is shown during the process when a path is established. Copy Services Mirroring Connectivity Copy Services Mirroring Connectivity Action Create Copy Services Mirroring Connectivity Action Delete
lspprcpath mkpprcpath
Delete path
rmpprcpath
Metro Mirror pairs commands Failback failbackpprc Copy Services Metro Mirror / Global Copy Action Recovery Failback Copy Services Metro Mirror / Global Copy Action Recovery Failover Copy Services Metro Mirror / Global Copy Copy Service Metro Mirror / Global Copy Action Create Metro Mirror
Failover
failoverpprc
lspprc mkpprc
178
Select option with DS GUI Copy Services Metro Mirror / Global Copy Action Suspend Copy Services Metro Mirror / Global Copy Action Resume Copy Services Metro Mirror / Global Copy Action Delete No equivalent command is available in the DS GUI. No equivalent command is available in the DS GUI.
Resume pair
resumepprc
Delete pair
rmpprc
freezepprc unfreezepprc
The DS CLI has the advantage that you can make scripts and use them for automation. The DS GUI is a web-based graphical user interface and is more intuitive than the DS CLI. Restriction: Not all of the commands or parameters available in the DS CLI are necessarily fully implemented in the DS GUI. For example, freezepprc and unfreezepprc are used in scripts or by automation software and would not make sense in the DS GUI with the comparably longer response times of a graphical interface. Another example is the -unconditional parameter of the pausepprc command, it is implemented in DS CLI only.
179
Table 17-2 shows the DS CLI commands that are typically used for managing a Metro Mirror environment, including path and volume relationship commands.
Table 17-2 DS CLI Remote Mirror and Copy pair and path commands Command failbackpprc Description Starts resynchronizing changed tracks between the primary and the secondary volume. Usually used after a failoverpprc to reverse the direction of the synchronization. Changes a secondary volume to a primary suspended status and starts bitmap recording for changed blocks. Suspended status and bitmap recording for the primary volume starts as soon as a write I/O is received for this volume. This command is used usually to switch production because of a planned or unplanned outage. Lists Remote Mirror and Copy volumes relationships. Establishes a Remote Mirror and Copy relationship. Places the primary logical subsystem (LSS) in the extended long busy state during the defined timeout, sets a queue full condition for the primary volumes, and removes the Remote Copy paths between the primary and secondary LSSs. Pauses an existing Remote Mirror and Copy volume pair relationship. Resumes a Remote Mirror and Copy relationship for a volume pair. Removes a Remote Mirror and Copy relationship. Thaws an existing Remote Copy Consistency Group by resetting the queue full condition for those primary volumes where the freezepprc command is issued. Lists available ports that can be defined as Remote Mirror and Copy. Lists the existing Remote Mirror and Copy path definition. Creates a Remote Mirror and Copy path over an IBM ESCON connection. Establishes or replaces a Remote Mirror and Copy path over a Fibre Channel. Removes a Remote Mirror and Copy path.
failoverpprc
Tip: The mkpprc and resumepprc commands offer the -wait flag, which delays the command response until copy complete status is achieved. You can choose to use this flag if you want to be sure of successful completion. For the most current list of DS CLI supported environments, see the IBM System Storage DS8000 Information Center at the following website: http://publib.boulder.ibm.com/infocenter/dsichelp/ds8000ic/index.jsp As you prepare to work with the DS CLI, it is assumed that you set up a DS CLI environment using DS CLI profiles that contain devid and remotedevid statements for your primary and secondary DS8000 systems. Using these statements spares you from typing -dev storage_image_ID and -remotedev storage_image_ID in to each command. For information about setting up DS CLI profiles, see 6.2, DS CLI profile on page 36. 180
IBM System Storage DS8000 Copy Services for Open Systems
The following sections present an example of how to set up a Metro Mirror environment using the DS CLI. Figure 17-1 shows the configuration that is implemented.
LSS10 LSS40
4000 4001
LSS50
5000 5001
LSS41
4100 4101
LSS51
5100 5101
DS8000#1
Figure 17-1 DS8000 configuration in the Metro Mirror setup example
DS8000#2
In our example, we use different LSS and LUN numbers for the Metro Mirror source and target elements so that you can more clearly understand which one is being specified when you read through the example. Real-world Metro Mirror environment: In a real environment (and different from our example), to simplify the management of your Metro Mirror environment, maintain a symmetrical configuration in terms of both physical and logical elements.
Source
LSS10 LSS40
4000 4001
Target
LSS50
5000
5001
LSS41
4100 4101
LSS51
5100
5101
DS8000#1
Figure 17-2 Metro Mirror environment to be set up
DS8000#2
181
To configure the Metro Mirror environment, complete the following steps: 1. Determine the available Fibre Channel links for paths definition. 2. Define the paths that Metro Mirror uses. 3. Create Metro Mirror pairs.
dscli> lsavailpprcport -l -remotewwnn 5005076303FFD18E 40:50 Local Port Attached Port Type Switch ID Switch Port =================================================== I0102 I0137 FCP NA NA I0201 I0237 FCP NA NA
FCP port ID with the lsavailpprcport command has four hexadecimal characters in the format 0xEEAP, where EE is a port enclosure number (003F), A is the adapter number (0F), and P is the port number (07). The FCP port ID number is prefixed with the letter I.
You can use the -fullid parameter to display the DS8000 Storage Image ID in the command output (see Example 17-2).
Example 17-2 List available fibre links with DS8000 Storage Image ID
dscli> lsavailpprcport -l -fullid -remotewwnn 5005076303FFD18E 40:50 Local Port Attached Port Type Switch ID Switch Port ======================================================================== IBM.2107-75TV181/I0102 IBM.2107-75TV181/I0137 FCP NA NA IBM.2107-75TV181/I0201 IBM.2107-75TV181/I0237 FCP NA NA You need the worldwide node name (WWNN) of your target DS8000 to issue the lsavailpprcport command. You can get the WWNN by running lssi (see Example 17-5 on page 183). You must issue this command to the DS HMC connected to DS8000 #2, which is the Metro Mirror target.
Example 17-3 Get WWNN of target DS8000 dscli> lssi Name ID Storage Unit Model WWNN State ESSNet ==================================================================================== DS8k05_3Tier IBM.2107-75ACV21 IBM.2107-75ACV20 951 5005076303FFD18E Online Enabled
182
dscli> mkpprcpath -remotewwnn i0102:i0137 i0201:i0237 CMUC00149I mkpprcpath: Remote dscli> mkpprcpath -remotewwnn i0102:i0137 i0201:i0237 CMUC00149I mkpprcpath: Remote
5005076303FFD18E -srclss 40 -tgtlss 50 -consistgrp Mirror and Copy path 40:50 successfully established. 5005076303FFD18E -srclss 41 -tgtlss 51 -consistgrp Mirror and Copy path 41:51 successfully established.
dscli> lspprcpath -l 40-41 Src Tgt State SS Port Attached Port Tgt WWNN Failed Reason PPRC CG ================================================================================ 40 50 Success FF50 I0102 I0137 5005076303FFD18E Enabled 40 50 Success FF50 I0201 I0237 5005076303FFD18E Enabled 41 51 Success FF51 I0102 I0137 5005076303FFD18E Disabled 41 51 Success FF51 I0201 I0237 5005076303FFD18E Disabled With the lspprcpath command, you can use the -fullid command flag to display the fully qualified DS8000 Storage Image ID in the command output (see Example 17-5).
Example 17-5 List paths with DS8000 Storage Image ID
dscli> lspprcpath -fullid 40-41 Src Tgt State SS Port Attached Port Tgt WWNN ================================================================================== IBM.2107-75TV181/40 IBM.2107-75ACV21/50 Success FF50 IBM.2107-75TV181/I0102 IBM.2107-75ACV21/I0137 5005076303FFD18E IBM.2107-75TV181/40 IBM.2107-75ACV21/50 Success FF50 IBM.2107-75TV181/I0201 IBM.2107-75ACV21/I0237 5005076303FFD18E IBM.2107-75TV181/41 IBM.2107-75ACV21/51 Success FF51 IBM.2107-75TV181/I0102 IBM.2107-75ACV21/I0137 5005076303FFD18E IBM.2107-75TV181/41 IBM.2107-75ACV21/51 Success FF51 IBM.2107-75TV181/I0201 IBM.2107-75ACV21/I0237 5005076303FFD18E Mirroring direction: Remember that logical paths are unidirectional. To reverse the mirroring direction (as it is required for failover/failback scenarios), you must establish logical paths from the remote to the local LSSs, for example, 50:40 and 51:41.
183
dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status =========================================================================================== 4000:5000 Copy Pending Metro Mirror 40 60 Disabled Invalid 4001:5001 Copy Pending Metro Mirror 40 60 Disabled Invalid 4100:5100 Copy Pending Metro Mirror 41 60 Disabled Invalid 4101:5101 Copy Pending Metro Mirror 41 60 Disabled Invalid
After the Metro Mirror source and target volumes are synchronized, the volume state changes to Full Duplex from Copy Pending (see Example 17-7).
Example 17-7 List Metro Mirror status after Metro Mirror initial copy completes dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status =========================================================================================== 4000:5000 Full Duplex Metro Mirror 40 60 Disabled Invalid 4001:5001 Full Duplex Metro Mirror 40 60 Disabled Invalid 4100:5100 Full Duplex Metro Mirror 41 60 Disabled Invalid 4101:5101 Full Duplex Metro Mirror 41 60 Disabled Invalid
The states of Full Duplex and Copy Pending indicate the Metro Mirror source state. In the case of the target state, the states are Target Full Duplex and Target Copy Pending (see Example 17-8). You must issue this command to the DS HMC connected to DS8000 #2, which is the Metro Mirror target.
Example 17-8 lspprc for Metro Mirror target volumes dscli> lspprc 5000-5001 5100-5101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ====================================================================================================== 4000:5000 Target Copy Pending Metro Mirror 40 unknown Disabled Invalid 4001:5001 Target Copy Pending Metro Mirror 40 unknown Disabled Invalid 4100:5100 Target Copy Pending Metro Mirror 41 unknown Disabled Invalid 4101:5101 Target Copy Pending Metro Mirror 41 unknown Disabled Invalid dscli> lspprc 5000-5001 5100-5101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ====================================================================================================== 4000:5000 Target Full Duplex Metro Mirror 40 unknown Disabled Invalid 4001:5001 Target Full Duplex Metro Mirror 40 unknown Disabled Invalid 4100:5100 Target Full Duplex Metro Mirror 41 unknown Disabled Invalid 4101:5101 Target Full Duplex Metro Mirror 41 unknown Disabled Invalid
184
In the Copy Pending state, you can check the data transfer status of the Metro Mirror initial copy by running lspprc -l. Out Of Sync Tracks shows the remaining tracks to be sent to the target volume. The size of the logical track for FB volume for the DS8000 is 64 KB. The lspprc -l command output is wide; you may use stanza output format to break it into a more readable format. See Example 17-9 for stanza output.
Example 17-9 Long lspprc -l output formatted to stanzas
dscli> lspprc -l -fmt stanza 5000-5001 5100-5101 ID 4000:5000 State Copy Pending Reason Type Metro Mirror Out Of Sync Tracks 297216 Tgt Read Disabled Src Cascade Disabled Tgt Cascade Invalid Date Suspended SourceLSS 40 Timeout (secs) 60 Critical Mode Disabled First Pass Status Invalid Incremental Resync Disabled Tgt Write Disabled GMIR CG N/A PPRC CG Enabled isTgtSE Unknown DisableAutoResync ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync 4001:5001 Copy Pending Metro Mirror 295234 Disabled Disabled Invalid 40 60 Disabled Invalid Disabled Disabled N/A Enabled Unknown -
ID 4100:5100 State Copy Pending Reason Type Metro Mirror Out Of Sync Tracks 304637 Tgt Read Disabled Src Cascade Disabled
Chapter 17. Metro Mirror interfaces and examples
185
Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync
Invalid 41 60 Disabled Invalid Disabled Disabled N/A Enabled Unknown 4101:5101 Copy Pending Metro Mirror 301122 Disabled Disabled Invalid 41 60 Disabled Invalid Disabled Disabled N/A Enabled Unknown -
186
CMUC00160W rmpprc: Are you sure you want to delete the Remote Mirror and Copy volume pair 4000-4001:5000-5001:? [y/n]:y CMUC00155I rmpprc: Remote Mirror and Copy volume pair 4000:5000 relationship successfully CMUC00155I rmpprc: Remote Mirror and Copy volume pair 4001:5001 relationship successfully CMUC00160W rmpprc: Are you sure you want to delete the Remote Mirror and Copy volume pair 4100-4101:5100-5101:? [y/n]:y CMUC00155I rmpprc: Remote Mirror and Copy volume pair 4100:5100 relationship successfully CMUC00155I rmpprc: Remote Mirror and Copy volume pair 4101:5101 relationship successfully
You can add the -at tgt parameter to the rmpprc command to remove only the Metro Mirror target volume (see Example 17-11). The commands are given to the DS HMC connected to DS8000 #2, which is the Metro Mirror target.
Example 17-11 Results of rmpprc with -at tgt dscli> lspprc 5000 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ==================================================================================================== 4000:5000 Target Full Duplex Metro Mirror 40 unknown Disabled Invalid dscli> rmpprc -quiet -remotedev IBM.2107-75ACV21 -at tgt 4000:5000 CMUC00155I rmpprc: Remote Mirror and Copy volume pair 4000:5000 relationship successfully withdrawn. dscli> lspprc 5000 CMUC00234I lspprc: No Remote Mirror and Copy found.
Example 17-12 shows the Metro Mirror source volume status after the rmpprc -at tgt command completes. The source status changed to Suspended with a reason of Simplex Target as a result of the rmpprc -at tgt command. If there are no available paths, the state of the Metro Mirror source volume is preserved. To remove the relationship on the source, you must run rmpprc. You must issue the command to the DS HMC connected to DS8000 #1, which is the Metro Mirror source.
Example 17-12 Metro Mirror source volume status after rmpprc with -at tgt and rmpprc with -at src dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================== 4000:5000 Suspended Simplex Target Metro Mirror 40 60 Disabled Invalid dscli> rmpprc -quiet -remotedev IBM.2107-75ACV21 4000:5000 CMUC00155I rmpprc: Remote Mirror and Copy volume pair 4000:5000 relationship successfully withdrawn. dscli> lspprc 4000 CMUC00234I lspprc: No Remote Mirror and Copy found.
Removing paths
The rmpprcpath command removes paths. Before you remove the paths, you must remove all Remote Mirror pairs that are using the paths or you must use the -force parameter with the rmpprcpath command (see Example 17-13, Example 17-14 on page 188, and Example 17-15 on page 188).
Example 17-13 Remove paths
dscli> lspprc 4000-4001 CMUC00234I lspprc: No Remote Mirror and Copy found. dscli> rmpprcpath -remotedev IBM.2107-75ACV21 40:50
Chapter 17. Metro Mirror interfaces and examples
187
CMUC00152W rmpprcpath: Are you sure you want to remove the Remote Mirror and Copy path 40:50:? [y/n]:y CMUC00150I rmpprcpath: Remote Mirror and Copy path 40:50 successfully removed. If you do not remove the Metro Mirror pairs that are using the paths, the rmpprcpath command fails (see Example 17-14).
Example 17-14 Removed paths that still have Metro Mirror pairs dscli> lspprc 4100 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 4100:5100 Full Duplex Metro Mirror 41 60 Disabled Invalid dscli> rmpprcpath -remotedev IBM.2107-75ACV21 41:51 CMUC00152W rmpprcpath: Are you sure you want to remove the Remote Mirror and Copy path 41:51:? [y/n]:y CMUN03070E rmpprcpath: 41:51: Copy Services operation failure: pairs remain
If you want to remove logical paths while still having Metro Mirror pairs, you can use the -force parameter (see Example 17-15). After the path is removed, the Metro Mirror pair is still in a Full Duplex state until the Metro Mirror source receives I/O from the servers. When I/O arrives to the Metro Mirror source, the source volume becomes Suspended. If you set the Consistency Group option for the LSS in which the Metro Mirror volumes are, I/Os from the servers are held with queue-full status for the specified timeout value.
Example 17-15 Removing paths while still having Metro Mirror pairs with the -force parameter
dscli> lspprc 4100 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 4100:5100 Full Duplex Metro Mirror 41 60 Disabled Invalid dscli> rmpprcpath -remotedev IBM.2107-75ACV21 -quiet -force 41:51 CMUC00150I rmpprcpath: Remote Mirror and Copy path 41:51 successfully removed. dscli> lspprc 4100 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 4100:5100 Full Duplex Metro Mirror 41 60 Disabled Invalid << After I/O goes to the source volume >> dscli> lspprc 4100 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ==================================================================================================================== 4100:5100 Suspended Internal Conditions Target Metro Mirror 41 60 Disabled Invalid
188
Metro Mirror 41
Metro Mirror 41 60
60
Disabled
Disabled
Invalid
Invalid
Because the source DS8000 tracks all changed data on the source volume, you can resume Metro Mirror operations later. The resumepprc command resumes a Metro Mirror relationship for a volume pair and restarts transferring data. You must specify the Remote Mirror and Copy type, such as Metro Mirror or Global Copy, with the -type parameter (see Example 17-17).
Example 17-17 Resuming Metro Mirror pairs dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ====================================================================================================== 4000:5000 Suspended Host Source Metro Mirror 40 60 Disabled Invalid 4001:5001 Suspended Host Source Metro Mirror 40 60 Disabled Invalid 4100:5100 Full Duplex Metro Mirror 41 60 Disabled Invalid 4101:5101 Full Duplex Metro Mirror 41 60 Disabled Invalid dscli> resumepprc -remotedev IBM.2107-75ACV21 -type mmir 4000-4001:5000-5001 CMUC00158I resumepprc: Remote Mirror and Copy volume pair 4000:5000 relationship successfully resumed. This message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 4001:5001 relationship successfully resumed. This message is being returned before the copy completes. dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 4000:5000 Full Duplex Metro Mirror 40 60 Disabled Invalid 4001:5001 Full Duplex Metro Mirror 40 60 Disabled Invalid 4100:5100 Full Duplex Metro Mirror 41 60 Disabled Invalid 4101:5101 Full Duplex Metro Mirror 41 60 Disabled Invalid
189
When you resume the Metro Mirror pairs, the state of the pairs is Copy Pending, and because of the way the copy is done, data consistency at the target volumes is not kept. Therefore, you must take action to keep data consistency at the recovery site while you resume the Metro Mirror pairs. Taking an initial FlashCopy at the recovery site is one of the ways to keep data consistency.
Important: When you add paths with the mkpprcpath command, you must specify all of the paths that you want to use, which includes the existing paths. Otherwise, you lose those definitions that were already there. Important: Pay attention to whether the path is created with the -consistgrp option. Otherwise, you change the consistency group behavior when you re-create the path.
190
To reduce the number of paths, you can also use the mkpprcpath command. In Example 17-19, for each LSS pair (40:50 and 41:51), we remove one path (I0331:I0337) from the existing three paths (I0102:I0137, I0201:I0237, and I0331:I0337).
Example 17-19 Removing paths
dscli> lspprcpath 40-41 Src Tgt State SS Port Attached Port Tgt WWNN ========================================================= 40 50 Success FF50 I0102 I0137 5005076303FFD18E 40 50 Success FF50 I0201 I0237 5005076303FFD18E 40 50 Success FF50 I0331 I0337 5005076303FFD18E 41 51 Success FF51 I0102 I0137 5005076303FFD18E 41 51 Success FF51 I0201 I0237 5005076303FFD18E 41 51 Success FF51 I0331 I0337 5005076303FFD18E dscli> mkpprcpath -remotewwnn 5005076303FFD18E -srclss 40 -tgtlss 50 -consistgrp i0102:i0137 i0201:i0237 CMUC00149I mkpprcpath: Remote Mirror and Copy path 40:50 successfully established. dscli> mkpprcpath -remotewwnn 5005076303FFD18E -srclss 41 -tgtlss 51 -consistgrp i0102:i0137 i0201:i0237 CMUC00149I mkpprcpath: Remote Mirror and Copy path 41:51 successfully established. dscli> lspprcpath 40-41 Src Tgt State SS Port Attached Port Tgt WWNN ========================================================= 40 50 Success FF50 I0102 I0137 5005076303FFD18E 40 50 Success FF50 I0201 I0237 5005076303FFD18E 41 51 Success FF51 I0102 I0137 5005076303FFD18E 41 51 Success FF51 I0201 I0237 5005076303FFD18E
Planned outages
The planned outage procedures rely on two facts: The Metro Mirror source and target volumes are in a consistent and current state. Both DS8000s are functional and reachable. You can swap sites without any full copy operation by combining Metro Mirror initialization modes.
Unplanned outages
In contrast to the assumptions for planned outages, the situation in a disaster is more difficult: In an unplanned outage situation, only the DS8000 at the recovery site is functioning. The production site DS8000 might be lost or unreachable. In an unplanned situation, volumes at the production and recovery site might be in different states.
191
As opposed to a planned situation, where you can stop all I/Os at the production site to make all volumes across the recovery site reach a consistent status, this action cannot be done in an unplanned situation. If you are not using consistency groups, in the case of, for example, a power failure, you can assume consistency only at the level of a single volume pair, not at the application level. In either case (planned or unplanned), you typically perform four major steps to switch production to the recovery site and back to the original production site after the problem is repaired: 1. Perform a failover to make the secondary volumes on the recovery site available to the host to continue operation or restart the host. 2. Perform a failback to reestablish the mirroring back in the opposite direction from the recovery site to the original production site after it is repaired. This action is typically done for two reasons: To reestablish mirroring as a data protection and business continuity measure As a preparation to switch back to production in a controlled manner 3. Perform a failover again to make the volumes on the original production site available to the host, so that the host can restart operations. 4. Perform a failback again to complete the site switch by turning around the mirroring direction and start synchronizing the data.
Source
LSS10 LSS40
4000 4001
Target
LSS50
5000
5001
LSS41
4100 4101
LSS51
5100
5101
DS8000#1
Figure 17-3 Metro Mirror environment for sites switch example
DS8000#2
192
A planned site switch using the Metro Mirror failover function involves the following steps. If the site switch is because of an unplanned outage, then the procedure starts from step 4 on page 194: 1. When the planned outage window is reached, the applications at the production site (A) must be quiesced to cease all write I/O activity, so that there are no more updates to the source volumes. Depending on the host operating system, it might be necessary to unmount the source volumes. 2. Ensure that all Metro Mirror pairs are in the Full Duplex state. It is better to check on both sites, on DS8000 #1 and on DS8000 #2. To do this task, you must run lspprc to each DS HMC (see Example 17-20).
Example 17-20 Check the Metro Mirror state at the production site and the recovery site
<< DS8000#1 >> dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 4000:5000 Full Duplex Metro Mirror 40 60 Disabled Invalid 4001:5001 Full Duplex Metro Mirror 40 60 Disabled Invalid 4100:5100 Full Duplex Metro Mirror 41 60 Disabled Invalid 4101:5101 Full Duplex Metro Mirror 41 60 Disabled Invalid << DS8000#2 >> dscli> lspprc 5000-5101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 4000:5000 Target Full Duplex Metro Mirror 40 unknown Disabled Invalid 4001:5001 Target Full Duplex Metro Mirror 40 unknown Disabled Invalid 4100:5100 Target Full Duplex Metro Mirror 41 unknown Disabled Invalid 4101:5101 Target Full Duplex Metro Mirror 41 unknown Disabled Invalid
3. You can submit the freezepprc command to ensure that no data can possibly be transferred to the target volumes (B volumes). You can then submit the unfreezepprc command to release the source volumes again and allow I/O from the host, but because the volume state is Suspended, no data is sent to the target volumes anymore (see Example 17-21). LSS level command: The freezepprc command is an LSS level command, which means that all Remote Mirror and Copy pairs, Metro Mirror and Global Copy, in the particular LSS are affected by this command. This command also removes the logical paths between the LSS pair.
Example 17-21 freezepprc and unfreezepprc
dscli> freezepprc -remotedev IBM.2107-75ACV21 40:50 41:51 CMUC00161I freezepprc: Remote Mirror and Copy consistency group 40:50 successfully created. CMUC00161I freezepprc: Remote Mirror and Copy consistency group 41:51 successfully created. dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================ 4000:5000 Suspended Freeze Metro Mirror 40 60 Disabled Invalid 4001:5001 Suspended Freeze Metro Mirror 40 60 Disabled Invalid 4100:5100 Suspended Freeze Metro Mirror 41 60 Disabled Invalid 4101:5101 Suspended Freeze Metro Mirror 41 60 Disabled Invalid dscli> lspprcpath 40-41 Src Tgt State SS Port Attached Port Tgt WWNN ======================================================= 40 50 Failed FF50 5005076303FFD18E 41 51 Failed FF51 5005076303FFD18E dscli> unfreezepprc -remotedev IBM.2107-75ACV21 40:50 41:51 CMUC00198I unfreezepprc: Remote Mirror and Copy pair 40:50 successfully thawed.
193
CMUC00198I unfreezepprc: Remote Mirror and Copy pair 41:51 successfully thawed. dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================ 4000:5000 Suspended Freeze Metro Mirror 40 60 Disabled Invalid 4001:5001 Suspended Freeze Metro Mirror 40 60 Disabled Invalid 4100:5100 Suspended Freeze Metro Mirror 41 60 Disabled Invalid 4101:5101 Suspended Freeze Metro Mirror 41 60 Disabled Invalid
4. Submit a failoverpprc command to the HMC connected to DS8000 #2. You must run failoverpprc command according to the roles the volumes have after the command is complete. In our example, you must specify the B volumes as the source volumes. After the failoverpprc command runs successfully, the B volumes become new source volumes in the Suspended state (Example 17-22). The state of the A volumes is preserved. Disconnecting the physical links: If there is an unplanned outage, before you run failoverpprc, you might consider disconnecting the physical links between the production and the recovery sites, which ensures that no unexpected data transfer to the recovery site occurs at all.
Example 17-22 failoverpprc command
dscli> lspprc 5000-5101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 4000:5000 Target Full Duplex Metro Mirror 40 unknown Disabled Invalid 4001:5001 Target Full Duplex Metro Mirror 40 unknown Disabled Invalid 4100:5100 Target Full Duplex Metro Mirror 41 unknown Disabled Invalid 4101:5101 Target Full Duplex Metro Mirror 41 unknown Disabled Invalid dscli> failoverpprc -remotedev IBM.2107-75TV181 CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy -type mmir 5000-5001:4000-4001 5100-5101:4100-4101 pair 5000:4000 successfully reversed. pair 5001:4001 successfully reversed. pair 5100:4100 successfully reversed. pair 5101:4101 successfully reversed.
dscli> lspprc 5000-5101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ===================================================================================================== 5000:4000 Suspended Host Source Metro Mirror 50 60 Disabled Invalid 5001:4001 Suspended Host Source Metro Mirror 50 60 Disabled Invalid 5100:4100 Suspended Host Source Metro Mirror 51 60 Disabled Invalid 5101:4101 Suspended Host Source Metro Mirror 51 60 Disabled Invalid
State column: The lspprc command shows Target in the State column only for a target volume. In the case of a source volume, there is no indication. 5. Create paths in the direction recovery site to production site (B A) (see Example 17-23). You must issue the mkpprcpath command to the DS HMC connected to DS8000 #2. Although it is not strictly necessary to reverse the paths, do it so that you have a well-defined situation at the end of the procedure. Additionally, you need the paths to transfer the updates back to the production site.
Example 17-23 Create paths from B to A dscli> mkpprcpath -remotewwnn 500507630AFFC29F -srclss 50 -tgtlss 40 i0137:i0102 i0237:i0201 CMUC00149I mkpprcpath: Remote Mirror and Copy path 50:40 successfully established. dscli> mkpprcpath -remotewwnn 500507630AFFC29F -srclss 51 -tgtlss 41 i0137:i0102 i0237:i0201 CMUC00149I mkpprcpath: Remote Mirror and Copy path 51:41 successfully established.
194
dscli> lspprcpath 50-51 Src Tgt State SS Port Attached Port Tgt WWNN ========================================================= 50 40 Success FF40 I0137 I0102 500507630AFFC29F 50 40 Success FF40 I0237 I0201 500507630AFFC29F 51 41 Success FF41 I0137 I0102 500507630AFFC29F 51 41 Success FF41 I0237 I0201 500507630AFFC29F
6. Depending on your operating system, it might be necessary to rescan Fibre Channel devices (to remove device objects and recognize the new sources) and mount the new source volumes (B volumes). Start all applications at the recovery site (B). Now that the applications have started, Metro Mirror starts tracking the updated data on the new source volumes at B.
dscli> lspprcpath -fullid 50-51 Src Tgt State SS Port Attached Port Tgt WWNN =============================================================================== IBM.2107-75ACV21/50 IBM.2107-75TV181/40 Success FF40 IBM.2107-75ACV21/I0137 IBM.2107-75TV181/I0102 500507630AFFC29F IBM.2107-75ACV21/50 IBM.2107-75TV181/40 Success FF40 IBM.2107-75ACV21/I0237 IBM.2107-75TV181/I0201 500507630AFFC29F
195
IBM.2107-75ACV21/51 IBM.2107-75TV181/41 Success FF41 IBM.2107-75ACV21/I0137 IBM.2107-75TV181/I0102 500507630AFFC29F IBM.2107-75ACV21/51 IBM.2107-75TV181/41 Success FF41 IBM.2107-75ACV21/I0237 IBM.2107-75TV181/I0201 500507630AFFC29F 2. If you did not reverse the paths before (in step 5 on page 195 of the previous procedure), now you must establish paths from the recovery to the production site before you run failbackpprc. 3. Run failbackpprc from the recovery site to the production site. You must submit this command to the DS HMC connected to DS8000#2. The failbackpprc command copies all the modified tracks from the B volumes to the A volumes (see Example 17-25).
Example 17-25 failbackpprc command
<< DS8000#2 >> dscli> lspprc 5000-5001 5100-5101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ===================================================================================================== 5000:4000 Suspended Host Source Metro Mirror 50 60 Disabled Invalid 5001:4001 Suspended Host Source Metro Mirror 50 60 Disabled Invalid 5100:4100 Suspended Host Source Metro Mirror 51 60 Disabled Invalid 5101:4101 Suspended Host Source Metro Mirror 51 60 Disabled Invalid dscli> failbackpprc -remotedev IBM.2107-75TV181 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy -type mmir 5000-5001:4000-4001 5100-5101:4100-4101 pair 5000:4000 successfully failed back. pair 5001:4001 successfully failed back. pair 5100:4100 successfully failed back. pair 5101:4101 successfully failed back.
dscli> lspprc 5000-5001 5100-5101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 5000:4000 Full Duplex Metro Mirror 50 60 Disabled Invalid 5001:4001 Full Duplex Metro Mirror 50 60 Disabled Invalid 5100:4100 Full Duplex Metro Mirror 51 60 Disabled Invalid 5101:4101 Full Duplex Metro Mirror 51 60 Disabled Invalid << DS8000#1 >> dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 5000:4000 Target Full Duplex Metro Mirror 50 unknown Disabled Invalid 5001:4001 Target Full Duplex Metro Mirror 50 unknown Disabled Invalid 5100:4100 Target Full Duplex Metro Mirror 51 unknown Disabled Invalid 5101:4101 Target Full Duplex Metro Mirror 51 unknown Disabled Invalid
You must run failbackpprc according to the roles the volumes have after the command is completed. In our example, you must specify the B volumes as the source volumes and the A volumes as the target volumes. After the failbackpprc command runs successfully, the B volumes become source volumes in the Copy Pending state, and the A volumes become target volumes in the Target Copy Pending state. After all changes are transferred back to the A volumes, the states change to Full Duplex and Target Full Duplex. Note: When you run failbackpprc, if you specify the A volume as the source and the B volume as the target, and you issue the command to the DS HMC connected to the DS8000 #1, then data is copied from A to B. This command overwrites any changes on the B volume that might occur in the meantime.
196
However, if the server that set the SCSI persistent reserve on the A volumes is not up anymore for whatever reason, there is a -resetreserve parameter for the failbackpprc command. This parameter resets the reserved status so the operation can complete. In a situation after a planned site switch, you must not use this parameter because the server at the production site still owns the A volume, and might be using it, while the failback operation suddenly changes the contents of the volume. This situation might cause corruption on the servers file system.
197
2. Before you return to normal operation, the applications, still updating B volumes at the recovery site, must be quiesced to cease all write I/O from updating the B volumes. Depending on the host operating system, it might be necessary to unmount the B volumes. 3. You should now run one more failoverpprc command. Now, you must specify the A volumes as the source volumes and the B volumes as the target volumes. You must submit this command to the DS HMC connected to the DS8000#1. This operation changes the state of the A volumes from Target Full Duplex to (source) Suspended. The state of the B volumes is preserved (see Example 17-28).
Example 17-28 Running failoverpprc to convert A volumes to source Suspended
<< DS8000#1 >> dscli> failoverpprc -remotedev IBM.2107-75ACV21 CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy -type mmir 4000-4001:5000-5001 4100-4101:5100-5101 pair 4000:5000 successfully reversed. pair 4001:5001 successfully reversed. pair 4100:5100 successfully reversed. pair 4101:5101 successfully reversed.
dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ===================================================================================================== 4000:5000 Suspended Host Source Metro Mirror 40 60 Disabled Invalid 4001:5001 Suspended Host Source Metro Mirror 40 60 Disabled Invalid 4100:5100 Suspended Host Source Metro Mirror 41 60 Disabled Invalid 4101:5101 Suspended Host Source Metro Mirror 41 60 Disabled Invalid << DS8000#2 >> dscli> lspprc 5000-5001 5100-5101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 5000:4000 Full Duplex Metro Mirror 50 60 Disabled Invalid 5001:4001 Full Duplex Metro Mirror 50 60 Disabled Invalid 5100:4100 Full Duplex Metro Mirror 51 60 Disabled Invalid 5101:4101 Full Duplex Metro Mirror 51 60 Disabled Invalid
198
4. Define paths in the direction of production site to recovery site (A B) (see Example 17-29). You must create paths if you run the freezepprc command in the optional step 3 on page 195 of the production to recovery site switchover procedure (freezepprc removed the paths).
Example 17-29 Create Metro Mirror paths from A to B
dscli> lspprcpath 40-41 Src Tgt State SS Port Attached Port Tgt WWNN ======================================================= 40 50 Failed FF50 5005076303FFD18E 41 51 Failed FF51 5005076303FFD18E dscli> mkpprcpath -remotewwnn 5005076303FFD18E -srclss 40 -tgtlss 50 -consistgrp i0102:i0137 i0201:i0237 CMUC00149I mkpprcpath: Remote Mirror and Copy path 40:50 successfully established. dscli> mkpprcpath -remotewwnn 5005076303FFD18E -srclss 41 -tgtlss 51 -consistgrp i0102:i0137 i0201:i0237 CMUC00149I mkpprcpath: Remote Mirror and Copy path 41:51 successfully established. dscli> lspprcpath 40-41 Src Tgt State SS Port Attached Port Tgt WWNN ========================================================= 40 50 Success FF50 I0102 I0137 5005076303FFD18E 40 50 Success FF50 I0201 I0237 5005076303FFD18E 41 51 Success FF51 I0102 I0137 5005076303FFD18E 41 51 Success FF51 I0201 I0237 5005076303FFD18E 5. Run another failbackpprc command. Now, you must specify the A volumes as the source volumes and the B volumes as the target volumes. You must submit this command to the DS HMC connected to the DS8000 #1. After you run the failbackpprc command successfully, the A volumes become source volumes in the Copy Pending state and the B volumes become target volumes in Target Copy Pending state (see Example 17-30).
Example 17-30 Running failbackpprc to restart the A to B Metro Mirror operation dscli> failbackpprc -remotedev IBM.2107-75ACV21 4100-4101:5100-5101 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy -type mmir 4000-4001:5000-5001 pair pair pair pair 4000:5000 4001:5001 4100:5100 4101:5101 successfully successfully successfully successfully failed failed failed failed back. back. back. back.
6. Wait until the Metro Mirror pairs are synchronized. Normally, this operation does not take much time because no data transfer is necessary. After the Metro Mirror pairs are synchronized, the state of the sources volumes (A) becomes Full Duplex and the state of the target volumes (B) becomes Target Full Duplex (see Example 17-31).
Example 17-31 After the Metro Mirror pairs are synchronized
<< DS8000#1 >> dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 4000:5000 Full Duplex Metro Mirror 40 60 Disabled Invalid
199
60 60 60
<< DS8000#2 >> dscli> lspprc 5000-5001 5100-5101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 4000:5000 Target Full Duplex Metro Mirror 40 unknown Disabled Invalid 4001:5001 Target Full Duplex Metro Mirror 40 unknown Disabled Invalid 4100:5100 Target Full Duplex Metro Mirror 41 unknown Disabled Invalid 4101:5101 Target Full Duplex Metro Mirror 41 unknown Disabled Invalid
7. Depending on your operating system, it might be necessary to rescan Fibre Channel devices and mount the new source volumes (A) at the production site. Start all applications at the production site and check for consistency. Now that the applications are started, all the write I/Os to the source volumes (A) are tracked by Metro Mirror. You should verify the applications integrity. 8. Eventually, you could terminate the paths from the recovery to the production LSSs, depending on your requirements.
dscli> showlss ID Group addrgrp stgtype confgvols subsys pprcconsistgrp xtndlbztimout resgrp
200
dscli> showlss ID Group addrgrp stgtype confgvols subsys pprcconsistgrp xtndlbztimout resgrp
dscli> chlss -pprcconsistgrp enable 40-41 CMUC00029I chlss: LSS 41 successfully modified. CMUC00029I chlss: LSS 40 successfully modified. dscli> showlss ID Group addrgrp stgtype confgvols subsys pprcconsistgrp xtndlbztimout resgrp dscli> showlss ID Group addrgrp stgtype confgvols subsys pprcconsistgrp xtndlbztimout resgrp 40 40 0 4 fb 2 0xFF40 Enabled 60 secs RG0 41 41 1 4 fb 2 0xFF41 Enabled 60 secs RG0
When the DS8000 detects a condition where it cannot update a Metro Mirror target volume, it sends an SNMP alert. At that moment, if an automation procedure is in place, the SNMP alert triggers the automation procedure, which runs a freezepprc command (see Example 17-33).
Example 17-33 The results of the freezepprc command
dscli> freezepprc -remotedev IBM.2107-75ACV21 40:50 41:51 CMUC00161I freezepprc: Remote Mirror and Copy consistency group 40:50 successfully created. CMUC00161I freezepprc: Remote Mirror and Copy consistency group 41:51 successfully created. dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================ 4000:5000 Suspended Freeze Metro Mirror 40 60 Disabled Invalid 4001:5001 Suspended Freeze Metro Mirror 40 60 Disabled Invalid 4100:5100 Suspended Freeze Metro Mirror 41 60 Disabled Invalid 4101:5101 Suspended Freeze Metro Mirror 41 60 Disabled Invalid
201
dscli> lspprcpath 40-41 Src Tgt State SS Port Attached Port Tgt WWNN ======================================================= 40 50 Failed FF50 5005076303FFD18E 41 51 Failed FF51 5005076303FFD18E With the freezepprc command, the DS8000 holds the I/O activity to the addressed LSSs by putting the source volumes in a queue full state for a time period. Example 17-34 shows, for an AIX environment, what the iostat command reports during this time interval.
Example 17-34 AIX iostat command output report during the queue full condition
# lsvpcfg vpath8 (Avail pv ) 75TV1814000 = hdisk18 (Avail ) hdisk22 (Avail ) vpath9 (Avail pv ) 75TV1814001 = hdisk19 (Avail ) hdisk23 (Avail ) vpath10 (Avail pv ) 75TV1814100 = hdisk20 (Avail ) hdisk24 (Avail ) vpath11 (Avail pv ) 75TV1814101 = hdisk21 (Avail ) hdisk25 (Avail ) # iostat -d vpath8 vpath9 vpath10 vpath11 1 Disks: % tm_act Kbps tps Kb_read hdisk18 0.0 0.0 0.0 0 hdisk19 100.0 0.0 0.0 0 hdisk21 0.0 0.0 0.0 0 hdisk20 0.0 0.0 0.0 0 hdisk25 0.0 0.0 0.0 0 hdisk22 100.0 0.0 0.0 0 hdisk23 0.0 0.0 0.0 0 hdisk24 0.0 0.0 0.0 0
Kb_wrtn 0 0 0 0 0 0 0 0
The freezepprc command: In addition to holding (freezing) the I/O activity, the freezepprc command also removes the paths between the affected LSSs. After the freezepprc command completes for all related LSSs pairs, you have consistent data at the recovery site. Therefore, the automation procedure can run the unfreezepprc command to release (thaw) the I/O that was on hold (frozen) on the affected source LSSs. Example 17-35 shows the unfreezepprc command that thaws the frozen I/O queue on the LSS pairs 40:50 and 41:51.
Example 17-35 The unfreezepprc command
dscli> unfreezepprc -remotedev IBM.2107-75ACV21 40:50 41:51 CMUC00198I unfreezepprc: Remote Mirror and Copy pair 40:50 successfully thawed. CMUC00198I unfreezepprc: Remote Mirror and Copy pair 41:51 successfully thawed. In a situation where the data could not be replicated because of a links failure circumstance, that is, the production site kept running, then Metro Mirror processing can resume after the links are recovered. Still, if the automation is triggered and the freezepprc is run, then the Metro Mirror paths must be defined again because the freezepprc command removes the paths between the affected LSSs.
202
After the paths are re-established, run resumepprc to resynchronize the Metro Mirror pairs. Example 17-36 shows this scenario.
Example 17-36 Resume the Metro Mirror environment dscli> mkpprcpath -remotewwnn 5005076303FFD18E -srclss 40 -tgtlss 50 -consistgrp i0102:i0137 i0201:i0237 CMUC00149I mkpprcpath: Remote Mirror and Copy path 40:50 successfully established. dscli> mkpprcpath -remotewwnn 5005076303FFD18E -srclss 41 -tgtlss 51 -consistgrp i0102:i0137 i0201:i0237 CMUC00149I mkpprcpath: Remote Mirror and Copy path 41:51 successfully established. dscli> resumepprc -remotedev IBM.2107-75ACV21 -type mmir 4000-4001:5000-5001 4100-4101:5100-5101 CMUC00158I resumepprc: Remote Mirror and Copy volume pair 4000:5000 relationship successfully resumed. message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 4001:5001 relationship successfully resumed. message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 4100:5100 relationship successfully resumed. message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 4101:5101 relationship successfully resumed. message is being returned before the copy completes. dscli> lspprc 4000-4001 4100-4101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 4000:5000 Full Duplex Metro Mirror 40 60 Disabled Invalid 4001:5001 Full Duplex Metro Mirror 40 60 Disabled Invalid 4100:5100 Full Duplex Metro Mirror 41 60 Disabled Invalid 4101:5101 Full Duplex Metro Mirror 41 60 Disabled Invalid
Data consistency: While you do the resynchronization, the volume pairs are in a Copy Pending state. During this time, there is no data consistency in the target volumes. Therefore, you might want to take specific action to keep data consistency at the recovery site while you resume the Metro Mirror pairs. Completing a FlashCopy at the recovery site first is one way to accomplish this task.
203
204
Because there are no paths to the DS8000 named DS8k05_3Tier for LSS 40, you must configure them. In our example, we want to configure two paths from the DS8000 named DS8K-ATS - ATS_04 LSS 40 to LSS 50 on DS8k05_3Tier. Select the correct Storage Image and click Action Create. In the window that opens, specify an LSS on the source and on the target side. From the auto-populated I/O port list, select an I/O port on both the source and target side. Select the Define as consistency group check box if you want to manage the path as part of a consistency group. Click Add to move the definition to the Create Mirroring Connectivity Verification field. Repeat the previous steps for any additional logical path you want to add (see Figure 17-5). When finished, click Create.
205
After the paths are created, you can verify the results by selecting LSS 40 from the drop-down menu (see Figure 17-6).
206
207
The Manage LSSs window opens (Figure 17-8). Scroll down until you see the LSS you want to modify, click it once to highlight it, and then click Action Properties. In the Single LSS properties input window, you can change the options of the LSS, for example, select the Consistency Group Enabled check box or enter a new Long Busy Timeout Value. For more information about these options, see Chapter 15, Metro Mirror operation and configuration on page 153.
208
209
The Delete Mirroring Connectivity window opens and shows a summary of the action to take. You can also select the Delete mirroring connectivity even if volume pairs exist check box, which we do not do in our example (Figure 17-10).
Deleting paths: Deleting paths reduces bandwidth. For more information about this topic, see Chapter 16, Metro Mirror implementation considerations on page 171. After you click Delete, the deletion is completed. You are presented with the Mirroring Connectivity window again, where you can confirm that the path is not there anymore.
210
To start creating volume pairs, in the left navigation pane of the DS GUI, click Copy Services Metro Mirror/Global Copy. The Metro Mirror/Global Copy window opens (Figure 17-11). You can filter the list of existing mirroring relationships by LSS, Host, Volume Group, Storage Type, and PPRC Type to allow for a better overview.
Figure 17-11 Create Metro Mirror pair - Metro Mirror/Global Copy window
211
In the Metro Mirror/Global Copy window, click Action Create Metro Mirror to display the Create Metro Mirror window. Here you must select the source and target storage systems first, and then the volume type of the source volume (FB or CKD) (the volume type of the target volume adapts to the source automatically). You filter the list by Host, LSS, Storage Allocation Method, or Volume Group. In this example, the list is filtered by a volume group called ITSO_MM. Now, you can select a single pair by clicking a source and a target volume or you can select multiple pairs by holding the Ctrl key while you click the volumes. Note the Showing 4 items | Selected 4 items at the bottom of the volume list. Select Add to move the definitions to the Create Metro Mirror Verification box at the bottom. DS GUI automatically determines the relationships for you (Figure 17-12).
You might have to increase the size of the Create Metro Mirror window by clicking the lower right corner with the mouse and dragging the corner to show the entire content. If there are no paths that are defined yet, you can do so by clicking Create mirroring connectivity to open the window. Then, follow the steps that are described in 17.3.1, Establishing paths with the DS GUI on page 204.
212
Before you click the Create button, you may want to open the Create Metro Mirror Options window by clicking Advanced first. Here you have different choices about how the Metro Mirror relationship should be set up (Figure 17-13).
For a description of these options, see Creating Metro Mirror pairs on page 184. Click OK to save the options and then click Create to establish the mirroring relationships.
213
The Metro Mirror/Global Copy window opens again. You may have to filter again by LSS, volume group, or other criteria. This time, it shows the list of volume pairs created. You can see that the state of the volumes is Copy Pending, which indicates that the initial copy from the source to the target volumes is still in progress (Figure 17-14).
Figure 17-14 Create Metro Mirror pair - Volume pairs that are created and in the Copy Pending state
After a period, depending on the size and number of volumes, all volumes are synchronized (in-sync). Then, the State column of the Metro Mirror window shows Full duplex for the volumes. Click Refresh to see if the status changed. Response times: With high workloads, a high number of established Metro Mirror volumes, or when you share ports for Metro Mirror and host attachments, you possibly have higher response times for your applications during this synchronization time.
214
To suspend volume pairs, in the left navigation pane of the DS GUI, click Copy Services Metro Mirror / Global Copy. The Metro Mirror/Global Copy window opens. Filter and select the mirroring relationships that you want to suspend and click Action Suspend to open the Suspend Metro Mirror/Global Copy window. Here you can, if you choose to do so, suspend the volume pairs at the source or at the target (Figure 17-15).
215
Click Suspend to carry out the command. The Metro Mirror/Global Copy window opens again. The state of the mirror relationships is Suspended (see Figure 17-16).
216
To resume volume pairs, in the left navigation pane of the DS GUI, click Copy Services Metro Mirror/Global Copy. The Metro Mirror/Global Copy panel opens. Filter and select the mirroring relationships that you want to re-establish and click Action Resume to open the Resume Metro Mirror/Global Copy window. You may want to call up the Resume Options window by clicking Advanced (see Figure 17-17).
For a description of these options, see Creating Metro Mirror paths on page 183. Click OK to save the options and then click Resume to establish the mirroring relationships. The Metro Mirror/Global Copy window opens again and you see the Copy Pending status. After all the changes are transferred from the source to the primary volumes, the status changes to Full Duplex, although you might have to click Refresh to see this change.
217
Volume access: If you try to access this volume on the same server where the previous source was or still is, an error might occur, as some operating systems have problems with disks with the same serial number or signature. In most cases, there is a procedure to do this task (see Appendix A, Open Systems specifics on page 705). In our example, we assume that the production site failed. We decide to start production processing using the backup servers at the recovery site. You must start using the Metro Mirror target volumes (B). These volumes must become source volumes because you are writing them, and these changes must be mirrored back to the production site. To fail over volume pairs, in the left navigation pane of the DS GUI, click Copy Services Metro Mirror/Global Copy. In the Metro Mirror/Global Copy window, filter the list to display the mirroring relationships of interest. Click the mirroring relationships once to select them for failover. Click Action Recovery Failover... to open the Recovery failover window. You might want to open the Recovery failover options window by clicking the Advanced Options (see Figure 17-18).
For a description of these options, see Creating Metro Mirror pairs on page 184. Click OK to save the options and then click Create to establish the mirroring relationships.
218
The failover operation now takes place. You see the result in Figure 17-19, which shows the status of the B volumes, which is Suspended. The A volumes are still in the Full Duplex status because the Metro Mirror failover function does not care about the state of the former source volumes (A). You can start the servers at the backup site, and they can access the B volumes on the backup site.
219
In the Metro Mirror/Global Copy window, filter the list to display the mirroring relationships of interest. Click the mirroring relationships once to select them for failover. Click Action Recovery Failback... to open the Recovery failback window (see Figure 17-20). You might want to open the Recovery failback options window by clicking Advanced Options.
220
During the failback, changes are copied. You can click Refresh to see if the copy is finished. When the copy completes, the volumes at the backup site (B) show as Full Duplex source volumes (see Figure 17-21).
Failback failure: If the failback fails, check whether your previous active server can still access the new target volume or has not reset the reserve on the volume, and if your Metro Mirror path is established between these LSSs.
221
If you view the statuses of the volumes at the production site, they appear as Target Full Duplex (see Figure 17-22).
Now, shut down servers or disconnect them from volumes at the backup site. After they are shut down, no more changes are written to the volumes at the backup site. Repeat the failover and failback process, but do so from the production site storage unit (A).
222
In Figure 17-23, you see the confirmation window of the selected volumes on the production storage unit where you perform a failover.
When the failover is complete, the volumes on the production site show as Suspended Metro Mirror source volumes. Now, perform a failback using the production site storage unit (see Figure 17-24).
223
When the final failback is complete, the production site storage unit is now the source again and the backup site storage unit is now the target again, which means the production site storage unit shows the volumes in the Full Duplex state (Figure 17-25).
At the recovery site, the volumes are back in the Target Full Duplex state (see Figure 17-26).
224
Part 5
Part
Global Copy
This part describes the IBM System Storage DS8000 Global Copy feature. After it presents an overview of Global Copy, this chapter describes available options, the interfaces, and configuration considerations, and provides usage examples.
225
226
18
Chapter 18.
227
Server write
1
2 Write acknowledge
4
Write to secondary (non-synchronously)
With Global Copy, write operations complete on the primary storage system before they are received by the remote storage system. This capability prevents the primary systems performance from being affected by wait time from writes on the secondary system. Therefore, the primary and secondary copies can be separated by any distance. Figure 18-1 illustrates how Global Copy operates: 1. The host server makes a write I/O to the primary DS8000. The write is staged through cache and non-volatile storage (NVS). 2. The write returns as completed to the host servers application. 3. Some moments later, that is, in an asynchronous manner, the primary DS8000 transmits the data so that the updates are reflected on the secondary volumes. The updates are grouped in batches for efficient transmission. 4. The secondary DS8000 returns write completed to the primary DS8000 when the updates are secured in the secondary DS8000 cache and NVS. The primary DS8000 then resets its Global Copy change recording information.
228
Algorithms: The efficient extended distance mirroring technique of Global Copy is achieved with sophisticated algorithms. For example, if changed data is in the cache, then Global Copy sends only the changed sectors. There are also sophisticated queuing algorithms to schedule the processing to update the tracks for each volume and set the batches of updates to be transmitted.
Simplex
Terminate Terminate
Copy Pending
Full Duplex
Resume
Resume Suspend
Suspend
Suspened
Figure 18-2 Global Copy and Metro Mirror volume state change logic
229
Regarding the state change logic, the following considerations apply. For the following description, see the various arrows that are shown in Figure 18-2 on page 229: When you initially establish a mirror relationship from a volume in simplex state, you can request that it becomes a Global Copy pair (establish Global Copy arrow) or a Metro Mirror pair (establish Metro Mirror arrow). Pairs can change from the Copy Pending state to the Full Duplex state when a go-to-SYNC operation, which is a conversion from Global Copy to Metro Mirror, runs (go-to-SYNC arrow). You can also request that a pair be suspended when it reaches the Full Duplex state (go-to-SYNC and suspend arrows). Pairs cannot change directly from Full Duplex state to Copy Pending state. They must go through an intermediate Suspended state. You can go from the Suspended state to the Copy Pending state by doing an incremental copy (copying out-of-sync tracks only). This process is a similar one to the transition from the Suspended state to a Full Duplex state (Resync arrow). ESE volumes: You can use Global Copy with thin provisioned Extent-Space-Efficient (ESE) volumes with Licensed Machine Code (LMC) 6.6.20.nnn for DS8700 and 7.6.20.nnn for DS8800 or later. The ESE option is supported for FB volumes and you must activate the DS8000 Thin Provisioning license on your DS8000 systems. For more information about Thin Provisioning, see Part 8, Thin provisioning and Copy Services on page 543.
230
19
Chapter 19.
231
232
Target online You can use this option to establish a Global Copy relationship when the secondary volume is online to a host system. If this option is not specified and the secondary volume is online to a host system, the command fails. Cascade option Specifies whether the primary volume in the pair is also eligible to be the secondary volume in another Global Copy or Metro Mirror pair. You can use this option to set up cascading remote copy relationships.
catch-up transition.
Copy pending catch-up is the name of the transition for a Global Copy pair that goes from its normal out-of-sync condition to a fully synchronous condition, that is, a full duplex Metro Mirror pair. The pair goes from the pending state to the full duplex state. At the end of this transition, primary and secondary volumes are fully synchronized, with all their respective tracks identical.
233
As a result of the catch-up, the Global Copy pair becomes a Metro Mirror pair. This pair has an impact on application response times because write I/Os in a Metro Mirror pair are written to both primary and secondary volumes before the host application is acknowledged. For this reason, a Global Copy pair is normally synchronized only for short periods, typically to make a FlashCopy, and host application updates are quiesced for the duration. For a more detailed description about this topic, see 19.2, Creating a consistent point-in-time copy on page 234. For an example about how to convert a Global Copy pair to a Metro Mirror relationship, see 21.4, Changing the copy mode from Global Copy to Metro Mirror on page 264.
234
Figure 19-1 illustrates a procedure to get a consistent point-in-time copy at the secondary site.
Primary Site
Non sysnchronous copy over long distance
Secondary Site
primary
Channel extender
Channel extender
secondary
Fl
op hC as
fuzzy co py of data
tertiary 1. Quiesce application updates 2. Catch-up (synchronize volume pairs) go-to-sync / suspend or wait for application writes to quiesce 3. Build consistency on recovery data resync and freeze or freeze (suspend) 4. Resume application writes as soon as freeze is done 6.Reestablish suspended pairs (resync) individual volume pairs synchronize
Consistent tertiary copy of data
5. FlashCopy
Here is a more detailed description of the steps in the procedure that is shown in Figure 19-1: 1. Quiesce the application updates. 2. Synchronize the volume pairs by using one of these methods: Perform the catch-up by doing a go-to-sync operation, as described in 19.1.5, Converting a Global Copy pair to Metro Mirror on page 233. The volume pair changes from the Copy Pending state to the Full Duplex state. From this moment, primary write updates are synchronously transmitted to the recovery site, if the application updates were not quiesced. Perform the catch-up by waiting until all application updates are transmitted to the secondary site. You can monitor the number of out-of-sync tracks with the DS CLI or with the DS GUI as shown in Figure 20-7 on page 255. 3. Suspend the Global Copy pairs after they reach the Full Duplex state. If you use consistency groups, you can do a freeze operation. Now, you have a set of consistent secondary volumes. 4. You can resume the application. Updates are not transmitted to the secondary volumes because the pairs are suspended. The secondary volumes remain consistent, and the application do not experience any response time impact. 5. Perform a FlashCopy on the secondary volumes. 6. Resume Global Copy mode for the copy pair.
235
For applications recovery based on point-in-time copies, you must plan for appropriate checkpoints to briefly quiesce the application and synchronize the volumes pairs. When the recovery of the application is done, you must remember that, while in an active Global Copy relationship, the secondary volumes always have a current fuzzy copy of the primary volumes. So, you must keep the tertiary volumes where you did a FlashCopy of the last globally consistent catch-up. This tertiary copy does not reflect the current updates; it reflects any updates up until the last global catch-up operation. It is not always possible to quiesce an application when the systems and applications must be always online and available to the user. If so, consider the following approaches: The quiesce is planned when the quiesce has the least possible impact to the user. For example, if possible, the quiesce of the application could be performed on the second or third shift. You can use a different approach than quiescing the application by completing the following steps: a. Perform a Consistency Group FlashCopy of all volumes that are used by this application. This action causes I/O operations to the frozen volume to return Extended Long Busy (ELB) or I/O queue full messages for a short period in order to maintain the order of dependent writes, which in turn keeps the copied data at a consistency level. For many users, this action is preferred instead of quiescing the application. For a complete description of this function, see 8.2, Consistency Group FlashCopy on page 61. b. Now you have a consistent FlashCopy of the application data at the primary site, and you are able to perform a Global Copy copy of this data to the secondary site. c. Consider using Incremental FlashCopy with Consistency Group FlashCopy to reduce the amount of data that is transmitted to the secondary site when you perform the procedure. For more information about this function, see 8.4, Incremental FlashCopy: Refreshing the target volume on page 62. d. You can perform a FlashCopy of the secondary Volumes to a target (tertiary) on the secondary site. These volumes can be used if there is a disaster or a problem at the primary site. e. Perform a Consistent Group FlashCopy with Incremental FlashCopy again. An example about how to set up a consistent point-in-time copy is shown in 21.6.2, Periodical backup operation on page 269.
19.3 Cascading
Cascading is a capability of the DS8000 storage systems, where the secondary volume of a Global Copy or Metro Mirror relationship is at the same time the primary volume of another Global Copy or Metro Mirror relationship. You can use cascading to perform remote copy from a primary volume through an intermediate volume to a target volume. The two volume pairs in a cascaded relationship can be either Global Copy or Metro Mirror pairs with the exception that the first pair (primary to intermediate) cannot be Global Copy if the second pair (intermediate to target) is Metro Mirror.
236
If both pairs are Metro Mirror, then the second pair is suspended when any write I/O is directed to the target volume. This action prevents potential performance impacts that a cascading Metro Mirror to Metro Mirror remote copy relationship might have on application write I/Os. A cascading Metro Mirror to Metro Mirror setup is intended for temporary use only to allow the target volume to be synchronized with the intermediate volume while the applications are quiesced (possibly followed by suspending the first Metro Mirror pair) so that there is no write I/O activity on the second Metro Mirror pair. Cascading is not limited to two copy pairs. You can create a chain of cascaded copy pairs. The first pair can be Metro Mirror while the other pairs are normally Global Copy. The two pairs can be established in any order: first and second here do not refer to the order of establishment. To minimize the amount of data to be copied on the second pairing, it is usually preferable to establish the primary-to-intermediate pair first. Cascading can be used for migrating data from one storage system to another compatible system. If the volumes to be migrated are already mirrored between two storage systems, you can set up a cascaded Global Copy from the secondary of the current relationship to the intended target system. This way, you can perform the migration without having to remove the existing remote copy relationships. For an example of Global Copy cascading, see 21.7, Global Copy cascading on page 272.
19.4.1 License
Global Copy is an optional licensed function of the DS8000. You must purchase the corresponding licensed function for the primary and secondary DS8000 systems. For detailed information about DS8000 licenses, see 3.1, Licenses on page 16.
19.4.2 Interoperability
Global Copy pairs can be established only between disk subsystems of the same (or similar) type and features. For example, a DS8800 can have a Global Copy pair with another DS8300. All disk subsystems must have the appropriate licensed function feature installed. For more information, see the System Storage Interoperation Center (SSIC) at: http://www-03.ibm.com/systems/support/storage/ssic/interoperability.wss
237
Fibre Channel ports: Although it is possible to share a Fibre Channel port for Global Copy and host data traffic, use different Fibre Channel ports and adapters in the DS8000 systems to avoid contentions and potential host performance degradation. Global Copy pairs are set up between volumes in LSSs, usually in different disk subsystems that are normally in separate locations. A path (or group of paths) must be defined between the source LSS and the target LSS. These logical paths are defined over physical links between the disk subsystems. The physical link includes the host adapter in the primary DS8000, the cabling, switches or directors, any wideband or long-distance transport devices (DWDM, channel extenders, WAN), and the host adapters in the remote storage system. Physical links can carry multiple Global Copy logical paths. For more detailed information about logical and physical paths, see 15.2, Metro Mirror paths and links on page 155. LInks: For Global Copy, the DS8000 supports Fibre Channel links only (no FICON links). To facilitate ease of testing, the DS8000 supports Global Copy primary and secondary volumes on the same DS8000. Paths are unidirectional, that is, they are defined to operate in either one direction or the other. Global Copy is bidirectional, which means that any particular pair of LSSs can have paths defined among them that have opposite directions; each LSS holds both primary and secondary volumes from the other particular LSS. Furthermore, opposite direction paths are allowed to be defined on the same Fibre Channel physical link. For bandwidth and redundancy, more than one path can be created between an LSS pair. Global Copy balances the workload across the available paths between the primary and secondary LSSs. LSSs: Remember that the LSS is not a physical construct in the DS8000; it is a logical construct. Volumes in an LSS can be on different disk arrays. Physical links are bidirectional and can be shared by other Global Copy pairs and other remote mirror and copy functions, such as Metro Mirror and Global Mirror. Note: In general, you should not share the FCP links used for synchronous and asynchronous remote copy functions. For more information, see 26.2.3, Considerations for host adapter usage on page 346.
238
239
Basically, a Channel Extender works in a pair (you can have also three or more channel extenders connected), where the network links connect those machines over very long distances, and the channel extender has FICON (used for z/OS environments only), FCP, or SCSI connections to a host server. DS8000 connected to DS8000 over large distances with channel extenders is an FCP-WAN-FCP protocol encapsulation. Channel extenders usually emulate a local connection and they have software running, which allows many facilities. The channel extender vendor should be contacted so that you can learn their distance capability, line quality requirements, and WAN attachment capabilities when you use Global Copy between the primary and secondary DS8000 systems. The channel extender vendor should be contacted regarding hardware and software prerequisites when you use their products in a DS8000 Global Copy configuration. Evaluation, qualification, approval, and support of Global Copy configurations using channel extender products is the sole responsibility of the channel extender vendor.
240
Primary Site
Global Copy
Primary Channel extender Channel extender
Secondary Site
Secondary
F la
Fuzzy copy of data
shC
opy
Tertiary
241
242
20
Chapter 20.
243
244
Task Lists the Metro Mirror and Global Copy volume relationships. Establishes the Metro Mirror and Global Copy volume relationships. Suspends the Metro Mirror and Global Copy volume relationships. Resumes the Metro Mirror and Global Copy volume relationships. Deletes the Metro Mirror and Global Copy volume relationships. Places the primary logical subsystem (LSS) in the extended long busy state during the defined timeout, sets a queue full condition for the primary volumes, and removes the remote copy paths between the primary and secondary LSS. Thaws an existing Remote Copy Consistency Group by resetting the queue full condition for those primary volumes where the freezepprc command is issued.
Command with DS CLI lspprc mkpprc -type gcp pausepprc resumepprc -type gcp rmpprc freezepprc
unfreezepprc
lsavailpprcport
The lsavailpprcport command displays the Fibre Channel ports that you can use to establish Global Copy paths (see Example 20-1).
Example 20-1 lsavailpprcport command
dscli> lsavailpprcport -l -dev ibm.2107-75abtv1 -remotedev ibm.2107-7520781 -remotewwnn 5005076303FFC1A5 00:80 Local Port Attached Port Type Switch ID Switch Port =================================================== I0010 I0143 FCP NA NA I0031 I0102 FCP NA NA
lspprcpath
The lspprcpath command shows you the established paths from an LSS. In Example 20-2, we list the paths from LSS 00.
Example 20-2 lspprcpath command
dscli> lspprcpath -dev ibm.2107-75abtv1 00 Src Tgt State SS Port Attached Port Tgt WWNN ========================================================= 00 80 Success 8000 I0010 I0143 5005076303FFC1A5 00 80 Success 8000 I0031 I0102 5005076303FFC1A5
245
mkpprcpath
With the mkpprcpath command, you can establish paths between two LSSs. In Example 20-3, we establish a path from 2107-75ABTV1 LSS 00 to 2107-7520781 LSS 80.
Example 20-3 mkpprcpath command
dscli> mkpprcpath -dev ibm.2107-75abtv1 -remotedev ibm.2107-7520781 -remotewwnn 5005076303FFC1A5 -srclss 00 -tgtlss 80 I0010:I0143 I0031:I0102 CMUC00149I mkpprcpath: Remote Mirror and Copy path 00:80 successfully established.
rmpprcpath
The rmpprcpath command removes all paths between two LSSs (see Example 20-4).
Example 20-4 rmpprcpath command
dscli> rmpprcpath -dev ibm.2107-75abtv1 -remotedev ibm.2107-7520781 00:80 CMUC00152W rmpprcpath: Are you sure you want to remove the Remote Mirror and Copy path 00:80:? [y/n]:y CMUC00150I rmpprcpath: Remote Mirror and Copy path 00:80 successfully removed. You can suppress the CMUC00152W confirmation message by adding the -quiet parameter in the command.
mkpprc
With the mkpprc command, you can create a Global Copy pair. To create a Global Copy pair, specify the -type gcp parameter (see Example 20-5).
Example 20-5 mkpprc command
dscli> mkpprc -dev ibm.2107-75abtv1 -remotedev ibm.2107-7520781 -type gcp -mode full 000a:808a CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 000A:808A successfully created.
lspprc
The lspprc command shows you which volumes are in a Global Copy relationship (see Example ).
Example 20-6 lspprc command
246
pausepprc
The pausepprc command suspends a Global Copy pair (see Example 20-7).
Example 20-7 pausepprc command
dscli> pausepprc -dev ibm.2107-75abtv1 -remotedev ibm.2107-7520781 000a:808a CMUC00157I pausepprc: Remote Mirror and Copy volume pair 000A:808A relationship successfully paused.
resumepprc
With the resumepprc command you can bring a pair from Suspended to Copy Pending, as shown in Example 20-8.
Example 20-8 resumepprc command
dscli> resumepprc -dev ibm.2107-75abtv1 -remotedev ibm.2107-7520781 -type gcp 000a:808a CMUC00158I resumepprc: Remote Mirror and Copy volume pair 000A:808A relationship successfully resumed. This message is being returned before the copy completes.
rmpprc
The rmpprc command removes the relationship from a volume pair and sets the volumes to simplex state (see Example 20-9).
Example 20-9 rmpprc command
dscli> rmpprc -dev ibm.2107-75abtv1 -remotedev ibm.2107-7520781 000a:808a CMUC00160W rmpprc: Are you sure you want to delete the Remote Mirror and Copy volume pair relationship 000A:808A:? [y/n]:y CMUC00155I rmpprc: Remote Mirror and Copy volume pair 000A:808A relationship successfully withdrawn. You can suppress the CMUC00160W confirmation message by adding the -quiet parameter to the command.
freezepprc
The freezepprc command suspends all Global Copy and Metro Mirror volume pairs on a given primary and secondary LSS pair and deletes all paths between the two LSSs. It sets the primary volumes on the primary LSS to the Suspended state. The secondary volumes are left in their current state. If the paths between the LSSs are created with the -consistgrp parameter, the command sets the queue full condition on the primary volumes. This action causes the I/Os directed to the primary volume to be queued in the LSS until the queue full condition times out (the default is 60 seconds for Open Systems) or the condition is reset with the unfreezepprc command. During the long busy condition, the primary volume reports queue full to the host. The freezepprc command is also used by automation software to provide data consistency at the secondary site during planned or unplanned primary site outages.
247
The command in Example 20-10 suspends all volume pairs whose primary volume is on LSS 00 of IBM.2107-75ABTV1 and the secondary is on LSS 80 of IBM.2107-7520781. In addition, all paths between the two LSSs are set to a Failed state with a failed reason of System Reserved.
Example 20-10 freezepprc command
dscli> freezepprc -dev ibm.2107-75abtv1 -remotedev ibm.2107-7520781 00:80 CMUC00161W freezepprc: Remote Mirror and Copy consistency group 00:80 successfully created. Failed state: Because the freezepprc command sets the paths between the specified LSS pair into a Failed state, you must re-establish them to resume normal Global Copy operation.
unfreezepprc
With the unfreezepprc command, you can thaw an existing consistency group (see Example 20-11). It resumes host application I/O activity for primary volumes on the consistency group that was created by an earlier freezepprc command. The unfreezepprc command is used to reset the extended long busy (queue full) condition for the primary volume and release application I/O to the primary volumes. All queued writes to the primary volume are written. The status of the volumes does not change; the primary volumes remain in the suspended state as set by the freezepprc command.
Example 20-11 unfreezepprc command
dscli> unfreezepprc -dev ibm.2107-75abtv1 -remotedev ibm.2107-7520781 00:80 CMUC00198I unfreezepprc: Remote Mirror and Copy pair 00:80 successfully thawed.
failoverpprc
You can use the failoverpprc command to change a secondary volume into a (source) Suspended state while you leave the original primary volume in its current state (see Example 20-12).
Example 20-12 failoverpprc command
dscli> failoverpprc -dev ibm.2107-7520781 -remotedev ibm.2107-75abtv1 -type gcp 808a:000a CMUC00196I failoverpprc: Remote Mirror and Copy pair 808A:000A successfully reversed. The command must be issued to the secondary site of a volume pair. The volume must be in the Full Duplex, Suspended, or Copy Pending state, or a cascading Global Copy volume whose secondary state is Full Duplex, Suspended, or Copy Pending.
failbackpprc
You can issue the failbackpprc command against any volume that is in the Suspended state. The command copies the required data from the local volume to the remote volume to resume mirroring. The local volume is that volume on where you apply the failbackpprc command (this volume becomes the primary volume). The command is usually used after a failoverpprc command to restart mirroring either in the reverse direction (recovery site to production site) or original direction (production site to recovery site).
248
The command in Example 20-13 is issued to a former secondary volume 808A that was changed to a Suspended primary state by a preceding failoverpprc command. After the command completes, 808A becomes the new primary volume and 000A the new secondary volume. Because the command reverses the copy direction, you first must establish paths in that direction.
Example 20-13 failbackpprc command
dscli> failbackpprc -dev ibm.2107-7520781 -remotedev ibm.2107-75abtv1 -type gcp 808a:000a CMUC00197I failbackpprc: Remote Mirror and Copy pair 808A:000A successfully failed back.
249
250
Click Action Create. The Create Mirroring Connectivity window opens, where you must select the source LSS and the target LSS for the Storage Images to which you want to establish paths. Select the I/O port for the Source and the Target Storage Image and click Add to move the definition to the Create Mirroring Connectivity Verification field. If you want to add more than one connection, repeat the previous steps for any additional logical paths you want to add (see Figure 20-2). When you add all the connections that you want to establish, click Create.
After the paths are created, you can verify the results by selecting the LSS for which you created paths at the Mirroring Connectivity window (see Figure 20-3).
251
252
3. From the Action drop-down menu, click Create Global Copy. The Create Global Copy window opens (Figure 20-4).
4. Use the Filter by option (available filter options are All Volumes, Host, LSS, Storage Allocation Method, and Volume Group) to get the volumes for which you want to create a Global Copy relationship and then select a Source Volume and Target Volume. 5. If you have not established Global Copy paths for your selected Global Copy pairs, you can do so by clicking Create mirroring connectivity. 6. If you want to select more options for the Global Copy relationship, click Advanced, add your options by selecting the appropriate check boxes (see Figure 20-5), and click OK.
253
Here is a short description of the possible options that you can select for a Global Copy: Reset reservation: Clears any host reservations on the target volume of the pair. This option is only available when a fixed-block (FB) volume pair is being created. Perform initial copy (default): Copies the entire source volume to the target volume. Select this option the first time that you create a Metro Mirror relationship because it ensures that the source and target volume contain the same data. If you do not select this option, data cannot be copied from the source volume to the target volume until the first writes are written to the host system when a Metro Mirror relationship is created. Cascading: When enabled, creates a cascading Metro Mirror and Global Copy relationship. This option enables a Metro Mirror and Global Copy target volume to be a Metro Mirror and Global Copy source volume for a different Metro Mirror and Global Copy volume relationship. Permit read access from target: Enables hosts to read from the Metro Mirror target volume. For the host to read the volume, the state of the Metro Mirror pair must be Full Duplex. This option is only valid for fixed-block (FB) volumes. Disable auto-resync: Disables the auto-resync function for the Global Copy pair. At least one of the selected relationships must be Global Copy for this option to be enabled. Every time the Global Copy pair is established or resynchronized, this check box must be selected or cleared. 7. Repeat the previous steps for all Global Copy pairs you want to establish. 8. Click Create, and all the Global Copy relationships that are shown in the Create Global Copy Verification field (Figure 20-4 on page 253) are established if no error message occurs.
254
Properties: On the Overview tab, you find all the selected options and the out-of-sync-tracks for a Global Copy (see Figure 20-7).
255
The Volumes tab shows all the related volumes for a Global Copy relationship, including the allocation method (see Figure 20-8).
As with the Properties and the Create actions, for each action you select, a new window opens. Follow the instructions in the new window. Depending on the action that you choose, you may change the offered options. Options for Open Systems: Not all displayed options are valid for Global Copy in an Open Systems environment.
256
21
Chapter 21.
257
Source
LSS10
1000 1001
Paths
Target
LSS20
2000 2001
LSS11
1100 1101 DS8000#1
-dev IBM.2107-7520781
LSS21
2100 2101 DS8000#2
-dev IBM.2107-75ABTV1
The configuration has four Global Copy pairs that are in two LSSs. Two paths are defined between each primary and secondary LSS. To configure the Global Copy environment, complete the following steps: 1. Define the paths that Global Copy uses. 2. Create Global Copy pairs.
dscli> mkpprcpath -dev ibm.2107-75abtv1 -remotedev 5005076303FFC1A5 -srclss 10 -tgtlss 20 I0010:I0143 CMUC00149I mkpprcpath: Remote Mirror and Copy path dscli> mkpprcpath -dev ibm.2107-75abtv1 -remotedev 5005076303FFC1A5 -srclss 11 -tgtlss 21 I0010:I0143 CMUC00149I mkpprcpath: Remote Mirror and Copy path
ibm.2107-7520781 -remotewwnn I0031:I0102 10:20 successfully established. ibm.2107-7520781 -remotewwnn I0031:I0102 11:21 successfully established.
258
You can check the status of Global Copy by running lspprc -l. Out Of Sync Tracks shows the remaining tracks to be sent to the target volume (the size of the logical track for the FB volume for the DS8000 is 64 KB). You can run lspprc -fullid to display the fully qualified DS8000 Storage Image ID in the command output (see Example 21-3).
Example 21-3 lspprc -l and lspprc -fullid for Global Copy pairs
dscli> lspprc -l 1000-1001 1100-1101 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================================================== 1000:2000 Copy Pending Global Copy 38379 Disabled Disabled invalid 10 unknown Disabled False 1001:2001 Copy Pending Global Copy 38083 Disabled Disabled invalid 10 unknown Disabled False 1100:2100 Copy Pending Global Copy 60840 Disabled Disabled invalid 11 unknown Disabled False 1101:2101 Copy Pending Global Copy 60838 Disabled Disabled invalid 11 unknown Disabled False dscli> lspprc -fullid 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================================================== IBM.2107-7520781/1000:IBM.2107-75ABTV1/2000 Copy Pending Global Copy IBM.2107-7520781/10 unknown Disabled True IBM.2107-7520781/1001:IBM.2107-75ABTV1/2001 Copy Pending Global Copy IBM.2107-7520781/10 unknown Disabled True IBM.2107-7520781/1100:IBM.2107-75ABTV1/2100 Copy Pending Global Copy IBM.2107-7520781/11 unknown Disabled True IBM.2107-7520781/1101:IBM.2107-75ABTV1/2101 Copy Pending Global Copy IBM.2107-7520781/11 unknown Disabled True
Unlike the Metro Mirror initial copy (first pass), the state of the Global Copy volumes still shows Copy Pending even after the Out Of Sync Tracks become 0 (see Example 21-4).
Example 21-4 List the Global Copy pairs status after the Global Copy first pass completes
dscli> lspprc -l 1000-1001 1100-1101 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================================================== 1000:2000 Copy Pending Global Copy 0 Disabled Disabled invalid 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 0 Disabled Disabled invalid 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 0 Disabled Disabled invalid 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 0 Disabled Disabled invalid 11 unknown Disabled True
259
Copy Pending is the state of the Global Copy primary. The secondary state is Target Copy Pending. To see the secondary state in this example, you must issue the lspprc command to the DS HMC connected to DS8000#2, which is the Global Copy secondary (see Example 21-5).
Example 21-5 lspprc for Global Copy target volumes dscli> lspprc 2000-2001 2100-2101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ====================================================================================================== 1000:2000 Target Copy Pending Global Copy 10 unknown Disabled Invalid 1001:2001 Target Copy Pending Global Copy 10 unknown Disabled Invalid 1100:2100 Target Copy Pending Global Copy 11 unknown Disabled Invalid 1101:2101 Target Copy Pending Global Copy 11 unknown Disabled Invalid
260
You can add the -at tgt parameter to the rmpprc command to remove only the available Global Copy secondary volumes, as shown in Example 21-7. You must issue this command to the HMC connected to DS8000 #2, which is the Global Copy secondary.
Example 21-7 Results of rmpprc with -at tgt dscli> lspprc 2002 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 1002:2002 Target Copy Pending Global Copy 10 unknown Disabled Invalid dscli> dscli> rmpprc -remotedev IBM.2107-75ABTV1 -quiet -at tgt 1002:2002 CMUC00155I rmpprc: Remote Mirror and Copy volume pair 1002:2002 relationship successfully withdrawn. dscli>
Example 21-8 shows the Global Copy primary volume status after the rmpprc -at tgt command completes. I t also shows the result of a rmpprc -at src command. In this case, there are still available paths. Therefore, the primary volume state changes after the rmpprc -at tgt command completes. If there are no available paths, the state of the Global Copy primary volumes would be preserved. In Example 21-8, you must issue this command to the HMC connected to DS8000 #1, which is the Global Copy primary.
Example 21-8 Global Copy source volume status after rmpprc with -at tgt and rmpprc with -at src dscli> lspprc 1002 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1002:2002 Copy Pending Global Copy 10 unknown Disabled False << After rmpprc -at tgt command completes >> dscli> lspprc 1002 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Stat ===================================================================================================== 1002:2002 Suspended Simplex Target Global Copy 10 unknown Disabled True dscli> rmpprc -remotedev IBM.2107-75ABTV1 -quiet -at src 1002:2002 CMUC00155I rmpprc: Remote Mirror and Copy volume pair 1002:2002 relationship successfully withdrawn. dscli> lspprc 1002 CMUC00234I lspprc: No Remote Mirror and Copy found.
261
CMUC00155I rmpprc: Remote Mirror and Copy volume pair 1101:2101 relationship successfully withdrawn. dscli> lspprc 1000-1001 1100-1101 CMUC00234I lspprc: No Remote Mirror and Copy found. dscli> rmpprcpath -quiet -remotedev IBM.2107-75ABTV1 10:20 11:21 CMUC00150I rmpprcpath: Remote Mirror and Copy path 10:20 successfully removed. CMUC00150I rmpprcpath: Remote Mirror and Copy path 11:21 successfully removed.
If you do not remove the Global Copy pairs that are using the paths, the rmpprcpath command fails (see Example 21-10).
Example 21-10 Remove paths without removing the Global Copy pairs dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True dscli> rmpprcpath -quiet -remotedev IBM.2107-75ABTV1 10:20 11:21 CMUN03070E rmpprcpath: 10:20: Copy Services operation failure: pairs remain CMUN03070E rmpprcpath: 11:21: Copy Services operation failure: pairs remain
If you want to remove the paths that still have the Global Copy pairs, you can use the -force parameter (see Example 21-11). After the path is removed, the Global Copy pair is still in the Copy Pending state until the primary receives I/O from the servers. When I/O goes to the Global Copy primary, the primary volume becomes suspended. If you set the Consistency Group option for the LSS in which the volumes are, I/Os from the servers are held with queue full status for the specified timeout value. For more information about the Consistency Group option, see 15.3, Consistency Group function on page 158.
Example 21-11 Remove paths that still have Global Copy pairs - use -force parameter
dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True dscli> rmpprcpath -quiet -remotedev IBM.2107-75ABTV1 -force 10:20 11:21 CMUC00150I rmpprcpath: Remote Mirror and Copy path 10:20 successfully removed. CMUC00150I rmpprcpath: Remote Mirror and Copy path 11:21 successfully removed. dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True << After I/O goes to the source volume(1000 and 1001) >> dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status =============================================================================================================== 1000:2000 Suspended Internal Conditions Target Global Copy 10 unknown Disabled True 1001:2001 Suspended Internal Conditions Target Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True
262
Because the primary DS8000 keeps records of all changed data on the primary volume, you can resume Global Copy data transfer later. The resumepprc command resumes a Global Copy relationship for a volume pair and restarts transferring data. You must specify the copy mode, such as Metro Mirror or Global Copy, with the -type parameter (see Example 21-13).
Example 21-13 Resume Global Copy pairs dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================= 1000:2000 Suspended Host Source Global Copy 10 unknown Disabled True 1001:2001 Suspended Host Source Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True dscli> resumepprc -remotedev IBM.2107-75ABTV1 -type gcp 1000-1001:2000-2001 CMUC00158I resumepprc: Remote Mirror and Copy volume pair 1000:2000 relationship successfully resumed. This message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 1001:2001 relationship successfully resumed. This message is being returned before the copy completes. dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True
263
21.4 Changing the copy mode from Global Copy to Metro Mirror
You can change the copy mode from Global Copy to Metro Mirror by running mkpprc (see Example 21-14). This operation is called go-to-sync. Depending on the amount of data to be sent to the secondary, it takes time until the pairs become full duplex.
Example 21-14 Change the copy mode from Global Copy to Metro Mirror dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True
You can use the -suspend parameter with the mkpprc -type mmir command. If you use this parameter, the state of the pairs becomes suspended when the data synchronization is complete (see Example 21-15). You can use this option for your off-site backup scenario with the Global Copy function.
Example 21-15 mkpprc -type mmir with -suspend dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True dscli> mkpprc -remotedev IBM.2107-75ABTV1 -type mmir -suspend 1000-1001:2000-2001 1100-1101:2100-2101 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1000:2000 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1001:2001 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1100:2100 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1101:2101 successfully created. dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ===================================================================================================== 1000:2000 Suspended Host Source Metro Mirror 10 unknown Disabled Invalid 1001:2001 Suspended Host Source Metro Mirror 10 unknown Disabled Invalid 1100:2100 Suspended Host Source Metro Mirror 11 unknown Disabled Invalid 1101:2101 Suspended Host Source Metro Mirror 11 unknown Disabled Invalid
264
You can add the -wait parameter with the mkpprc command. With the -wait parameter, the mkpprc -type mmir -suspend command does not return to the command prompt until the pairs complete data synchronization and reach the suspended state (see Example 21-16). Data synchronization: If you do not specify the -wait parameter with the mkpprc -type mmir -suspend command, the mkpprc command does not wait for the data synchronization. If you do not use the -wait parameter, you must check the completion of the synchronization by running lspprc.
Example 21-16 mkpprc -type mmir -suspend with -wait dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True dscli> mkpprc -remotedev IBM.2107-75ABTV1 -type mmir -suspend -wait 1000-1001:2000-2001 1100-1101:2100-2101 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1000:2000 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1001:2001 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1100:2100 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 1101:2101 successfully created. 1/4 pair 1001:2001 state: Suspended 2/4 pair 1000:2000 state: Suspended 3/4 pair 1101:2101 state: Suspended 4/4 pair 1100:2100 state: Suspended dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ===================================================================================================== 1000:2000 Suspended Host Source Metro Mirror 10 unknown Disabled Invalid 1001:2001 Suspended Host Source Metro Mirror 10 unknown Disabled Invalid 1100:2100 Suspended Host Source Metro Mirror 11 unknown Disabled Invalid 1101:2101 Suspended Host Source Metro Mirror 11 unknown Disabled Invalid
21.5 Changing the copy mode from Metro Mirror to Global Copy
You cannot change the copy mode from the Metro Mirror to Global Copy directly. To do this task, you must run pausepprc to suspend the Metro Mirror pair first, and then resume the pair in Global Copy mode by running resumepprc -type gcp (see Example 21-17).
Example 21-17 Change the copy mode from Metro Mirror to Global Copy dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Full Duplex Metro Mirror 10 unknown Disabled Invalid 1001:2001 Full Duplex Metro Mirror 10 unknown Disabled Invalid 1100:2100 Full Duplex Metro Mirror 11 unknown Disabled Invalid 1101:2101 Full Duplex Metro Mirror 11 unknown Disabled Invalid dscli> mkpprc -remotedev IBM.2107-75ABTV1 -type gcp 1000-1001:2000-2001 1100-1101:2100-2101 CMUN03053E mkpprc: 1000:2000: Copy Services operation failure: invalid transition CMUN03053E mkpprc: 1001:2001: Copy Services operation failure: invalid transition CMUN03053E mkpprc: 1100:2100: Copy Services operation failure: invalid transition CMUN03053E mkpprc: 1101:2101: Copy Services operation failure: invalid transition
265
dscli> pausepprc -remotedev IBM.2107-75ABTV1 1000-1001:2000-2001 1100-1101:2100-2101 CMUC00157I pausepprc: Remote Mirror and Copy volume pair 1000:2000 relationship successfully paused. CMUC00157I pausepprc: Remote Mirror and Copy volume pair 1001:2001 relationship successfully paused. CMUC00157I pausepprc: Remote Mirror and Copy volume pair 1100:2100 relationship successfully paused. CMUC00157I pausepprc: Remote Mirror and Copy volume pair 1101:2101 relationship successfully paused. dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ===================================================================================================== 1000:2000 Suspended Host Source Metro Mirror 10 unknown Disabled Invalid 1001:2001 Suspended Host Source Metro Mirror 10 unknown Disabled Invalid 1100:2100 Suspended Host Source Metro Mirror 11 unknown Disabled Invalid 1101:2101 Suspended Host Source Metro Mirror 11 unknown Disabled Invalid dscli> resumepprc -remotedev IBM.2107-75ABTV1 -type gcp 1000-1001:2000-2001 1100-1101:2100-2101 CMUC00158I resumepprc: Remote Mirror and Copy volume pair 1000:2000 relationship successfully resumed. message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 1001:2001 relationship successfully resumed. message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 1100:2100 relationship successfully resumed. message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 1101:2101 relationship successfully resumed. message is being returned before the copy completes. dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True
266
Figure 21-2 shows a diagram of the DS8000 Global Copy environment for this example. We use the Remote Incremental FlashCopy function for the FlashCopy at the recovery site DS8000 #2. We use 1000, 1001, 1100, and 1101 in DS8000 #1 for Global Copy primary, 2000, 2001, 2100, and 2101 in DS8000 #2 for Global Copy secondary and FlashCopy sources, and 2002, 2003, 2102, and 2103 in the DS8000 #2 for FlashCopy targets.
GC Source
LSS10
1000 1001 Paths
FC Target
FlashCopy 2002 2003 FlashCopy 2102 2103
LSS20
2000 2001
LSS11
1100 1101 DS8000#1
-dev IBM.2107-7520781
LSS21
2100 2101 DS8000#2
-dev IBM.2107-75ABTV1
Figure 21-2 The DS8000 environment for Global Copy offsite backup
267
Verify that the FlashCopy background copy completes by running the command that is shown in Example 21-20 and reviewing the output.
Example 21-20 lsremoteflash to check the FlashCopy background copy completion
dscli> lsremoteflash -l -conduit IBM.2107-7520781/10 -dev IBM.2107-75ABTV1 2000-2001 ID SrcLSS SequenceNum ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy OutOfSyncTracks ========================================================================================================================================== 2000:2002 20 0 Enabled Enabled Enabled Disabled Enabled Enabled Enabled 44626 2001:2003 20 0 Enabled Enabled Enabled Disabled Enabled Enabled Enabled 11153 dscli> lsremoteflash -l -conduit IBM.2107-7520781/11 -dev IBM.2107-75ABTV1 2100-2101 ID SrcLSS SequenceNum ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy OutOfSyncTracks ========================================================================================================================================== 2100:2102 21 0 Enabled Enabled Enabled Disabled Enabled Enabled Enabled 46289 2101:2103 21 0 Enabled Enabled Enabled Disabled Enabled Enabled Enabled 14695 dscli> lsremoteflash -l -conduit IBM.2107-7520781/10 -dev IBM.2107-75ABTV1 2000-2001 ID SrcLSS SequenceNum ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy OutOfSyncTracks ========================================================================================================================================= 2000:2002 20 0 Disabled Enabled Enabled Disabled Enabled Enabled Enabled 0 2001:2003 20 0 Disabled Enabled Enabled Disabled Enabled Enabled Enabled 0 dscli> lsremoteflash -l -conduit IBM.2107-7520781/11 -dev IBM.2107-75ABTV1 2100-2101 ID SrcLSS SequenceNum ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy OutOfSyncTracks ========================================================================================================================================== 2100:2102 21 0 Disabled Enabled Enabled Disabled Enabled Enabled Enabled 0 2101:2103 21 0 Disabled Enabled Enabled Disabled Enabled Enabled Enabled 0
268
Primary Site
Non sysnchronous copy over long distance
Secondary Site
primary
Channel extender
Channel extender
secondary
Fl p Co h as
tertiary 1. Quiesce application updates 2. Catch-up (synchronize volume pairs) go-to-sync / suspend or wait for application writes to quiesce 3. Build consistency on recovery data resync and freeze or freeze (suspend) 4. Resume application writes as soon as freeze is done individual volume pairs synchronize
Consistent tertiary copy of data
Here is a more detailed description of the steps in the procedure that is shown in Figure 21-3: 1. 2. 3. 4. 5. Normal Global Copy mode operation. Stop or quiesce the application at the production site. Run a go-to-sync and suspend the Global Copy pairs. Resume or restart the application at the production site. Take a FlashCopy copy from the secondary to tertiary volumes.
For a detailed description of the preceding steps, see 19.2, Creating a consistent point-in-time copy on page 234.
269
============================================================================================================================ 2000:2002 20 0 Disabled Enabled Enabled Disabled Enabled Enabled Enabled 2001:2003 20 0 Disabled Enabled Enabled Disabled Enabled Enabled Enabled 2100:2102 21 0 Disabled Enabled Enabled Disabled Enabled Enabled Enabled 2101:2103 21 0 Disabled Enabled Enabled Disabled Enabled Enabled Enabled
270
Example 21-24 Resume normal Global Copy mode dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ===================================================================================================== 1000:2000 Suspended Host Source Metro Mirror 10 unknown Disabled Invalid 1001:2001 Suspended Host Source Metro Mirror 10 unknown Disabled Invalid 1100:2100 Suspended Host Source Metro Mirror 11 unknown Disabled Invalid 1101:2101 Suspended Host Source Metro Mirror 11 unknown Disabled Invalid dscli> resumepprc -remotedev IBM.2107-75ABTV1 -type gcp 1000-1001:2000-2001 1100-1101:2100-2101 CMUC00158I resumepprc: Remote Mirror and Copy volume pair 1000:2000 relationship successfully resumed. message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 1001:2001 relationship successfully resumed. message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 1100:2100 relationship successfully resumed. message is being returned before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 1101:2101 relationship successfully resumed. message is being returned before the copy completes. dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True
271
dscli> mkpprc -dev ibm.2107-7522391 -remotedev ibm.2107-7503461 -type gcp -mode full -cascade 050a:031a CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 050A:031A successfully created. dscli> lspprc -dev ibm.2107-7522391 050a ID State Reason Type SourceLSS Timeout (secs) Critical Mode =========================================================================================== 011A:050A Target Copy Pending Global Copy 01 unknown Disabled 050A:031A Copy Pending Global Copy 05 unknown Disabled In the example, 011A is the primary volume, 050A is the intermediate volume, and 031A is the target volume. The lspprc command shows the status of both volume pairs where the intermediate volume 050A belongs. 011A:050A is the first pair where 050A is the secondary, and 050A:031A is the second pair where 050A is the cascading primary.
272
C:\IBM\DSCLI> lssi Name ID Storage Unit Model WWNN State ESSNet ============================================================================ IBM.2107-7503461 IBM.2107-7503460 951 5005076303FFC08F Online Enabled 2. Check what FCP ports are available for establishing paths between LSS 05 on the old DS8000 and LSS 03 on the new DS8000. In our example, we use the DS CLI lsavailpprcport command (see Example 21-27).
Example 21-27 Check the ports
C:\IBM\DSCLI> lsavailpprcport -dev ibm.2107-7522391 -remotedev ibm.2107-7503461 -remotewwnn 5005076303ffc08f 05:03 Local Port Attached Port Type ============================= I0003 I0003 FCP I0003 I0003 FCP 3. The next step is to establish logical paths from LSS 05 to LSS 03. The DS CLI command is shown in Example 21-28.
Example 21-28 Establish the paths
C:\IBM\DSCLI >mkpprcpath -dev ibm.2107-7522391 -remotedev ibm.2107-7503461 -srclss 05 -tgtlss 03 -remotewwnn 5005076303ffc08f I0003:I0003 I0002:I0002 CMUC00149I mkpprcpath: Remote Mirror and Copy path 05:03 successfully established. C:\IBM\DSCLI> lspprcpath -dev ibm.2107-7522391 05 Src Tgt State SS Port Attached Port Tgt WWNN ========================================================= 05 03 Success 7B00 I0003 I0003 5005076303FFC08F 05 03 Success 7B00 I0002 I0002 5005076303FFC08F 4. You can now establish the Global Copy pairs. The DS CLI commands to do this task are shown in Example 21-29.
Example 21-29 Establish the pairs
C:\IBM\DSCLI> mkpprc -dev ibm.2107-7522391 -remotedev ibm.2107-7503461 -type gcp -mode full 050a:031a CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 050A:031A successfully created. C:\IBM\DSCLI> mkpprc -dev ibm.2107-7522391 -remotedev ibm.2107-7503461 -type gcp -mode full 050b:031b CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 050B:031B successfully created. C:\IBM\DSCLI> lspprc -dev ibm.2107-7522391 -l 050a-050b ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Casca =========================================================================================== 050A:031A Copy Pending Global Copy 0 Disabled Unknown Unknown 050B:031B Copy Pending Global Copy 0 Disabled Unknown Unknown
273
5. Shut down the host systems that are using the volumes that are being migrated. This task should not be done until the number of out-of-sync tracks for most of the Global Copy volume pairs is zero or close to zero so that synchronization in the next step does not take long. This action reduces application downtime. 6. The next step is to change the Global Copy pairs to Metro Mirror pairs or wait until the out-of-sync tracks reach zero so that the target volumes are consistent. The DS CLI commands to synchronize the pairs are shown in Example 21-30.
Example 21-30 Synchronize the pairs
C:\IBM\DSCLI> mkpprc -dev ibm.2107-7522391 -remotedev ibm.2107-7503461 -type mmir 050a:031a CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 050A:031A successfully created. C:\IBM\DSCLI> mkpprc -dev ibm.2107-7522391 -remotedev ibm.2107-7503461 -type mmir 050b:031b CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 050B:031B successfully created. C:\IBM\DSCLI> lspprc -dev ibm.2107-7522391 050a-050b ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass =========================================================================================== 050A:031A Full Duplex Metro Mirror 05 unknown Disabled Invalid 050B:031B Full Duplex Metro Mirror 05 unknown Disabled Invalid
7. You can now stop mirroring and change the secondary volumes to the simplex status, as shown in Example 21-31.
Example 21-31 Terminate the Metro Mirror pairs
C:\IBM\DSCLI> rmpprc -dev ibm.2107-7522391 -remotedev 050a:031a CMUC00155I rmpprc: Remote Mirror and Copy volume pair successfully withdrawn. C:\IBM\DSCLI> rmpprc -dev ibm.2107-7522391 -remotedev 050b:031b CMUC00155I rmpprc: Remote Mirror and Copy volume pair successfully withdrawn. C:\IBM\DSCLI> lspprc -dev ibm.2107-7522391 050a-050b CMUC00234I lspprc: No Remote Mirror and Copy found.
8. After the volume pairs are deleted, you can restart the host systems from the volumes on the new DS8000 and start the applications again. 9. Remove the paths between the old DS8000 and the new DS8000 by running the DS CLI rmpprcpath command (see Example 21-32).
Example 21-32 Delete the paths
C:\IBM\DSCLI> rmpprcpath -dev ibm.2107-7522391 -remotedev ibm.2107-7503461 -quiet 05:03 CMUC00150I rmpprcpath: Remote Mirror and Copy path 05:03 successfully removed.
274
The modified command for one volume pair is shown in Example 21-33.
Example 21-33 Establish the cascaded pairs C:\IBM\DSCLI> mkpprc -dev ibm.2107-7522391 -remotedev ibm.2107-7503461 -type gcp -mode full -cascade 050a:031a CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 050A:031A successfully created. C:\IBM\DSCLI> mkpprc -dev ibm.2107-7522391 -remotedev ibm.2107-7503461 -type gcp -mode full -cascade 050b:031b CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 050B:031B successfully created. C:\IBM\DSCLI> lspprc -dev ibm.2107-7522391 -l 050a ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Casca ID State Reason Type SourceLSS Timeout (secs) Critical Mode =========================================================================================== 011A:050A Target Copy Pending Global Copy 01 unknown Disabled 050A:031A Copy Pending Global Copy 05 unknown Disabled
Here, 011A is the source volume, 050A is the intermediate volume, and 031A is the target volume. The lspprc command shows the status of both volume pairs where the intermediate volume 050A belongs. 011A:050A is the first pair, where 050A is the secondary, and 050A:031A is the second pair, where 050A is the cascading primary. If the existing pairs are in a Global Copy relationship, you must change them to Metro Mirror pairs before you perform the synchronization in step 6 on page 274 because the first pair of a cascading relationship cannot be Global Copy if the second pair is Metro Mirror.
275
276
22
Chapter 22.
277
22.1 Performance
As the distance between DS8000 systems increases, the Metro Mirror response time is proportionally affected, which elongates write response time to the primary volumes because of its synchronous processing. When implementations over extended distances are required, Global Copy becomes an excellent trade-off solution because of its asynchronous replication mechanism. Compared to Metro Mirror, the source volume write response time impact is low when you use Global Copy. The Global Copy primary DS8000 system must manage the out-of-sync bitmaps to track which data must be transferred and all these changes must be sent asynchronously to the remote DS8000. But these changes have a negligible effect on the response time, as compared with the synchronous Metro Mirror latency. In the case of a highly used primary storage system, the response time performance could be affected because of the additional Global Copy I/Os. If you take a FlashCopy copy at the recovery site in your Global Copy implementation, you should take into account the relationship between Global Copy and the FlashCopy background copy, as described in 19.5, Bandwidth considerations on page 239.
22.2 Scalability
The DS8000 Global Copy environment can be scaled up or down as required. If new volumes are added to the DS8000 that require mirroring, they can be dynamically added. If more Global Copy paths are required, they also can be dynamically added.
278
279
280
Part 6
Part
Global Mirror
This part describes IBM System Storage Global Mirror for DS8000 when used in an Open Systems environment. This part describes the characteristics of Global Mirror and explains the options for its setup. THis part also shows which management interfaces you can use, and the important aspects to be considered when you establish a Global Mirror environment. This part concludes with examples of Global Mirror management and setup. This part covers the following topics: Global Mirror overview Global Mirror options and configuration Global Mirror interfaces Performance and scalability Examples
281
282
23
Chapter 23.
283
Global Mirror is also an asynchronous replication method that provides consistent data at the secondary site. For more information, see 23.3, Basic concepts of Global Mirror on page 288 It is a hardware solution. It supports all platforms (System z, System p, IBM System x, and IBM i).
284
Master The master is a function inside a primary storage system that communicates with subordinates in other storage systems, and controls the creation of consistency groups while you manage the Global Mirror session. The master is defined when the start command for a session is issued to any LSS in a primary storage system. This command determines which DS8000 becomes the master storage system and which LSS becomes the master LSS. The master requires communication paths over Fibre Channel links to any one of the LSSs in each subordinate disk storage server. These communications paths function as regular PPRC paths with a known limit of up to four secondary LSSs connected from a primary LSS. With more than four secondary subordinate disk storage systems, you must use more than a primary LSS when you define these communication paths between master storage system and its subordinate storage systems. Session A Global Mirror session is a collection of volumes that are managed together when you create consistent copies of data volumes. This set of volumes can be in one or more LSSs and one or more storage systems at the primary site. Open Systems volumes and z/OS volumes can both be members of the same session. When you start or resume a session, the consistency groups are created, and the master storage disk system controls the session by communicating with the subordinate storage disk systems. There is also a session concept at the LSS level. But all LSS sessions are combined and grouped within a Global Mirror session. With DS8700 at R5.1+ and DS8800 at R6.1+, this Global Mirror sessions limit is extended to up to 32 GM sessions within the same DS8700 or DS8800. Subordinate The subordinate is a function inside a primary storage system that communicates with the master and is controlled by the master. At least one of the LSSs of each subordinate primary storage systems requires Fibre Channel communication paths to the master. These paths enable the communication between the master and the subordinate, and are required to create consistency groups of volumes that spread across more than one storage system. If all the volumes of a Global Mirror session are in one primary storage system, no subordinate is required, because the master can communicate to all LSSs inside the primary storage system. Consistency group A group of volumes in one or more remote storage systems whose data must be kept consistent at the secondary site. Local site The site that contains the production servers. This term is used interchangeably with the terms primary, production, source, or host1 site. Remote site The site that contains the backup servers of a disaster recovery solution. This term is used interchangeably with the terms secondary, backup, standby, target site, or host2 site.
285
286
In an asynchronous data replication environment, an application write I/O has the following steps; see Figure 23-1: 1. Write application data to the primary storage system cache. 2. Acknowledge a successful I/O to the application so that the next I/O can be immediately scheduled. 3. Replicate the data from the primary storage system cache to the remote storage system cache and NVS. 4. Acknowledge to the primary storage system that data successfully arrived at the remote storage system.
Host Server
Source
Storage Disk System 1 2C00 Primary Primary Primary Primary Source local
1 2
Target
Storage Disk System 2
A A1 A
3 4
Asynchronous Replicate
B1
Target remote
Note how in an asynchronous type technique that the data transmission and the I/O completion acknowledgement are independent processes, which result in virtually no application I/O impact, or at most a minimal degree of I/O. This technique is convenient when you must replicate over long distances.
287
Summary
In summary, the following features are what an asynchronous data replication technique should provide: Data replication to the secondary site is independent from application write I/O processing at the primary site, which results in no impact, or at least only minimal impact, to the application write I/O response time. The order of dependent writes is maintained at the primary site, so data consistency is maintained at the secondary site. Data currency at the secondary site lags behind the primary site. How much it lags depends upon network bandwidth and storage system configuration. In periods of peak write workloads, this difference increases. The bandwidth requirement between primary and secondary sites does not have to be configured for peak write workload; link bandwidth utilization is improved over synchronous solutions. Tertiary copies are required at the secondary site to preserve data consistency. Data loss in disaster recovery situations is limited to the data in transit plus the data that might still be in the queue at the primary site that is waiting to be replicated to the secondary site.
Task1 Task2
Local site
Task1
Remote site
Server
Task3
Client
Network
Task3
Client
Task2
Client
288
The server distributes the work to its clients. The server also coordinates all individual feedback from the clients and decides further actions. Looking at this diagram, the communication paths between the server and all its clients are key. Without communication paths between these four components, the functions eventually come to a complete stop. Matters get more complicated when the communication fails unexpectedly in the middle of information exchange between the server and its clients or to some of its clients. Usually, a two-phase commit process provides a consistent state for certain functions and determines whether they complete successfully at the client site. After a function completes successfully and is acknowledged to the server, the server progresses to the next function task. Concurrently, the server tries to parallelize operations (for example, I/O requests, coordination communication, and so on) to minimize the impact on throughput because of serialization and checkpoints. When certain activities depend on each other, the server must coordinate these activities to ensure a correct sequence. The server and client can be also referred as master and subordinate (see Figure 23-3). These terms are used later when describing Global Mirror.
Local site
Remote site
Master
Subord Flash Sec
Secondary
Primary Network
FlashCopy2
Subordinate
Primary
Secondary
Figure 23-3 shows the basic Global Mirror structure. A master coordinates all efforts within a Global Mirror environment. After the master is started and manages a Global Mirror environment, the master issues all related commands over PPRC links to its attached subordinates at the primary site. These subordinates could include a subordinate within the master itself. This communication between the master and an internal subordinate is transparent and does not require any extra attention from the user. The subordinates use inband communication to communicate with their related remote storage systems at the remote site. The master also receives all acknowledgements from the subordinates and coordinates and serializes all the activities in the session. When the master and subordinate are in a single storage system, the subordinate is internally managed by the master. With two or more storage systems at the primary site, which participate in a Global Mirror session, the subordinate is external and requires separate attention when you create and manage a Global Mirror session or environment.
289
The following sections explain how Global Mirror works and how Global Mirror ensures consistent data at any time at the secondary site. First, we go through the process of how to create a Global Mirror environment, which gives you an insight into how Global Mirror works.
Host
Write I/O
A
Primary site
Figure 23-4 Start with a simple application environment
Secondary site
Host
Write I/O
A
Primary site
Primary Primary
FCP port
Secondary s ite
290
In Figure 23-5 on page 290, we establish Global Copy paths over an existing network. This network may be based on an FCP transport technology or on an IP-based network. Global Copy paths are logical connections that are defined over the physical links that interconnect both sites. All Remote Mirror and Copy paths (Metro Mirror, Global Copy, and Global Mirror paths) are similar and are based on FCP. The term Global Copy path denotes that the path is intended for Global Copy usage. Metro Mirror and Global Copy paths are also referred to as PPRC paths.
23.4.3 Creating a Global Copy relationship between the primary volume and the secondary volume
Next, create a Global Copy relationship between the primary volume and the secondary volume (see Figure 23-6).
Host
Write I/O
A
Primary PENDING O O S
Global Cop y
B
Secondary PENDING
Primary site
Secondary site
In the following paragraphs, we refer to the primary volume as the A volume and to the secondary volume as the B volume. In Figure 23-6, we change the target volume state from simplex (no relationship) to target Copy Pending. This Copy Pending state applies to both volumes, source Copy Pending and target Copy Pending. Concurrently, data is copied from the source volume to the target volume. After a first complete pass through the entire A volume, Global Copy scans constantly through the out-of-sync (OOS) bitmap. This bitmap indicates changed data as it arrives from the applications to the source disk system. Global Copy replicates the data from the A volume to the B volume based on this out-of-sync bitmap. Global Copy does not immediately copy the data as it arrives on the A volume. Instead, this process is an asynchronous one. When a track is changed by an application write I/O, it is reflected in the out-of-sync bitmap with all the other changed tracks. There can be several concurrent replication processes that work through this bitmap, thus maximizing the usage of the high-bandwidth Fibre Channel links.
291
This replication process keeps running until the Global Copy volume pair A-B is explicitly or implicitly suspended or terminated. A Global Mirror session command, for example, to pause or to terminate a Global Mirror session does not affect the Global Copy operation between both volumes. Now, data consistency does not yet exist at the secondary site.
Host
Write I/O
FlashCopy MODE(ASYNC)
Primary PENDING O O S
B
Sec ondary
PENDING S B M
C
Tertiary T B M
Primary site
Secondary site
FlashCopy has two unique licenses available. They can be ordered separately or together. One license is for classical FlashCopy (PTC) and the other for Space-Efficient FlashCopy (IBM FlashCopy SE). Either of these licenses can be used as C volumes in a Global Mirror environment but FlashCopy SE reduces the required storage capacity at the secondary site. With standard FlashCopy, you need twice the capacity at the secondary site, as compared to the primary site. With FlashCopy SE, you need the same capacity as on the primary site with additional capacity for the repository. The size of the repository is less than using fully provisioned C volumes. The size is determined by how long you can tolerate an inactive Global Mirror session. If Global Mirror does not initiate FlashCopies by forming consistency groups, which starts FlashCopy and releases the space currently in use, the repository fills up with data as tracks are updated with updates from the primary site. The creation and handling of the Global Mirror environment is almost identical for FlashCopy and FlashCopy SE. There are dedicated parameters for the creation and removal of FlashCopy SE pairs only (for more information, see Chapter 10, IBM FlashCopy SE on page 95).
292
The focus is now on the secondary site. Figure 23-7 on page 292 shows a FlashCopy relationship with a Global Copy secondary volume as the FlashCopy source volume. Volume B is now both, at the same time, a Global Copy secondary volume and a FlashCopy source volume. In the same storage server is the corresponding FlashCopy target volume. This FlashCopy relationship has certain attributes that are typical and required when you create a Global Mirror session. These attributes are: Inhibit target write: Protect the FlashCopy target volume from being modified by anything other than Global Mirror related actions. Start change recording: Apply changes only from the source volume to the target volume that occur to the source volume in between FlashCopy establish operations, except for the first time when FlashCopy is initially established. Persist: Keep the FlashCopy relationship until explicitly or implicitly terminated. This parameter is automatic because of the change recording property. Nocopy: Do not start background copy from source to target, but keep the set of FlashCopy bitmaps required for tracking the source and target volumes. These bitmaps are established when a FlashCopy relationship is created. Before a track in the source volume B is modified, between consistency group creations, the track is copied to the target volume C to preserve the previous point-in-time copy. This copy includes updates to the corresponding bitmaps to reflect the new location of the track that belongs to the point-in-time copy. The first Global Copy write to its secondary volume track with the window of two adjacent consistency groups causes FlashCopy to perform copy on write operations. Space-Efficient target: Use Space-Efficient volumes as FlashCopy targets, which means FlashCopy SE is used in the Global Mirror setup. Virtual capacity was allocated in a Space-Efficient repository when these volumes were created. A repository volume per extent pool is used to provide physical storage for all Space-Efficient volumes in that extent pool. Background copy is not allowed if Space-Efficient targets are used. For a detailed description of FlashCopy SE, see Chapter 10, IBM FlashCopy SE on page 95. Where required, check the IBM System Storage support website for the availability of Copy Services features at: http://www.ibm.com/support/entry/portal/overview
293
Host
Define Global Mirror session
FlashCopy MODE(ASYNC)
01
Global Cop y
Primary
Seconda ry
PENDING S B M
Tertiary T B M
Primary site
Secondary s ite
Defining a Global Mirror session creates a token, which is a number 1 - 255. This number represents the Global Mirror session. This session number is defined at the LSS level. Each LSS that has volumes that are part of the session needs a corresponding define session command. Until recently, only a single Global Mirror session was supported per DS8000 Storage Facility Image (SFI). With the DS8700 and LMC level 6.5.1.xx or later and DS8800 at LMC 7.6.1.xx or later, up to 32 Global Mirror hardware sessions can be supported within the same primary DS8700 or DS8800. Session means a hardware/firmware based session in the DS8000, which is managed by the DS8000 in an autonomic fashion.
294
Host
Add Global Copy primary volume to G lobal Mirror session
01
Fl ashCopy
A
Source PENDING
Glob al Copy
B
T arg et Co py Pen ding
C
Tertiary T B M
O O S
S B M
Secondary Site
Figure 23-9 Add a Global Copy primary volume to a Global Mirror session
This process adds primary volumes to a list of volumes in the Global Mirror session, but it does not perform consistency group formation yet. Global Copy is replicating, on the Global Copy secondary volumes, the application updates that arrive at the Global Copy primary volumes. Initially, the Global Copy primary volumes are placed in a join pending status. After a consistency group is formed, the Global Copy primary volume is added to the session and placed in an in-session status. Nothing happens to the C volume after its initial establishment in a Global Mirror session.
295
Host
Start Global Mirror session
01
FlashCopy MODE(ASYNC)
A
Prima ry PEN DING
C
Tertiary T B M
C R
O O S
S B M
C R
Primary site
Secondary site
This start command triggers events that involve all the volumes within the session. These events include fast bitmap management on the primary storage system, issuing inband FlashCopy commands from the primary site to the secondary site, and verifying that the corresponding FlashCopy operations finished successfully. This process happens at the microcode level of the related storage systems that are part of the session, and are fully transparent and autonomic from the users perspective. All B and C volumes that belong to the Global Mirror session comprise the consistency group. Now let us examine some more details about the consistency group creation at the secondary site.
296
Perfo rm FlashCop y
C R
A1
Prim ary
O O S
B1
Se conda ry
C1
Te rtiary
C R
A2
Prim ary
O O S
B2
Se conda ry
C2
Primary
Secondary
Te rtiary
Notice that before step 1 and after step 3, Global Copy constantly scans through the out-of-sync (OOS) bitmaps and replicates data from A volumes to B volumes, as described in 23.4.3, Creating a Global Copy relationship between the primary volume and the secondary volume on page 291. When the creation of a consistency group is triggered by the Global Mirror master, the following steps occur: 1. All Global Copy primary volumes are serialized. This serialization imposes a brief hold on all incoming write I/Os to all involved Global Copy primary volumes. After all primary volumes are serialized across all involved primary DS8000s, the pause on the incoming write I/O is released and all further write I/Os are now noted in the change recording (CR) bitmap. They are not replicated until step 3 on page 298 is done, but application write I/Os can immediately continue. This serialization phase takes only a few milliseconds and the default coordination time is set to 50 ms. 2. Drain includes the process to replicate all remaining data that is indicated in the out-of-sync (OOS) bitmap and still not replicated. After all out-of-sync bitmaps are empty (notice that empty here does not mean in a literal sense), step 3 on page 298 is triggered by the microcode from the primary site.
297
3. Now the B volumes contain all data as a quasi point-in-time copy, and are consistent because of the serialization process in step 1 on page 297 and the completed replication or drain process in step 2 on page 297. Step 3 is now a FlashCopy that is triggered by the primary systems microcode as an inband FlashCopy command to volume B, as FlashCopy source, and volume C, as a FlashCopy target volume. This FlashCopy is a two-phase process. First, the FlashCopy command is issued to all involved FlashCopy pairs in the Global Mirror session. Then, the master collects the feedback and all incoming FlashCopy completion messages. When all FlashCopy operations are successfully completed, the master concludes that a new consistency group is created successfully. FlashCopy applies here only to changed data since the last FlashCopy operation because the start change recording property was set at the time when the FlashCopy relationship was established. The FlashCopy relationship does not end because of the change recording property, which forces persistent. Because of the nocopy attribute, only copy on write operations cause physical tracks to be copied from the source to the target. When step 3 is complete, a consistent set of volumes is created at the secondary site. This set of volumes, the B and C volumes, represents the consistency group. For this brief moment only, the B volumes and the C volumes are equal in their content. Immediately after the FlashCopy process is logically complete, the primary systems microcode is notified to continue with the Global Copy process from A to B. To replicate the changes to the A volumes that occurred during the step 1 to step 3 window, the change recording bitmap is mapped against the empty out-of-sync bitmap, and from now on, all arriving write I/Os end up again in the out-of-sync bitmap. From now on, the conventional Global Copy process, as outlined in 23.4.3, Creating a Global Copy relationship between the primary volume and the secondary volume on page 291, continues until the next consistency group creation process is started.
Maximum
Maximum
co ordination
time Serialize all Global Co py p rim ary volum es
drain
time Drain d at a from p rimary to secondary site Perform FlashCo py
CG interval
time
298
Here are the three parameters: Maximum coordination time: In the first step, the serialization step, Global Mirror serializes all related Global Mirror primary volumes across all participating primary storage systems. This parameter dictates, for all of the Global Copy primary volumes that belong to this session and consistency group, the maximum time that is allowed when you form the change recording bitmaps for each volume. This time is measured in milliseconds (ms). The default is 50 ms. Maximum drain time: This time is the maximum time that is allowed for draining the out-of-sync bitmap after the process to form a consistency group is started and step 1 of Figure 23-12 on page 298 completes successfully. The maximum drain time is specified in seconds. The default is 30 seconds. You might want to increase this time window when you replicate over a longer distance and with limited bandwidth. If the maximum drain time is exceeded, Global Mirror fails to form the consistency group and evaluates the current throughput of the environment. If the evaluation indicates that another drain failure is likely, Global Mirror stays in Global Copy mode while regularly re-evaluating the situation to determine when to form the next consistency group. If this situation persists for a significant period, then Global Mirror eventually forces the formation of a new consistency group. In this way, Global Mirror ensures that during periods when the bandwidth is insufficient, production performance is protected, and data is transmitted to the secondary site in the most efficient manner possible. When the peak activity passes, consistency group formation resumes in a timely fashion. Consistency Group interval time: After a consistency group (CG) is created, the Consistency Group Interval time (CGI) determines how long to wait before you start the formation of the next consistency group. This formation is specified in seconds, and the default is zero seconds. Zero seconds means that consistency group formation happens constantly. When a consistency group is created successfully, the process to create a consistency group starts again immediately. There is no external parameter to limit the time for the FlashCopy operation. Global Mirror uses a distributed approach and a two-phase commit technique for activities between the master and its subordinate LSSs. The communication between the primary and the secondary site is coordinated through the subordinate LSS. The subordinates function here partly as a transient component for the Global Mirror activities, which are all triggered and coordinated by the master. You can use this distributed concept to always provide a set of data consistent volumes at the secondary site independent of the number of involved storage systems at the primary or secondary site.
299
Figure 23-13 depicts a configuration for the DS8700 and DS8800 with a GM session per Storage Facility Image (SFI), which is one session for a DS8700 and the DS8800 and one session for DS8700 or two sessions when a DS8700 is partitioned in two SFIs.
DS8700/DS8800 Session
DS8700/ DS8800
Global Copy
20
20 20
20
Se c onda r y
PENDING
Subordinate
S e c onda ry
Se c onda r y
Primary PENDING
SAN
20 20
Se c onda r y
20
GM master
S e c onda ry
PENDING
Se c onda r y
Primary PENDING
Primary Site
Secondary Site
Session is a Global Mirror master session that controls a Global Mirror session, as shown in
Figure 23-13. A Global Mirror session is identified by a Global Mirror session ID, which is the number 20 in Figure 23-13. This session ID is defined in all the involved LSSs at the primary site, which contain Global Copy primary volumes that belong to session 20. Note the two storage systems configuration with a Global Mirror master in the DS8000 at the bottom and a master DS8000 that contains Global Copy primary volumes that belong to session 20. The Global Mirror master controls the master through PPRC FCP-based paths between both DS8000 storage systems. Consistency is provided across all primary storage systems. With DS8100 and DS8300, it is not possible to create another Global Mirror session with the master session in the DS8000 that already contains a master session, as shown in Figure 23-13 with session 20.
300
Potential impacts with such a single Global Mirror session are shown in Figure 23-14. Assume that a disk storage environment is commonly used by various application servers. To provide good performance, all volumes should be spread across primary and remote storage systems. For disaster recovery purposes, a secondary site exists with a corresponding storage system and the data volumes are replicated through a Global Mirror session with the Global Mirror master function in a storage system.
DS8700 / DS8800
Session
DS8700 / DS8800
20
Global Copy
Application 1
Subordinate
Application 2
SAN
GM master
Application 3
Primary Site Secondary Site
301
When Application 2 fails and there is no access from the secondary site servers to the primary site servers, then the entire Global Mirror session 20 must fail over to the secondary site. Figure 23-15 shows the impact on the other two applications, Application 1 and Application 3. There is only a single Global Mirror session possible with a DS8100 or a DS8300 SFI, which means that the entire session must be failed over to the secondary site in order to restart Application 2 on the backup server. In this case, even the other two servers, Application 1 and Application 3, also must swap sites when you perform a failover.
Application 1
Session
fail over
20
Application 2
SA N
Application 2
GM ma st er
Application 3
DS8700 / DS8800
Primary Site
This situation implies that a services interruption occurs not only to the failed server with Application 2 but also that there are service impacts to Application 1 and Application 3, which must shut down in the primary site as well and restart in the secondary site 2 after the Global Mirror session failover process completes.
302
Figure 23-16 shows the same server configuration, but the DS8100 or DS8300 storage systems are exchanged by a DS8700 or DS8800 with release 6.1+ firmware. You can have up to 32 dedicated Global Mirror master sessions with this configuration. In Figure 23-16, Application 1 is connected to volumes in LSS number 00 to LSS number 3F. Application 2 connects to volumes in LSS 40-7F and the server with Application 3 connects to volumes in LSS 80-BF. Each set of volumes per application server is in its own Global Mirror session which is controlled by the relevant Global Mirror master session within the same storage system.
Application 1
D S8800
DS8800
GM master
Session
10 20 30
Secondary Site GM master
Application 2
Session
Application 2
GM master
Session
Application 3
Primary Site
Figure 23-16 DS8700 and DS8800 provides multiple GM master session support
As an example, if Application 2 server fails, only Global Mirror session 20 is failed over to the secondary site, and then the concerned server in Site 2 can start Application 2 after the failover process completes. Different applications might have different RPO requirements. A DS8700 and DS8800 installation now allows one or more test sessions in parallel with one or more production Global Mirror sessions within the same SFI. Global Mirror session management: The basic management of a Global Mirror session does not change. The Global Mirror session is built upon the existing Global Mirror technology and microcode of the DS8000.
303
304
24
Chapter 24.
305
: LSS nn
Paths are unidirectional, that is, they are defined to operate in either one direction or the other. PPRC is bidirectional, that is, any particular pair of LSSs can have paths that are defined among them that have opposite directions (each LSS holds both primary and secondary volumes from the other particular LSS). Opposite direction paths can be defined on the same Fibre Channel physical link. For bandwidth and redundancy, more than one path can be created between the same LSSs. PPRC balances the workload across the available paths between the primary and secondary LSSs. LSS: Remember that the LSS is not a physical construct in the DS8000; it is a logical construct. Volumes in an LSS can come from multiple disk ranks and arrays.
306
Physical links are bidirectional and can be shared by other Remote Mirror and Copy functions, such as Metro Mirror, Global Copy, and Global Mirror. It is preferable to not mix them. For performance aspects, see 26.2, Performance considerations for network connectivity on page 344.
307
Alternatively, if the volumes in each of the LSSs of DS8000 1 map to volumes in all three secondary LSSs in DS8000 2, there are nine logical paths over the physical link (not fully illustrated in Figure 24-2). You should use a 1:1 LSS mapping as a preferred practice.
DS8000 1
LSS 1
1 logical path 3-9 logical paths
DS8000 2
LSS 1
1 logical path switch
LSS 2
1 logical path
Port
PPRC paths
1 Link
Port
LSS 2
1 logical path
LSS 3
1 logical path
Figure 24-2 Logical paths for PPRC
LSS 3
1 logical path
PPRC paths have certain architectural limits: A primary LSS can maintain paths to a maximum of four secondary LSSs. Each secondary LSS can be in a separate DS8000. A master storage system that communicates with a subordinate storage system uses one path from a primary LSS on the master storage system to the remote storage system LSS. This configuration reduces the maximum number of secondary LSSs a primary LSS can maintain paths from four to three. Up to eight logical paths per LSS-LSS relationship can be defined. They are defined over different physical links. An FCP port can host up to 1280 logical paths. These paths are the logical and directional paths that are made from LSS to LSS. An FCP link (the physical connection from one port to another port) can host up to 256 logical paths. An FCP port can accommodate up to 126 different links (DS8000 port to DS8000 port through the SAN).
24.2 Bandwidth
Before you establish your Remote Mirror and Copy solution, you should establish what your peak bandwidth requirement is. This estimate helps ensure that you have enough PPRC links in place to support that requirement. To avoid any impact to production response times and to meet Recovery Point Objectives (RPO), the typical write rate along with the peak write rate should be established to determine how much bandwidth is adequate to cope with this load. Growth should also be considered and allowed for when you determine bandwidth requirements. Remember that only writes are mirrored across to the secondary volumes.
308
It is not always necessary to have bandwidth in a Global Mirror environment to handle peaks that are high and relatively short in duration. Global Mirror may see its RPO increase during these periods, but catches up when the peak is over. But, there must be adequate bandwidth to catch up in a reasonable amount of time after this event. Tools to assist with this data collection include Tivoli Storage Productivity Center for Replication can also provide the data necessary for bandwidth analysis. More information about bandwidth and available connection options can be found in IBM System Storage Business Continuity: Part 1 Planning Guide, SG24-6547.
309
Subordinate
01 A Primary Primary A
source copy pending
Primary Primary
target
A B
Primary Primary
copy pending
C
tertiary
paths
Local site
Remote site(s)
Primary Primary
Network
target
A B
Primary Primary
copy pending
Global Copy
C
tertiary
Figure 24-3 Global Mirror basic configuration with master and subordinate storage systems
The order of commands to create a Global Mirror environment is not absolutely fixed and allows for some variation. To be consistent with other sources and to not confuse the user with a different sequence of commands, use a meaningful order and use the following steps to create a Global Mirror environment: 1. Define the paths between the local site and the remote site. In Figure 24-3, these paths are the logical communication paths between corresponding LSSs at the local site and the remote site, which are defined over Fibre Channel physical links that are configured over the network. Global Copy source LSSs are represented by the A volumes and their corresponding Global Copy target LSSs by the B volumes. You can also define logical communication paths here between the master and any master storage system that is part of the Global Mirror session. These paths are defined between source storage systems at the local site. These communication paths function as regular PPRC paths with a known limit of up to four secondary LSSs connected from a primary LSS. With more than four secondary master storage systems you must use more than a primary LSS when you define these communication paths between the master storage system and its subordinate storage. With only a single source storage system, you do not need to define paths to connect internal LSSs within the source storage system. The communication between the master and the subordinates within a single source storage system is obvious and internally performed. 2. When the communication paths are defined, start the Global Copy pairs that are part of a Global Mirror session. Global Copy must be created by running mkpprc -type gcp. You should wait until the first initial copy is complete before you continue to the next step. Taking this time avoids unnecessary FlashCopy background I/Os in the next step.
310
3. Create FlashCopy relationships between the B and C volumes. You can use the mkremoteflash (abbreviated mkflash) command, with the following parameters: -tgtinhibit, -record, and -nocp. The -persist parameter is automatically set when the -record parameter is selected. If you use Track-Space-Efficient volumes as FlashCopy target volumes, also add the -tgtse parameter. For more information about the particular FlashCopy attributes that are required for a Global Mirror FlashCopy, see 23.4.4, Introducing FlashCopy on page 292. 4. With external subordinates, that is, with more than one involved storage system at the local site, you must have paths between the master LSS and any potential subordinate storage system at the local site. If you did not establish these paths in step 1 on page 310, then create these paths here before you continue with step 5. These communications paths function as regular PPRC paths with a known limit of up to four secondary LSSs connected from a primary LSS. With more than four secondary subordinate storage systems, you must use more than a primary LSS when you define these communication paths between a master storage system and its subordinate storage systems. 5. Define a token that identifies the Global Mirror session. This token is a session ID with a number 1 - 255. Define this session number to the master storage system and also to all potentially involved source LSSs that are going to be part of this Global Mirror session and contain Global Copy source volumes that belong to the Global Mirror session. All source LSSs include all LSSs in potential subordinate storage systems that are going to be part of the Global Mirror session. For this step, you can run the mksession command. With DS8100, DS8300, and the DS8800 at the R6.0 microcode level, only one active Global Mirror session is supported within the system. With the DS8700 and LMC level 6.5.1.xx or later and DS8800 at LMC 7.6.1.xx or later, up to 32 Global Mirror hardware sessions can be supported within the same local DS8700 or DS8800. Also in regards to the session ID and the session number is that Tivoli Storage Productivity Center for Replication uses session number 2 when you create the first Tivoli Storage Productivity Center for Replication Global Mirror session. 6. Populate the session with the Global Copy source volumes. You should put these Global Copy source volumes into the session after they complete their first pass for initial copy. To accomplish this task, run chsession -action add -volume. 7. Start the session by running mkgmir. This command defines the master LSS. All further session commands must go through this LSS. You can specify the Global Mirror tuning parameters with this command, such as maximum drain time, maximum coordination time, and Consistency Group interval time. You can use these steps regardless of the interface used.
311
Volumes can be added to a session in any state, for example, simplex or pending. Volumes that have not completed their initial copy phase stay in a join pending state until the first initial copy is complete. If a volume in a session is suspended, it causes consistency group formation to fail. As a preferred practice, add only Global Copy primary volumes that complete their initial copy or first pass. Also, wait until the first initial copy is complete before you create the FlashCopy relationship between the B and the C volumes. Adding primary volumes: You cannot add a Metro Mirror source volume to a Global Mirror session. Global Mirror supports only Global Copy pairs. When Global Mirror detects a volume that, for example, is converted from Global Copy to Metro Mirror, the formation of a consistency group fails. When you add many volumes at once to an existing Global Mirror session, then the available resources for Global Copy within the affected ranks might be used by the initial copy pass. To minimize the impact to the production servers when you add many volumes, consider adding the volumes to an existing Global Mirror session in stages. Suspending a Global Copy pair that belongs to an active Global Mirror session affects the formation of consistency groups. When you remove Global Copy volumes from an active Global Mirror session, complete the following steps: 1. Remove the volumes from the Global Mirror session. 2. Withdraw the FlashCopy relationship between the B and C volumes. 3. Terminate the Global Copy pair to bring volume A and volume B into simplex mode. Consistency group failure: When you remove A volumes without removing them from a Global Mirror Session, you might see an error condition when you run showmigr -metrics, indicating that the consistency group formation failed. However, this error does not mean that you lost a consistent copy at the remote site because Global Mirror does not take the FlashCopy (B to C) for the failed consistency group data. This message indicates that one consistency group formation failed, and Global Mirror tries the sequence again.
312
313
If FlashCopy SE is used for the Global Mirror environment, you should also release the repository space that was used for the Space-Efficient volumes by using the -tgtreleasespace parameter. For the options to release Space-Efficient repository space, see 10.4.2, Removing FlashCopy relationships and releasing space on page 111. After the volumes are removed from the session, you can explicitly terminate the corresponding FlashCopy relationship that was tied to this source volume. The termination of FlashCopy relationships might be necessary when you want to change the FlashCopy targets within a Global Mirror configuration and choose, for example, another LSS for the FlashCopy targets. You might be doing this task because you want to replace the FlashCopy targets because of a skew in the load pattern in the remote storage system. In this situation, you can pause the session before such activity, and then resume the session again after the replacement of the FlashCopy relationships is completed. Pause versus stop commands: A pause command (pausemigr) completes a consistency group formation in progress. A stop command (rmgmir) immediately ends the formation of consistency groups.
314
01
Network
A B
C
Remote site
01
Network
A B
Figure 24-4 Define a Global Mirror session to all potentially involved storage systems
Subordinate
01 A Primary Primary A
source copy pending
Primary Primary
target
A B
Primary Primary
copy pending
C
tertiary
paths
Remote site
Primary Primary
Network
target
A B
Primary Primary
copy pending
C
tertiary
Global Copy
Figure 24-5 Define a master storage system and start a Global Mirror session
Through the start command (mkgmir), you decide which LSS becomes the master LSS and which local storage system becomes the master storage system. This master acts like a server in a client/server environment.
315
The required communication between the master storage system and the subordinate storage systems is inband, over the defined Global Mirror paths. This communication is efficient, and minimizes any potential application write I/O impact during the coordination phase to about a few milliseconds. For more information, see 23.5, Consistency groups on page 296. This communication is performed over FCP links. At least one FCP link is required between the master storage system and the subordinate storage system. Figure 24-6 uses dashed lines to show the Global Mirror paths that are defined over FCP links, between the master storage system and its associated subordinate storage systems. These FCP ports are dedicated for Global Mirror communication between master and subordinates.
Subordinate
01 A Primary Primary A
source copy pending
Global Copy links
Primary Primary
target
A B
Primary Primary
copy pending
C
tertiary
Subordinate
01 A Primary Primary A
source
Global Copy links
Primary Primary
target
A B
Primary Primary
copy pending
copy pending
C
tertiary
GM links
Network
Primary Primary
target
A B
Primary Primary
copy pending
C
tertiary
Figure 24-6 Global Mirror paths over FCP links between source storage systems
Also shown in Figure 24-6 is a shared port on the master storage system, and dedicated ports at the subordinates. When you configure links over a SAN network, the same FCP ports of the storage system can be used for the Global Mirror session communication, as well as for the Global Copy communication and for host connectivity. However, for performance reasons, and to prevent host errors from disrupting your Global Mirror environment, you should use separate FCP ports.
316
The sample configuration that is shown in Figure 24-7 shows a mix of dedicated and shared FCP ports. In this example, an FCP port in the master storage system is used as Global Mirror link to the other two subordinate storage systems, and is also used as Global Copy link to the target storage system. Also, there are ports at the subordinate storage system that are used as the Global Mirror session link and the Global Copy link.
Subordinate
01 A Primary Primary A
source copy pending
Global Copy links
Primary Primary
target
A B
Primary Primary
copy pending
C
tertiary
Subordinate
01 A Primary Primary A
source
Global Copy links
Primary Primary
target
A B
Primary Primary
copy pending
copy pending
C
tertiary
GM links
Network
Primary Primary
target
A B
copy pending
317
If possible, a better configuration is the one shown in Figure 24-8. Again, from a performance and throughput viewpoint, you do not need two Global Mirror links between the master and its subordinate storage systems. Still, dedicated ports for Global Mirror control communication between the master and subordinate provides a maximum of responsiveness and availability.
Subordinate
01 A Primary Primary A
source
Primary Primary
target
A B
Primary Primary
copy pending
C
tertiary
Subordinate
01 A Primary Primary A
source
Primary Primary
target
A B
Primary Primary
copy pending
C
tertiary
GM links
Network
Primary Primary
target
A B
Primary Primary
C
tertiary
Figure 24-8 Dedicated Global Mirror links and dedicated Global Copy links
With the DS8700 and DS8800, up to 32 dedicated and individual master Global Mirror sessions within each local DS8700 or DS8800 are supported. The configuration in Figure 24-8 might still be an option for a large configuration that requires all primary volumes to be within the same Global Mirror session and to provide data consistency for all volumes across all primary storage systems at the remote site.
318
Server
FlashCopy
Global Copy
A A B
A A C
Local site
Figure 24-9 Normal Global Mirror operation
Remote site
Writes from the server are replicated through Global Copy, and consistency groups are created as tertiary copies. The B volumes are Global Copy target volumes, and they are also FlashCopy source volumes. The C volumes are the FlashCopy target volumes. The FlashCopy relationship is a particular relationship, which is described in 23.4.4, Introducing FlashCopy on page 292.
Server
FlashCopy
Global Copy
A A B
A A C
Local site
Figure 24-10 Production site fails
Remote site
319
Your goal is to swap to the remote site and restart the applications. This process requires that you first make the set of consistent volumes at the remote site available for the application before the application can be restarted at the remote site. Then, after the local site is operational again, you must return to the local site. Before you return to the local site, you must apply to the source volumes the changes that the application did to the target data while it was running at the remote site. Afterward, do a quick swap back to the local site and restart the application. When the local storage system fails, Global Mirror can no longer form consistency groups. Depending on the state of the local storage system, you might be able to terminate the Global Mirror session. Usually, this action is not possible because the storage system might not respond any longer. Host application I/O might have failed and the application ended. This situation usually prompts messages or SNMP alerts that indicate the problem. With an automation solution in place, for example, Tivoli Storage Productivity Center for Replication, these alert messages trigger the initial recovery actions. If the formation of a consistency group was in progress, then, most probably, not all FlashCopy relationships between the B an C volumes at the remote site reached the corresponding point-in-time. Some FlashCopy pairs might have completed the FlashCopy phase to form a new consistency group, and committed the changes already. Others might not have completed yet, and are in the middle of forming their consistent copy, and remain in a revertible state. There is no master to control and coordinate what might still be going on. This situation requires that you take a closer look at the volumes at the remote site before you can continue to work with them. First, however, you must bring the B volumes into a usable state using the failover command.
FlashCopy
Global Copy
A A B
A A C
Local site
Figure 24-11 Perform Global Copy failover from B to A
Remote site
You can use the DS GUI, DS CLI, and Tivoli Storage Productivity Center for Replication to run the necessary commands on the remote storage systems.
320
Doing a failover on the Global Copy secondary volumes changes their status to suspended primaries and activates the out-of-sync (OOS) bitmaps. This configuration sets the stage for change recording when application updates start changing the B volumes. You can use change recording to resynchronize the changes from the B to the A volumes later before you return to the local site to resume the application again at the local site. Currently, the B volumes do not contain consistent data. In our example, we changed their state from target Copy Pending to suspended. The state of the A volumes remains unchanged. The key element when you run a Global Copy failover is that the B volumes become the new source volumes. This action changes the state of the B volumes from target Copy Pending to suspended. This action does not require communication with the other storage system at all, even though it is specified with the failoverpprc command. When all the failover commands run successfully, you can move on to the next step.
FlashCopy
Global Copy
Prim ary Prim ary Prim Prim ary ary source suspended
A A B
A A C
Each FlashCopy pair needs a FlashCopy query to identify its state. If the local site is still accessible and the source storage system is also accessible, you might consider running these activities at the production site using remote FlashCopy commands. Most likely, the source storage system does not respond any longer. In this case, you must target the query directly at the remote site storage system. When you query a FlashCopy pair, there are two pieces of information that are key to determine whether the C volume set is consistent or needs intervention: the revertible state and the sequence number.
321
The lsflash command output reports the Revertible state as either Enable or Disable, which indicates whether the state of the FlashCopy is revertible or non-revertible. A non-revertible state means that a FlashCopy process completed successfully and all changes are committed. Global Mirror uses the two-phase FlashCopy establishment operation. This operation allows the storage system to prepare for a new FlashCopy relationship without altering the existing FlashCopy relationship. You can either commit or revert this new FlashCopy relationship with a revertible state by running the revertflash and commitflash commands. During the consistency group formation process, Global Mirror puts all FlashCopy relationships in the revertible state, and after they are in the revertible state, commits all FlashCopy relationships. With this operation, the situation in which some FlashCopy operations have not started while others have completed does not occur. The Sequence number is an identifier that can be set for FlashCopy establish operations and it is then associated with the FlashCopy relationship. Subsequent FlashCopy withdraw operations can be directed to FlashCopy relationships with specific sequence numbers. Global Mirror uses the sequence number to identify a particular consistency group. The sequence number that is used by Global Mirror is the timer from the Global Mirror master storage system (in seconds resolution) at the point when the Global Mirror source components must be coordinated to form a consistency group. This action occurs at a point before the consistency group is transferred to the remote site. If your master storage system platform timer is set to the time of day, then the FlashCopy sequence for Global Mirror approximates a time stamp for the consistency group. The best situation is when all the FlashCopy pairs of a Global Mirror session are in the non-revertible state and all their sequence numbers are equal. No further action is necessary, the set of C volumes is consistent, and the copy is good. Figure 24-13 shows the consistency group creation process. The action that is required depends on the state of the consistency group creation process when the failure occurs.
Create consistency group by holding application writes while creating Transmit updates in Global Copy mode FlashCopy issued bitmap containing updates for this while between consistency groups with revertible option consistency group on all volumes Consistency group interval 0s to 18hrs design point is 2-3ms All FlashCopies Maximum coordination time (eg. 10ms) revertible
Drain consistency group and send to remote DS using Global Copy. Application writes for next consistency group are recorded in change recording bitmap Maximum drain time eg.1 min
Action required
322
Depending on when the failure occurs, there are some combinations of revertible states and FlashCopy sequence numbers that need different corrective actions. Use Table 24-1 as a guide. This table is a decision table and reads in the following way: When column 2 and column 3 are true, then take the action in column 4. Column 5 contains additional comments. Do this determination for each of the four cases. The cases are described in chronological order, starting from the left.
Table 24-1 Consistency Group and FlashCopy validation decision table Case Are all FC relationships revertible? No. Are all FC sequence numbers equal? Yes. Action to take Comments
Case 1
CG formation ended.
Case 2
Some: Some FlashCopy pairs are revertible and others are not revertible.
Revertible FlashCopy pairs sequence numbers are equal. Non-revertible FlashCopy pairs sequence numbers are equal, but do not match the revertible FlashCopies sequence number. Yes.
Revert FC relations.
Some FlashCopy pairs are running in a consistency group process and some have not yet started their incremental process.
Case 3
Yes.
All FlashCopy pairs are in a new consistency group process and none have finished their incremental process. Some FlashCopy pairs are running in a consistency group process and some have already finished their incremental process.
Case 4
Some: Some FlashCopy pairs are revertible and others are not revertible.
Yes.
Commit FC relations.
If you see a situation other than the above four situations, then the Global Mirror mechanism is corrupted.
323
324
FRR FlashCopy
A A B
A A C
Local site
Remote site
Figure 24-14 Set a consistent set of B volumes using the C volumes as source
Although the FRR operation starts the background copy from C to B volumes, in the reverseflash command, you must specify the B volumes as the FlashCopy sources and the C volumes as the FlashCopy targets. With the reverseflash command, you must use the following parameters: -fast With this parameter, the reverseflash command can be issued before the background copy completes. This option is intended for use as part of Global Mirror. -tgtpprc After the failover of B to A operation described in 24.8.3, Global Copy failover B volumes on page 320, the B volume became a Global Copy source volume in the suspended state. The -tgtpprc parameter allows the FlashCopy target volume to be a Global Copy source volume. You must specify this parameter because the B volume becomes a FlashCopy target in the reverseflash process. Because you do not specify the -persist parameter, the FlashCopy relationship ends after the background copy from C to B completes.
325
This FRR operation does a background copy of all tracks that changed on B since the last consistency group (CG) formation, which results in the B volume becoming equal to the image that was present on the C volume. This view is the logical one. From the physical data placement point of view, the C volume does not have meaningful data after the FlashCopy relationship ends. You must wait until all FRR operations complete successfully and its background copy completes before you proceed. Again, when the background copy completes, the FlashCopy relation ends. Therefore, you should check whether the FlashCopy relationships remain to determine when all Fast Reverse Restore operations are completed.
326
Application Server
Restart applications
a
A A B
Global Copy
A A A
suspended
A A C
Local site
Figure 24-15 Restart applications at a remote site
Remote site
Note the suspended state of the B volumes, which implies change recording and indicates the changed tracks on the B volumes. When the local site is ready to restart the applications again and resume the operations, prepare the remote site.
327
Application Server
Failback B to A
re-synchronize from B to A
A A B
copy pending
copy pending
A A C
Local site
Remote site
Figure 24-16 Failback operation from B to A in preparation for returning to the local site
The failback operation is issued to the B volumes as the source and the A volumes as the target. This command changes the A volume from its previous source Copy Pending state to target Copy Pending and starts the resynchronization of the changes from B to A. LSS paths: Before you do the failback operation, ensure that paths are defined from the remote site LSS to its corresponding LSS at the local site. With Fibre Channel links, you can define paths in either direction on the same FCP link. During the failback operation, the application continues running at the remote site to minimize the application outage. If the A volume is still online to the server at the local site or it was online when a crash happened, so that the SCSI persistent reserve is still set on the previous source disk (A volume), the Global Copy failback process with the failbackpprc command fails. In this case, the server at the production site locks the target with a SCSI persistent reserve. After this SCSI persistent reserve is reset with the varyoffvg command (in this case, on AIX), the failbackpprc command completes successfully. There is a -resetreserve parameter for the failbackpprc command. This option resets the reserved state so that the failback operation can complete. In the failback operation after a real disaster, you can use this parameter because the server might go down while the SCSI persistent reserve was set on the A volume. In the planned failback operation, you must not use this parameter because the server at the local site still owns the A volume and might be using it, and the failback operation suddenly changes the contents of the volume. This situation can corrupt the servers file system.
328
Application Server
A A B
suspended
copy pending
A A C
Local site
Figure 24-17 Global Copy failover from A to B
Remote site
Figure 24-17 shows the action at the local site, that is, the Global Copy failover operation from A to B. This failoverpprc command changes the state of the A volumes from target Copy Pending to source suspended, and keeps a bitmap record of the changes to the A volumes. You issue this command to the storage system at the local site. The state of the B volumes does not change. When the failover completes, a failback operation from A to B is run (see Figure 24-18).
Server
A A B
copy pending
copy pending
A A C
Local site
Remote site
Figure 24-18 Global Copy failback from A to B and resync Global Copy volumes
329
LSS paths: Before you do the failback operation, ensure that paths are defined from the local site LSS to its corresponding LSS at the remote site. Figure 24-18 on page 329 shows the failback operation at the local site. The failbackpprc command changes the state of the A volumes from source suspended to source Copy Pending. The state of the B volume changes from source Copy Pending to target Copy Pending. Also, the replication of updates from A to B begins. This replication ends quickly because the application did not start yet at the local site. Finally, if you did not already establish the FlashCopy relationships from B to C during the failover/failback sequence at the remote site, then do it now. This action might be an inband FlashCopy, as shown in Figure 24-19.
A A B
copy pending
copy pending
A A C
Local site
Remote site
The last step is to start the Global Mirror session again, as shown in Figure 24-20. Then, the application can resume at the local site.
Application server
I/O
A A B
copy pending
copy pending
A A C
Local site
Remote site
Figure 24-20 Start the Global Mirror session and resume application I/O at the local site
330
331
This information is also shown in the output of the lssession command (Example 24-2), where the volumes have a status of Suspended.
Example 24-2 lssession output showing the suspended relationships
dscli> lssession 22-23 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ================================================================================================================= 22 01 Normal 2200 Active Primary Suspended Secondary Simplex True Disable 22 01 Normal 2201 Active Primary Suspended Secondary Simplex True Disable 22 01 Normal 2203 Active Primary Copy Pending Secondary Simplex True Disable 22 01 Normal 2204 Active Primary Copy Pending Secondary Simplex True Disable 23 01 Normal 2300 Active Primary Copy Pending Secondary Simplex True Disable 23 01 Normal 2301 Active Primary Copy Pending Secondary Simplex True Disable
Even after the underlying issue between the local and the remote site is resolved, Global Mirror has not restarted. Global Mirror does not restart until the PPRC Global Copy volumes are paused and resumed to reset the reason to Host Source. Running pausepprc against the PPRC relationships sets their status to Suspended and their reason to Host Source (see Example 24-3).
Example 24-3 PPRC pair volume reason changed
dscli> lspprc 2000-2500 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ======================================================================================================= 2200:2200 Suspended Host Source Global Copy 22 120 Disabled True 2201:2201 Suspended Host Source Global Copy 22 120 Disabled True 2203:2403 Copy Pending Global Copy 22 120 Disabled True 2204:2404 Copy Pending Global Copy 22 120 Disabled True 2300:2300 Copy Pending Global Copy 23 120 Disabled True 2301:2301 Copy Pending Global Copy 23 120 Disabled True
Now the PPRC pairs can be resumed by running resumepprc -type gcp. The Global Mirror consistency groups begin to reform (see Example 24-4).
Example 24-4 resumepprc command
dscli> resumepprc -type gcp 2200:2200 2201:2201 CMUC00158I resumepprc: Remote Mirror and Copy volume pair 2200:2200 relationship successfully resumed. This message is being returne d before the copy completes. CMUC00158I resumepprc: Remote Mirror and Copy volume pair 2201:2201 relationship successfully resumed. This message is being returne d before the copy completes.
332
The output of the lspprc command now shows the PPRC pairs in their normal Global Copy state of Copy Pending and reason is a dash (-).
Example 24-5 PPRC pair volumes resumed
dscli> lspprc 2000-2500 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ======================================================================================================= 2200:2200 Copy Pending Global Copy 22 120 Disabled True 2201:2201 Copy Pending Global Copy 22 120 Disabled True 2203:2403 Copy Pending Global Copy 22 120 Disabled True 2204:2404 Copy Pending Global Copy 22 120 Disabled True 2300:2300 Copy Pending Global Copy 23 120 Disabled True 2301:2301 Copy Pending Global Copy 23 120 Disabled True
The output of the lssession command now shows that the volumes in the session have the Copy Pending status (Example 24-6).
Example 24-6 PPRC sessions in Copy Pending status
dscli> lssession 22-23 Date/Time: 30 September 2010 3:44:19 PM IBM DSCLI Version: 6.6.0.284 DS: IBM.2107-75TV181 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ========================================================================================================================= 22 01 CG In Progress 2200 Active Primary Copy Pending Secondary Simplex True Disable 22 01 CG In Progress 2201 Active Primary Copy Pending Secondary Simplex True Disable 22 01 CG In Progress 2203 Active Primary Copy Pending Secondary Simplex True Disable 22 01 CG In Progress 2204 Active Primary Copy Pending Secondary Simplex True Disable 23 01 CG In Progress 2300 Active Primary Copy Pending Secondary Simplex True Disable 23 01 CG In Progress 2301 Active Primary Copy Pending Secondary Simplex True Disable
The output of the Global Mirror command showgmir shows the Current Time and the CG Time to be within a few seconds of each other (see Example 24-7). Running the showgmir command again shows that the CG time updates every time a consistency group forms.
Example 24-7 Output of the showgmir command
dscli> showgmir 22 ID Master Count Master Session ID Copy State Fatal Reason CG Interval Time (seconds) Coord. Time (milliseconds) Max CG Drain Time (seconds) Current Time CG Time Successful CG Percentage FlashCopy Sequence Number Master ID Subordinate Count Master/Subordinate Assoc
IBM.2107-75TV181/22 1 0x01 Running Not Fatal 0 50 30 06/15/2012 10:39:04 BRT 06/15/2012 10:39:04 BRT 54 0x4CA494F8 IBM.2107-75TV181 1 IBM.2107-75TV181/22:IBM.2107-7520781/22
The Global Mirror environment is now re-established and forming Global Mirror consistency groups.
333
The failure of Global Mirror at the local site is a much simpler recovery process than a local site failure because there is not a failover to the remote site and you do not need to perform recovery actions for the B and C volumes. Where there are many LSSs or volumes that fail to maintain communications, there is some merit in pausing the Global Mirror environment while the PPRC Global Copies resynchronize. Global Mirror is a two-site solution that can bridge any distance between both sites. There are ready-to-use packages and services available to implement a disaster recovery solution for two-site Remote Copy configurations. IBM offers the GDPS and Tivoli Storage Productivity Center for Replication to deliver solutions in this area. For more information about these solutions, see Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685 or visit the IBM website found at: http://www.ibm.com/systems/id/storage/software/center/replication/index.html
334
25
Chapter 25.
335
336
Task Change a Global Mirror session on an LSS. Display a Global Mirror session on an LSS. Create a complete Global Mirror environment.
DS CLI chsession lssession mkpprcpath mkpprc mkflash mksession mkgmir rmgmir failoverpprc lsflash revertflash or commitflash reverseflash mkflash mkpprcpath failbackpprc lspprc mkpprcpath failoverpprc failbackpprc resumegmir pausegmir showgmir pausepprc failoverpprc lsflash reverseflash mkflash failbackpprc resumegmir rmgmir chsession rmsession rmflash rmpprc rmpprcpath
For most DS CLI commands, you will must know some (or all) of the following information: The serial number and device type of the source and target storage disk system. The World Wide Node name (WWNN) of the remote storage disk system. The LSS numbers of the source and target volumes. The port IDs for source and target. Up to eight port pair IDs can be specified. A full establishment of a Global Mirror environment using DS CLI commands can take a long time, especially if you must set up an environment that involves many volumes on many LSSs and in several storage disk systems. Detailed examples about how to set up and manage a Global Mirror environment using DS CLI commands can be found in Chapter 27, Global Mirror examples on page 355. The DS CLI commands are documented in IBM System Storage DS8000 Command-Line Interface Users Guide, SC26-7916.
Chapter 25. Global Mirror interfaces
337
Further information about the DS GUI: For more examples and uses of the DS GUI, see Chapter 27, Global Mirror examples on page 355.
338
Figure 25-2 Tivoli Storage Productivity Center for Replication Detailed Session for a Global Mirror configuration with Practice Devices
For a detailed description of Tivoli Storage Productivity Center for Replication concepts and usage, see Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685.
339
340
26
Chapter 26.
341
Write
A A1
Read
2
FCP links
Write
Primary Primary
ta rget
A B1
Read
Write
C1
tertiary
copy pending
Host
Local site
Remote site
342
Logical view approximation: What Figure 26-1 and Figure 26-2 show is not necessarily the exact sequence of internal I/O events, but a logical view approximation. There are internal microcode optimization and consolidation techniques that make the entire process more efficient. An I/O in this context is also a Global Copy I/O when Global Copy replicates a track from a Global Copy source volume to a Global Copy target volume. This situation implies an additional read I/O in the source storage system at the local site and a corresponding write I/O to the target storage system at the remote site. In a Global Mirror environment, the Global Copy target volume is also the FlashCopy source volume. Figure 26-1 on page 342 summarizes roughly what happens between two consistency group creation points when application writes come in. The application write I/O completes immediately at the local site (1). Asynchronously from the application I/O, Global copy detects that an update was made and transmits the data to the Global Copy secondary volume at the remote site (2). Before the track is updated on the B volume, it is first preserved on the C volume. Figure 26-1 on page 342 shows the normal sequence of I/Os within a Global Mirror configuration. Only those tracks on B1 updated for the first time are physically copied to the C1 volume because of the nocopy option. There is some potential impact on the Global Copy data replication operation, depending on whether persistent memory or non-volatile cache is overcommitted in the target storage system. In this situation, the FlashCopy source tracks might be preserved on the target volume C1 before the Global Copy write completes (see Figure 26-2).
Write
A Primary Primary A1
source copy pending
Read
2
FCP links
Write
Primary Primary
ta rge t
A B1
Read
Write
Primary Primary
C1
4
tertiary
copy pending
Host
Local site
Remote site
Figure 26-2 Global Copy with overcommitted NVS at the remote site
Figure 26-2 roughly summarizes what happens when persistent memory or NVS in the remote storage system is overcommitted. A read (3) and a write (4) to preserve the FlashCopy source track and write it to the C1 volume are required before the write (5) can complete. Then, when the track is updated on the B1 volume, this action completes the write (5) operation. Most of the I/O operations are quick writes to cache and persistent memory, as shown in Figure 26-1 on page 342. The first write to a FlashCopy source volume track triggers a bitmap update. This bitmap is the one that was created when the FlashCopy volume pair was created with the start change recording attribute. This update allows the replication of only the changed recording bitmap to the corresponding bitmap for the target volume in the course of forming a consistency group. For more information, see 23.5, Consistency groups on page 296.
343
Local site
System Storage
Target Devices
1 2
GC
SAN/Network Connectivity
GM FC
FC
Source Devices
Journal Devices
Some companies use different strategies to plan for a disaster recovery (DR) or contingency site in their business continuity management (BCM). Because of unpredictable situations, such as extreme weather or earthquakes, many BCM plans should include tertiary or secondary sites in different locations, away from the primary production source. Many companies have different strategies that are implemented for DR situations. Some use wavelength division multiplexer (WDM) technologies such as dense wavelength division multiplexer (DWDM) and others use their current infrastructure for IP networking with Fibre Channel over IP (FC-IP). A DWDM Fibre Connection can provide high bandwidth, but within a limited distance range. It is also capable of handling different protocols using the same Fibre Connection. The technology itself is ideal for many configurations. It is commonly used in locations near each other, for example, a different site in a nearby state or city. The distance range for this technology can be enhanced further with various additional techniques and methods. As the DWDM has a high cost of implementation and in some cases it might not be available, some companies prefer to use their existing infrastructure for IP networking, such as ATM and other solutions. Customers that use IP networking prefer to use it in facilities that are at a great distance from each other, such as having a recovery site in a different state or nearby country. To have better management control and a balanced distribution over the connections, SAN switches should be part of these solutions. They communicate with different devices and they can handle heavy traffic demands, such as trunks, ISLs, and tunneling. This situation does not mean, for example, that the DS8800 or DS8700 storage system is not capable of being directly attached to the other one using DWDM. Check for interoperability compatibility first with IBM Support before you implement such a type of solution.
344
Using FC-IP with an existing infrastructure for IP networking might be ideal because you can use it to link geographically dispersed SANs, at relatively low cost. FC-IP is also known as Fibre Channel tunneling or storage tunneling. It is a method for allowing the transmission of Fibre Channel information to be tunneled through the IP network. FC-IP encapsulates Fibre Channel block data and then transports it over a TCP socket, or tunnel. TCP/IP services are used to establish connectivity between remote SANs. Any congestion control and management, and data error and data loss recovery, is handled by TCP/IP services and does not affect FC fabric services. The major point about FC-IP is that it does not replace FC with IP; it simply allows deployments of FC fabrics using IP tunneling. The main advantages of using FC-IP are that it overcomes the distance limitations of native Fibre Channel and enables geographically distributed SANs to be linked using the existing IP infrastructure, while it keeps the fabric services intact. Some SAN switches can handle multiple FC-IP ports and also can trunk these ports, allowing the formation of tunnels and load balancing the connections. Before you create the connectivity between your DS8000 storage systems, review the few considerations in the next paragraphs regarding long-distance fabrics with FC and FC-IP. More information about bandwidth and available connection options can be found in IBM System Storage Business Continuity: Part 1 Planning Guide, SG24-6547. Attention: For redundancy, using two or more physical paths between the local and remote storage systems is recommended. These physical paths may consolidate multiple PPRC paths.
345
Some tools for bottleneck detection might also be available for FC-IP ports. Typically, FC-IP ports have a commit rate. Adjust this setting to reflect your available connection bandwidth from one site to another site. Enable write acceleration (also known as fast write) and compression on FC-IP tunnels. These actions help with higher latency in long-distance networks. Long-distance fabrics and switches: For more information about long-distance fabrics and switches mechanisms, see Fabric Resiliency Best Practices, REDP-4722 and Introduction to Storage Area Networks and System Networking, SG24-5470.
346
347
Do not use a disk larger than twice the size of the primary with the same RPM. Use a disk with the same size and RPM for the FlashCopy target (Journals).
Extent pool using extent pool striping and Easy Tier to maximize performance. For more information about logical configuration performance considerations and Easy Tier functions, see DS8800 Performance Monitoring and Tuning, SG24-8013. Ensure that extent pools are provisioned from both storage servers. Spread an equal number of B and C volumes across all extent pools: LSS 00 B volumes on extent pool P0, C volumes on extent pool P2. LSS 02 B volumes on extent pool P2, C volumes on extent pool P0. LSS 01 B volumes on extent pool P1, C volumes on extent pool P3. LSS 03 B volumes on extent pool P3, C volumes on extent pool P1.
B and C volumes are controlled by the same storage server (CPC), which provides significant benefit to FlashCopy creation and maintenance. B and C volume ranks are controlled by separate DA (Device Adapter) pairs.
Maximum
Maximum
coordination
time Serialize all Global Copy source volumes
drain
time Perform FlashCopy
Write I/O
A1
source
B1
target
C1
tertiary
A2
source
B2
target
C2
tertiary
Local site
Remote site
The coordination time can be limited by specifying a number in milliseconds. This value is the amount of time that is allowed for forming change recording bitmaps that are used when the Out-of-Sync bitmap is used for the second phase of the consistency group formation, that is, draining the data to the remote site. The intention is to keep this time window as small as possible, but the default of 50 ms is an appropriate value for most environments.
348
The required communication between the master storage system and the subordinate storage system is inband over the paths between the master and the subordinates. This communication is highly optimized and minimizes the potential application write I/O impact to 3 ms. This communication is performed over FCP links. There is at least one FCP link required between a master storage system and a subordinate. For redundancy, use two FCP links. The following example illustrates the impact of the coordination time when consistency group formation starts, and whether this impact has the potential to be significant. Assume a total aggregate number of 5000 write I/Os over two source storage systems with 2500 write I/Os per second to each storage system. Each write I/O takes 0.5 ms. Assume further that a consistency group is created every 3 seconds. In summary: There are 5000 write I/Os. There is a 0.5-ms response time for each write I/O. Assume that the coordination time is 3 ms. Every 3 seconds a consistency group is created. This is 5 I/Os for every millisecond or 15 I/Os within 3 ms. Each of these 15 write I/Os experiences a 3-ms delay, which happens every 3 seconds. You should observe an average response time delay of approximately 0.003 ms, as outlined in the following formula: (15 I/Os * 0.003 sec) / 3*5000 IO/sec) = 0.000003 sec or 0.003 ms The response time increases on average from 0.5 ms to 0.503 ms.
349
Remote site
Extent Pool 0 cec #0
SSD ENT
Easy Tier enabled on all Extent Pools
ranks
B
target volumes
SSD ENT
A
NL SAS
ranks source volumes
C
journal volumes
NL SAS
SSD ENT
ranks
SSD
B
ranks target volumes
ENT
A
source volumes
NL SAS
ranks
C
journal volumes
NL SAS
There are two different options for journal volumes available with Global Mirror setup, which are configuration using Space-Efficient FlashCopy and traditional IBM FlashCopy. The performance characteristics are different for FlashCopy SE and traditional FlashCopy. For considerations about volume types using thin provisioning for DS8800 and DS8700, see Table 26-2. Since Release 6.3, thin provisioning is fully supported in Copy Services for Open Systems, but ESE volumes are not supported for IBM eServer zSeries volumes.
Table 26-2 Volume type considerations for thin provisioning in a Global Mirror environment Volume A/B A/B Volume type source Standard ESE Volume type target Standard ESE Considerations and comments A volumes and B volumes are standard volumes. A volumes and B volumes are Extent-Space-Efficient (ESE) volumes.
350
ranks
Volume B/C
Considerations and comments C volumes are standard. This configuration leads to the maximum capacity on the target. This configuration is a good one for performance aspects. C volumes release space only with the initial FlashCopy. A C volume that is Track-Space-Efficient (TSE) is economical. It is preferred for space-saving GM configurations. Be careful with the performance planning and sizing. C volumes are standard, which lead to the maximum capacity on the target. This configuration is a good one for performance aspects. C volumes release space only with the initial FlashCopy.
B/C B/C
Standard Standard
ESE TSE
B/C
ESE
Standard
B/C
ESE
ESE
351
FlashCopy SE is optimized for use cases where only a few tracks on the source are updated during the life of the relationship. Often, Global Mirror is configured to schedule consistency group creation at an interval of a few seconds, which means that a small amount of data is copied to the FlashCopy targets. From this point of view, Global Mirror is an application where the usage of FlashCopy SE could be one of the options that are considered for the C volumes. In contrast, Standard FlashCopy generally has superior performance to FlashCopy SE. The FlashCopy SE repository is critical regarding performance. When you provision a repository, Storage Pool Striping is automatically used with a multi-rank extent pool to balance the load across the available disks. One important point to remember is that TSE repositories are not supported by Easy Tier, which means that the workload and planning in this case must be carefully done and should be coordinated among your existing planned extent pools. If you previously planned your Easy Tier extent pools, you might want to create your repositories in the beginning of your logical configuration; if so, consider using multiple repositories. The repository can be in the same extent pool with other volumes if the number of ranks in the pool is limited to eight or less. The repository extent pool can also contain additional non-repository volumes, which means that your previously planned extent pool does not need to be entirely dedicated to TSE. It is still possible to create separate extent pools with fewer ranks to dedicate them only for your TSE repositories. In this case, use a minimum of four ranks and a maximum of eight ranks. Contention can arise if the extent pool is shared. Easy Tier handles the Standard and ESE volumes on extent pools, but no movement on tiers with TSE volumes is done. After the repository is defined, it cannot be expanded, so it is important that planning is done to make sure that it is large enough. If the repository fills, the FlashCopy SE relationship fails and the Global Mirror is not able to create consistency groups successfully. For more information, see 27.3.7, Recovering from a suspended state after a repository fills on page 378.
dscli> showgmiroos -dev IBM.2107-7520781 -lss 10 -scope si 02 Scope IBM.2107-7520781 Session 02 OutOfSyncTracks 73125 dscli> showgmiroos -dev IBM.2107-7520781 -lss 10 -scope lss 02 352
IBM System Storage DS8000 Copy Services for Open Systems
Scope IBM.2107-7520781/10 Session 02 OutOfSyncTracks 0 dscli> showgmiroos -dev IBM.2107-7520781 -lss 11 -scope lss 02 Scope IBM.2107-7520781/11 Session 02 OutOfSyncTracks 67847 dscli> Example 26-2 shows the lspprc command that reports the out-of-sync tracks by volume level. Here you see that two volumes in LSS 10 have no Out-of-sync tracks and two volumes in LSS 11 have some out-of-sync tracks.
Example 26-2 Out-of-sync tracks that are shown by the lspprc command
When the load distribution is unknown in your configuration, you could develop rudimentary code that is based on a script, for example, that regularly issues the showgmiroos commands (as shown in Example 26-1 on page 352) and the lspprc -l command (as shown in Example 26-2). You can then process the output of these commands to better understand the write load distribution over the Global Copy source volumes. The numbers in Example 26-1 on page 352 might show a brief peak period. It is still feasible to use the conventional approach with I/O performance reports, such as running iostat in the UNIX environment, to investigate the write workload. Tivoli Storage Productivity Center for Disk could also be used to analyze the storage system performance.
353
354
27
Chapter 27.
355
A
LSS B1
GM Master B100
Global Copy
B
B100
C
B110
Global Copy paths FCP link FCP link Global Copy paths
DS8000#1 (local)
Figure 27-1 DS8000 configuration in the Global Mirror example
DS8000#2 (remote)
In this configuration, different LSS and LUN numbers are used across the A, B, and C components, so that you can easily identify every element when they are referenced. Symmetrical configuration: In a real environment, to simplify the management of your Global Mirror environment, it is better to maintain a symmetrical configuration in terms of both physical and logical elements.
356
This procedure is outlined in Example 27-1, where the sequence of commands and the corresponding results are shown.
Example 27-1 Create Global Copy pairs relationships (A to B) << Determine the available fibre links >> dscli> lsavailpprcport -l -remotedev IBM.2107-75ABTV1 -remotewwnn 5005076303FFC663 10:20
Local Port Attached Port Type Switch ID Switch Port =================================================== I0143 I0010 FCP NA NA I0213 I0140 FCP NA NA dscli>
<< Create paths >> dscli> mkpprcpath -remotedev IBM.2107-75ABTV1 -remotewwnn 5005076303FFC663 -srclss 10 -tgtlss 20 I0143:I0010 I0213:I0140
CMUC00149I mkpprcpath: Remote Mirror and Copy path 10:20 successfully established. dscli>
dscli> mkpprcpath -remotedev IBM.2107-75ABTV1 -remotewwnn 5005076303FFC663 -srclss 11 -tgtlss 21 I0143:I0010 I0213:I0140
CMUC00149I mkpprcpath: Remote Mirror and Copy path 11:21 successfully established. dscli>
<< Create Global Copy pairs >> dscli> mkpprc -remotedev IBM.2107-75ABTV1 -type gcp 1000-1001:2000-2001 1100-1101:2100-2101
CMUC00153I CMUC00153I CMUC00153I CMUC00153I mkpprc: mkpprc: mkpprc: mkpprc: Remote Remote Remote Remote Mirror Mirror Mirror Mirror and and and and Copy Copy Copy Copy volume volume volume volume pair pair pair pair relationship relationship relationship relationship 1000:2000 1001:2001 1100:2100 1101:2101 successfully successfully successfully successfully created. created. created. created.
357
<< some columns were suppressed in lspprc output to fit the screen >> << wait to see that the Out Of Sync Tracks shows 0 >> dscli> lspprc -l 1000-1001 1100-1101
ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS ===================================================================================================================== 1000:2000 1001:2001 1100:2100 1101:2101 dscli> Copy Copy Copy Copy Pending Pending Pending Pending Global Global Global Global Copy Copy Copy Copy 0 0 0 0 Disabled Disabled Disabled Disabled Disabled Disabled Disabled Disabled invalid invalid invalid invalid 10 10 11 11
<< some columns were suppressed in lspprc output to fit the screen >>
The tasks that you must perform to create the Global Copy relationships in a Global Mirror environment are similar to the tasks shown in Part 5, Global Copy on page 225. For more information, see 21.1, Setting up a Global Copy environment on page 258.
358
CMUC00173I mkremoteflash: Remote FlashCopy volume pair 2001:2201 successfully created. Use the lsremoteflash command to determine copy completion. dscli> mkremoteflash -tgtinhibit -nocp -record -conduit IBM.2107-7520781/11 -dev IBM.2107-75ABTV1 2100-2101:2300-2301 CMUC00173I mkremoteflash: Remote FlashCopy volume pair 2100:2300 successfully created. Use the lsremoteflash command to determine copy completion. CMUC00173I mkremoteflash: Remote FlashCopy volume pair 2101:2301 successfully created. Use the lsremoteflash command to determine copy completion.
Because the -nocp parameter is specified and the Global Copy initial copy (first pass) completed, no FlashCopy background copy occurs. FlashCopy relationship: You can create this FlashCopy relationship before the initial copy of Global Copy occurs. However, because it leads to unnecessary FlashCopy background I/Os, it is not a preferred practice.
dscli> lssession -l IBM.2107-7520781/10 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ======================================================================================================== 10 02 Chapter 27. Global Mirror examples
359
dscli> lssession -l IBM.2107-7520781/11 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ======================================================================================================== 11 02 -
With the chsession command, you specify the Global Mirror session ID (02 in our example) and the volumes to be part of the Global Mirror environment. After you add the A volumes to the session, running lssession shows the volume IDs with their states in the LSS. The volumes state is join pending. The session is not started and the volumes are not yet connected in the Global Mirror relationship. Adding volumes: At this step, we do not have to do anything to add the B and C volumes to the Global Mirror session. They are automatically recognized by the Global Mirror mechanism through the Global Copy relationships and the FlashCopy relationships. As an alternative to the chsession command, you can also add the A volumes by running mksession when you define the Global Mirror session on an LSS (see Example 27-5).
Example 27-5 Add the A volumes when you create a Global Mirror session
dscli> mksession -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -volume 1000-1001 02 CMUC00145I mksession: Session 02 opened successfully. dscli> mksession -dev IBM.2107-7520781 -lss IBM.2107-7520781/11 -volume 1100-1101 02 CMUC00145I mksession: Session 02 opened successfully. dscli> lssession -l IBM.2107-7520781/10 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ================================================================================================================= 10 02 Normal 1000 Join Pending Primary Copy Pending Secondary Simplex True Disable 10 02 Normal 1001 Join Pending Primary Copy Pending Secondary Simplex True Disable dscli> lssession -l IBM.2107-7520781/11
360
LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ================================================================================================================= 11 02 Normal 1100 Join Pending Primary Copy Pending Secondary Simplex True Disable 11 02 Normal 1101 Join Pending Primary Copy Pending Secondary Simplex True Disable
dscli> mkgmir -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -session 02 CMUC00162I mkgmir: Global Mirror for session 02 successfully started. dscli> showgmir -dev IBM.2107-7520781 -session 02 IBM.2107-7520781/10 ID IBM.2107-7520781/10 Master Count 1 Master Session ID 0x02 Copy State Running Fatal Reason Not Fatal CG Interval Time (seconds) 0 Coord. Time (milliseconds) 50 Max CG Drain Time (seconds) 30 Current Time 06/15/2012 10:39:04 BRT CG Time 06/15/2012 10:39:04 BRT Successful CG Percentage 100 FlashCopy Sequence Number 0x43723196 Master ID IBM.2107-7520781 Subordinate Count 0 Master/Subordinate Assoc In the mkgmir command, the LSS specified with the -lss parameter becomes the master. In our example, this master is LSS10. With this command, we also specify the Global Mirror session ID of the session we are starting. When you start the Global Mirror session, you can change the Global Mirror tuning parameters of the session by running mkgmir. Here are some of the parameters you can change: -cginterval: Specifies how long to wait between the formation of consistency groups. If this number is not specified or is set to zero, consistency groups are formed continuously. -coordinate: Indicates the maximum time that Global Mirror processing can hold host I/Os in the source disk system to start forming a consistency group. -drain: Specifies the maximum amount of time in seconds allowed for the data to drain to the remote site before the current consistency group fails. For more information about these tuning parameters, see 23.5.2, Consistency Group parameters on page 298 and Chapter 26, Global Mirror performance and scalability on page 341.
361
This showgmir command shows the current Global Mirror status. The Copy State field indicates Running, which means that Global Mirror is satisfactorily operating. A Fatal state indicates that Global Mirror failed, and the Fatal Reason field show the reason for the failure. The showgmir command also shows the current time in the Current Time field, which is the time when the DS8000 received this command. The time when the last successful consistency group formed is shown in the CG Time field. You can calculate the current Recovery Point Objective (RPO) for this Global Mirror session from the difference between the Current Time and the CG Time. In the output of the lssession command in Example 27-7, you can see that after you start the Global Mirror session, the VolumeStatus of the A volumes changes from Join Pending to Active.
Example 27-7 The A volumes status after you start the Global Mirror session dscli> lssession 10-11
LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascadin ======================================================================================================================== 10 02 CG In Progress 1000 Active Primary Copy Pending Secondary Simplex True Disable 10 02 CG In Progress 1001 Active Primary Copy Pending Secondary Simplex True Disable 11 02 CG In Progress 1100 Active Primary Copy Pending Secondary Simplex True Disable 11 02 CG In Progress 1101 Active Primary Copy Pending Secondary Simplex True Disable
If you use the -metrics parameter with the showgmir command, you can obtain more detailed metrics for Global Mirror after you start the session (see Example 27-8).
Example 27-8 The showgmir command with the -metrics parameter
dscli> showgmir -metrics -dev ID Total Failed CG Count Total Successful CG Count Successful CG Percentage Failed CG after Last Success Last Successful CG Form Time Coord. Time (milliseconds) CG Interval Time (seconds) Max CG Drain Time (seconds) First Failure Control Unit First Failure LSS First Failure Status First Failure Reason First Failure Master State Last Failure Control Unit Last Failure LSS Last Failure Status Last Failure Reason Last Failure Master State Previous Failure Control Unit Previous Failure LSS Previous Failure Status Previous Failure Reason Previous Failure Master State
IBM.2107-7520781 -session 02 IBM.2107-7520781/10 IBM.2107-7520781/10 0 83947 100 0 06/15/2012 10:57:32 BRT 50 0 30 No Error No Error No Error -
362
In Example 27-8 on page 362, the Total Failed CG Count field indicates the number of attempts to form a consistency group that did not complete successfully after you started the Global Mirror. The Total Successful CG Count indicates the total number of consistency groups that completed successfully. First Failure indicates the first failure after you started this session. Last Failure indicates the latest failure, and Previous Failure indicates the failure before the latest one. All this failure information is cleared after we stop the Global Mirror and start it again. Pausing and resuming the Global Mirror operation does not reset this information. Depending on the Global Mirror parameters you set and your system environment, the consistency group formation can occasionally fail, and running showgmir -metrics can show the error messages. A typical case is that you see Max Drain Time Exceeded with the showgmir command output when data of the out-of-sync bitmap cannot be drained within the specified time. However, this failure does not mean that you lose consistent data at the remote site, because Global Mirror does not perform FlashCopy (B to C) for the failed consistency group data. Global Mirror continues to attempt to form more consistency groups without external intervention. If failures repeatedly continue (no more consistency groups are formed), the percentage of successful consistency groups is unacceptable (many failures occur), or the frequency of consistency groups is not meeting your requirements (Recovery Point Objective (RPO)), then the failures are a problem and must be addressed. There is another command that is related to Global Mirror, which is the showgmiroos command. This command reports the number of Out-Of-Sync Tracks at a moment that Global Mirror must transmit to the remote site (the size of the logical track on the DS8000 FB volume is 64 KB). With the -scope parameter, you select either the Storage Image scope or the LSS scope for the information to be reported (see Example 27-9).
Example 27-9 The showgmiroos command
dscli> showgmiroos -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -scope si 02 Scope IBM.2107-7520781 Session 02 OutOfSyncTracks 1138 dscli> showgmiroos -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -scope lss 02 Scope IBM.2107-7520781/10 Session 02 OutOfSyncTracks 303 dscli> showgmiroos -dev IBM.2107-7520781 -lss IBM.2107-7520781/11 -scope lss 02 Scope IBM.2107-7520781/11 Session 02 OutOfSyncTracks 0
363
Figure 27-2 shows the example DS8000 configuration where you have a total of eight A volumes (four on DS8000 #1 and four on DS8000 #3) at the local site that participate in the Global Mirror session. You have one DS8000 (DS8000 #2) at the remote site with the corresponding B and the C volumes.
A
LSS10 GM Master 1000 1001 Global Mirror Control Path LSS11 1100 1101 DS8000#1
-dev IBM.2107-7520781
B
LSS20 2000
FlashCopy
FlashCopy
C
LSS22
2300 2301
FlashCopy
LSS26
2600 2601
2401
2700 2701
Figure 27-2 Start Global Mirror session with a subordinate (DS8000 #3)
Example 27-10 shows how to start a Global Mirror configuration with a subordinate. The example does not show how to set up the Global Copy and FlashCopy relationships because these steps are the same as in a non-subordinate situation.
Example 27-10 Start a Global Mirror session when there is a subordinate << Create Global Mirror control path between DS8000#1 and DS8000#3 >> dscli> lsavailpprcport -l -remotedev IBM.2107-7503461 -remotewwnn 5005076303FFC08F 10:90 Local Port Attached Port Type Switch ID Switch Port =================================================== I0001 I0031 FCP NA NA I0101 I0101 FCP NA NA dscli> mkpprcpath -remotedev IBM.2107-7503461 -remotewwnn 5005076303FFC08F -srclss 10 -tgtlss 90 i0001:i0031 i0101:i0101 CMUC00149I mkpprcpath: Remote Mirror and Copy path 10:90 successfully established.
dscli> lspprcpath -fullid 10 Src Tgt State SS Port Attached Port Tgt WWNN =================================================================================================================== IBM.2107-7520781/10 IBM.2107-75ABTV1/20 Success FF20 IBM.2107-7520781/I0143 IBM.2107-75ABTV1/I0010 5005076303FFC663 IBM.2107-7520781/10 IBM.2107-75ABTV1/20 Success FF20 IBM.2107-7520781/I0213 IBM.2107-75ABTV1/I0140 5005076303FFC663 IBM.2107-7520781/10 IBM.2107-7503461/90 Success FF90 IBM.2107-7520781/I0001 IBM.2107-7503461/I0031 5005076303FFC08F IBM.2107-7520781/10 IBM.2107-7503461/90 Success FF90 IBM.2107-7520781/I0101 IBM.2107-7503461/I0101 5005076303FFC08F
364
<< Setup Global Mirror environment bewteen DS8000#3 and DS8000#2 (These steps are NOT shown here) >> << Start Global Mirror with a Subordinate (DS8000#3) >> dscli> mkgmir -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -session 02 IBM.2107-7520781/10:IBM.2107-7503461/90 CMUC00162I mkgmir: Global Mirror for session 02 successfully started. dscli> showgmir -dev IBM.2107-7520781 -session 02 IBM.2107-7520781/10 ID IBM.2107-7520781/10 Master Count 1 Master Session ID 0x02 Copy State Running Fatal Reason Not Fatal CG Interval Time (seconds) 0 Coord. Time (milliseconds) 50 Max CG Drain Time (seconds) 30 Current Time 06/15/2012 10:39:04 BRT CG Time 06/15/2012 10:39:04 BRT Successful CG Percentage 100 FlashCopy Sequence Number 0x4373384A Master ID IBM.2107-7520781 Subordinate Count 1 Master/Subordinate Assoc IBM.2107-7520781/10:IBM.2107-7503461/90
365
Example 27-11 shows how to end Global Mirror session 02 processing. Although this command might interrupt the formation of a consistency group, every attempt is made to preserve the previous consistent copy of the data on the FlashCopy target volumes. If, because of failures, this command cannot complete without compromising the consistent copy, the command stops processing and an error code is issued. If this error occurs, rerun rmgmir with the -force parameter to force the command to stop the Global Mirror process.
Example 27-11 Terminate Global Mirror
dscli> showgmir -dev IBM.2107-7520781 -session 02 IBM.2107-7520781/10 ID IBM.2107-7520781/10 Master Count 1 Master Session ID 0x02 Copy State Running Fatal Reason Not Fatal CG Interval Time (seconds) 0 Coord. Time (milliseconds) 50 Max CG Drain Time (seconds) 30 Current Time 06/15/2012 10:39:04 BRT CG Time 06/15/2012 10:39:04 BRT Successful CG Percentage 100 FlashCopy Sequence Number 0x4373481A Master ID IBM.2107-7520781 Subordinate Count 0 Master/Subordinate Assoc dscli> rmgmir -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -session 02 CMUC00166W rmgmir: Are you sure you want to stop the Global Mirror session 02:? [y/n]:y CMUC00165I rmgmir: Global Mirror for session 02 successfully stopped. dscli> showgmir -dev IBM.2107-7520781 -session 02 IBM.2107-7520781/10 ID IBM.2107-7520781/10 Master Count Master Session ID Copy State Fatal Reason CG Interval (seconds) XDC Interval(milliseconds) CG Drain Time (seconds) Current Time CG Time Successful CG Percentage FlashCopy Sequence Number Master ID Subordinate Count Master/Subordinate Assoc If the Global Mirror configuration had subordinates, then to end Global Mirror, you also must specify the Global Mirror control path information by running rmgmir. Otherwise, the command fails, as shown in Example 27-12.
Example 27-12 Terminate Global Mirror when there is a subordinate
dscli> rmgmir -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -session 02 CMUC00166W rmgmir: Are you sure you want to stop the Global Mirror session 02:? [y/n]:y CMUN03067E rmgmir: Copy Services operation failure: configuration does not exist
366
dscli> rmgmir -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -session 02 IBM.2107-7520781/10:IBM.2107-7503461/90 CMUC00166W rmgmir: Are you sure you want to stop the Global Mirror session 02:? [y/n]:y CMUC00165I rmgmir: Global Mirror for session 02 successfully stopped.
367
Before you close the Global Mirror session on the LSS, remove all the A volumes from the Global Mirror session on that LSS, or the rmsession command fails (see Example 27-15).
Example 27-15 rmsession command fails when Global Mirror volumes were not previously removed dscli> lssession 10-11 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCas =========================================================================================================== 10 02 Normal 1000 Join Pending Primary Copy Pending Secondary Simplex True Disable 10 02 Normal 1001 Join Pending Primary Copy Pending Secondary Simplex True Disable 11 02 Normal 1100 Join Pending Primary Copy Pending Secondary Simplex True Disable 11 02 Normal 1101 Join Pending Primary Copy Pending Secondary Simplex True Disable dscli> rmsession -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 02 CMUC00148W rmsession: Are you sure you want to close session 02? [y/n]:y CMUN03107E rmsession: Copy Services operation failure: volumes in session
368
dscli> lspprcpath 10-11 Src Tgt State SS Port Attached Port Tgt WWNN ========================================================= 10 20 Success FF20 I0143 I0010 5005076303FFC663 10 20 Success FF20 I0213 I0140 5005076303FFC663 11 21 Success FF21 I0143 I0010 5005076303FFC663 11 21 Success FF21 I0213 I0140 5005076303FFC663 dscli> rmpprcpath -quiet -remotedev IBM.2107-75ABTV1 10:20 11:21 CMUC00150I rmpprcpath: Remote Mirror and Copy path 10:20 successfully removed. CMUC00150I rmpprcpath: Remote Mirror and Copy path 11:21 successfully removed.
369
dscli> pausegmir -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -session 02 CMUC00163I pausegmir: Global Mirror for session 02 successfully paused. dscli> showgmir -dev IBM.2107-7520781 -session 02 IBM.2107-7520781/10 ID IBM.2107-7520781/10 Master Count 1 Master Session ID 0x02 Copy State Paused Fatal Reason Not Fatal CG Interval Time (seconds) 0 Coord. Time (milliseconds) 50 Max CG Drain Time (seconds) 30 Current Time 06/15/2012 10:39:04 BRT CG Time 06/15/2012 10:39:04 BRT Successful CG Percentage 100 FlashCopy Sequence Number 0x43736656 Master ID IBM.2107-7520781 Subordinate Count 0 Master/Subordinate Assoc dscli> lssession 10-11
LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ================================================================================================================= 10 02 Normal 1000 Active Primary Copy Pending Secondary Simplex True Disable 10 02 Normal 1001 Active Primary Copy Pending Secondary Simplex True Disable 11 02 Normal 1100 Active Primary Copy Pending Secondary Simplex True Disable 11 02 Normal 1101 Active Primary Copy Pending Secondary Simplex True Disable
dscli> lsremoteflash
370
ID SrcLSS SequenceNum ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled Background ======================================================================================================================== 2000:2200 20 43736656 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2001:2201 20 43736656 Disabled Enabled Enabled Disabled Enabled Disabled Disabled
dscli> lsremoteflash
ID SrcLSS SequenceNum ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled Background ======================================================================================================================== 2100:2300 21 43736656 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2101:2301 21 43736656 Disabled Enabled Enabled Disabled Enabled Disabled Disabled dscli>
dscli> lsremoteflash
ID SrcLSS SequenceNum ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled Background ======================================================================================================================== 2000:2200 20 43736656 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2001:2201 20 43736656 Disabled Enabled Enabled Disabled Enabled Disabled Disabled
dscli> lsremoteflash
ID SrcLSS SequenceNum ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled Background ======================================================================================================================== 2100:2300 21 43736656 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2101:2301 21 43736656 Disabled Enabled Enabled Disabled Enabled Disabled Disabled
dscli>
Data transfer: The pausegmir command does not influence the Global Copy data transfer. When you pause a Global Mirror session by running pausegmir, the command completes the consistency group formation in progress before it pauses the session. This action is slightly different from the usage of the rmgmir command that is described in 27.3.3, Stopping and starting Global Mirror on page 373. The Status that is shown by the lssession command changes from CG In Progress, which means that the consistency group of the session is in progress, to Normal, which means that the session is in a normal Global Copy state. In fact, you see this state (Normal) between the time when a FlashCopy is taken and the next Global Copy Consistency Group formation time. FlashCopy sequence number: The FlashCopy sequence number is not changed after you pause Global Mirror because the FlashCopy at the remote site is not done (see Example 27-18). The resumegmir command resumes Global Mirror processing for a specified session (see Example 27-19). Consistency group formation is resumed.
Example 27-19 Resume Global Mirror processing
dscli> resumegmir -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -session 02 CMUC00164I resumegmir: Global Mirror for session 02 successfully resumed. dscli> showgmir -dev IBM.2107-7520781 -session 02 IBM.2107-7520781/10 ID IBM.2107-7520781/10 Master Count 1 Master Session ID 0x02 Copy State Running Fatal Reason Not Fatal CG Interval Time (seconds) 0 Coord. Time (milliseconds) 50 Max CG Drain Time (seconds) 30 371
Current Time CG Time Successful CG Percentage FlashCopy Sequence Number Master ID Subordinate Count Master/Subordinate Assoc
372
373
<< some columns were suppressed in lspprc output to fit the screen >> dscli> lspprc -l 1002
ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS ======================================================================================================================== 1002:2002 Copy Pending Global Copy 0 Disabled Disabled invalid 10
<< some columns were suppressed in lspprc output to fit the screen >> dscli> mkremoteflash -tgtinhibit -nocp -record -conduit IBM.2107-7520781/10 -dev IBM.2107-75ABTV1 2002:2202 CMUC00173I mkremoteflash: Remote FlashCopy volume pair 2002:2202 successfully created. Use the lsremoteflash command to determine copy completion dscli> lsremoteflash -conduit IBM.2107-7520781/10 -dev IBM.2107-75ABTV1 2002:2202 ID SrcLSS SequenceNum ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabl =========================================================================================================== 2002:2202 20 0 Disabled Enabled Enabled Disabled Enabled Disabled
374
<< Add the A volume to Global Mirror >> dscli> lssession 10 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCas =========================================================================================================== 10 02 Normal 1000 Active Primary Copy Pending Secondary Simplex True Disable 10 02 Normal 1001 Active Primary Copy Pending Secondary Simplex True Disable dscli> chsession -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -action add -volume 1002 02 CMUC00147I chsession: Session 02 successfully modified. dscli> lssession 10 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCas =========================================================================================================== 10 02 Normal 1000 Active Primary Copy Pending Secondary Simplex True Disable 10 02 Normal 1001 Active Primary Copy Pending Secondary Simplex True Disable 10 02 Normal 1002 Join Pending Primary Copy Pending Secondary Simplex True Disable dscli> lssession 10 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCas =========================================================================================================== 10 02 Normal 1000 Active Primary Copy Pending Secondary Simplex True Disable 10 02 Normal 1001 Active Primary Copy Pending Secondary Simplex True Disable 10 02 Normal 1002 Active Primary Copy Pending Secondary Simplex True Disable
Attention: To be added to a Global Mirror session, the A volumes can be in any state, such as simplex (no relationship), Copy Pending, or suspended. Volumes that have not completed their initial copy phase (also called first pass) stay in a join pending state until the first pass is complete. You can check the first pass status by running lspprc -l, as shown in Example 27-23. The First Pass Status field reports this information, where True means that the Global Copy first pass is complete.
Example 27-23 Check the first pass completion for the Global Copy initial copy dscli> lspprc -l -fmt stanza 1002 ID 1002:2002 State Copy Pending Reason Type Global Copy Out Of Sync Tracks 0 Tgt Read Enabled Src Cascade Disabled Tgt Cascade Invalid Date Suspended SourceLSS B1 Timeout (secs) 60 Critical Mode Disabled First Pass Status True Incremental Resync Disabled Tgt Write Disabled GMIR CG N/A PPRC CG Disabled isTgtSE Unknown DisableAutoResync False
375
To remove an A volume from the Global Mirror environment, run chsession -action remove -volume. First, remove the A volume from the Global Mirror session and then remove its Global Copy and FlashCopy relationships (see Example 27-24).
Example 27-24 Remove an A volume from the Global Mirror environment dscli> lssession 10-11
LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascadin ======================================================================================================================== 10 02 CG In Progress 1000 Active Primary Copy Pending Secondary Simplex True Disable 10 02 CG In Progress 1001 Active Primary Copy Pending Secondary Simplex True Disable 10 02 CG In Progress 1002 Active Primary Copy Pending Secondary Simplex True Disable 11 02 CG In Progress 1100 Active Primary Copy Pending Secondary Simplex True Disable 11 02 CG In Progress 1101 Active Primary Copy Pending Secondary Simplex True Disable
dscli>
Attention: Suspending or removing even one Global Copy pair that belongs to an active Global Mirror session impacts the formation of consistency groups. If you suspend or remove the Global Copy relationship from the A volume without removing the volume from the Global Mirror session, consistency group formation fails, and periodical SNMP alerts are issued.
<< some columns were suppressed in lspprc output to fit the screen >>
376
<< some columns were suppressed in lspprc output to fit the screen >> dscli> mkremoteflash -tgtinhibit -nocp -record -conduit IBM.2107-7520781/10 -dev IBM.2107-75ABTV1 2400:2600
CMUN03044E mkremoteflash: 2400:2600: Copy Services operation failure: path not available
dscli> mkremoteflash -tgtinhibit -nocp -record -conduit IBM.2107-7520781/12 -dev IBM.2107-75ABTV1 2400:2600
CMUC00173I mkremoteflash: Remote FlashCopy volume pair 2400:2600 successfully created. Use the lsremoteflash command to determine copy completion.
<< Add a LSS and an A volume to the Global Mirror >> dscli> lssession 12
CMUC00234I lssession: No Session found.
dscli> lssession 12
LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ================================================================================================================= 12 02 Normal 1200 Join Pending Primary Copy Pending Secondary Simplex True Disable
When you remove an LSS from a Global Mirror environment, you must first remove all the A volumes on the LSS by running chsession and then remove the LSS by running rmsession (see Example 27-26).
Example 27-26 Remove an LSS from a Global Mirror session dscli> lssession 10-12
LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascadin ======================================================================================================================== 10 02 CG In Progress 1000 Active Primary Copy Pending Secondary Simplex True Disable 10 02 CG In Progress 1001 Active Primary Copy Pending Secondary Simplex True Disable 11 02 CG In Progress 1100 Active Primary Copy Pending Secondary Simplex True Disable 11 02 CG In Progress 1101 Active Primary Copy Pending Secondary Simplex True Disable 12 02 CG In Progress 1200 Active Primary Copy Pending Secondary Simplex True Disable
dscli>
377
10 10 11 11 12
02 02 02 02 02
378
e. Resume the Global Copy relationship by running resumepprc on the DS CLI. f. Check the Global Mirror Session status and failures by running showgmir. Failover to secondary site This option is suggested for preserving the consistency group already at the secondary site. The application does not have to be quiesced to perform this recovery, but if updates are allowed to continue at the primary site, the Recovery Point Objective (RPO) between local and remote site grows. This situation might be acceptable if quiescing local systems is not an option. The steps are similar to the process described in 27.4.1, Summary of the recovery scenario on page 380, except that the FlashCopy copy is performed on different volumes and a failback from primary site to secondary site is required, which is accomplished by completing the following steps: a. Perform Global Copy failover from B to A. b. Verify that there is a valid consistency group state. c. Create consistent data on B volumes (perform a Reverse FlashCopy from B to C). d. Perform a Global Copy failback from A to B. e. Re-establish the FlashCopy relationship from B to C. f. Resume Global Copy.
27.4 Recovery scenario after a local site failure using the DS CLI
The example that is presented in this section illustrates how to perform the required steps to recover from a production site failure using DS CLI commands. For a detailed description of the internal Global Mirror considerations, see 24.8, Recovery scenario after a site failure on page 318. For this example, we use the configuration that we set up in 27.1, Setting up a Global Mirror environment using the DS CLI on page 356.
379
LSS10
GM Master 1000 1001 Global Copy Pair
LSS20
2000
FlashCopy
LSS22
2200 2201
2001
LSS11
1100 1101 Global Copy Pair
LSS21
2100
FlashCopy
LSS23
2300 2301
2101
A
GC: Source Copy Pending
B
GC: Target Copy Pending FC: Source
C
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
Figure 27-3 Global Mirror example before an unplanned production site failure
380
failoverpprc B to A
LSS10
1000 1001
LSS20
Global Copy Pair 2000
FlashCopy
LSS22
2200 2201
2001
LSS11
1100 1101 Global Copy Pair
LSS21 FlashCopyLSS23
2100 2101 2300 2301
A
GC: Source Copy Pending
B
GC: Source Suspended FC: Source
C
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
Example 27-28 shows the command for this operation. You can check the result with the lspprc command.
Example 27-28 failoverpprc command example dscli> lspprc 2000-2001 2100-2101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 1000:2000 Target Copy Pending Global Copy 10 unknown Disabled Invalid 1001:2001 Target Copy Pending Global Copy 10 unknown Disabled Invalid 1100:2100 Target Copy Pending Global Copy 11 unknown Disabled Invalid 1101:2101 Target Copy Pending Global Copy 11 unknown Disabled Invalid dscli> failoverpprc -remotedev IBM.2107-7520781 CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy -type gcp 2000-2001:1000-1001 2100-2101:1100-1101 pair 2000:1000 successfully reversed. pair 2001:1001 successfully reversed. pair 2100:1100 successfully reversed. pair 2101:1101 successfully reversed.
381
dscli> lspprc 2000-2001 2100-2101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ==================================================================================================== 2000:1000 Suspended Host Source Global Copy 20 unknown Disabled True 2001:1001 Suspended Host Source Global Copy 20 unknown Disabled True 2100:1100 Suspended Host Source Global Copy 21 unknown Disabled True 2101:1101 Suspended Host Source Global Copy 21 unknown Disabled True
Example 27-30 shows a hypothetical situation where all FlashCopy relationships are in the revertible state and have the same sequence number. In this case, run revertflash for all the FlashCopy relationships.
Example 27-30 If all revertible and SeqNum are equal, then perform a revertflash dscli> lsflash 2000-2001 2100-2101
ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 2000:2200 20 437895B2 300 Disabled Enabled Enabled Enabled Disabled Disabled Disabled 2001:2201 20 437895B2 300 Disabled Enabled Enabled Enabled Disabled Disabled Disabled 2100:2300 21 437895B2 300 Disabled Enabled Enabled Enabled Disabled Disabled Disabled 2101:2301 21 437895B2 300 Disabled Enabled Enabled Enabled Disabled Disabled Disabled
382
If some FlashCopy pairs are revertible and others are not revertible while their sequence numbers are equal, you should run commitflash for the FlashCopy relationships that have the revertible status (see Example 27-31). When the FlashCopy relationship is not in a revertible state, the commit operation is not possible. When you issue this command to FlashCopy pairs that are non-revertible, you see only an error message; no action is performed. To speed up this process, you can issue a commitflash command to all FlashCopy pairs. In Example 27-31, 2000 and 2001 are not in the revertible state we see error messages.
Example 27-31 If some pairs are revertible and the SeqNum are equal, then perform a commitflash dscli> lsflash 2000-2001 2100-2101
ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 2000:2200 20 437895B2 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2001:2201 20 437895B2 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2100:2300 21 437895B2 300 Disabled Enabled Enabled Enabled Disabled Disabled Disabled 2101:2301 21 437895B2 300 Disabled Enabled Enabled Enabled Disabled Disabled Disabled
Example 27-32 shows a hypothetical situation where some FlashCopy pairs are revertible and others are non-revertible. The revertible FlashCopy pairs' sequence numbers are equal. The non-revertible FlashCopy pairs sequence numbers are also equal, but do not match the revertible FlashCopies sequence number. In this case, you should issue a commitflash command to the FlashCopy relationships that have the revertible status. When the FlashCopy relationship is non-revertible, the revert operation is not possible. When you issue this command against FlashCopy pairs that are non-revertible, you see only an error message; no action is performed. To speed up this process, you can issue a revertflash command to all FlashCopy pairs. In Example 27-32, 2000 and 2001 are not in the revertible state, so you see error messages.
Example 27-32 If some are revertible and the SeqNum are not equal, then perform a revertflash dscli> lsflash 2000-2001 2100-2101
ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 2000:2200 20 437895B2 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2001:2201 20 437895B2 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2100:2300 21 437895B3 300 Disabled Enabled Enabled Enabled Disabled Disabled Disabled 2101:2301 21 437895B3 300 Disabled Enabled Enabled Enabled Disabled Disabled Disabled
383
ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 2000:2200 20 437895B2 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2001:2201 20 437895B2 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2100:2300 21 437895B2 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2101:2301 21 437895B2 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled
After these actions complete, all FlashCopy pairs are non-revertible and all sequence numbers are equal, so you can proceed to the next step.
384
Figure 27-5 shows the remote DS8000 environment after reverseflash is run. After you run this command and before the C to B background copy is completed, the C volumes become the FlashCopy source and the B volumes become the FlashCopy target.
reverseflash B to C
LSS10
Background Copy
LSS20 LSS22
1000 1001
2000 2001
LSS21
2200 2201
LSS23
LSS11
1100 1101 Global Copy Pair
Background Copy
2100 2101
2300 2301
A
GC: Source Copy Pending
B
GC: Source Suspended FC: Target
C
FC: Source
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
Figure 27-5 Site swap scenario after the reverseflash command is run
Example 27-33 shows the results of the reverseflash command. The lsflash command shows volume 2200 (C volume) as the FlashCopy source.
Example 27-33 The reverseflash from B to C dscli> reverseflash -fast -tgtpprc 2000-2001:2200-2201 2100-2101:2300-2301
CMUC00169I CMUC00169I CMUC00169I CMUC00169I reverseflash: reverseflash: reverseflash: reverseflash: FlashCopy FlashCopy FlashCopy FlashCopy volume volume volume volume pair pair pair pair 2000:2200 2001:2201 2100:2300 2101:2301 successfully successfully successfully successfully reversed. reversed. reversed. reversed.
dscli> lsflash
ID
2000-2001 2100-2101
SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy
385
==================================================================================================================================== 2200:2000 22 4374ABB7 300 Enabled Disabled Disabled Disabled Enabled Enabled Enabled 2201:2001 22 4374ABB7 300 Enabled Disabled Disabled Disabled Enabled Enabled Enabled 2300:2100 23 4374ABB7 300 Enabled Disabled Disabled Disabled Enabled Enabled Enabled 2301:2101 23 4374ABB7 300 Enabled Disabled Disabled Disabled Enabled Enabled Enabled dscli>
The FRR operation does a background copy of all tracks that changed on the B volumes since the last CG formation, which results in the B volumes becoming equal to the image that was present on the C volumes. This view is the logical one. From the physical data placement point of view, the C volumes do not have meaningful data after the FlashCopy relationship ends. Because you do not specify the -persist parameter, the FlashCopy relationship ends after the background copy from C to B completes, as shown in Figure 27-6.
LSS10
1000 1001 Global Copy Pair
LSS20
LSS22
2000 2001
LSS21
2200 2201
LSS23
LSS11
1100 1101 Global Copy Pair
2100 2101
2300 2301
A
GC: Source Copy Pending
B
GC: Source Suspended FC: None
C
FC: None
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
You must wait until all FRR operations and their background copy complete successfully before you proceed with the next step. When the background copy completes, the FlashCopy relationship ends. Therefore, you should check whether any FlashCopy relationships remain in order to determine when all FRR operations are complete, as shown in Example 27-34. This example shows the result of the lsflash command after the reverseflash background copy completes.
Example 27-34 The lsflash command to confirm the completion of the background copy
386
mkflash B to C
LSS10
1000 1001 Global Copy Pair
LSS20
2000 2001
LSS21
LSS11
1100 1101 Global Copy Pair
2100 2101
A
GC: Source Copy Pending
B
GC: Source Suspended FC: Source
C
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
This step is in preparation for returning production to the local site. The mkflash command that is used in this step is illustrated in Example 27-35, and is the same FlashCopy command you might have used when you initially created the Global Mirror environment in 23.4.4, Introducing FlashCopy on page 292. In a disaster situation, you might not want to use the -nocp option for the FlashCopy from B to C. This option removes the FlashCopy I/O processing impact when the application starts.
Example 27-35 Re-establish the FlashCopy relationships from B to C dscli> mkflash -tgtinhibit -nocp -record 2000-2001:2200-2201 2100-2101:2300-2301 CMUC00137I mkflash: FlashCopy pair 2000:2200 successfully created. CMUC00137I mkflash: FlashCopy pair 2001:2201 successfully created. CMUC00137I mkflash: FlashCopy pair 2100:2300 successfully created. CMUC00137I mkflash: FlashCopy pair 2101:2301 successfully created. dscli> lsflash 2000-2001 2100-2101
ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 2000:2200 20 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2001:2201 20 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2100:2300 21 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2101:2301 21 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled
387
Start Application
I/O LSS10
1000 1001 Global Copy Pair
LSS20
2000 2001
LSS21
LSS11
1100 1101 Global Copy Pair
2100 2101
A
GC: Source Copy Pending
B
GC: Source Suspended FC: Source
C
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
388
dscli> mkpprcpath -remotedev IBM.2107-7520781 -remotewwnn 5005076303FFC1A5 -srclss 20 -tgtlss 10 I0010:I0143 I0140:I0213
CMUC00149I mkpprcpath: Remote Mirror and Copy path 20:10 successfully established.
dscli> mkpprcpath -remotedev IBM.2107-7520781 -remotewwnn 5005076303FFC1A5 -srclss 21 -tgtlss 11 I0010:I0143 I0140:I0213
CMUC00149I mkpprcpath: Remote Mirror and Copy path 21:11 successfully established.
389
This process changes the A volume from its previous state (source) Copy Pending to target Copy Pending (see Figure 27-9). You must use the -type gcp parameter with the failbackpprc command to request Global Copy mode.
failbackpprc B to A
Application running
LSS10
1000 1001 Global Copy Pair
LSS20
2000 2001
LSS11
1100 1101
LSS21
2100 2101
A
GC: Target Copy Pending
B
GC: Source Copy Pending FC: Source
C
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
Figure 27-9 Site swap scenario after Global Copy failback from B to A
The failbackpprc initialization mode resynchronizes the volumes in this manner: If a volume at the production site is in simplex state (no relationship), all of the data for that volume is sent from the recovery site to the production site. If a volume at the production site is in the Copy Pending or suspended state and without changed tracks, only the modified data on the volume at the recovery site is sent to the volume at the production site. If a volume at the production site is in a suspended state and has tracks on which data is written, the volume at the recovery site discovers which tracks were modified on any site and sends both the tracks that are changed on the production site and the tracks that are marked at the recovery site. The volume at the production site becomes a write-inhibited target volume. This action is performed on an individual volume basis. Example 27-37 shows the commands that are performed in our example. We list the status of the B volumes and then perform the Global Copy failback operation on the DS HMC connected to the remote DS8000#2.
Example 27-37 Perform Global Copy failback from B to A << Before the failbackpprc B to A >> << B volume status >> dscli> lspprc 2000-2001 2100-2101 ID State Reason Type
390
==================================================================================================== 2000:1000 Suspended Host Source Global Copy 20 unknown Disabled True 2001:1001 Suspended Host Source Global Copy 20 unknown Disabled True 2100:1100 Suspended Host Source Global Copy 21 unknown Disabled True 2101:1101 Suspended Host Source Global Copy 21 unknown Disabled True
<< The failbackpprc B to A >> dscli> failbackpprc -remotedev IBM.2107-7520781 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy
-type gcp 2000-2001:1000-1001 2100-2101:1100-1101 pair 2000:1000 successfully failed back. pair 2001:1001 successfully failed back. pair 2100:1100 successfully failed back. pair 2101:1101 successfully failed back.
<< B volume status >> dscli> lspprc 2000-2001 2100-2101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 2000:1000 Copy Pending Global Copy 20 unknown Disabled False 2001:1001 Copy Pending Global Copy 20 unknown Disabled False 2100:1100 Copy Pending Global Copy 21 unknown Disabled False 2101:1101 Copy Pending Global Copy 21 unknown Disabled False << A volume status >> dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 2000:1000 Target Copy Pending Global Copy 20 unknown Disabled Invalid 2001:1001 Target Copy Pending Global Copy 20 unknown Disabled Invalid 2100:1100 Target Copy Pending Global Copy 21 unknown Disabled Invalid 2101:1101 Target Copy Pending Global Copy 21 unknown Disabled Invalid
391
You can query this status by running lspprc, as shown in Example 27-38. The First Pass Status field indicates the status of the first pass, where True means that the first pass completed.
Example 27-38 Query for the Global Copy first pass completion
dscli> lspprc -l 2000-2001 2100-2101 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================================================== ===================== 2000:1000 Copy Pending Global Copy 0 Disabled Disabled invalid 20 unknown Disabled True 2001:1001 Copy Pending Global Copy 0 Disabled Disabled invalid 20 unknown Disabled True 2100:1100 Copy Pending Global Copy 0 Disabled Disabled invalid 21 unknown Disabled True 2101:1101 Copy Pending Global Copy 0 Disabled Disabled invalid 21 unknown Disabled True << some columns were suppressed in lspprc output to fit the screen >>
27.5.5 Querying the out-of-sync tracks until the result shows zero
After you quiesce the application, to ensure that all the data is written to the B volumes, you wait until the out-of-sync tracks for the Global Copy pairs shows zero. You can check this status by running lspprc -l, as shown in Example 27-39. You run this command at the DS HMC connected to the remote DS8000 (DS8000 #2).
Example 27-39 Query the Global Copy out-of-sync tracks until the result shows zero
dscli> lspprc -l 2000-2001 2100-2101 ID State Reason Type OutOfSyncTracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass ========================================================================================================================================== ====================== 2000:1000 Copy Pending Global Copy 0 Disabled Disabled invalid 20 unknown Disabled True 2001:1001 Copy Pending Global Copy 0 Disabled Disabled invalid 20 unknown Disabled True 2100:1100 Copy Pending Global Copy 0 Disabled Disabled invalid 21 unknown Disabled True 2101:1101 Copy Pending Global Copy 0 Disabled Disabled invalid 21 unknown Disabled True << some columns were suppressed in lspprc output to fit the screen >>
392
In our example, the paths are still available (see Example 27-40). If there are no available paths, you must define them now by running mkpprcpath. Example 27-1 on page 357 shows the commands to accomplish this task.
Example 27-40 Check the available paths from A to B dscli> lspprcpath -fullid 10-11
Src Tgt State SS Port Attached Port Tgt WWNN =================================================================================================================== IBM.2107-7520781/10 IBM.2107-75ABTV1/20 Success FF20 IBM.2107-7520781/I0143 IBM.2107-75ABTV1/I0010 5005076303FFC663 IBM.2107-7520781/10 IBM.2107-75ABTV1/20 Success FF20 IBM.2107-7520781/I0213 IBM.2107-75ABTV1/I0140 5005076303FFC663 IBM.2107-7520781/11 IBM.2107-75ABTV1/21 Success FF21 IBM.2107-7520781/I0143 IBM.2107-75ABTV1/I0010 5005076303FFC663 IBM.2107-7520781/11 IBM.2107-75ABTV1/21 Success FF21 IBM.2107-7520781/I0213 IBM.2107-75ABTV1/I0140 5005076303FFC663
failoverpprc A to B
Application running
LSS10
1000 1001 Global Copy Pair
LSS20
FlashCopy
LSS22
2000 2001
LSS21
2200 2201
LSS23
LSS11
1100 1101 Global Copy Pair
FlashCopy
2100 2101
2300 2301
A
GC: Source Suspended
B
GC: Source Copy Pending FC: Source
C
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
Figure 27-10 Site swap scenario after Global Copy failover from A to B
Example 27-41 shows the result of the failoverpprc command we used in our example, and the volume state after this command is issued. You must perform this command on the DS HMC connected to the local DS8000 (DS8000 #1).
Example 27-41 Global Copy failover from A to B << DS8000 #1 >> dscli> failoverpprc -remotedev IBM.2107-75ABTV1 -type gcp 1000-1001:2000-2001 1100-1101:2100-2101 Chapter 27. Global Mirror examples
393
dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ==================================================================================================== 1000:2000 Suspended Host Source Global Copy 10 unknown Disabled True 1001:2001 Suspended Host Source Global Copy 10 unknown Disabled True 1100:2100 Suspended Host Source Global Copy 11 unknown Disabled True 1101:2101 Suspended Host Source Global Copy 11 unknown Disabled True << DS8000 #2 >> dscli> lspprc 2000-2001 2100-2101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 2000:1000 Copy Pending Global Copy 20 unknown Disabled True 2001:1001 Copy Pending Global Copy 20 unknown Disabled True 2100:1100 Copy Pending Global Copy 21 unknown Disabled True 2101:1101 Copy Pending Global Copy 21 unknown Disabled True
failbackpprc A to B
Application running
LSS10
1000 1001 Global Copy Pair
LSS20
FlashCopy
LSS22
2000 2001
LSS21
2200 2201
LSS23
LSS11
1100 1101 Global Copy Pair
FlashCopy
2100 2101
2300 2301
A
GC: Source Copy Pending
B
GC: Target Copy Pending FC: Source
C
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
Figure 27-11 Site swap scenario after you run failbackpprc for A to B
394
Example 27-42 shows the result of the failbackpprc command that is used in our example, and the volume state after this command is issued. You must run this command on the DS HMC connected to the local DS8000 (DS8000 #1).
Example 27-42 Global Copy failback from A to B << DS8000 #1 >> dscli> failbackpprc -remotedev IBM.2107-75ABTV1 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy -type gcp 1000-1001:2000-2001 1100-1101:2100-2101 pair 1000:2000 successfully failed back. pair 1001:2001 successfully failed back. pair 1100:2100 successfully failed back. pair 1101:2101 successfully failed back.
dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True << DS8000 #2 >> dscli> lspprc 2000-2001 2100-2101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 1000:2000 Target Copy Pending Global Copy 10 unknown Disabled Invalid 1001:2001 Target Copy Pending Global Copy 10 unknown Disabled Invalid 1100:2100 Target Copy Pending Global Copy 11 unknown Disabled Invalid 1101:2101 Target Copy Pending Global Copy 11 unknown Disabled Invalid
395
mkgmir
Application running
LSS10
GM Master Started 1000 1001 Global Copy Pair
LSS20
FlashCopy
LSS22
2000 2001
LSS21
2200 2201
LSS23
LSS11
1100 1101 Global Copy Pair
FlashCopy
2100 2101
2300 2301
A
GC: Source Copy Pending
B
GC: Target Copy Pending FC: Source
C
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
The last step before you start the application at the production site is to start the Global Mirror session again, as shown in Figure 27-12. Before you start the Global Mirror, create the FlashCopy relationships from B to C volumes. To start the Global Mirror session, run mkgmir. Before you start Global Mirror, you can check the status of the Global Mirror session on each LSS by running lssession. After you start Global Mirror, you can run showgmir to check the LSSs status. Example 27-43 shows the commands we used in our example and the corresponding results. We run this command on the DS HMC connected to the local DS8000 (DS8000 #1).
Example 27-43 Start Global Mirror dscli> lssession 10-11 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCas =========================================================================================================== 10 02 Normal 1000 Active Primary Copy Pending Secondary Simplex True Disable 10 02 Normal 1001 Active Primary Copy Pending Secondary Simplex True Disable 11 02 Normal 1100 Active Primary Copy Pending Secondary Simplex True Disable 11 02 Normal 1101 Active Primary Copy Pending Secondary Simplex True Disable dscli> mkgmir -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -session 02 CMUC00162I mkgmir: Global Mirror for session 02 successfully started. dscli> showgmir -dev IBM.2107-7520781 -session 02 IBM.2107-7520781/10 ID IBM.2107-7520781/10 Master Count 1 Master Session ID 0x02
396
Copy State Running Fatal Reason Not Fatal CG Interval Time (seconds) 0 Coord. Time (milliseconds) 50 Max CG Drain Time (seconds) 30 Current Time 06/15/2012 10:39:04 BRT CG Time 06/15/2012 10:39:04 BRT Successful CG Percentage 91 FlashCopy Sequence Number 0x4374B72B Master ID IBM.2107-7520781 Subordinate Count 0 Master/Subordinate Assoc -
Start Application
I/O LSS10
GM Master 1000 1001 Global Copy Pair 2000 2001
LSS21
Application running
LSS20
FlashCopy
LSS22
2200 2201
LSS23
LSS11
1100 1101 Global Copy Pair
FlashCopy
2100 2101
2300 2301
A
GC: Source Copy Pending
B
GC: Target Copy Pending FC: Source
C
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
397
The typical scenario for this activity is the following one: 1. Query the Global Mirror environment to see its status. 2. Pause Global Mirror and check its completion. 3. Pause Global Copy pairs. 4. Perform Global Copy failover from B to A. 5. Create consistent data on B volumes (Perform reverse FlashCopy from B to C). 6. Wait for the FlashCopy background copy to complete. 7. Re-establish FlashCopy pairs (B to C with original Global Mirror options). 8. Take a FlashCopy copy from B to (newly created) D. 9. Perform the disaster recovery testing using the D volume. 10.Perform Global Copy failback from A to B. 11.Resume Global Mirror. Many of the steps in this scenario are also described in 27.4, Recovery scenario after a local site failure using the DS CLI on page 379 and 27.5, Returning to the local site on page 388. For those steps that are similar, we provide their pointers here.
dscli> pausegmir -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -session 02 CMUC00163I pausegmir: Global Mirror for session 02 successfully paused. dscli> showgmir -dev IBM.2107-7520781 -session 02 IBM.2107-7520781/10 ID IBM.2107-7520781/10 Master Count 1 Master Session ID 0x02 Copy State Paused Fatal Reason Not Fatal CG Interval Time (seconds) 0 Coord. Time (milliseconds) 50 Max CG Drain Time (seconds) 30 Current Time 06/15/2012 10:39:04 BRT CG Time 06/15/2012 10:39:04 BRT Successful CG Percentage 100 398
IBM System Storage DS8000 Copy Services for Open Systems
0x43785DC7 IBM.2107-7520781 0 -
dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ==================================================================================================== 1000:2000 Suspended Host Source Global Copy 10 unknown Disabled True 1001:2001 Suspended Host Source Global Copy 10 unknown Disabled True 1100:2100 Suspended Host Source Global Copy 11 unknown Disabled True 1101:2101 Suspended Host Source Global Copy 11 unknown Disabled True
<< DS8000 #2 >> dscli> lspprc 2000-2001 2100-2101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Stat =========================================================================================================== 1000:2000 Target Suspended Update Target Global Copy 10 unknown Disabled Invalid 1001:2001 Target Suspended Update Target Global Copy 10 unknown Disabled Invalid 1100:2100 Target Suspended Update Target Global Copy 11 unknown Disabled Invalid 1101:2101 Target Suspended Update Target Global Copy 11 unknown Disabled Invalid
399
CMUC00196I failoverpprc: Remote Mirror and Copy pair 2001:1001 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair 2100:1100 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair 2101:1101 successfully reversed. dscli> lspprc 2000-2001 2100-2101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ==================================================================================================== 2000:1000 Suspended Host Source Global Copy 20 unknown Disabled True 2001:1001 Suspended Host Source Global Copy 20 unknown Disabled True 2100:1100 Suspended Host Source Global Copy 21 unknown Disabled True 2101:1101 Suspended Host Source Global Copy 21 unknown Disabled True dscli>
dscli> reverseflash -fast -tgtpprc CMUC00169I reverseflash: FlashCopy CMUC00169I reverseflash: FlashCopy CMUC00169I reverseflash: FlashCopy CMUC00169I reverseflash: FlashCopy
2000-2001:2200-2201 2100-2101:2300-2301 volume pair 2000:2200 successfully reversed. volume pair 2001:2201 successfully reversed. volume pair 2100:2300 successfully reversed. volume pair 2101:2301 successfully reversed.
400
ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 2000:2200 20 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2001:2201 20 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2100:2300 21 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2101:2301 21 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled
mkflash B to D
FlashCopy
LSS10
1000 1001 Global Copy Pair
LSS20
LSS22
LSS24
2000 2001
LSS21
2400 2401
LSS26
LSS11
1100 1101 2100 2101
2300 2301
2500 2501
A
GC: Source Copy Pending
B
GC: Source Suspended FC: Source
C
FC: Target
D
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
Example 27-50 shows the DS CLI log for this operation. We use the -nocp option for the FlashCopy. You can also use the copy option.
Example 27-50 Take FlashCopy from B to D dscli> lsflash 2000-2001 2100-2101
ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 2000:2200 20 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2001:2201 20 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2100:2300 21 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled 2101:2301 21 0 300 Disabled Enabled Enabled Disabled Enabled Disabled Disabled
dscli> mkflash -nocp 2000-2001:2400-2401 2100-2101:2500-2501 CMUC00137I mkflash: FlashCopy pair 2000:2400 successfully created. CMUC00137I mkflash: FlashCopy pair 2001:2401 successfully created. CMUC00137I mkflash: FlashCopy pair 2100:2500 successfully created. CMUC00137I mkflash: FlashCopy pair 2101:2501 successfully created.
401
failbackpprc A to B
FlashCopy
LSS10
1000 1001 Global Copy Pair
LSS20
LSS22
LSS24
2000 2001
LSS21
2400 2401
LSS26
LSS11
1100 1101 Global Copy Pair
2100 2101
2300 2301
2500 2501
A
GC: Source Copy Pending
B
GC: Target Copy Pending FC: Source
C
FC: Target
D
FC: Target
DS8000#1
-dev IBM.2107-7520781
DS8000#2
-dev IBM.2107-75ABTV1
402
Example 27-51 Perform Global Copy failback from A to B - test scenario << Before failbackpprc >> << DS8000 #1 >> dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ==================================================================================================== 1000:2000 Suspended Host Source Global Copy 10 unknown Disabled True 1001:2001 Suspended Host Source Global Copy 10 unknown Disabled True 1100:2100 Suspended Host Source Global Copy 11 unknown Disabled True 1101:2101 Suspended Host Source Global Copy 11 unknown Disabled True << DS8000 #2 >> dscli> lspprc 2000-2001 2100-2101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ==================================================================================================== 2000:1000 Suspended Host Source Global Copy 20 unknown Disabled True 2001:1001 Suspended Host Source Global Copy 20 unknown Disabled True 2100:1100 Suspended Host Source Global Copy 21 unknown Disabled True 2101:1101 Suspended Host Source Global Copy 21 unknown Disabled True << DS8000 #1 >> dscli> failbackpprc -remotedev IBM.2107-75ABTV1 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy
-type gcp 1000-1001:2000-2001 1100-1101:2100-2101 pair 1001:2001 successfully failed back. pair 1100:2100 successfully failed back. pair 1101:2101 successfully failed back.
dscli> lspprc 1000-1001 1100-1101 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 1000:2000 Copy Pending Global Copy 10 unknown Disabled True 1001:2001 Copy Pending Global Copy 10 unknown Disabled True 1100:2100 Copy Pending Global Copy 11 unknown Disabled True 1101:2101 Copy Pending Global Copy 11 unknown Disabled True
Important: Do not specify the B volume as a source when you run failbackpprc (to the DS8000 #2), or data on the B volume is copied to the A volume. If the A volume does not have reserve status, data on the A volume might be overwritten.
403
Copy Pending True Copy Pending True Copy Pending True Copy Pending True
10 10 11 11
dscli> lssession 10-11 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ================================================================================================================= 10 02 Normal 1000 Active Primary Copy Pending Secondary Simplex True Disable 10 02 Normal 1001 Active Primary Copy Pending Secondary Simplex True Disable 11 02 Normal 1100 Active Primary Copy Pending Secondary Simplex True Disable 11 02 Normal 1101 Active Primary Copy Pending Secondary Simplex True Disable
dscli> resumegmir -dev IBM.2107-7520781 -lss IBM.2107-7520781/10 -session 02 CMUC00164I resumegmir: Global Mirror for session 02 successfully resumed. dscli> dscli> showgmir -dev IBM.2107-7520781 IBM.2107-7520781/10 ID IBM.2107-7520781/10 Master Count 1 Master Session ID 0x02 Copy State Running Fatal Reason Not Fatal CG Interval Time (seconds) 0 Coord. Time (milliseconds) 50 Max CG Drain Time (seconds) 30 Current Time 06/15/2012 10:39:04 BRT CG Time 06/15/2012 10:39:04 BRT Successful CG Percentage 99 FlashCopy Sequence Number 0x43785DC7 Master ID IBM.2107-7520781 Subordinate Count 0 Master/Subordinate Assoc dscli> showgmir -dev IBM.2107-7520781 IBM.2107-7520781/10 ID IBM.2107-7520781/10 Master Count 1 Master Session ID 0x02 Copy State Running Fatal Reason Not Fatal CG Interval Time (seconds) 0 Coord. Time (milliseconds) 50 Max CG Drain Time (seconds) 30 Current Time 06/15/2012 10:39:04 BRT CG Time 06/15/2012 10:39:04 BRT 404
IBM System Storage DS8000 Copy Services for Open Systems
Successful CG Percentage FlashCopy Sequence Number Master ID Subordinate Count Master/Subordinate Assoc
99 0x437883A2 IBM.2107-7520781 0 -
A
LSS B1
GM Master B100 DS8000#1 (local)
Global Copy
B
B100
C
B110
Global Copy paths FCP link FCP link Global Copy paths
DS8000#2 (remote)
405
Because there are no paths to the DS8000 named DS8k05_3Tier in LSS B1, you must configure them. In our example, we want to configure two paths from LSS B1 on the DS8K-ATS - ATS_04 to LSS B1 on the DS8k05_3Tier. To do so, complete the following steps: 1. Select the correct Storage Image and click Action Create. 2. In the input window, specify an LSS on the source and on the target side. 3. From the auto-populated I/O port list, select an I/O port on both the source and target side. 4. Check Define as consistency group if you want to manage the path as part of a consistency group. This option is not mandatory because the Global Mirror session handles the consistency of the data across the set of volumes. Click Add to move the definition to the Create Mirroring Connectivity Verification field.
406
5. Repeat the previous steps for any additional logical paths you want to add (see Figure 27-18. When you are finished, click Create.
After the paths are created, you can verify the results by selecting LSS B1 from the menu (see Figure 27-19).
407
2. When the creation wizard displays the Volume Pairing Method, you must select the source and target storage systems first, then the volume type of the source volume (FB or CKD). The volume type of the target volume adapts to the source automatically. You can choose to filter the list by Host, LSS, Storage Allocation Method, or Volume Group. In this example, the list is filtered by the LSS called B1. You can select a single pair by clicking a source and a target volume or you can select multiple pairs by pressing and holding the Ctrl key while you click the volumes. Note the Showing 1 item | Selected 1 item entry at the bottom of the volume list. Click Add to move the definitions to the Create Metro Mirror Verification pane.
408
DS GUI automatically determines the relationships for you (see Figure 27-21).
Different LSSs: If you have volumes in different LSSs, click Add for the one you are currently working on and then select the other LSSs and perform the same action. You might have to increase the size of the Create Global Copy window by expanding the window (use the mouse to drag the lower right corner down and right). Before you click Create, click Advanced. In the dialog box that opens, you can choose how the Global Copy relationship should be set up. Check Permit read access from target and Perform initial copy (see Figure 27-22).
409
3. Click OK to save the options and then click Create to establish the Global Copy relationship. The Metro Mirror/Global Copy opens again. You may filter again according to your LSS, volume group, or other criteria. After this filtering is complete, the list of volume pairs that were just created appear. You can see that the state of the volumes is Copy Pending, which indicates that the initial copy from the source to the target volumes is still in progress (see Figure 27-23).
Figure 27-23 Create Global Copy pair - volume pairs that are created and in the Copy Pending state
410
1. In the left navigation pane, click Copy Services FlashCopy. The FlashCopy window opens. In the Select Action menu, click Create (see Figure 27-24).
411
2. In the window that opens, check Change Recording and clear Initiate background copy. You can filter the list by Host, LSS, Storage Allocation Method, or Volume Group. In our example, the list is filtered by an LSS called B1. You can select a single pair by clicking a source and a target volume or you can select multiple pairs by pressing and holding the Ctrl key while you click the volumes. Note the Showing 8 item | Selected 1 item entry at the bottom of the volume list. Click Add to move the definitions to the Create FlashCopy Verification pane. The DS GUI automatically determines the relationships for you (see Figure 27-21 on page 409).
Multiple LSSs: This process was done for only one volume, but if you have multiple volumes in different LSSs, click Add for the one you are currently working on and then select the others LSSs and perform the same action.
412
3. Click Create to create your FlashCopy relationships. After you complete this step, you see the relationship (see Figure 27-26).
413
414
2. The Create Global Mirror window opens. Select Create a new Global Mirror Session and specify the Master Storage Image. In our case, the master is DS8K-ATS - ATS_04. After you select it, specify the master LSS 0xb1, which corresponds to LSS B1 and session ID 0x01, which is an unused Global Mirror session. The default values for a Global Mirror session for Select Options are used. After you select the options, click Next (see Figure 27-28). For more information about multiple sessions, see 27.10, Multiple Global Mirror sessions within DS8700 and DS8800 systems on page 425.
Figure 27-28 Create Global Mirror - define the master LSS and session ID
415
3. In the window that opens, add the Global Copy relationship that was created in 27.8.2, Creating Global Copy pairs on page 408. Click Add Existing Global Copy and specify the Storage Image from the source, filter it by Host 1 LSS, and choose the LSS that corresponds to your source volume. In our case, the LSS is 75TV181:b1. DS GUI refreshes the list for existing volumes in the specified LSS. Then, you select the previously created relationship (see Figure 27-29). Click Add, and if you are satisfied with the Global Copy relationship, click Finish.
Figure 27-29 Global Mirror - add an existing Global Copy and review the Global Copy relationship
Adding multiple volumes: This process was done for one volume, but if you have multiple volumes that are in different LSSs, click Add for the volume you are currently working on. Then, select the other LSSs and perform the same action. When the creation wizard finishes, you return to the Global Mirror window and see your most recently created session, which is session ID 0x01 in our case (see Figure 27-30).
416
27.9.1 Viewing settings and error information of the Global Mirror session
To see session information through the DS GUI, in the left navigation pane, click Copy Services Global Mirror. Select the session ID for the Global Mirror session and, in the Select Action menu, click Properties (see Figure 27-31).
417
In the Overview tab shown Figure 27-31 on page 417, you can review the settings for this session. In the Metrics tab, you can review the consistency group attempts and Last Successful formation time (see Figure 27-32).
You can click the Failures tab to view any failures (see Figure 27-33). When you are done reviewing the information, click Close to go back to the main Global Mirror window.
418
27.9.2 Viewing the information of the volumes in the Global Mirror session
To see session information using the DS GUI, in the left navigation pane, click Copy Services Global Mirror. Select your session ID for the Global Mirror session, and in the Select Action menu, click View Relationships (see Figure 27-34).
The Global Mirror session volumes: Real-time window opens shows information about the volumes that are associated with the Global Mirror session. You can either download or print this information table. Click OK to finish and go back to the main Global Mirror window.
419
420
When the Global Mirror Real-time window displays the warning message (see Figure 27-36), either click Cancel to return to the main Global Mirror window without resuming the Global Mirror session or confirm the values for your session and click Resume to resume the Global Mirror session and return to the main Global Mirror window. When the main Global Mirror window opens, the state of the Global Mirror session now shows Running.
Figure 27-36 Resume Global Mirror - confirm the resumption of the Global Mirror session
421
To add Global Copy relationships into existing Global Mirror Session, see 27.8.4, Creating the Global Mirror session on page 414, but in this case, select Add to an existing Global Mirror Session (see Figure 27-37).
Figure 27-37 Global Mirror window - add a relationship to an existing Global Mirror Session
422
423
The Manage LSSs window opens (see Figure 27-39). Scroll down until you see the LSS you want to modify, click it once to highlight it, and then click Action and then Properties. In the Single LSS properties dialog box, you can change the options of the LSS. For example, you can check the Consistency Group Enabled box or enter a Long Busy Timeout Value. For a description of consistency group with Global Mirror, see Chapter 23, Global Mirror overview on page 283.
424
Bandwidth: Deleting paths reduces bandwidth. For a description of this topic, see 26.2, Performance considerations for network connectivity on page 344. Click Delete and the deletion completes. The Mirroring Connectivity window opens again, where you can confirm that the path is not there.
27.10 Multiple Global Mirror sessions within DS8700 and DS8800 systems
Multiple Global Mirror hardware sessions are supported by DS8700 and DS8800 firmware Release 6.1 or later (for DS8100 and DS8300 systems that are independent of the firmware level, only one session is supported). This section covers an example of two Global Mirror master sessions within a DS8800. One of the two sessions is swapped to Site 2 without impacting the other Global Mirror master session within the same DS8800. This example is performed through DS CLI commands, but Copy Services TSO commands (z/OS environments) and Tivoli Storage Productivity Center for Replication can be used as well. They provide the same functional support as the DS CLI based commands.
425
Figure 27-41 shows two Global Mirror sessions, session number 20 and session number 30. It does not matter which interface is used to create a Global Mirror session.
I/O
DS8800
DS8800
I/O
LSS 20
GM master
Session
20
LSS 20 LSS 21
Session
30
LSS 30 LSS 31
LSS 40 LSS 41
After you create PPRC paths and Global Copy pairs finish their first pass, create a FlashCopy relationship for Global Mirror, as shown in Example 27-54.
Example 27-54 Create a FlashCopy relationship for a Global Mirror session
dscli> mkflash -tgtinhibit -record -persist -nocp 2100-2109:2200-2209 CMUC00137I mkflash: FlashCopy pair 2100:2200 successfully created. CMUC00137I mkflash: FlashCopy pair 2101:2201 successfully created. CMUC00137I mkflash: FlashCopy pair 2102:2202 successfully created. CMUC00137I mkflash: FlashCopy pair 2103:2203 successfully created. CMUC00137I mkflash: FlashCopy pair 2104:2204 successfully created. CMUC00137I mkflash: FlashCopy pair 2105:2205 successfully created. CMUC00137I mkflash: FlashCopy pair 2106:2206 successfully created. CMUC00137I mkflash: FlashCopy pair 2107:2207 successfully created. CMUC00137I mkflash: FlashCopy pair 2108:2208 successfully created. CMUC00137I mkflash: FlashCopy pair 2109:2209 successfully created. After FlashCopy relationships are created and after all Global Copy pairs finish their first pass, it is possible to establish the Global Mirror sessions. Logically, the next step is to establish a Global Mirror session within all primary site LSSs that potentially contain Global Copy primary volumes that might belong to Global Mirror sessions.
Example 27-55 Create GM sessions within each concerned source LSS
dscli> mksession -lss 20 20 CMUC00145I mksession: Session 20 opened successfully. dscli> mksession -lss 30 30 CMUC00145I mksession: Session 30 opened successfully.
426
dscli> mksession -lss 40 30 CMUC00145I mksession: Session 30 opened successfully. Example 27-55 on page 426 shows three mksession commands that open a GM session on a source LSS level. LSS 20 receives GM session number 20. LSS 30 and LSS 40 receive GM session 30. So, GM session 20 receives Global Copy primary volumes from LSS 20 and GM session 30 contains Global Copy primary volumes from LSS 30 and LSS 40 (see Figure 27-41 on page 426). After all the relevant LSSs receive a Global Mirror session number, populate the GM sessions with Global Copy primary volumes. On the primary site, there are volumes in LSS 20 for session 20 and volumes in LSS 30 and LSS 40 for session 30.
Example 27-56 Populate Global Mirror session number 20 and number 30 with volumes
dscli> chsession -lss 20 -action add -volume 2000-2009 CMUC00147I chsession: Session 20 successfully modified. dscli> dscli> chsession -lss 30 -action add -volume 3000-3009 CMUC00147I chsession: Session 30 successfully modified. dscli> dscli> chsession -lss 40 -action add -volume 4000-4009 CMUC00147I chsession: Session 30 successfully modified. dscli> Example 27-56 has three chsession commands.
20
30
30
The first chsession command adds 10 Global Copy primary volumes from LSS 20 to session number 20. The second chsession command adds 10 Global Copy primary volumes from LSS 30 to session number 30. The third chsession command adds another 10 Global Copy primary volumes from LSS 40 to session number 30. Figure 27-41 on page 426 shows the LSS numbers at the secondary site. These LSSs are LSS 20 for Global Copy secondary volumes in Session 20 and LSS 21, which contains the FlashCopy targets for the Global Copy secondary volumes within session 20. LSS 30 and LSS 40 on Site 2 correspond to the same LSS numbers on the primary site. LSS 31 and LSS 41 contain the FlashCopy targets within the session number 30. To start Global Mirror processing, run the two mkgmir commands that are shown in Example 27-57 to start each Global Mirror session within the same DS8800.
Example 27-57 Start Global Mirror processing
dscli> mkgmir -lss 20 -session 20 -cginterval 0 -coordinate 10 -drain 60 CMUC00162I mkgmir: Global Mirror for session 20 successfully started. dscli> mkgmir -lss 30 -session 30 -cginterval 0 -coordinate 05 -drain 600 CMUC00162I mkgmir: Global Mirror for session 30 successfully started. Listing, for example, session number 30 shows both LSSs 30 and 40 for the primary site.
427
Example 27-58 shows two list session command and that sessions also exist on an LSS level. Global Mirror session number 30 contains volumes of two primary LSSs, that is, LSS 30 and LSS 40. Therefore, you must list both LSS numbers to query the involved Global Copy primary volumes.
Example 27-58 List volumes in session number 30 dscli> lssession -l 30 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete ========================================================================================================== 30 30 CG In Progress 3000 Active Primary Copy Pending Secondary Simplex True 30 30 CG In Progress 3001 Active Primary Copy Pending Secondary Simplex True 30 30 CG In Progress 3002 Active Primary Copy Pending Secondary Simplex True 30 30 CG In Progress 3003 Active Primary Copy Pending Secondary Simplex True 30 30 CG In Progress 3004 Active Primary Copy Pending Secondary Simplex True 30 30 CG In Progress 3005 Active Primary Copy Pending Secondary Simplex True 30 30 CG In Progress 3006 Active Primary Copy Pending Secondary Simplex True 30 30 CG In Progress 3007 Active Primary Copy Pending Secondary Simplex True 30 30 CG In Progress 3008 Active Primary Copy Pending Secondary Simplex True 30 30 CG In Progress 3009 Active Primary Copy Pending Secondary Simplex True dscli> lssession -l 40 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete =========================================================================================================== 40 30 CG In Progress 4000 Active Primary Copy Pending Secondary Simplex True 40 30 CG In Progress 4001 Active Primary Copy Pending Secondary Simplex True 40 30 CG In Progress 4002 Active Primary Copy Pending Secondary Simplex True 40 30 CG In Progress 4003 Active Primary Copy Pending Secondary Simplex True 40 30 CG In Progress 4004 Active Primary Copy Pending Secondary Simplex True 40 30 CG In Progress 4005 Active Primary Copy Pending Secondary Simplex True 40 30 CG In Progress 4006 Active Primary Copy Pending Secondary Simplex True 40 30 CG In Progress 4007 Active Primary Copy Pending Secondary Simplex True 40 30 CG In Progress 4008 Active Primary Copy Pending Secondary Simplex True 40 30 CG In Progress 4009 Active Primary Copy Pending Secondary Simplex True
428
DS8800
DS8800
I/O
I/O
GM master
Session
20
GM master
Session
30
Before the session fails over to secondary site, stop the workload to the active volumes in Global Mirror session 20 on the primary site. Stopping the workload: This example uses a planned scenario. In this case, pause the Global Mirror Session after stopping the workload on the primary volumes by running pausegmir with the DS CLI. In an unplanned scenario, the Global Mirror Session would already be stopped because of a failed condition. After the application workload is stopped for the volumes in session 20, fail over to the secondary site for session 20. Fail over to all the relevant Global Copy secondary volumes, as shown in Example 27-59. This action happens on the secondary site.
Example 27-59 Global copy failover for session 20 dscli> failoverpprc -type gcp CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote 2000-2009:2000-2009 Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair 2000:2000 2001:2001 2002:2002 2003:2003 2004:2004 2005:2005 2006:2006 2007:2007 2008:2008 2009:2009 successfully successfully successfully successfully successfully successfully successfully successfully successfully successfully reversed. reversed. reversed. reversed. reversed. reversed. reversed. reversed. reversed. reversed.
429
After the failoverpprc command completes, all secondary site Global Copy volumes in LSS 20 receive a PRIMARY SUSPENDED state and are now accessible. They do not contain consistent data yet. You must apply the changed data since the last consistency group creation from the FlashCopy target volumes in session 20. To accomplish this task, run the reverseflash command shown Example 27-60.
Example 27-60 Make Global Copy volumes on Site 2 consistent dscli> reverseflash -fast -tgtpprc 2000-2009:2100-2109 CMUC00169I reverseflash: FlashCopy volume pair 2000:2100 CMUC00169I reverseflash: FlashCopy volume pair 2001:2101 CMUC00169I reverseflash: FlashCopy volume pair 2002:2102 CMUC00169I reverseflash: FlashCopy volume pair 2003:2103 CMUC00169I reverseflash: FlashCopy volume pair 2004:2104 CMUC00169I reverseflash: FlashCopy volume pair 2005:2105 CMUC00169I reverseflash: FlashCopy volume pair 2006:2106 CMUC00169I reverseflash: FlashCopy volume pair 2007:2107 CMUC00169I reverseflash: FlashCopy volume pair 2008:2108 CMUC00169I reverseflash: FlashCopy volume pair 2009:2109 successfully successfully successfully successfully successfully successfully successfully successfully successfully successfully reversed. reversed. reversed. reversed. reversed. reversed. reversed. reversed. reversed. reversed.
From now on, the application may be restarted on the secondary site by accessing the Global Copy volumes, which are now consistent and in a PRIMARY SUSPENDED state, as shown in Figure 27-42 on page 429. Before you restart the application at the secondary site, you may create the FlashCopy relationship for the Global Mirror session between LSS 20 and LSS 21 on the secondary site. The application that connects to Site 1 volumes in session 30 is not affected at all.
DS8800
DS8800
fail back
I/O
SEC PENDING
I/O
GM master
Session
20
PRIM PENDING
GM master
Session
30
430
You may keep the application at the secondary site that connects to volumes in session 20 active and prepare to resynchronize the changed data from the secondary site to the primary site. Example 27-61 shows the failobackpprc command that is used to establish Global Copy relationships between the secondary site and the primary site. The Global Copy primary volumes are still at the secondary site and the Global Copy secondary volumes are at the primary site, which requires you to connect to the secondary site DS8000 and issue failobackpprc to the secondary site DS8000.
Example 27-61 Failback from a secondary site to a primary site dscli> failbackpprc -type gcp CMUC00197I failbackpprc: Remote CMUC00197I failbackpprc: Remote CMUC00197I failbackpprc: Remote CMUC00197I failbackpprc: Remote CMUC00197I failbackpprc: Remote CMUC00197I failbackpprc: Remote CMUC00197I failbackpprc: Remote CMUC00197I failbackpprc: Remote CMUC00197I failbackpprc: Remote CMUC00197I failbackpprc: Remote 2000-2009:2000-2009 Mirror and Copy pair 2000:2000 Mirror and Copy pair 2001:2001 Mirror and Copy pair 2002:2002 Mirror and Copy pair 2003:2003 Mirror and Copy pair 2004:2004 Mirror and Copy pair 2005:2005 Mirror and Copy pair 2006:2006 Mirror and Copy pair 2007:2007 Mirror and Copy pair 2008:2008 Mirror and Copy pair 2009:2009 successfully successfully successfully successfully successfully successfully successfully successfully successfully successfully failed failed failed failed failed failed failed failed failed failed back. back. back. back. back. back. back. back. back. back.
This configuration is one that does not provide consistent data at primary site because of the simple Global Copy relationship for the volumes in this former session 20 configuration. Consider either returning to the primary site and reestablishing the original Global Mirror configuration for session 20 or creating a Global Mirror configuration from the secondary site to the primary site. In our example, we return the application from Site 2 to Site 1.
431
Figure 27-44 shows what is required to fail back from a secondary site to a primary site and restart Global Mirror session number 20. The required commands are issued at the primary site if the Global Mirror FlashCopy relationships are reestablished after the failover process to the secondary site completes.
I/O
DS8700
Fail over Fail back GM master
DS8700
I/O
Session
20
GM master
Session
30
Figure 27-44 Failback to the primary site and returning to the original GM session configuration
Example 27-62 contains the DS CLI commands that are used to fail over to the primary site and reestablish Global Mirror session number 20.
Example 27-62 Reestablish a Global Mirror session for session 20
dscli> failoverpprc -type gcp CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote CMUC00196I failoverpprc: Remote 2000-2009:2000-2009 Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair Mirror and Copy pair 2000:2000 2001:2001 2002:2002 2003:2003 2004:2004 2005:2005 2006:2006 2007:2007 2008:2008 2009:2009 successfully successfully successfully successfully successfully successfully successfully successfully successfully successfully reversed. reversed. reversed. reversed. reversed. reversed. reversed. reversed. reversed. reversed.
dscli> failbackpprc -type gcp 2000-2009:2000-2009 CMUC00197I failbackpprc: Remote Mirror and Copy pair CMUC00197I failbackpprc: Remote Mirror and Copy pair CMUC00197I failbackpprc: Remote Mirror and Copy pair CMUC00197I failbackpprc: Remote Mirror and Copy pair CMUC00197I failbackpprc: Remote Mirror and Copy pair CMUC00197I failbackpprc: Remote Mirror and Copy pair CMUC00197I failbackpprc: Remote Mirror and Copy pair CMUC00197I failbackpprc: Remote Mirror and Copy pair CMUC00197I failbackpprc: Remote Mirror and Copy pair CMUC00197I failbackpprc: Remote Mirror and Copy pair
2000:2000 2001:2001 2002:2002 2003:2003 2004:2004 2005:2005 2006:2006 2007:2007 2008:2008 2009:2009
successfully successfully successfully successfully successfully successfully successfully successfully successfully successfully
failed failed failed failed failed failed failed failed failed failed
back. back. back. back. back. back. back. back. back. back.
dscli> resumegmir -lss 20 -session 20 CMUC00164I resumegmir: Global Mirror for session 20 successfully resumed.
432
The failoverpprc and failbackpprc commands are required to reverse the Global Copy replication direction from the primary site to the secondary site for session 20. After these two commands complete successfully, the I/O may be restarted on the primary site and Global Mirror session 20 is reestablished on the primary site. This brief scenario shows that more than one Global Mirror session may exist within the same DS8700 or DS8800. Managing a session does not impact the other session.
433
434
Part 7
Part
435
436
28
Chapter 28.
437
Server or Servers
***
normal application I/Os Global Mirror network Global Mirror FlashCopy asynchronous incremental long distance NOCOPY
Metro Mirror
A B
C D
Global Mirror
Remote Site (Site C)
438
Global Mirror: An asynchronous operation supports long-distance replication for disaster recovery. The Global Mirror methodology has no impact on applications at the local site. It provides a recoverable, restartable, and consistent image at the remote site with a Recovery Point Objective (RPO) of 3 - 5 seconds. This chapter provides a high-level overview of Metro/Global Mirror. It does not go into the details of the individual processes and elements of the solution, as they are already described in great detail in the other parts of this book.
439
Server or Servers
***
4
normal application I/Os Global Mirror network asynchronous long distance
1
A
Metro Mirror
2 3
B
Global Mirror
Global Mirror consistency group formation (CG)
a. write updates to B volumes paused (< 3ms) to create CG b. CG updates to B volumes drained to C volumes c. after all updates drained, FlashCopy changed data from C to D volumes
The local site (site A) to intermediate site (site B) component is identical to Metro Mirror. Application writes are synchronously copied to the intermediate site before write complete is signaled to the application. All writes to the local site volumes in the mirror are treated the same way, as explained in Chapter 14, Metro Mirror overview on page 147. The intermediate site (site B) to remote site (site C) component is identical to Global Mirror, except that: The writes to intermediate site volumes are Metro Mirror secondary writes and not application primary writes. The intermediate site volumes are both source (GM) and target (MM) at the same time.
440
The intermediate site storage systems are collectively paused by the Global Mirror master storage system to create the consistency group (CG) set of updates. This pause normally take 3 ms every 3 - 5 seconds. After the CG set is formed, the Metro Mirror writes, from local site (site A) volumes to intermediate site (site B) volumes, continue. Also, the CG updates continue to drain to remote site (site C) volumes. The intermediate site to remote site drain takes only a few seconds to complete (as few as 2 - 3 seconds). When all updates are drained to the remote site, all changes since the last FlashCopy from the C volumes to the D volumes are logically (NOCOPY) copied by FlashCopy to the D volumes. After the logical FlashCopy is complete, the intermediate site to remote site Global Copy data transfer is resumed until the next formation of a Global Mirror Consistency Group. The process that described is repeated every 3 - 5 seconds if the interval for consistency group formation is set to zero. Otherwise, it is repeated at the specified interval plus 3 - 5 seconds. The Global Mirror processes are described in Chapter 23, Global Mirror overview on page 283. IBM offers services and solutions for the automation and management of the Metro/Global Mirror environment, which include GDPS for System z and Tivoli Storage Productivity Center for Replication (see Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685). More details about GDPS can be found at the following website: http://www-03.ibm.com/systems/z/gdps/
441
442
29
Chapter 29.
443
Local Site
Intermediate Site
Remote Site
A
Metro Mirror
B
Global Mirror from intermediate to remote site
C D
Figure 29-1 Metro/Global Mirror with an additional Global Mirror from the remote to intermediate sites
444
Local site
Intermediate site
Remote site
Subordinate
Subordinate
The Metro Mirror is set up from multiple storage systems at the local site to the Global Mirror primary systems at the intermediate site. The paths of the Metro Mirror relationships must be established with Consistency Group enabled. The data consistency at the intermediate site can be provided in two steps. If there is a Metro Mirror write failure at the intermediate site, the LSS at the primary site to which the failing write was going is frozen for the consistency group timeout. The timeout is 60 seconds for FB volumes and 120 seconds for CKD volumes. During this time, a freeze of all LSSs in each Metro Mirror primary storage system must take place or writes continue after the timeout, which compromises the consistency at the intermediate site and the remote site as well.
445
An automation mechanism is required that detects the extended long busy state and initiates a freeze and run, where all LSSs of the Metro Mirror must be frozen and released afterward to let production continue at the primary site. This functionality is provided either with Tivoli Storage Productivity Center for Replication or Global Dispersed Parallel Sysplex (GDPS). It is a preferred practice to use one of these automation solutions.
446
Remote site
DS8000
switch
switch
switch
Site connections: Figure 29-3 shows only one Fibre Channel director per site. In a real implementation, the connections between the sites should be realized by two redundant fabrics across all locations. When an application uses a Metro Mirror from the primary production site to the secondary production site, the primary production site is the Metro/Global Mirror local site and the secondary site is the intermediate site and vice versa for applications that run in the other direction.
447
Using this setup, you can configure clustered systems that span across both production sites, which offers more flexibility for high availability setups. A failure of a clustered server at the production site can be taken over automatically at the other production site using the automatic takeover feature of the cluster software while the primary storage is still being used. Otherwise, if a single host system, as shown in Figure 29-3 on page 447, fails at the primary production site, the only way to start the production site again is through a failover of the storage and a recovery of the secondary server at the remote site. A failure of the storage can be seen as a failure of an infrastructure component, and this situation can be categorized as a partial disaster. In this case, recovery of storage at the intermediate or remote site must be performed. The storage at the intermediate site can be accessed by the server at the local site, if the bandwidth between both production sites does not compromise performance. This situation means that a takeover of the servers is not necessarily required, which offers you more flexibility about how to start the applications. For cost effectiveness, it is possible to consolidate the needed disk capacity at the remote site. Because of the distance, a stretched cluster environment might not be possible, so single host systems or local clustered systems are implemented at the remote site. A large-scale server that can provide multiple logical partitions to run the multiple applications from the production sites could be used. It is also possible to equip the storage system at the remote site with disk drive modules of higher capacity to reduce the number of installed storage systems.
Local Site A
75-ABTV1
Intermediate Site
1 3
Remote Site
1 2
B
75-03461
C
75-TV181
6 5
448
Figure 29-4 on page 448 shows the steps you must complete to set up Metro/Global Mirror: 1. Set up all Metro Mirror and Global Mirror paths. 2. Set up Global Copy with nocopy from the intermediate site to the remote site. 3. Set up Metro Mirror between local and intermediate sites. Let the initial Metro Mirror copy complete before you proceed to the next step. 4. Set up FlashCopy at the remote site. 5. Create a Global Mirror session and add volumes to the session at the intermediate site. 6. Start Global Mirror at the intermediate site.
Figure 29-5 List of PPRC ports and LSSs to define PPRC paths
449
Figure 29-5 on page 449 shows that for the PPRC paths for each LSS from the local site to the intermediate site, the port relationships I0240:I0320 and I0032:I0001 must be used. The other ports from the intermediate site have a relationship to the ports on the remote site. On each site, the same LSS IDs are assigned. At the intermediate site, use different ports for the communication to the local and to the remote site.
dscli> lsavailpprcport -fullid -remotedev IBM.2107-7503461 -remotewwnn 5005076303FFC08F 64:42 Local Port Attached Port Type ================================================== IBM.2107-75ABTV1/I0011 IBM.2107-7503461/I0142 FCP IBM.2107-75ABTV1/I0012 IBM.2107-7503461/I0141 FCP dscli> This command uses the -fullid option, which shows the corresponding device ID of the storage systems for each LSSID in the output. This information helps you identify on which storage systems the PPRC ports are seen. If this output does not show the correct port ID as defined in the connection table defined (see Creating a connection table on page 449), then the zoning should be verified and corrected. The following sections give detailed information about the steps outlines in Figure 29-4 on page 448.
450
CMUC00149I mkpprcpath: Remote Mirror and Copy path 42:64 successfully established. dscli> # #At intermediate site: dscli> mkpprcpath -remotedev IBM.2107-75ABTV1 -remotewwnn 5005076303FFC663 -srclss 64 -tgtlss 42 -consistgrp I0233:I0033 I0301:I0102 CMUC00149I mkpprcpath: Remote Mirror and Copy path 64:42 successfully established. dscli>
Step 2: Setting up Global Copy with the NOCOPY option from intermediate to remote sites
The Global Copy relationship is cascaded from a Metro Mirror relationship. To enable the cascade, use the -cascade option with the mkpprc command. Example 29-3 shows the complete command. In Step 3: Setting up Metro Mirror between local and intermediate sites, the Metro Mirror is created and copies all data from local site volumes to intermediate site volumes. The data is then forwarded by the Global Copy to remote site volumes. To avoid copying the data twice, initiate the setup of the Global Copy with NOCOPY mode, which is provided with the -mode nocp option.
Example 29-3 Set up Global Copy from intermediate to remote sites dscli> mkpprc -remotedev IBM.2107-75TV181 CMUC00153I mkpprc: Remote Mirror and Copy CMUC00153I mkpprc: Remote Mirror and Copy CMUC00153I mkpprc: Remote Mirror and Copy CMUC00153I mkpprc: Remote Mirror and Copy dscli> -mode volume volume volume volume nocp pair pair pair pair -cascade -type gcp 6400-6403:e400-e403 relationship 6400:E400 successfully created. relationship 6401:E401 successfully created. relationship 6402:E402 successfully created. relationship 6403:E403 successfully created.
The volumes at the intermediate site are target volumes for Metro Mirror and source volumes for Global Copy at the same time. When the lspprc command is run against these volumes, it shows the pair status of both the Metro Mirror and Global Copy relationships (see Example 29-5).
Example 29-5 Query the PPRC relationship at the intermediate site dscli> lspprc -remotedev IBM.2107-75TV181 6400-6403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================== 4200:6400 Target Copy Pending Metro Mirror 42 unknown Disabled Invalid 4201:6401 Target Copy Pending Metro Mirror 42 unknown Disabled Invalid 4202:6402 Target Copy Pending Metro Mirror 42 unknown Disabled Invalid 4203:6403 Target Copy Pending Metro Mirror 42 unknown Disabled Invalid
451
64 64 64 64
The Metro Mirror is established as a full copy, while the Global Copy is established in nocopy mode. The Metro Mirror starts its initial copy, and each track that is copied from the local to the intermediate site is copied to the remote site as well. Usually, the distance between the intermediate and remote sites is higher and the bandwidth is lower, as with Metro Mirror. The out-of-sync track count of the Global Copy increases if the Metro Mirror has not finished its initial copy. Afterward, the Global Copy is able to empty the out-of-sync tracks as limited by its bandwidth capacity.
452
Space-Efficient target: Uses Space-Efficient volumes as FlashCopy targets, which means that FlashCopy SE is used in the Global Mirror setup. Virtual capacity is allocated in a Space-Efficient repository when these volumes are created. A repository volume per extent pool is used to provide physical storage for all Space-Efficient volumes in that extent pool. Background copy is not allowed if Space-Efficient targets are used. Target out-of-space: Indicates the actions to be taken if the Space-Efficient repository runs of space. Use the value fail for this parameter. This value causes a failure of the FlashCopy pair relationship if the repository runs out of space. For a detailed description of FlashCopy SE, see Chapter 10, IBM FlashCopy SE on page 95. Example 29-6 shows how to create a standard FlashCopy copy for the Global Mirror at the remote site.
Example 29-6 Create the FlashCopy for the Global Mirror at the remote site dscli> mkflash -tgtinhibit -record CMUC00137I mkflash: FlashCopy pair CMUC00137I mkflash: FlashCopy pair CMUC00137I mkflash: FlashCopy pair CMUC00137I mkflash: FlashCopy pair dscli> -persist -nocp e400-e403:d400-d403 e500-e503:d500-d503 E400:D400 successfully created. E401:D401 successfully created. E402:D402 successfully created. E403:D403 successfully created.
Step 5: Creating a Global Mirror session and adding volumes to the session
This action is done at the intermediate site. The session is created with the mksession command. For each LSS, a session must be created. The session is denoted by a session number. Example 29-7 shows a series of commands. The first command creates an empty session. The second command populates the session with the primary volumes of the Global Copy. If Global Mirror is not started and no consistency group is formed, the status of the volumes is Join Pending.
Example 29-7 Create the sessions and add the volumes
dscli> mksession -lss 64 2 CMUC00145I mksession: Session 2 opened successfully. dscli> dscli>chsession -lss 64 -action add -volume 6400-6403 2 CMUC00147I chsession: Session 2 successfully modified. dscli> dscli>lssession -dev IBM.2107-7503461 64 2 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ===================================================================================================================== 64 02 Normal 6400 Join Pending Primary Copy Pending Secondary Full Duplex True Enable 64 02 Normal 6401 Join Pending Primary Copy Pending Secondary Full Duplex True Enable 64 02 Normal 6402 Join Pending Primary Copy Pending Secondary Full Duplex True Enable 64 02 Normal 6403 Join Pending Primary Copy Pending Secondary Full Duplex True Enable dscli>
453
If you run showgmir, you can verify whether the Global Mirror was successfully created. The Copy State should show Running (see Example 29-9). The output has the Copy State highlighted. An indication of ongoing consistency formation is an increase of the FlashCopy Sequence Number. When you run more showgmir commands, you can see the sequence number rising.
Example 29-9 Monitor Global Mirror
dscli> showgmir 64 ID Master Count Master Session ID Copy State Fatal Reason CG Interval Time (seconds) Coord. Time (milliseconds) Max CG Drain Time (seconds) Current Time CG Time Successful CG Percentage FlashCopy Sequence Number Master ID Subordinate Count Master/Subordinate Assoc
IBM.2107-7503461/64 1 0x02 Running Not Fatal 0 50 30 06/19/2012 14:07:16 CEST 06/19/2012 14:07:16 CEST 99 0x4FE06B74 IBM.2107-7503461 0 -
When the -metrics option is supplied with the showgmir command, the progress of the consistency group formation can be monitored. The Total Successful CG Count entry shows the current number of successful created consistency groups. When the Global Mirror is running, the number of consistency groups is steadily growing each time the showgmir command is run (see Example 29-10).
Example 29-10 Show progress of consistency formation
dscli> showgmir -metrics 64 ID Total Failed CG Count Total Successful CG Count Successful CG Percentage Failed CG after Last Success Last Successful CG Form Time Coord. Time (milliseconds)
454
CG Interval Time (seconds) Max CG Drain Time (seconds) First Failure Control Unit First Failure LSS First Failure Status First Failure Reason First Failure Master State Last Failure Control Unit Last Failure LSS Last Failure Status Last Failure Reason Last Failure Master State Previous Failure Control Unit Previous Failure LSS Previous Failure Status Previous Failure Reason Previous Failure Master State dscli> dscli> showgmir -metrics 65 ID Total Failed CG Count Total Successful CG Count Successful CG Percentage Failed CG after Last Success Last Successful CG Form Time Coord. Time (milliseconds) CG Interval Time (seconds) Max CG Drain Time (seconds) First Failure Control Unit First Failure LSS First Failure Status First Failure Reason First Failure Master State Last Failure Control Unit Last Failure LSS Last Failure Status Last Failure Reason Last Failure Master State Previous Failure Control Unit Previous Failure LSS Previous Failure Status Previous Failure Reason Previous Failure Master State dscli>
The showgmiroos command displays the number of tracks that are out of synchronization. Example 29-11 shows the OutOfSyncTracks of LSS 64.
Example 29-11 Display the number of out-of-sync tracks
455
Local Site
Intermediate Site
Remote Site
C
3
4 5
As shown in Figure 29-6, here are steps that are required to extend an existing Metro Mirror to a 3-site setup using Metro/Global Mirror: 1. Set up PPRC paths from the intermediate site to the remote site. 2. Set up Global Copy with COPY from the intermediate site to the remote site. In contrast to the initial Metro/Global Mirror setup, the Global Copy is established with copy mode to ensure that all data is copied from the intermediate site to the remote site. 3. Set up FlashCopy at the remote site. 4. Create a Global Mirror session and add volumes to the session at the intermediate site. 5. Start Global Mirror at the intermediate site. The steps are described in detail in 29.3, Initial setup of Metro/Global Mirror on page 448.
456
457
458
30
Chapter 30.
459
30.1 Overview
This chapter includes some general operations that are preferred practices for a Metro/Global Mirror environment. Metro/Global Mirror has its own functionality in the DS8000 Copy Services portfolio with characteristics beyond a simple combination of Metro Mirror and Global Mirror. Some of the topics described here are preferred practices that ensure that you take special care because of the complexity of Metro/Global Mirror. Here are some terms that are used in this chapter: Host A host is a server where applications or components of applications are running. Hosts can be implemented as single servers or can be clustered with other servers. An application is all the software components that are used to build a self-contained solution for the users business. The different software components are running on one or more servers. Applications that are running on cluster servers can take advantage of the takeover procedures that are offered by the cluster software. In a failure situation, the cluster software stops the application from where it is running and starts it automatically at the other clustered server. In a Remote Copy relationship, the primary storage is where the regular production is; it is the source for the data replication. In a Remote Copy relationship, the remote storage is where the data is replicated to. A remote storage is the target of the copy relationship. A storage failover changes the access point of the data from the primary to the remote storage system. The application is started using the remote storage system.
Applications
Application takeover
Storage failover
460
It is important to understand the status of the whole Metro/Global Mirror environment before you act. Otherwise, your actions can result in a situation where either the Metro Mirror or the Global Copy environments cannot be recovered to their original statuses. In this case, the relationships must be re-created from the scratch, which results in a full copy of the data. The following examples show the status of the whole Metro/Global Mirror environment while it is in normal operations. In Example 30-1, all volumes of the Metro Mirror are in Full Duplex mode.
Example 30-1 Full Duplex mode of a Metro Mirror dscli>lspprc -remotedev IBM.2107-75ABTV1 6000-6003 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================== 6000:6200 Full Duplex Metro Mirror IBM.2107-7520781/60 unknown Disabled Invalid 6001:6201 Full Duplex Metro Mirror IBM.2107-7520781/60 unknown Disabled Invalid 6002:6202 Full Duplex Metro Mirror IBM.2107-7520781/60 unknown Disabled Invalid 6003:6203 Full Duplex Metro Mirror IBM.2107-7520781/60 unknown Disabled Invalid
The Global Copy volume status always shows Copy Pending. To determine that all tracks are copied to the secondary site, you must obtain the Out of Sync Tracks value. The Global Copy environment is synchronized when the Out of Sync Tracks value is zero for all volumes, as shown in Example 30-2.
Example 30-2 Synchronized Global Copy relationship
dscli> lspprc -remotedev IBM.2107-75ABTV1 -l 6400-6403 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================================================== ======================================================================= 6400:6200 Copy Pending Global Copy 0 Disabled Enabled invalid 64 unknown Disabled True 6401:6201 Copy Pending Global Copy 0 Disabled Enabled invalid 64 unknown Disabled True 6402:6202 Copy Pending Global Copy 0 Disabled Enabled invalid 64 unknown Disabled True 6403:6203 Copy Pending Global Copy 0 Disabled Enabled invalid 64 unknown Disabled True
In a Metro/Global Mirror environment, the volumes at the intermediate site are in a different state than in a conventional Metro Mirror or Global Mirror environment. Because the Global Mirror environment is cascaded to the Metro Mirror environment, the volumes at the intermediate site are a target and source at the same time. Thus, the lspprc command that is run at the intermediate site shows both the Metro Mirror and the Global Mirror environments, as shown in Example 30-3.
Example 30-3 Status of the intermediate volumes in a Metro/Global Mirror dscli> lspprc -remotedev IBM.2107-7520781 6200-6203 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================== 6000:6200 Target Full Duplex Metro Mirror 60 unknown Disabled Invalid 6001:6201 Target Full Duplex Metro Mirror 60 unknown Disabled Invalid 6002:6202 Target Full Duplex Metro Mirror 60 unknown Disabled Invalid 6003:6203 Target Full Duplex Metro Mirror 60 unknown Disabled Invalid 6200:6400 Copy Pending Global Copy 62 unknown Disabled True 6201:6401 Copy Pending Global Copy 62 unknown Disabled True 6202:6402 Copy Pending Global Copy 62 unknown Disabled True 6203:6403 Copy Pending Global Copy 62 unknown Disabled True
461
The unfreezepprc command removes the long busy status from the primary volumes and I/O continues. The pair status of the primary volumes is still Suspended, as shown in Example 30-5.
Example 30-5 Unfreezepprc after the freezepprc dscli> unfreezepprc -remotedev IBM.2107-75ABTV1 68:62 CMUC00198I unfreezepprc: Remote Mirror and Copy pair 68:62 successfully thawed. dscli> lspprc -remotedev IBM.2107-75ABTV1 6800-6803 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================ 6800:6200 Suspended Freeze Metro Mirror 68 unknown Disabled Invalid
462
6801:6201 Suspended Freeze Metro Mirror 68 6802:6202 Suspended Freeze Metro Mirror 68 6803:6203 Suspended Freeze Metro Mirror 68
A freezepprc command can be issued against the volumes of a running application, although it has an impact on the application: after the freezepprc command runs, the applications are waiting to continue with the I/O. This freeze is disabled by the unfreezepprc command or after the consistency group timeout value is exceeded. This method is used to provide a consistent copy of the data at the secondary site when the applications at the primary site are not stopped.
463
The consistency of the data is provided in each phase of the consistency group formation process, as shown in Figure 30-1.
Create consistency group by holding application writes while creating bitmap containing updates for this consistency group on all volumes - design point is 2-3ms. Maximum coordination time eg 10ms
Transmit updates in Global Copy mode while between consistency groups Consistency group interval - 0s to 18hrs
Drain consistency group and send to remote DS8000 using Global Copy Application writes for next consistency group are recorded in change recording bitmap Maximum drain time - eg 1 min
When the failure occurs during the coordination time or the draining of the data to the remote site, consistency is still available on the FlashCopy volumes because the new FlashCopy has not started. When the failure happens while the FlashCopy command is running, a manual intervention is required to either roll revert or commit the consistency group before you continue with the recovery or restarting Global Mirror. The action that is needed depends on the status of the FlashCopy, where the sequence numbers and the revertible flag are of special interest: When the sequence numbers of the FlashCopy are different, the copy process has not started for all the volumes. In this case, the recent FlashCopy is inconsistent and cannot be used. You must roll back by running revertflash, which removes all uncommitted sequences from the FlashCopy target. When the sequence numbers are all equal and there is a mix of revertible and unrevertible volumes, the copy to the FlashCopy targets has taken place but the process is not finished for some volumes. In this case, the recent FlashCopy targets are usable and the process must be committed manually by running commitflash. Example 30-6 shows a commit situation. All volumes that are shown in the lsflash command output have the same sequence number. The volume 50E8 and the ones that follow it have the Revertible flag enabled while the volumes before 50E8 have the flag disabled. In this case, the commitflash command must be issued to the volumes 50E8 and further to re-create the consistency.
Example 30-6 Commit situation
dscli> lsflash -l 5000-50ff 5100-51ff 5300-5344 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy OutOfSyncTracks DateCreated DateSynced ========================================================================================================================================== ====================================================================
464
5000:5A00 50 437DABED 300 Disabled Enabled Enabled Fri Nov 18 09:22:09 CET 2005 Fri Nov 18 11:24:44 CET 2005 5001:5A01 50 437DABED 300 Disabled Enabled Enabled Fri Nov 18 09:22:09 CET 2005 Fri Nov 18 11:24:44 CET 2005 ..... 50E7:5AE7 50 437DABED 300 Disabled Enabled Enabled Fri Nov 18 09:22:09 CET 2005 Fri Nov 18 11:24:44 CET 2005 50E8:5AE8 50 437DABED 300 Disabled Enabled Enabled Fri Nov 18 09:22:09 CET 2005 Fri Nov 18 11:24:44 CET 2005 50E9:5AE9 50 437DABED 300 Disabled Enabled Enabled Fri Nov 18 09:22:09 CET 2005 Fri Nov 18 11:24:44 CET 2005 .....
Disabled Disabled
Enabled Enabled
Disabled Disabled
Disabled Disabled
15259 15259
Example 30-7 shows a revertible situation. The sequence numbers have two different values, where the revertible flag is disabled for the lower sequence number. These values show that the FlashCopy process has not started for these volumes. The volumes with the higher sequence numbers have the revertible flag enabled, which means that the relationship exists but is not committed. The correct interaction to bring back consistency is to issue a revertflash command.
Example 30-7 Revertible situation
dscli> lsflash -l 5000-50ff 5100-51ff 5300-5344 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy OutOfSyncTracks DateCreated DateSynced ========================================================================================================================================== ==================================================================== 5000:5A00 50 437DC7BD 300 Disabled Enabled Enabled Enabled Disabled Disabled Disabled 15259 Fri Nov 18 09:22:09 CET 2005 Fri Nov 18 13:23:24 CET 2005 5001:5A01 50 437DC7BD 300 Disabled Enabled Enabled Enabled Disabled Disabled Disabled 15259 Fri Nov 18 09:22:09 CET 2005 Fri Nov 18 13:23:24 CET 2005 ..... 50E3:5AE3 50 437DC7BD 300 Fri Nov 18 09:22:09 CET 2005 Fri 50E4:5AE4 50 437DC7BC 300 Fri Nov 18 09:22:09 CET 2005 Fri 50E5:5AE5 50 437DC7BD 300 Fri Nov 18 09:22:09 CET 2005 Fri 50E6:5AE6 50 437DC7BC 300 Fri Nov 18 09:22:09 CET 2005 Fri 50E7:5AE7 50 437DC7BD 300 Fri Nov 18 09:22:09 CET 2005 Fri 50E8:5AE8 50 437DC7BC 300 Fri Nov 18 09:22:09 CET 2005 Fri 50E9:5AE9 50 437DC7BC 300 Fri Nov 18 09:22:09 CET 2005 Fri 50EA:5AEA 50 437DC7BD 300 Fri Nov 18 09:22:09 CET 2005 Fri 50EB:5AEB 50 437DC7BD 300 Fri Nov 18 09:22:09 CET 2005 Fri
Disabled Enabled Nov 18 13:23:24 CET Disabled Enabled Nov 18 13:23:23 CET Disabled Enabled Nov 18 13:23:24 CET Disabled Enabled Nov 18 13:23:23 CET Disabled Enabled Nov 18 13:23:24 CET Disabled Enabled Nov 18 13:23:23 CET Disabled Enabled Nov 18 13:23:23 CET Disabled Enabled Nov 18 13:23:24 CET Disabled Enabled Nov 18 13:23:24 CET
Enabled 2005 Enabled 2005 Enabled 2005 Enabled 2005 Enabled 2005 Enabled 2005 Enabled 2005 Enabled 2005 Enabled 2005
When the -revertible option is supplied with the lsflash command, only the revertible volumes are listed. A host failover would gain access to an inconsistent set of secondary volumes, as the most recent data is at the FC target volumes. Before you give the server access to the volumes, the FlashCopy target volumes must be reversed to their sources, which are also the secondary GlobalCopy volumes, by running reverseflash, as shown in Example 30-8 on page 466. Data consistency: When the application is stopped and two consistency groups are formed, you can assume that the data on the FlashCopy source and on the FlashCopy target are the same. In this case, a fast reverse restore is not necessary.
465
dscli> reverseflash -fast -tgtpprc CMUC00169I reverseflash: FlashCopy CMUC00169I reverseflash: FlashCopy CMUC00169I reverseflash: FlashCopy CMUC00169I reverseflash: FlashCopy
6400-6403:6600-6603 volume pair 6400:6600 volume pair 6401:6601 volume pair 6402:6602 volume pair 6403:6603
Figure 30-2 Set up an additional Global Mirror path after a failover to the remote site
466
The setup of the additional Global Mirror path consists of the following steps: 1. 2. 3. 4. Clean up Metro Mirror. Create or fail back from the remote site to the intermediate site. Establish a FlashCopy path to the additional volumes at the intermediate site. Create a session and start the Global Mirror environment.
Step 2: Creating a Global Copy path from the remote site to the intermediate site
The volumes at C (see Figure 30-2 on page 466) are a source for the Global Mirror environment, which is established from the remote site to the intermediate site. The first step is to set up the Global Copy path from the remote volumes to the intermediate volumes. Example 30-9 shows how to set up the Global Copy environment. Because the data at the remote site is the same as it is at the intermediate site, the Global Copy environment can be established with the -mode nocp option to avoid background copy. The -cascade option must be omitted because the Global Copy environment is no longer a cascaded relationship.
Example 30-9 Set up Global Copy from the intermediate site to the remote site
dscli> mkpprc -remotedev IBM.2107-75ABTV1 -type gcp -mode nocp 6400-6403:6200-6203 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6400:6200 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6401:6201 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6402:6202 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6403:6203 successfully created. dscli> dscli> lspprc -remotedev IBM.2107-75ABTV1 -l 6400-6403 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================================================== ======================================================================= 6400:6200 Copy Pending Global Copy 0 Disabled Enabled invalid 64 unknown Disabled True 6401:6201 Copy Pending Global Copy 0 Disabled Enabled invalid 64 unknown Disabled True 6402:6202 Copy Pending Global Copy 0 Disabled Enabled invalid 64 unknown Disabled True 6403:6203 Copy Pending Global Copy 0 Disabled Enabled invalid 64 unknown Disabled True
467
Step 4: Creating a session and a Global Mirror environment at the remote site
To complete the setup of the Global Mirror, create a session at the remote site and add the Global Copy source volumes in to the session. Finally, start the Global Mirror environment. Example 30-11 shows the setup of the session and the Global Mirror environment and how to check if the Global Mirror environment is running properly.
Example 30-11 Create a session and start the Global Mirror environment
dscli> mksession -lss 64 1 dscli> chsession -lss 64 -action add -volume 6400-6403 1 CMUC00145I mksession: Session 1 opened successfully. CMUC00147I chsession: Session 1 successfully modified. dscli> lssession 64 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ========================================================================================================================= 64 01 CG In Progress 6400 Active Primary Copy Pending Secondary Simplex True Enable 64 01 CG In Progress 6401 Active Primary Copy Pending Secondary Simplex True Enable 64 01 CG In Progress 6402 Active Primary Copy Pending Secondary Simplex True Enable 64 01 CG In Progress 6403 Active Primary Copy Pending Secondary Simplex True Enable dscli> mkgmir -lss 64 -session 1 CMUC00162I mkgmir: Global Mirror for session 1 successfully started. dscli> showgmir 64 ID IBM.2107-75ABTV2/64 Master Count 1 Master Session ID 0x01 Copy State Running Fatal Reason Not Fatal CG Interval (seconds) 0 XDC Interval(milliseconds) 50 CG Drain Time (seconds) 30 Current Time 11/14/2005 10:38:21 CET CG Time 11/14/2005 10:38:21 CET Successful CG Percentage 10 FlashCopy Sequence Number 0x43785B0D Master ID IBM.2107-75ABTV2 Subordinate Count 0 Master/Subordinate Assoc -
468
31
Chapter 31.
469
31.1 Overview
Planned recovery scenarios are a series of operations that are initiated by you that are based on failover/failback advanced Copy Services function features. A storage failover always impacts the production environment. For this reason, all failover operations must be planned carefully by you in terms of the integrity of procedures, the time schedule, and the availability of the applications. In a planned failover, the application must be stopped before recovery at the intermediate site or the remote site can occur. The host is then given access to the volumes and the application is started using these volumes. A reason for a planned recovery at the intermediate site might be because there are planned maintenance activities at the local site, which might impact the production environment (for more information, see 30.2, General considerations for storage failover on page 460). A recovery at the intermediate site minimizes the impact to the production environment that runs normally at the local site. In a large data center, many applications are running on different servers can participate in the Metro/Global Mirror. In this case, it is most likely that failover operations are not accomplished against the complete environment, but rather against specific applications. In this case, the Global Mirror can be set up with multiple Global Mirror sessions, as described in 23.5, Consistency groups on page 296 for detailed information. All the scenarios that are presented in this chapter were tested and represent the preferred practice for the situations they address. However, additional or alternative scenarios are possible, depending on the particular circumstances within your data center. The scenarios in a Metro/Global Mirror are complex and sometimes difficult to handle with DS CLI commands. A certain level of automation, where the steps are processed in the correct sequence is needed. This automation is provided by Tivoli Storage Productivity Center for Replication (for more information, see Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685). Other scenarios: If you require other scenarios, test them extensively before you implement them in your production environment.
470
Here are the steps (Figure 31-1) to fail over the production environment to the intermediate site: 1. 2. 3. 4. Stop the production environment at the local site. Suspend Metro Mirror. Fail over Metro Mirror to the intermediate site. Start the application at the intermediate site.
Local Site
2
Intermediate Site
3
Remote Site C
A
75-ABTV1
B
75-03461
75-TV181
D
1 4
471
dscli> lspprc -remotedev IBM.2107-7503461 4200-4203 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ===================================================================================================== 4200:6400 Suspended Host Source Metro Mirror 42 60 Disabled Invalid 4201:6401 Suspended Host Source Metro Mirror 42 60 Disabled Invalid 4202:6402 Suspended Host Source Metro Mirror 42 60 Disabled Invalid 4203:6403 Suspended Host Source Metro Mirror 42 60 Disabled Invalid # At the intermediate site: # dscli> lspprc -remotedev IBM.2107-75ABTV1 6400-6403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status =========================================================================================================== 4200:6400 Target Suspended Update Target Metro Mirror 42 unknown Disabled Invalid 4201:6401 Target Suspended Update Target Metro Mirror 42 unknown Disabled Invalid 4202:6402 Target Suspended Update Target Metro Mirror 42 unknown Disabled Invalid 4203:6403 Target Suspended Update Target Metro Mirror 42 unknown Disabled Invalid 6400:E400 Copy Pending Global Copy 64 unknown Disabled True 6401:E401 Copy Pending Global Copy 64 unknown Disabled True 6402:E402 Copy Pending Global Copy 64 unknown Disabled True 6403:E403 Copy Pending Global Copy 64 unknown Disabled True
472
After a failover, the status of the Metro Mirror secondary volumes that are used in Metro/Global Mirror is different from the status of a failover in conventional Metro Mirror. In conventional Metro Mirror, the secondary volumes have the Suspended status with the Host Source reason. Because in Metro/Global Mirror the Metro Mirror secondary volumes are still cascaded to the remote volumes, the status of these volumes is not Suspended Host Source. The status of the secondary volumes is Target Suspended with the Host Target reason. Failback and synchronization: In an MGM configuration, after you perform the failover at the intermediate site (failover from B to A) while production was still running on A, you cannot fail back from A to B to resynchronize the B volumes with the A volumes without removing the cascaded Global Copy relationship.
31.3 Returning the production environment to the local site from the intermediate site
This section describes how to return the production environment to the local site from the intermediate site. It relates to the scenario described in 31.2, Recovering the production environment at the intermediate site on page 470, or when the production environment is taken over by the intermediate site after a failure at the local site. The return of the production environment from the intermediate site to the local site means that the Global Copy must be suspended because the volumes cannot be the source for the remote and the local sites at the same time. This configuration means that there are two things to consider about the applications at the intermediate site: 1. When the data at the local site is too old because the production environment was at the intermediate site a relatively long time, the data at the remote site should be kept as a save copy. It is also possible to create an additional FlashCopy relationship between the C and D volumes. For the failback process, the applications must be stopped for the whole duration of the scenario. 2. If the application downtime must be as short as possible, the applications can be stopped after the failback of the Metro Mirror to the local site occurs and all the volumes are in the Full Duplex mode (for more information, see Step 4: Failing back Metro Mirror to the local site and waiting for the Full Duplex state on page 475). This situation implies that during all the subsequent steps, the production data has no other valid copy because the Global Copy is suspended.
473
Figure 31-2 illustrates the steps to return the production environment to the local site from the intermediate site where the application is stopped first to ensure that there is always a valid copy of the data available.
Local Site
Intermediate Site
5 6 3 7
Remote Site
A
75-ABTV1
B
4 2 8 9 1
75-03461
C
75-TV181
Production Host
Figure 31-2 Return of the production environment from the intermediate site to the local site
Here are the steps to return the production environment to the local site from the intermediate site, as described in Figure 31-2: 1. 2. 3. 4. 5. 6. 7. 8. 9. Stop I/O at the intermediate site. Terminate Global Mirror. Suspend Global Copy. Fail back Metro Mirror to the local site and wait for Full Duplex volume status. Fail over to the local site. Fail back Metro Mirror from the local site to the intermediate site. Resume Global Copy. Start I/O at the local site. Start Global Mirror or add volumes to the session.
dscli> rmgmir -quiet -lss 64 -session 2 CMUC00165I rmgmir: Global Mirror for session 2 successfully stopped.
474
Step 4: Failing back Metro Mirror to the local site and waiting for the Full Duplex state
Now the failback to the local site can be started. Run failbackpprc at the intermediate site, as shown in Example 31-5. The secondary volumes are specified as the source volumes and the primary volumes as the target volumes. It is a preferred practice to check that the Metro Mirror volumes are in the Full Duplex mode as well before you proceed to the next step.
Example 31-5 Failback to the local site and checking the status dscli> failbackpprc -remotedev IBM.2107-75ABTV1 -type mmir 6400-6403:4200-4203 CMUC00197I failbackpprc: Remote Mirror and Copy pair 6400:4200 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 6401:4201 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 6402:4202 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 6403:4203 successfully failed back. dscli> dscli> lspprc -remotedev IBM.2107-75ABTV1 6400-6403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 6400:4200 Full Duplex Metro Mirror 64 120 Disabled Invalid 6401:4201 Full Duplex Metro Mirror 64 120 Disabled Invalid 6402:4202 Full Duplex Metro Mirror 64 120 Disabled Invalid 6403:4203 Full Duplex Metro Mirror 64 120 Disabled Invalid
475
Step 6: Failing back Metro Mirror from the local site to the intermediate site
In this step, Metro Mirror is failed back to the original direction. If the production environment was started after Steps 5: Failover to the local site on page 475, all changes to the volumes are now copied to the intermediate site. Example 31-7 shows the failback to the intermediate site. This command is run at the local site.
Example 31-7 Failback from the local site to the intermediate site
dscli> failbackpprc -remotedev IBM.2107-7503461 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy dscli>
-type mmir 4200-4203:6400-6403 pair 4200:6400 successfully failed pair 4201:6401 successfully failed pair 4202:6402 successfully failed pair 4203:6403 successfully failed
476
Local Site
4
Intermediate Site
5
Remote Site
A
75-ABTV1
B
75-03461
C
3
75-TV181
Figure 31-3 illustrates the steps that it takes to perform a recovery at the remote site. An additional option for this scenario is to set up Global Mirror from the remote site to the intermediate site to provide disaster protection for the production environment while it is at the remote site. This setup requires more volumes as FlashCopy targets at the intermediate site. For more information, see 30.5, Setting up an additional Global Mirror from the remote site on page 466. Important: In this scenario, Global Copy between the intermediate and remote site is terminated and re-created in the opposite direction as a nocopy relationship. To avoid data corruption during the whole scenario, do not allow any I/O to any host volume. Here are the steps to perform the recovery of the production environment at the remote site (Figure 31-3): 1. 2. 3. 4. 5. 6. Stop I/O at the local site. Terminate Global Mirror. Terminate Global Copy. Fail over Metro Mirror to the intermediate site. Establish Global Copy from the remote site to the intermediate site. Start I/O at the remote site.
477
dscli> rmgmir -quiet -lss 64 -session 2 CMUC00165I rmgmir: Global Mirror for session 2 successfully stopped.
Example 31-11 Remove Global Copy dscli> rmpprc -remotedev IBM.2107-75TV181 CMUC00155I rmpprc: Remote Mirror and Copy CMUC00155I rmpprc: Remote Mirror and Copy CMUC00155I rmpprc: Remote Mirror and Copy CMUC00155I rmpprc: Remote Mirror and Copy dscli> 6400-6403:e400-e403 volume pair 6400:E400 volume pair 6401:E401 volume pair 6402:E402 volume pair 6403:E403 relationship relationship relationship relationship successfully successfully successfully successfully withdrawn. withdrawn. withdrawn. withdrawn.
478
The Metro Mirror failover can also be seen as preparation for the failback to the local site. The reversed Metro Mirror acts as a cascade from the reversed Global Copy between the remote and intermediate sites. For this reason, the Metro Mirror secondary volumes failover is run with the -cascade option. Also, to avoid extended response times at the host side, a cascaded copy relationship should not be set up as a synchronous relationship. Thus, the failoverpprc command for Metro Mirror is run with the -type gcp option. Example 31-12 shows the failover of Metro Mirror. Specifying volumes: To fail over to the intermediate site, you must specify the secondary volumes as the source and the primary volumes as targets in the failoverpprc command.
Example 31-12 Failover of the Metro Mirror dscli> failoverpprc -remotedev IBM.2107-75ABTV1 -type gcp -cascade 6400-6403:4200-4203 CMUC00196I failoverpprc: Remote Mirror and Copy pair 6400:4200 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair 6401:4201 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair 6402:4202 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair 6403:4203 successfully reversed. dscli> dscli> lspprc -remotedev IBM.2107-75ABTV1 6400-6403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ==================================================================================================== 6400:4200 Suspended Host Source Global Copy 64 120 Disabled True 6401:4201 Suspended Host Source Global Copy 64 120 Disabled True 6402:4202 Suspended Host Source Global Copy 64 120 Disabled True 6403:4203 Suspended Host Source Global Copy 64 120 Disabled True
If you run lspprc at the intermediate site, you can see that the type of the volume pairs changed from Metro Mirror to Global Copy.
Step 5: Establishing Global Copy from the remote site to the intermediate site
In anticipation of returning the production environment to the local site, and if the volumes at the intermediate site and the links are still available, it is now possible to establish Global Copy from the remote site to the intermediate site. If, at the intermediate site, more volumes can be provided for a new FlashCopy relationship to the intermediate volumes, it is possible to set up an additional Global Mirror. Assuming that the production environment might remain for a long period at the remote site, this approach provides consistent data at the intermediate site, and protects the production environment against a possible disaster at the remote site. For more details about this topic, see 30.5, Setting up an additional Global Mirror from the remote site on page 466.
479
Example 31-13 shows how to establish Global Copy. You must omit the -cascade option because the Global Copy is no longer in a cascaded relationship. Because the volumes at the remote and intermediate sites are identical, use the -mode nocp option to avoid needing to make a full copy.
Example 31-13 Establish Global Copy to the intermediate site dscli> mkpprc -remotedev IBM.2107-7503461 CMUC00153I mkpprc: Remote Mirror and Copy CMUC00153I mkpprc: Remote Mirror and Copy CMUC00153I mkpprc: Remote Mirror and Copy CMUC00153I mkpprc: Remote Mirror and Copy dscli> -mode volume volume volume volume nocp pair pair pair pair -type gcp e400-e403:6400-6403 relationship E400:6400 successfully relationship E401:6401 successfully relationship E402:6402 successfully relationship E403:6403 successfully created. created. created. created.
31.5 Returning the production environment to the local site from the remote site
The scenario that is described in this section includes the return of the production environment from the remote site after a planned recovery, which is described in 31.4, Recovery of the production environment at the remote site on page 477. This scenario also applies after a recovery at the remote site takes place because of a failure at the local site. If the recovery at the remote site was accomplished because of a failure at the local site, the return of the production environment to the local site can be done only when all required resources are available again. As part of the recovery at the remote site scenario, a Global Copy is established from the remote site to the intermediate site. If this step was omitted, it must be done now. Ensure that all the data is drained to the intermediate site and then to the local site before the applications are started again at the local site. If an additional Global Mirror is set up as described in 30.5, Setting up an additional Global Mirror from the remote site on page 466, then it must be removed before you return the production environment to the local site.
480
Figure 31-4 illustrates the steps to return the production environment to the local site.
Local Site
75-ABTV1
Intermediate Site
75-03461
Remote Site
75-TV181
A
4 2 8 7
B
3
C D
1
Figure 31-4 Return of the production environment from the remote site to the local site
Important: In this scenario, the Global Copy between the intermediate and remote sites is terminated and re-created in the opposite direction as a nocopy relationship. To avoid data corruption, no I/O to any host volume can occur during the whole scenario. Here are the steps to return the production environment from the remote site to the local site: 1. Stop I/O at the remote site. 2. Fail back Metro Mirror from the intermediate site to the local site and wait until the pairs are in the Full Duplex state. 3. Terminate Global Copy from the remote site to the intermediate site. 4. Fail over to the local site. 5. Fail back Metro Mirror from the local site to the intermediate site. 6. Establish Global Copy from the intermediate site to the remote site. 7. Start I/O at the local site. 8. Start Global Mirror or add volumes to a session.
481
Step 2: Failing back Metro Mirror from the intermediate site to the local site
The running applications data at the remote site is replicated to the intermediate site because a Global Copy was established during the failover procedure (see Step 5: Establishing Global Copy from the remote site to the intermediate site on page 479). When the local site becomes available again, Metro Mirror can be failed back to transfer the changed data back to the local site. The Metro Mirror failover that is described in Step 4: Failing over Metro Mirror to the intermediate site on page 478 must be performed with the -cascade option because with the failback to the local site, the Metro Mirror is now cascaded again to the Global Copy. This situation implies that a cascaded copy relationship to the local site must not be a synchronous relationship. Use the -type gcp option to change Metro Mirror in to Global Copy mode. If the failover is not run with these two options during the failover to the remote site scenario, these options can be supplied now by running failbackpprc. In Example 31-14, the failbackpprc command is run with the -cascade and -type gcp options. It is run at the intermediate site. Before you take any further action, make sure that all data is replicated to the local site by running lspprc.
Example 31-14 Fail back Metro Mirror to the local site dscli> failbackpprc -remotedev IBM.2107-75ABTV1 -type gcp -cascade 6400-6403:4200-4203 CMUC00197I failbackpprc: Remote Mirror and Copy pair 6400:4200 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 6401:4201 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 6402:4202 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 6403:4203 successfully failed back. dscli> dscli> lspprc -remotedev IBM.2107-75ABTV1 6400-6403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 6400:4200 Copy Pending Global Copy 64 unknown Disabled True 6401:4201 Copy Pending Global Copy 64 unknown Disabled True 6402:4202 Copy Pending Global Copy 64 unknown Disabled True 6403:4203 Copy Pending Global Copy 64 unknown Disabled True E400:6400 Target Copy Pending Global Copy E4 unknown Disabled Invalid E401:6401 Target Copy Pending Global Copy E4 unknown Disabled Invalid E402:6402 Target Copy Pending Global Copy E4 unknown Disabled Invalid E403:6403 Target Copy Pending Global Copy E4 unknown Disabled Invalid
Step 3: Terminating Global Copy from the remote site to the intermediate site
Reversing the Global Copy using failover and failback functions results in the intermediate volumes becoming sources for the reversed Metro Mirror and the Global Copy, which is not permitted. For this reason, you must terminate Global Copy now. In a later step (see Step 6: Creating Global Copy from the intermediate site to the remote site on page 484), Global Copy is re-created from the intermediate site to the remote site in nocopy mode. Attention: Make sure that between this step and step 6 that no I/O to the volumes occurs. Otherwise, data can become inconsistent because of the nocopy option, which means that you must perform a full copy.
482
Example 31-15 shows the DS CLI command to remove Global Copy. Important: Before you remove Global Copy, check whether all the tracks are copied to the secondary site. Run lspprc to inspect the out-of-sync tracks.
Example 31-15 Remove the Global Copy from the remote site to the intermediate site dscli> rmpprc -quiet -remotedev IBM.2107-7503461 CMUC00155I rmpprc: Remote Mirror and Copy volume CMUC00155I rmpprc: Remote Mirror and Copy volume CMUC00155I rmpprc: Remote Mirror and Copy volume CMUC00155I rmpprc: Remote Mirror and Copy volume dscli> e400-e403:6400-6403 pair E400:6400 relationship pair E401:6401 relationship pair E402:6402 relationship pair E403:6403 relationship successfully successfully successfully successfully withdrawn. withdrawn. withdrawn. withdrawn.
Step 5: Failing back Metro Mirror from the local site to the intermediate site
The failback enables tracks to be copied from the local site to the intermediate site. The Metro Mirror is no longer in a cascaded relationship and the copy type is now mmir again. Because the applications are not started, the contents of the volumes at the local and the intermediate site are the same. The status of the Metro Mirror volumes is Full Duplex. Example 31-17 shows the failback to the intermediate site at the local site.
Example 31-17 Fail back Metro Mirror from the local site to the intermediate site dscli> failbackpprc -remotedev IBM.2107-7503461 -type mmir 4200-4203:6400-6403 CMUC00197I failbackpprc: Remote Mirror and Copy pair 4200:6400 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 4201:6401 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 4202:6402 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair 4203:6403 successfully failed back. dscli> dscli> lspprc -remotedev IBM.2107-7503461 4200-4203 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status
483
================================================================================================== 4200:6400 Full Duplex Metro Mirror 42 60 Disabled Invalid 4201:6401 Full Duplex Metro Mirror 42 60 Disabled Invalid 4202:6402 Full Duplex Metro Mirror 42 60 Disabled Invalid 4203:6403 Full Duplex Metro Mirror 42 60 Disabled Invalid
Step 6: Creating Global Copy from the intermediate site to the remote site
After the Metro Mirror is prepared for production at the local site, the Global Copy must be created from the intermediate site to the remote site with the -cascade option. Because the volumes at the remote and intermediate sites are identical, the Global Copy should be established with the -mode nocp option (NO COPY). Important: Be sure that there is no active I/O to the volumes between steps 3 - 7. Otherwise, data can become inconsistent because of the nocopy option and you must run a full copy. Example 31-18 shows the DS CLI command to create the Global Copy.
Example 31-18 Create Global Copy from the intermediate site to the remote site dscli>mkpprc -remotedev IBM.2107-75TV181 -mode nocp -cascade -type CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship dscli> gcp 6400-6403:e400-e403 6400:E400 successfully created. 6401:E401 successfully created. 6402:E402 successfully created. 6403:E403 successfully created.
mkgmir -lss 64-session 2 CMUC00162I mkgmir: Global Mirror for session 2 successfully started.
484
32
Chapter 32.
485
32.1 Overview
Disaster recovery test scenarios are used to test readiness for a disaster. While testing, the production environment must not be affected. Also, the replication of the production data to the disaster sites should be affected as little as possible. The general goal for enabling disaster recovery testing at the intermediate or the remote site is to provide consistent data according to the required level of consistency of the applications that must be tested. The requirements for a disaster recovery test scenario are summarized in the following list: 1. All volumes that belong to the application that is subject to the test must be considered. 2. Data consistency for all volumes must be provided at the testing site. 3. Replication must continue as quickly as possible to keep the Recovery Point Objective (RPO) for the production environment at the local site as low as possible. It is a preferred practice to use extra FlashCopy volumes for the disaster recovery tests. In this chapter, all the scenarios use these volumes. The target volumes of Metro Mirror or Global Mirror should not be used during the tests for the following reasons: 1. When a target volume is used for the test, no replication of the production is possible, which means that the RPO increases continually during the whole disaster recovery test. 2. The scenarios are more complicated, especially when replication is re-established to the original direction after the test. 3. There is a potential risk of data corruption because of user errors. Tip: Tivoli Storage Productivity Center for Replication has an automation solution that provides practice scenarios that reflect real disaster scenarios. For more information, see Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685. The scenarios that are presented in this chapter have a minimal impact on the existing Metro/Global Mirror and no impact on the production environment at the local site. We describe two disaster recovery test scenarios: one at the intermediate site and one at the remote site. Both scenarios were tested and represent the preferred practices for the situations they address. However, additional or alternative scenarios remain possible depending on the particular circumstances within your data center.
486
Figure 32-1 illustrates the steps that are required to perform a failover to the intermediate site while the production environment remains at the local site.
Local Site
75-ABTV1
Intermediate Site
75-03461
Remote Site
75-TV181
B E
2
C D
Production Host
Here are the steps to fail over to the intermediate site for a DR test with Metro Mirror: 1. Prepare the failover by issuing a freeze and unfreeze of the Metro Mirror. Copying data: During the freeze/unfreeze interval, no data is copied to the remote site. 2. Establish FlashCopy to the additional volumes at the intermediate site. 3. Re-establish PPRC paths from the local site to the intermediate site. 4. Resume Metro Mirror. 5. Start I/O at the disaster recovery host.
487
Example 32-1 shows the usage of the freezepprc and the unfreezepprc commands for Metro Mirror. Theses commands are issued at the local site. A subsequent lspprc command at the local storage system shows that the primary volumes went in to a Suspended state as a result of the Metro Mirror freeze. An lspprc command that is issued at the intermediate site shows the status of the secondary volumes, which are still in the Target Full Duplex mode.
Example 32-1 Freeze and unfreeze before the failover # # At the local site: dscli> freezepprc -remotedev IBM.2107-7503461 42:64 CMUC00161I freezepprc: Remote Mirror and Copy consistency group 42:64 successfully created. dscli> dscli> lspprc -remotedev IBM.2107-7503461 4200-4203 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================ 4200:6400 Suspended Freeze Metro Mirror 42 60 Disabled Invalid 4201:6401 Suspended Freeze Metro Mirror 42 60 Disabled Invalid 4202:6402 Suspended Freeze Metro Mirror 42 60 Disabled Invalid 4203:6403 Suspended Freeze Metro Mirror 42 60 Disabled Invalid # #At the intermediate site: dscli> lspprc -remotedev IBM.2107-75ABTV1 6400-6403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 4200:6400 Target Full Duplex Metro Mirror 42 unknown Disabled Invalid 4201:6401 Target Full Duplex Metro Mirror 42 unknown Disabled Invalid 4202:6402 Target Full Duplex Metro Mirror 42 unknown Disabled Invalid 4203:6403 Target Full Duplex Metro Mirror 42 unknown Disabled Invalid 6400:E400 Copy Pending Global Copy 64 unknown Disabled True 6401:E401 Copy Pending Global Copy 64 unknown Disabled True 6402:E402 Copy Pending Global Copy 64 unknown Disabled True 6403:E403 Copy Pending Global Copy 64 unknown Disabled True dscli>
To quickly re-enable I/O to the primary volumes, run unfreezepprc immediately after you run freezepprc. Example 32-2 illustrates the unfreezepprc command.
Example 32-2 Unfreeze the primary volumes and re-create the PPRC paths
dscli> unfreezepprc -remotedev IBM.2107-7503461 42:64 CMUC00198I unfreezepprc: Remote Mirror and Copy pair 42:64 successfully thawed. dscli>
488
ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled BackgroundCopy ==================================================================================================================================== 6400:7000 64 0 120 Disabled Disabled Disabled Disabled Enabled Enabled Enabled 6401:7001 64 0 120 Disabled Disabled Disabled Disabled Enabled Enabled Enabled 6402:7002 64 0 120 Enabled Disabled Disabled Disabled Enabled Enabled Enabled 6403:7003 64 0 120 Enabled Disabled Disabled Disabled Enabled Enabled Enabled
For two volumes, the ActiveCopy flag is enabled, as indicated by the fact that the background copy for these volumes is still ongoing.
Step 3: Re-establishing the PPRC paths from the local site to the intermediate site
After the FlashCopy for the practice scenario is created, re-establish the PPRC paths by running mkpprcpath (Example 32-4).
Example 32-4 Set up PPRC paths from the local site to the intermediate site and check the paths
dscli> lspprcpath -fullid 42 IBM.2107-75ABTV1/42 IBM.2107-7503461/64 Failed FF64 5005076303FFC08F dscli> dscli> mkpprcpath -remotedev IBM.2107-7503461 -remotewwnn 5005076303FFC08F -srclss 42 -tgtlss 64 -consistgrp I0011:I0142 I0012:I0141 MUC00149I mkpprcpath: Remote Mirror and Copy path 42:64 successfully established. dscli> dscli> lspprcpath -fmt default -fullid -dev IBM.2107-75ABTV1 -hdr off 42 IBM.2107-75ABTV1/42 IBM.2107-7503461/64 Success FF64 IBM.2107-75ABTV1/I0011 IBM.2107-7503461/I0142 5005076303FFC08F IBM.2107-75ABTV1/42 IBM.2107-7503461/64 Success FF64 IBM.2107-75ABTV1/I0012 IBM.2107-7503461/I0141 5005076303FFC08F dscli>
489
dscli> lspprc -remotedev IBM.2107-75ABTV1 6400-6403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 4200:6400 Target Full Duplex Metro Mirror 42 unknown Disabled Invalid 4201:6401 Target Full Duplex Metro Mirror 42 unknown Disabled Invalid 4202:6402 Target Full Duplex Metro Mirror 42 unknown Disabled Invalid 4203:6403 Target Full Duplex Metro Mirror 42 unknown Disabled Invalid 6400:E400 Copy Pending Global Copy 64 unknown Disabled True 6401:E401 Copy Pending Global Copy 64 unknown Disabled True 6402:E402 Copy Pending Global Copy 64 unknown Disabled True 6403:E403 Copy Pending Global Copy 64 unknown Disabled True dscli>
490
Figure 32-2 illustrates the steps that are used to perform a failover to the remote site while the production environment remains at the local site.
Local Site
75-ABTV1
Intermediate Site
75-03461
Remote Site
75-TV181
A
1
C
2
E
5
Production Host
Figure 32-2 Disaster recovery test at the remote site with a Metro Mirror freeze
Here are the steps to fail over to the remote site for a DR test with Metro Mirror: 1. Freeze and unfreeze the Metro Mirror. Data copying: During the freeze/unfreeze interval, no data is copied to the remote site. 2. 3. 4. 5. Establish FlashCopy to the additional volumes at the remote site. Re-establish the PPRC paths from the local site to the intermediate site. Resume Metro Mirror. Start I/O at the disaster recovery host.
For this scenario, assume that an additional host is available to start the application from the additional volumes that were copied with FlashCopy at the remote site. These steps are explained in detail in 32.2.1, Disaster recovery test at the intermediate site on page 486. The only difference is that steps 2 and 5 are performed on the remote site.
491
This scenario requires a few more steps than when you use Metro Mirror, but it is fairly easy task to achieve, as shown in Figure 32-3.
Local Site
75-ABTV1
Intermediate Site
75-03461 75-TV181
Remote Site
4
C
5 1 7 3
E
8
Production Host
Figure 32-3 Disaster recovery test at the remote site with Global Mirror
Here are the steps to fail over to the remote site for a DR test with Global Mirror: 1. 2. 3. 4. 5. 6. 7. 8. Stop Global Mirror. Pause Global Copy from the intermediate site to the remote site. Fail over Global Copy from the remote site to the intermediate site. Fast reverse FlashCopy from the D to C volumes. Establish FlashCopy from the C to E volumes. Fail back Global Copy from the intermediate site to the remote site. Restart Global Mirror. Start I/O at the disaster recovery host.
RPO: Between step 1 and step 7, Global Mirror is stopped and cannot provide consistency at the remote site. During this time, the RPO increases.
492
Step 2: Suspending Global Copy from the intermediate site to the remote site
Suspend the Global Copy from the intermediate to the remote site to prepare for the failback in Step 6: Failing back Global Copy from the intermediate site to the remote site on page 494. In that step, the failbackpprc command is issued to the original direction, and the command requires that the Global Copy primary volume is in the suspended state. Example 32-7 shows the command that is used to suspend Global Copy from the intermediate site to the remote site.
Example 32-7 Suspend Global Copy from the intermediate site to the remote site dscli> pausepprc -remotedev IBM.2107-75TV181 CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy dscli> 6400-6403:e400-e403 volume pair 6400:E400 volume pair 6401:E401 volume pair 6402:E402 volume pair 6403:E403 relationship relationship relationship relationship successfully successfully successfully successfully paused. paused. paused. paused.
Step 3: Failing over Global Copy from the remote site to the intermediate site
Now the Global Copy must be failed over to the remote site. Issue the failoverpprc command that is shown in Example 32-8 at the remote site. Remember that the C volumes are not consistent because the Metro Mirror constantly sends data from the running production site while this command is running.
Example 32-8 Fail over Global Copy from the remote site to the intermediate site
dscli> failoverpprc -remotedev IBM.2107-7503461 CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy dscli>
-type gcp e400-e403:6400-6403 pair E400:6400 successfully reversed. pair E401:6401 successfully reversed. pair E402:6402 successfully reversed. pair E403:6403 successfully reversed.
dscli> reverseflash -fast -tgtpprc CMUC00169I reverseflash: FlashCopy CMUC00169I reverseflash: FlashCopy CMUC00169I reverseflash: FlashCopy CMUC00169I reverseflash: FlashCopy dscli>
e400-e403:d400-d403 volume pair E400:D400 volume pair E401:D401 volume pair E402:D402 volume pair E403:D403
493
The FlashCopy must be re-created to start Global Mirror in Step 7: Restarting Global Mirror on page 495. You can re-create the FlashCopy by running the command that is shown in Example 32-10.
Example 32-10 Re-create the Global Mirror FlashCopy pairs
dscli> mkflash -tgtinhibit -record CMUC00137I mkflash: FlashCopy pair CMUC00137I mkflash: FlashCopy pair CMUC00137I mkflash: FlashCopy pair CMUC00137I mkflash: FlashCopy pair scli>
-persist -nocp e400-e403:d400-d403 E400:D400 successfully created. E401:D401 successfully created. E402:D402 successfully created. E403:D403 successfully created.
dscli> mkflash e400-e403:d410-d413 CMUC00137I mkflash: FlashCopy pair CMUC00137I mkflash: FlashCopy pair CMUC00137I mkflash: FlashCopy pair CMUC00137I mkflash: FlashCopy pair dscli>
While the background copy continues, the data can be accessed by the test servers at the remote site. However, you should finish the scenario before you start testing in order to bring the Global Mirror back into production as fast as possible.
Step 6: Failing back Global Copy from the intermediate site to the remote site
Global Copy must be failed back to the original direction. To accomplish this task, run failbackpprc command at the intermediate site. Make sure that the -cascade option is supplied because the Global Copy is still cascaded to the Metro Mirror.
Example 32-12 Failbackpprc from the intermediate site to the remote site
dscli> failbackpprc -remotedev IBM.2107-75TV181 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy dscli>
-type gcp 6400-6403:e400-e403 pair 6400:E400 successfully failed pair 6401:E401 successfully failed pair 6402:E402 successfully failed pair 6403:E403 successfully failed
494
dscli> mkgmir -lss 64 -session 2 CMUC00162I mkgmir: Global Mirror for session 2 successfully started. Run showgmir to verify that the Global Mirror continues with consistency formation.
495
496
33
Chapter 33.
497
33.1 Overview
In a Metro/Global Mirror environment, data is copied from the local site to the intermediate site and then cascaded to the remote site. Obviously, if there is a storage failure (or disaster) at the intermediate site or the connectivity between the local and intermediate sites fails, data cannot be replicated to the remote site. However, when there is additional physical connectivity between the local and remote sites, Global Mirror can be established from the local site to the remote site, as shown in Figure 33-1. You can use Incremental Resync to establish the Global Mirror relationship between the local and remote sites without replicating all the data again.
Local Site
Intermediate Site
Remote Site
C D
A prerequisite to Incremental Resync is that you must have paths from the local site to the remote site. These paths are required for the Global Mirror, which is created when the failing intermediate site is bypassed. Typically in a Metro/Global Mirror environment, the distance between the intermediate site and the remote site is larger than the distance of the Metro Mirror between the local and the remote sites. Therefore, the distance between the local and the remote sites is similar to the existing Global Mirror. So, the topology of a Metro/Global Mirror with Incremental Resync is a full mesh of all three sides.
498
Figure 33-2 shows an example of a possible architecture for a 3-site solution with Incremental Resync.
Local site Intermediate site
Global Mirror
Remote site
DS8000
DS8000
DS8000
Metro Mirror
Global Mirror
Switch
Switch Switch
Server
When this topology is set up, the size of the link between the local and remote sites should have similar bandwidth and latency characteristics. The Global Mirror between the local and the remote sites then shows a Recovery Point Objective (RPO) behavior that is similar to the one for the existing Global Mirror from the intermediate to remote sites. With Metro/Global Mirror using incremental resynchronization, the following scenarios are now possible: Intermediate side fails. Set up a Global Mirror between the local and the remote sites, where only the changes since the failure at the intermediate site occurred are resynchronized. Production continues at the local site. When the intermediate side is back, it is synchronized from the remote site and the original replicating direction from the local site through the intermediate site to the remote site can be restored Local site fails. Fail over production at the intermediate site while Global Mirror continues. When the local site is back, it is resynchronized from the remote site. When this task is done, the Metro Mirror is switched incrementally from the intermediate site to the local site. A normal Metro/Global Mirror operation can be maintained as it is, meaning production remains at the intermediate site and the replication to the remote site is cascaded through the former local site. Swap production between the local and intermediate sites. The copy direction can be swapped incrementally from local, intermediate, remote to intermediate, local, remote, and vice versa.
499
Local Site
Intermediate Site
Remote Site
A
Incremental Resync querying N N+1
B
Global Mirror
C D
C C R R
toggles
If there is a failure at the intermediate site, the Metro Mirror primary volumes go into suspend mode. The query to the Global Mirror does not return any results, which means that the change recording bitmaps are not toggled anymore. The tracks that could not be copied to the intermediate site are recorded in the out-of-sync bitmap of the Metro Mirror at the primary site. To bypass the failed intermediate site, you establish a new Global Copy relationship from the local volumes to the Global Copy target volumes at the remote site. You must establish the new Global Copy with the Incremental Resync option. Only the tracks that are recorded in the change recording bitmap and in the out-of-sync recording are copied to the remote site. When all of the out-of-sync tracks are sent from the local site to the remote site, a Global Mirror can be started at the local site with the remote volumes as the Global Mirror target volumes.
500
enablenoinit
disable recover
override
Incremental Resync: When you enable the -incrementalresync option, the N-bitmaps are initialized with ones and at least two toggles must happen before the data is copied incrementally. Before the incremental copy occurs, a full copy occurs. Before you can initiate an incremental resynchronization, it must run at least 10 - 15 minutes, or a full copy occurs.
501
When the Metro Mirror is already established without Incremental Resync enabled, the -mode nocp option (which prevents a full copy) is required to enable Incremental Resync to an existing Metro Mirror. Example 33-1 shows how to set up incremental resynchronization for Metro/Global Mirror.
Example 33-1 Setup of Metro/Global Mirror with Incremental Resync
dscli> mkpprc -remotedev IBM.2107-1301261 -type mmir -mode nocp -incrementalresync enable 2000-2007:2000-2007 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 2000:2000 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 2001:2001 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 2002:2002 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 2003:2003 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 2004:2004 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 2005:2005 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 2006:2006 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 2007:2007 successfully created. dscli> dsclI> lspprc -l 2000-2007 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG ========================================================================================================================================== ========================================================================= 2000:2000 Full Duplex Metro Mirror 0 Disabled Disabled Invalid 20 300 Disabled Invalid Enabled Disabled Disabled Enabled 2001:2001 Full Duplex Metro Mirror 0 Disabled Disabled Invalid 20 300 Disabled Invalid Enabled Disabled Disabled Enabled 2002:2002 Full Duplex Metro Mirror 0 Disabled Disabled Invalid 20 300 Disabled Invalid Enabled Disabled Disabled Enabled 2003:2003 Full Duplex Metro Mirror 0 Disabled Disabled Invalid 20 300 Disabled Invalid Enabled Disabled Disabled Enabled 2004:2004 Full Duplex Metro Mirror 0 Disabled Disabled Invalid 20 300 Disabled Invalid Enabled Disabled Disabled Enabled 2005:2005 Full Duplex Metro Mirror 0 Disabled Disabled Invalid 20 300 Disabled Invalid Enabled Disabled Disabled Enabled 2006:2006 Full Duplex Metro Mirror 0 Disabled Disabled Invalid 20 300 Disabled Invalid Enabled Disabled Disabled Enabled 2007:2007 Full Duplex Metro Mirror 0 Disabled Disabled Invalid 20 300 Disabled Invalid Enabled Disabled Disabled Enabled dscli>
33.2.2 Migrating from Global Mirror to Metro/Global Mirror with Incremental Resync
You can initialize Metro/Global Mirror with Incremental Resync if production is already running in a Global Mirror environment. The transition from the Global Mirror environment to an Metro/Global Mirror with Incremental Resync environment introduces a new intermediate site. This new intermediate site is synchronized with the remote site.
502
Figure 33-4 illustrates the steps to migrate from a Global Mirror environment to a Metro/Global Mirror with Incremental Resync environment.
Local Site
A
4 3 1 7 2
Remote Site
Intermediate Site
C D
B
8
6 1
Figure 33-4 Moving from Global Mirror to Metro/Global Mirror with Incremental Resync
The general approach to this scenario is that the new intermediate site is initially synchronized with the remote site through a cascaded Global Copy relationship. When the initial copy phase of this relationship is complete, Incremental Resync is enabled at the local site. Then, the original Global Mirror is terminated, the new Global Copy is reversed, and the Metro Mirror is established by using the contents of the change recording bitmap of Incremental Resync. Here are the steps for setting up and initializing Metro/Global Mirror with Incremental Resync if Global Mirror is already running. Assume that there is an existing Global Mirror relationship between the A and C volumes, and an intermediate site (B volumes) is added to the configuration. 1. 2. 3. 4. 5. 6. 7. 8. Set up all PPRC paths. Start Global Copy from the remote site to the new intermediate site. Start incremental resynchronization from the local site. Terminate Global Mirror and suspend Global Copy from the local site to the remote site. Terminate Global Copy from the local site to the remote site at the remote site. Reverse Global Copy to run from the intermediate site to the remote site. Set up Metro Mirror from the local site to the intermediate site. Set up Global Mirror at the intermediate site.
503
When the PPRC ports are identified, then you can create PPRC paths from the local site to the intermediate site and also from the intermediate site to the remote site. It is a preferred practice to create the PPRC paths in the opposite directions as well, that is, from the intermediate site to the local site and from the remote site to the intermediate site. The prerequisites for creating a successful PPRC path and the reasons for creating PPRC in both directions are explained in Step 1: Setting up all Metro Mirror and Global Mirror paths on page 450. To create the PPRC paths, run mkpprcpath on each of the sites.
Step 2: Starting Global Copy from the remote site to the intermediate site
To migrate the data from the remote site to the intermediate site, establish a Global Copy relationship. This relationship transmits the data to the new intermediate site without impacting the current production environment and the current Global Mirror configuration. Run mkpprc with the -type gcp option at the remote site. This Global Copy relationship must be a cascading relationship, which is why the -cascade option is used. Example 33-2 shows the DS CLI command that is run to create the Global Copy. The lspprc command that is then run shows the cascading relationship from the local site to the remote site to the intermediate site. The First Pass column shows False; wait until all the volumes complete the first pass, which is indicated by the True entry.
Example 33-2 Start Global Copy from the remote site to the intermediate site
dscli> mkpprc -remotedev IBM.2107-7520781 -type gcp -mode full -cascade 6000-6003:6600-6603 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6000:6600 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6001:6601 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6002:6602 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6003:6603 successfully created. dscli> dscli> lspprc -fullid 6000-6003 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status==================================================================================================================================== IBM.2107-7503461/6200:IBM.2107-75DNXC1/6000 Target Copy Pending Global Copy IBM.2107-7503461/62 unknown Disabled Invalid IBM.2107-7503461/6201:IBM.2107-75DNXC1/6001 Target Copy Pending Global Copy IBM.2107-7503461/62 unknown Disabled Invalid IBM.2107-7503461/6202:IBM.2107-75DNXC1/6002 Target Copy Pending Global Copy IBM.2107-7503461/62 unknown Disabled Invalid IBM.2107-7503461/6203:IBM.2107-75DNXC1/6003 Target Copy Pending Global Copy IBM.2107-7503461/62 unknown Disabled Invalid IBM.2107-75DNXC1/6000:IBM.2107-7520781/6600 Copy Pending Global Copy IBM.2107-75DNXC1/60 unknown Disabled False IBM.2107-75DNXC1/6001:IBM.2107-7520781/6601 Copy Pending Global Copy IBM.2107-75DNXC1/60 unknown Disabled False IBM.2107-75DNXC1/6002:IBM.2107-7520781/6602 Copy Pending Global Copy IBM.2107-75DNXC1/60 unknown Disabled False IBM.2107-75DNXC1/6003:IBM.2107-7520781/6603 Copy Pending Global Copy IBM.2107-75DNXC1/60 unknown Disabled False dscli>
504
Run mkpprc with the -incrementalresync enablenoinit option from the local site using the current Global Mirror relationship from the local site to the remote site (Example 33-3).
Example 33-3 Start incremental resynchronization from the local site
dscli> mkpprc -remotedev IBM.2107-75DNXC1 -type gcp -mode nocp -incrementalresync enablenoinit 6200-6203:6000-6003 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6200:6000 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6201:6001 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6202:6002 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6203:6003 successfully created. dscli> dscli> lspprc -l 6200-6203 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode Fi rst Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG ========================================================================================================================================== 6200:6000 Copy Pending Global Copy 0 Disabled Enabled Invalid 62 120 Disabled Tr ue Enabled Disabled Disabled Disabled 6201:6001 Copy Pending Global Copy 0 Disabled Enabled Invalid 62 120 Disabled Tr ue Enabled Disabled Disabled Disabled 6202:6002 Copy Pending Global Copy 0 Disabled Enabled Invalid 62 120 Disabled Tr ue Enabled Disabled Disabled Disabled 6203:6003 Copy Pending Global Copy 0 Disabled Enabled Invalid 62 120 Disabled Tr ue Enabled Disabled Disabled Disabled dscli>
Step 4: Terminating Global Mirror and suspending Global Copy at the local site
Terminate the Global Mirror at the local site and suspend the Global Copy from the local site to the remote site by running Run pausepprc to suspend the Global Copy relationship from the local site to the remote site. By suspending Global Copy from the local site to the remote site, data at the intermediate site is no longer in sync with data at the local site. Therefore, the data at the intermediate site begins to age as production continues running at the local site. However, the change recording bitmaps and the out-of-sync bitmaps at the local site track all updates from production. In Example 33-4, the Global Mirror and the sessions are removed at the local site. Finally, the Global Copy from local to remote is suspended.
Example 33-4 Terminate Global Mirror and suspend Global Copy at the local site dscli> rmgmir -quiet -lss 62 -session 2 CMUC00165I rmgmir: Global Mirror for session 2 successfully stopped. dscli> chsession -dev IBM.2107-7503461 -lss 62 -action remove -volume 6200-6203 CMUC00147I chsession: Session 10 successfully modified. dscli> rmsession -dev IBM.2107-7503461 -quiet -lss 62 2 CMUC00146I rmsession: Session 2 closed successfully. dscli> dscli> pausepprc -remotedev IBM.2107-75DNXC1 6200-6203:6000-6003 CMUC00157I pausepprc: Remote Mirror and Copy volume pair 6200:6000 relationship CMUC00157I pausepprc: Remote Mirror and Copy volume pair 6201:6001 relationship CMUC00157I pausepprc: Remote Mirror and Copy volume pair 6202:6002 relationship CMUC00157I pausepprc: Remote Mirror and Copy volume pair 6203:6003 relationship dscli>
505
# # At local Site
dscli> lspprc -fullid 6200-6203 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================================================== IBM.2107-7503461/6200:IBM.2107-75DNXC1/6000 Suspended Host Source Global Copy IBM.2107-7503461/62 120 Disabled True IBM.2107-7503461/6201:IBM.2107-75DNXC1/6001 Suspended Host Source Global Copy IBM.2107-7503461/62 120 Disabled True IBM.2107-7503461/6202:IBM.2107-75DNXC1/6002 Suspended Host Source Global Copy IBM.2107-7503461/62 120 Disabled True IBM.2107-7503461/6203:IBM.2107-75DNXC1/6003 Suspended Host Source Global Copy IBM.2107-7503461/62 120 Disabled True dscli>
# # At remote site #
dscli> lspprc -fullid 6000-6003 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================================================== IBM.2107-75DNXC1/6000:IBM.2107-7520781/6600 Copy Pending Global Copy IBM.2107-75DNXC1/60 120 Disabled True IBM.2107-75DNXC1/6001:IBM.2107-7520781/6601 Copy Pending Global Copy IBM.2107-75DNXC1/60 120 Disabled True IBM.2107-75DNXC1/6002:IBM.2107-7520781/6602 Copy Pending Global Copy IBM.2107-75DNXC1/60 120 Disabled True IBM.2107-75DNXC1/6003:IBM.2107-7520781/6603 Copy Pending Global Copy IBM.2107-75DNXC1/60 120 Disabled True dscli>
506
Step 6: Reversing Global Copy to run from the intermediate site to the remote site
The Global Copy relationship that is initially established from the remote site to the intermediate site can be reversed. To reverse the Global Copy relationship, you must first perform a failover and failback. To fail over to the intermediate site, run failoverpprc with the -type gcp option at the intermediate site. Because the Global Copy that is running from the remote site to the intermediate site is cascaded, specify the -cascade option as well.
Example 33-6 Fail over Global Copy at the intermediate site
dscli> failoverpprc -remotedev IBM.2107-7520781 CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy dscli>
-type gcp -cascade 6000-6003:6600-6603 pair 6000:6600 successfully reversed. pair 6001:6601 successfully reversed. pair 6002:6602 successfully reversed. pair 6003:6603 successfully reversed.
Fail back Global Copy at the intermediate site by running failbackpprc with the -type gcp option at the intermediate site (see Example 33-7). Once again, because Global Copy is in a cascaded relationship, specify the -cascade option.
Example 33-7 Fail back Global Copy at the intermediate site dscli> failbackpprc -remotedev IBM.2107-1300561 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy dscli> -type gcp -cascade 6000-6003:6600-6603 pair 6000:6600 successfully failed back. pair 6001:6601 successfully failed back. pair 6002:6602 successfully failed back. pair 6003:6603 successfully failed back.
Step 7: Setting up Metro Mirror from the local site to the intermediate site
In this step, the Metro Mirror from the local site to the intermediate site is established. Run the mkpprc with the -type mmir option at the local site. In Step 3: Starting incremental resynchronization at the local site on page 504, change recording bitmaps were created to track all the updates from the production environment. To recover and restore these updates, specify the -incrementalresync recover option with the mkpprc command. The recover parameter establishes the Metro Mirror relationship after you check for a relationship at the intermediate site (the Metro Mirror secondary). The change recording bitmaps that are created in Step 3: Starting incremental resynchronization at the local site on page 504 are then merged with the out-of-sync bitmaps at the local site. Disaster preparation: To prepare for a disaster, start Metro Mirror with Incremental Resync enabled. Run mkpprc with the -type mmir and -incrementalresync enable options at the local site. This command creates change recording bitmaps for all the Metro Mirror primary volumes. Statuses: Metro Mirror must be in the Full Duplex mode, and Global Copy must complete a first pass. To query the status of Global Mirror and Metro Mirror, run lspprc -l command at the local site to query Metro Mirror, and at the intermediate site to query Global Copy.
507
In Example 33-8, we create the Metro Mirror with the -incrementalresync recover option, which initiates a copy of all tracks that are marked in the change recording bitmap at the local site. When all the tracks are drained to the intermediate site, the incremental resynchronization restarts.
Example 33-8 Set up Metro Mirror from the local site to the intermediate site
dscli> mkpprc -remotedev IBM.2107-7520781 -type mmir -mode full -incrementalresync recover 6200-6203:6600-6603 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6200:6600 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6201:6601 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6202:6602 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6203:6603 successfully created. dscli> dscli> lspprc -l -fullid 6200-6203 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG ========================================================================================================================================== IBM.2107-7503461/6200:IBM.2107-7520781/6600 Full Duplex Metro Mirror 0 Disabled Enabled Invalid IBM.2107-7503461/62 120 Disabled Invalid Disabled Disabled Disabled Disabled IBM.2107-7503461/6201:IBM.2107-7520781/6601 Full Duplex Metro Mirror 0 Disabled Enabled Invalid IBM.2107-7503461/62 120 Disabled Invalid Disabled Disabled Disabled Disabled IBM.2107-7503461/6202:IBM.2107-7520781/6602 Full Duplex Metro Mirror 0 Disabled Enabled Invalid IBM.2107-7503461/62 120 Disabled Invalid Disabled Disabled Disabled Disabled IBM.2107-7503461/6203:IBM.2107-7520781/6603 Full Duplex Metro Mirror 0 Disabled Enabled Invalid IBM.2107-7503461/62 120 Disabled Invalid Disabled Disabled Disabled Disabled dscli> dscli> mkpprc -remotedev IBM.2107-7520781 -type mmir -mode nocp -incrementalresync enable 6200-6203:6600-6603 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6200:6600 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6201:6601 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6202:6602 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6203:6603 successfully created. dscli> dscli>lspprc -l 6200-6203 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG ========================================================================================================================================== 6200:6600 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 62 120 Disabled Invalid Enabled Disabled Disabled Disabled 6201:6601 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 62 120 Disabled Invalid Enabled Disabled Disabled Disabled 6202:6602 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 62 120 Disabled Invalid Enabled Disabled Disabled Disabled 6203:6603 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 62 120 Disabled Invalid Enabled Disabled Disabled Disabled dscli>
508
Total Failed CG Count Total Successful CG Count Successful CG Percentage Failed CG after Last Success Last Successful CG Form Time Coord. Time (seconds) Interval Time (seconds) Max Drain Time (seconds) First Failure Control Unit First Failure LSS First Failure Status First Failure Reason First Failure Master State Last Failure Control Unit Last Failure LSS Last Failure Status Last Failure Reason Last Failure Master State Previous Failure Control Unit Previous Failure LSS Previous Failure Status Previous Failure Reason Previous Failure Master State dscli> dscli> showgmir -metrics 66 ID Total Failed CG Count Total Successful CG Count Successful CG Percentage Failed CG after Last Success Last Successful CG Form Time Coord. Time (seconds) Interval Time (seconds) Max Drain Time (seconds) First Failure Control Unit First Failure LSS First Failure Status First Failure Reason First Failure Master State Last Failure Control Unit Last Failure LSS Last Failure Status Last Failure Reason Last Failure Master State Previous Failure Control Unit Previous Failure LSS Previous Failure Status Previous Failure Reason Previous Failure Master State dscli>
509
Local Site
75-ABTV1
Remote Site A
1 2
Intermediate Site
C
75-TV181
B
75-03461
You must complete the following steps to transition to a 2-site environment after the local site fails: 1. Fail over to the remote site. 2. Terminate Metro Mirror at the intermediate site. 3. Start applications at the intermediate site.
510
Example 33-10 shows the running of the failoverpprc command with the -force and -cascade options. Running lspprc shows that the Global Copy is now cascaded to the Global Copy from the intermediate site to the remote site. All commands are run at the remote site.
Example 33-10 Forced failoverpprc command from the local site to the remote site
dscli> failoverpprc -remotedev IBM.2107-75ABTV1 -type gcp -cascade -force e400-e403:4200-4203 CMUC00196I failoverpprc: Remote Mirror and Copy pair E400:4200 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair E401:4201 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair E402:4202 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair E403:4203 successfully reversed. dscli> dscli> lspprc e400-e403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ============================================================================================================== 6400:E400 Target Copy Pending Global Copy 64 unknown Disabled Invalid 6401:E401 Target Copy Pending Global Copy 64 unknown Disabled Invalid 6402:E402 Target Copy Pending Global Copy 64 unknown Disabled Invalid 6403:E403 Target Copy Pending Global Copy 64 unknown Disabled Invalid E400:4200 Suspended Host Source Global Copy E4 unknown Disabled True E401:4201 Suspended Host Source Global Copy E4 unknown Disabled True E402:4202 Suspended Host Source Global Copy E4 unknown Disabled True E403:4203 Suspended Host Source Global Copy E4 unknown Disabled True dscli>
511
Local Site
75-ABTV1
Remote Site
8
1 4 5 6
Intermediate Site
C
7
75-TV181
B
75-03461
Here are the steps that reintroduce the local site and move it back to a Metro/Global Mirror with Incremental Resync environment: 1. Fail back Global Copy from the remote site to the local site. 2. Start incremental resynchronization at the intermediate site. 3. Terminate Global Mirror and suspend Global Copy from the intermediate site to the remote site. 4. Suspend Global Copy from the remote site to the local site.
512
5. Terminate Global Copy at the remote site. 6. Reverse Global Copy to run from the local site to the remote site. 7. Start Metro Mirror from the intermediate site to the local site. 8. Start Global Mirror at the local site.
Step 1: Failing back Global Copy from the remote site to the local site
When the local site is available again and all relationships are cleaned up, you can start resynchronization of the local site by starting a Global Copy from the remote site to the local site. You can perform this action because you ran the forced failover from the remote site to the local site in Step 1: Failing over to the remote site on page 510. The resynchronization of the local site is provided by a failback command with -cascade option because the Global Mirror from the intermediate site to the remote site is still active. Draining the out-of-sync tracks: The remote site to local site out-of-sync tracks must be drained before you continue with Step 2: Starting incremental resynchronization at the intermediate site. This action ensures that the local site is fully resynchronized. Tip: When the local site is back, in addition to cleaning up the remaining relationships, you might need to clear the remaining SCSI reservations from the host access before the failure occurred. Example 33-12 shows the command execution.
Example 33-12 Fail back Global Copy from the remote site to the local site dscli> failbackpprc -remotedev IBM.2107-75ABTV1 -type gcp -cascade e400-e403:4200-4203 CMUC00197I failbackpprc: Remote Mirror and Copy pair E400:4200 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair E401:4201 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair E402:4202 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair E403:4203 successfully failed back. dscli> dscli>lspprc e400-e403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 6400:E400 Target Copy Pending Global Copy 64 unknown Disabled Invalid 6401:E401 Target Copy Pending Global Copy 64 unknown Disabled Invalid 6402:E402 Target Copy Pending Global Copy 64 unknown Disabled Invalid 6403:E403 Target Copy Pending Global Copy 64 unknown Disabled Invalid E400:4200 Copy Pending Global Copy E4 unknown Disabled True E401:4201 Copy Pending Global Copy E4 unknown Disabled True E402:4202 Copy Pending Global Copy E4 unknown Disabled True E403:4203 Copy Pending Global Copy E4 unknown Disabled True dscli>
513
To enable incremental resynchronization, run mkpprc with the -mode nocp option. Example 33-13 shows the command to enable incremental resynchronization at the intermediate site. To verify this action, run lssprc -l.
Example 33-13 Start incremental resynchronization at the intermediate site
dscli> mkpprc -remotedev IBM.2107-75TV181 -mode nocp -type gcp -incrementalresync enablenoinit 6400-6403:e400-e403 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6400:E400 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6401:E401 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6402:E402 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6403:E403 successfully created. dscli> dscli> lspprc -l 6400-6403 5700-5703 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ========================================================================================================================================== 6400:E400 Copy Pending Global Copy 0 Disabled Enabled Invalid 64 120 Disabled True Enabled Disabled N/A Disabled Unknown False 6401:E401 Copy Pending Global Copy 0 Disabled Enabled Invalid 64 120 Disabled True Enabled Disabled N/A Disabled Unknown False 6402:E402 Copy Pending Global Copy 0 Disabled Enabled Invalid 64 120 Disabled True Enabled Disabled N/A Disabled Unknown False 6403:E403 Copy Pending Global Copy 0 Disabled Enabled Invalid 64 120 Disabled True Enabled Disabled N/A Disabled Unknown False dscli>
514
Step 4: Suspending Global Copy from the remote site to the local site
In preparation for reversing Global Copy at the remote site, suspend the Global Copy from the remote site to the local site that was created in Step 1: Failing back Global Copy from the remote site to the local site on page 513. This task should be done after the remote to local site out-of-sync bitmaps are drained from the last of the updates at the remote site. Draining the out-of-sync tracks: Remote site to local site out-of-sync tracks must be drained before you continue with Step 5: Terminating Global Copy from the intermediate site to the remote site at the remote site. This action ensures that the local site is fully resynchronized. Example 33-15 shows the command that is used to suspend the Global Copy from the remote site to the local site.
Example 33-15 Suspend Global Copy from the remote site to the local site dscli> pausepprc -remotedev IBM.2107-75ABTV1 CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy dscli> e400-e403:4200-4203 volume pair E400:4200 volume pair E401:4201 volume pair E402:4202 volume pair E403:4203 relationship relationship relationship relationship successfully successfully successfully successfully paused. paused. paused. paused.
Step 5: Terminating Global Copy from the intermediate site to the remote site at the remote site
Although Global Copy is suspended from the intermediate to the remote site, the remote site volumes still have the status of Global Copy target devices. For the remote site devices to lose knowledge of being a target of the intermediate, terminate the Global Copy relationship from the intermediate site to the remote site at the remote site. This termination does not change the intermediate sites state, so the out-of-sync bitmaps can remain in operation at the intermediate site. The state of the intermediate site remains primary suspended while the remote no longer shows as a suspended target and is terminated. This step is necessary so that the failback of the intermediate site to the remote site can occur when you reverse Global Copy in Step 6: Reversing Global Copy to run from the local site to the remote site on page 515. Example 33-16 shows how to terminate the Global Copy at the remote site.
Example 33-16 Terminate Global Copy from the intermediate site to the remote site at the remote site dscli> rmpprc -quiet -unconditional -at tgt e400-e403 CMUC00155I rmpprc: Remote Mirror and Copy volume pair CMUC00155I rmpprc: Remote Mirror and Copy volume pair CMUC00155I rmpprc: Remote Mirror and Copy volume pair CMUC00155I rmpprc: Remote Mirror and Copy volume pair dscli> :E400 :E401 :E402 :E403 relationship relationship relationship relationship successfully successfully successfully successfully withdrawn. withdrawn. withdrawn. withdrawn.dscli>
Step 6: Reversing Global Copy to run from the local site to the remote site
When the Global Copy out-of-sync tracks are transferred and drained to the local site, then you can reverse the Global Copy relationship from the remote site to the local site. Therefore, you must do a failover and failback before you create the Global Copy in the reverse direction.
515
The local to remote failover is done at the local site using cascading Global Copy mode options. The failover changes the local site volumes status to primary suspended. Then, do the failback of the Global Copy from the local site to remote site at the local site with cascading allowed and the Global Copy mode. A successful failover and failback reverses the Global Copy, which now runs from the local site to the remote site. Both the failoverpprc command and the failbackpprc command must include the -cascade option. Example 33-17 shows how to reverse the direction of the Global Copy between the local site and the remote site.
Example 33-17 Reverse the direction of the Global Copy between the local site and the remote site dscli> failoverpprc -remotedev IBM.2107-75TV181 CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy dscli dscli> failbackpprc -remotedev IBM.2107-75TV181 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy dscli> -type gcp -cascade 4200-4203:e400-e403 pair 4200:E400 successfully reversed. pair 4201:E401 successfully reversed. pair 4202:E402 successfully reversed. pair 4203:E403 successfully reversed. -type gcp -cascade 4200-4203:e400-e403 pair 4200:E400 successfully failed back. pair 4201:E401 successfully failed back. pair 4202:E402 successfully failed back. pair 4203:E403 successfully failed back.
Step 7: Starting Metro Mirror from the intermediate site to the local site
Metro Mirror is now established in the direction from intermediate site to the local site with the Incremental Resync and -override options and with Incremental Resync initialization. The command to establish Metro Mirror is run at the intermediate site. The Metro Mirror is incrementally resynchronized by using the change recording bitmaps, which were created in Step 2: Starting incremental resynchronization at the intermediate site on page 513. When the Metro Mirror is in the Full Duplex mode and all the tracks of the Global Copy are drained, the Incremental Resync is enabled again, which creates change recording bitmaps for Metro Mirror at the intermediate site that can be used for a failure. Important: Metro Mirror must be in the Full Duplex mode and Global Copy must complete a first pass before you continue with Step 8: Starting Global Mirror at the local site on page 517. In Example 33-18, you establish the Metro Mirror from the intermediate site to the local site with the -incrementalresync override option, which starts the copying of all that tracks that are recorded in the change recording bitmap at the intermediate site. This process can be monitored by running lspprc -l. When all the tracks are drained, the incremental resynchronization is enabled with the -incrementalresync enable option.
Example 33-18 Start Metro Mirror from the intermediate site to the local site with incremental resynchronization
dscli> mkpprc -remotedev IBM.2107-75ABTV1 -type mmir -mode nocp -incrementalresync override 6400-6403:4200-4203 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6400:4200 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6401:4201 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6402:4202 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6403:4203 successfully created. dsclI> dscli> lspprc -l 4200-4203 5300-5303 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ==========================================================================================================================================
516
4200:E400 Disabled 4201:E401 Disabled 4202:E402 Disabled 4203:E403 Disabled 6400:4200 Disabled 6401:4201 Disabled 6402:4202 Disabled 6403:4203 Disabled dsclI>
Copy Pending True Copy Pending True Copy Pending True Copy Pending True Target Full Duplex Invalid Target Full Duplex Invalid Target Full Duplex Invalid Target Full Duplex Invalid
Global Copy Disabled Global Copy Disabled Global Copy Disabled Global Copy Disabled Metro Mirror Disabled Metro Mirror Disabled Metro Mirror Disabled Metro Mirror Disabled
33 Disabled 38 Disabled 24 Disabled 102 Disabled 0 Disabled 0 Disabled 0 Disabled 0 Disabled N/A N/A N/A N/A N/A N/A N/A N/A
Disabled Enabled Disabled Unknown Disabled Enabled Disabled Unknown Disabled Enabled Disabled Unknown Disabled Enabled Disabled Unknown Disabled Invalid N/A Unknown Disabled Invalid N/A Unknown Disabled Invalid N/A Unknown Disabled Invalid N/A Unknown
Invalid False Invalid False Invalid False Invalid False Enabled Enabled Enabled Enabled -
42 42 42 42 64 64 64 64
# # Wait until Full Duplex of Metro Mirror and Global Copy first pass has completed! # dscli> mkpprc -dev IBM.2107-7503461 -remotedev IBM.2107-75ABTV1 -type mmir -mode nocp -incrementalresync enable 6400-6403:4200-4203 5700-5703:5300-5303 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6400:4200 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6401:4201 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6402:4202 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 6403:4203 successfully created. dsclI> dscli> lspprc -l 6400-6403 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ========================================================================================================================================== 6400:4200 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 64 120 Disabled Invalid Enabled Disabled N/A Enabled Unknown 6401:4201 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 64 120 Disabled Invalid Enabled Disabled N/A Enabled Unknown 6402:4202 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 64 120 Disabled Invalid Enabled Disabled N/A Enabled Unknown 6403:4203 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 64 120 Disabled Invalid Enabled Disabled N/A Enabled Unknown dscli>
517
Figure 33-7 illustrates the steps to recover from a failure at the intermediate site while production is able to continue at the local site.
6 1
75-TV181
Intermediate Site
C
3 4
B
75-03461
Here are the steps for recovery after a failure at the intermediate site: 1. 2. 3. 4. 5. 6. Suspend Metro Mirror at the local site. Clean up the remaining components of Global Mirror (if possible). Fail over Global Copy at the remote site. Verify the Global Mirror consistency group. Start Global Copy from the local site to remote site. Create a session and start Global Mirror at the local site.
dscli> pausepprc -remotedev IBM.2107-7503461 -unconditional -at src 4200-4203 CMUC00157I pausepprc: Remote Mirror and Copy volume pair 4200: relationship successfully paused. 518
IBM System Storage DS8000 Copy Services for Open Systems
CMUC00157I pausepprc: Remote Mirror and Copy volume pair 4201: relationship successfully paused. CMUC00157I pausepprc: Remote Mirror and Copy volume pair 4202: relationship successfully paused. CMUC00157I pausepprc: Remote Mirror and Copy volume pair 4203: relationship successfully paused. dscli>
519
To verify the Global Mirror consistency, see 30.4, Checking consistency at the remote site on page 463, which describes how to verify the consistency groups and determine whether any action must be taken. Tip: The output of the lsflash -revertible command shows a query of the FlashCopy pairs with the revertible bit enabled. This output is helpful when you verify the Global Mirror consistency group.
Step 5: Starting Global Copy from the local site to the remote site
To establish the Incremental Resync with copy option from the local to the remote site, run mkpprc with the -incrementalresync recover option. The recover parameter for the -incrementalresync option does a check to see if there was a former relationship at the remote site. In Step 2: Cleaning up the remaining components of Global Mirror (if possible) on page 519, the local site was failed over and it is no longer in a relationship. When the Global Copy with Incremental Resync is established, the function that was running previously at the local site is stopped. If there was a former Incremental Resync relationship at the remote site, then the override parameter must be used with the -incrementalresync option when you establish Global Copy. Important: Now, all writes are transferred from the local site to the remote site and all the out-of-sync tracks must be drained before you continue with Step 6: Creating a session and starting Global Mirror at the local site. To query out-of-sync tracks, run lspprc -l at the local site. In Example 33-21, the lspprc command that is run at the local site shows the relationship of the Metro Mirror from the local site to the intermediate site. When you run mkpprc with the -incrementalresync recover option, the pair relationship starts the Global Copy relationship from the local site to the remote site.
Example 33-21 Start Global Copy from the local site to the remote site with incremental resynchronization
dsclI> mkpprc -remotedev IBM.2107-75TV181 -type gcp -mode full -incrementalresync recover 4200-4203:e400-e403 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4200:E400 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4201:E401 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4202:E402 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4203:E403 successfully created. dsclI> dsclI> lspprc -l 4200-4203 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ========================================================================================================================================== 4200:E400 Copy Pending Global Copy 5 Disabled Enabled Invalid 42 60 Disabled True Disabled Disabled N/A Disabled Unknown False 4201:E401 Copy Pending Global Copy 8 Disabled Enabled Invalid 42 60 Disabled True Disabled Disabled N/A Disabled Unknown False 4202:E402 Copy Pending Global Copy 18 Disabled Enabled Invalid 42 60 Disabled True Disabled Disabled N/A Disabled Unknown False 4203:E403 Copy Pending Global Copy 2 Disabled Enabled Invalid 42 60 Disabled True Disabled Disabled N/A Disabled Unknown False dscli>
Step 6: Creating a session and starting Global Mirror at the local site
For the Global Copy relationship from the local site to the remote site, create a Global Mirror session and add volumes to the session. Run mksession to create the session and run chsession with the -action add option at the local site to add the volumes to the session.
520
You can now start the Global Mirror session by running mkgmir at the local site. This configuration remains unchanged until the intermediate site is available again for Metro/Global Mirror. Production continues to run at the local site without interruption while the original Metro/Global Mirror configuration of the local site to intermediate site to remote site moves to the local site to the remote site. Example 33-22 shows the steps to create sessions and Global Mirror and how to check it.
Example 33-22 Create a session and start Global Mirror at the local site
dscli> mksession -lss 42 2 CMUC00145I mksession: Session 2 opened successfully. dscli> dscli> chsession -lss 42 -action add -volume 4200-4203 2 CMUC00147I chsession: Session 2 successfully modified. dscli> dscli> lssession -dev IBM.2107-75ABTV1 42 2 LSS ID Session Status Volume VolumeStatus PrimaryStatus SecondaryStatus FirstPassComplete AllowCascading ================================================================================================================= 42 02 Normal 4200 Active Primary Copy Pending Secondary Simplex True Enable 42 02 Normal 4201 Active Primary Copy Pending Secondary Simplex True Enable 42 02 Normal 4202 Active Primary Copy Pending Secondary Simplex True Enable 42 02 Normal 4203 Active Primary Copy Pending Secondary Simplex True Enable dscli> dscli> mkgmir -dev IBM.2107-75ABTV1 -lss 42 -session 2 CMUC00162I mkgmir: Global Mirror for session 2 successfully started. dscli>
521
Figure 33-8 shows the steps to clean up the intermediate site after it is fully recovered.
4 3 7
75-TV181
Intermediate Site
6 2
C D
5
B
8 1
75-03461
Here are the steps in more details: 1. 2. 3. 4. 5. 6. 7. 8. Clean up the remaining components at the intermediate site. Fail back Global Copy from the remote site to the intermediate site. Start Incremental Resync at the local site. Stop Global Mirror and suspend Global Copy from the local site to the remote site. Remove Global Copy from the local site to the remote site at the remote site. Reverse Global Copy to run from the intermediate site to the remote site. Create Metro Mirror to run from the local site to the intermediate site. Start Global Mirror.
522
Run rmpprc with the -unconditional -at tgt options at the intermediate site, as shown in Example 33-23. Tip: To remove the Metro Mirror pairs, the communication between local and intermediate sites must work. Check the PPRC paths in both directions when the intermediate site becomes available again.
Example 33-23 Remove the Metro Mirror target relationship at the intermediate site
dcsli> lspprc 6400-6403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ============================================================================================================== ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ============================================================================================================== 4200:6400 Target Suspended Update Target Metro Mirror 42 unknown Disabled Invalid 4201:6401 Target Suspended Update Target Metro Mirror 42 unknown Disabled Invalid 4202:6402 Target Suspended Update Target Metro Mirror 42 unknown Disabled Invalid 4203:6403 Target Suspended Update Target Metro Mirror 42 unknown Disabled Invalid 6400:E400 Suspended Host Source Global Copy 64 unknown Disabled True 6401:E401 Suspended Host Source Global Copy 64 unknown Disabled True 6402:E402 Suspended Host Source Global Copy 64 unknown Disabled True 6403:E403 Suspended Host Source Global Copy 64 unknown Disabled True dscli> dscli> rmpprc -quiet -unconditional -at tgt 6400-6403 5700-5703 CMUC00155I rmpprc: Remote Mirror and Copy volume pair :6400 relationship successfully withdrawn. CMUC00155I rmpprc: Remote Mirror and Copy volume pair :6401 relationship successfully withdrawn. CMUC00155I rmpprc: Remote Mirror and Copy volume pair :6402 relationship successfully withdrawn. CMUC00155I rmpprc: Remote Mirror and Copy volume pair :6403 relationship successfully withdrawn. dscli> dscli> lspprc 6400-6403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 6400:E400 Suspended Host Source Global Copy 64 unknown Disabled True 6401:E401 Suspended Host Source Global Copy 64 unknown Disabled True 6402:E402 Suspended Host Source Global Copy 64 unknown Disabled True 6403:E403 Suspended Host Source Global Copy 64 unknown Disabled True dscli>
523
Suspending Global Copy relationships from the intermediate site to the remote site
The volumes at the intermediate site still have Global Copy relationships with the remote site. Do not remove these relationships because they still correspond to the failover status at the remote site. In Step 2: Failing back Global Copy from the remote site to the intermediate site on page 524, the Global Copy fails back from the remote site, for which these relationships are still required. It is likely that after the storage at the intermediate site is back that the Global Copy relationships are already in a suspended state. If not, suspend the Global Copy now by running pausepprc with the -unconditional -at src options. Global Copy relationships at the intermediate site must be in the host suspended state, as shown in Example 33-23 on page 523.
Step 2: Failing back Global Copy from the remote site to the intermediate site
Now that the intermediate storage is clear, you can run failbackpprc command at the remote site, which begins the process of copying data from the remote site to the intermediate site. Note: Waiting for the initial pass of the resynchronization before you restart incremental resynchronization is a preferred practice to reduce the number of updates that are sent later when Metro Mirror is started with incremental resynchronization and force at the local site. To query the out-of-sync status, run lspprc -l at the intermediate site. Example 33-24 shows the command to fail back the Global Copy.
Example 33-24 Fail back Global Copy from the remote site to the intermediate site
dscli> failbackpprc -remotedev IBM.2107-7503461 -type gcp -cascade e400-e403:6400-6403 CMUC00197I failbackpprc: Remote Mirror and Copy pair E400:6400 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair E401:6401 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair E402:6402 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair E403:6403 successfully failed back. dscli> dscli>lspprc -l e400-e403 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ========================================================================================================================================== 4200:E400 Target Copy Pending Global Copy 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown False 4201:E401 Target Copy Pending Global Copy 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown False 4202:E402 Target Copy Pending Global Copy 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown False 4203:E403 Target Copy Pending Global Copy 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown False E400:6400 Copy Pending Global Copy 0 Disabled Enabled Invalid E4 unknown Disabled True Disabled Disabled N/A Enabled Unknown False E401:6401 Copy Pending Global Copy 9076 Disabled Enabled Invalid E4 unknown Disabled False Disabled Disabled N/A Enabled Unknown False E402:6402 Copy Pending Global Copy 5331 Disabled Enabled Invalid E4 unknown Disabled False Disabled Disabled N/A Enabled Unknown False E403:6403 Copy Pending Global Copy 0 Disabled Enabled Invalid E4 unknown Disabled True Disabled Disabled N/A Enabled Unknown False dscli>
Important: This step is necessary so that the Metro Mirror relationship at the local site can be restored in a later step with the -incrementalresync override option.
Example 33-25 Start Incremental Resync at the local site
dscli> mkpprc -remotedev IBM.2107-75TV181 -type gcp -mode nocp -incrementalresync enablenoinit 4200-4203:e400-e403 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4200:E400 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4201:E401 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4202:E402 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4203:E403 successfully created. dscli> dscli>lspprc -l 4200-4203 5300-5303 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ========================================================================================================================================== 4200:E400 Copy Pending Global Copy 0 Disabled Enabled Invalid 42 60 Disabled True Enabled Disabled N/A Disabled Unknown False 4201:E401 Copy Pending Global Copy 0 Disabled Enabled Invalid 42 60 Disabled True Enabled Disabled N/A Disabled Unknown False 4202:E402 Copy Pending Global Copy 0 Disabled Enabled Invalid 42 60 Disabled True Enabled Disabled N/A Disabled Unknown False 4203:E403 Copy Pending Global Copy 0 Disabled Enabled Invalid 42 60 Disabled True Enabled Disabled N/A Disabled Unknown False dscli>
Step 4: Stopping Global Mirror at the local site and suspending Global Copy
Global Mirror was running from the local site to remote site while the intermediate site was being recovered. Before you restore the original configuration, terminate the Global Mirror with Incremental Resync from the local site to the remote site. To do so, run rmgmir at the local site. As a result, the remote FlashCopy target begins to age while the transition back to the original configuration is in progress. Swap back: The swap back to the intermediate site can be done at any time but is normally done at a planned time. To stop data from being copied to the remote site and to allow the resynchronization to complete between the remote and intermediate sites, suspend Global Copy at the local site. Run pausepprc at the local site, which primary suspend the local site and primary suspend the remote site. Important: Out-of-sync tracks must be drained to the remote site before you continue with Step 5: Stopping Global Copy from the local site to the remote site at the remote site. To query the out-of-sync tracks, run lspprc -l at the remote site. Example 33-26 shows the DS CLI command to suspend the Global Copy.
Example 33-26 Suspend Global Copy from the local site to the remote site dscli> pausepprc -remotedev IBM.2107-75TV181 CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy dscli> 4200-4203:e400-e403 volume pair 4200:E400 volume pair 4201:E401 volume pair 4202:E402 volume pair 4203:E403 relationship relationship relationship relationship successfully successfully successfully successfully paused. paused. paused. paused.
525
Step 5: Stopping Global Copy from the local site to the remote site at the remote site
The Global Copy is terminated only at the remote site because the pair relationship at the local site is still required. However, the pair relationship must be removed at the remote site to proceed with Step 6: Reversing Global Copy to run from the intermediate site to the remote site on page 526. Running rmpprc with the -unconditional -at tgt option at the remote site. This termination does not affect the local sites state, so the out-of-sync bitmaps remain in operation at the local site. The state of the local site remains primary suspended while the remote site no longer shows as suspended target. This step is necessary to allow the failback of the intermediate site to the remote site in a later step. Note: The local site has updates for the intermediate and remote site that are being recorded in the incremental resync change recording and out-of-sync bitmap. Example 33-27 shows the command to remove the Global Copy at the remote site. The lspprc command, which is issued at the local site, shows that the relationship at the local site is still there. It is still required for the incremental resynchronization in Step 7: Creating Metro Mirror with Incremental Resync at the local site on page 527.
Example 33-27 Stop the Global Copy from the local site to the remote site at the remote site
dscli> rmpprc -quiet -unconditional -at tgt e400-e403 e500-e503 CMUC00155I rmpprc: Remote Mirror and Copy volume pair :E400 relationship successfully withdrawn. CMUC00155I rmpprc: Remote Mirror and Copy volume pair :E401 relationship successfully withdrawn. CMUC00155I rmpprc: Remote Mirror and Copy volume pair :E402 relationship successfully withdrawn. CMUC00155I rmpprc: Remote Mirror and Copy volume pair :E403 relationship successfully withdrawn. dscli> dscli>lspprc -l e400-e403 e500-e503 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ========================================================================================================================================== E400:6400 Copy Pending Global Copy 0 Disabled Enabled Invalid E4 60 Disabled True Disabled Disabled N/A Enabled Unknown False E401:6401 Copy Pending Global Copy 0 Disabled Enabled Invalid E4 60 Disabled True Disabled Disabled N/A Enabled Unknown False E402:6402 Copy Pending Global Copy 0 Disabled Enabled Invalid E4 60 Disabled True Disabled Disabled N/A Enabled Unknown False E403:6403 Copy Pending Global Copy 0 Disabled Enabled Invalid E4 60 Disabled True Disabled Disabled N/A Enabled Unknown False
dscli>
Step 6: Reversing Global Copy to run from the intermediate site to the remote site
In this step, the Global Copy between the intermediate site and the remote site is now reversed back to the initial direction from the intermediate site to the remote site. Run failoverpprc and failbackpprc at the intermediate site. Because the volumes at the intermediate site are cascaded, you must use the -cascade option. This option sets the intermediate site to the primary suspended state. Example 33-28 shows the appropriate DS CLI commands to reverse the Global Copy.
Example 33-28 Reverse Global Copy from the remote site to the intermediate site
dscli> failoverpprc -remotedev IBM.2107-75TV181 -type gcp -cascade 6400-6403:e400-e403 CMUC00196I failoverpprc: Remote Mirror and Copy pair 6400:E400 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair 6401:E401 successfully reversed. 526
IBM System Storage DS8000 Copy Services for Open Systems
CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy dscli> dscli> failbackpprc -remotedev IBM.2107-75TV181 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy dscli>
pair 6402:E402 successfully reversed. pair 6403:E403 successfully reversed. -type gcp -cascade 6400-6403:e400-e403 pair 6400:E400 successfully failed back. pair 6401:E401 successfully failed back. pair 6402:E402 successfully failed back. pair 6403:E403 successfully failed back.
Step 7: Creating Metro Mirror with Incremental Resync at the local site
The command to create Metro Mirror with the Incremental Resync option is run twice in this step using two different parameters. First, to stop Incremental Resync from the local site to the remote site and to be able to move it to the intermediate site, set Metro Mirror with incremental resync by running mkpprc with the -incrementalresync override option. By using the override parameter for -incrementalresync, Metro Mirror with Incremental Resync is established without doing a check at the intermediate site (the Metro Mirror secondary). The change recording bitmaps are also merged with the out-of-sync bitmaps at the local site during this step. Querying Metro Mirror and Global Copy statuses: Both Metro Mirror and Global Copy might still be in the first pass. To query the status of Global Mirror or Metro Mirror, run lspprc -l at the local site to query Metro Mirror and at the intermediate site to query Global Copy. Example 33-29 shows the DS CLI command to create the Metro Mirror with the -incremental override option. During this phase, all tracks that are marked in the incremental resynchronization are not enabled. When all tracks are drained to the intermediate site, the incremental resynchronization is enabled.
Example 33-29 Create Metro Mirror with Incremental Resync at the local site
dscli> mkpprc -remotedev IBM.2107-7503461 -type mmir -mode nocp -incrementalresync override 4200-4203:6400-6403 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4200:6400 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4201:6401 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4202:6402 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4203:6403 successfully created. dscli> dscli> lspprc 4200-4203 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 4200:6400 Full Duplex Metro Mirror 42 60 Disabled Invalid 4201:6401 Full Duplex Metro Mirror 42 60 Disabled Invalid 4202:6402 Full Duplex Metro Mirror 42 60 Disabled Invalid 4203:6403 Full Duplex Metro Mirror 42 60 Disabled Invalid dscli> #
# At intermediate site #
dscli>lspprc -l 6400-6403 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ========================================================================================================================================== 4200:6400 Target Full Duplex Metro Mirror 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown 4201:6401 Target Full Duplex Metro Mirror 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown 4202:6402 Target Full Duplex Metro Mirror 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown 4203:6403 Target Full Duplex Metro Mirror 0 Disabled Invalid Enabled 42 unknown 6400:E400 Copy Pending Global Copy 0 Disabled Enabled Invalid 64 unknown Disabled True Disabled Disabled N/A Disabled Unknown False
527
6401:E401 Copy Pending Disabled True 6402:E402 Copy Pending Disabled True 6403:E403 Copy Pending Disabled True dscli>
Global Copy 0 Disabled Disabled Global Copy 0 Disabled Disabled Global Copy 0 Disabled Disabled
Disabled Enabled Invalid Disabled Unknown False Disabled Enabled Invalid Disabled Unknown False Disabled Enabled Invalid Disabled Unknown False
64 64 64
Next, to monitor and track data as it is written on the primary volumes at the local site, create a Metro Mirror relationship with Incremental Resync from the local site to the intermediate site by running mkpprc with the -incrementalresync enable option. By using the enable parameter for -incrementalresync, Incremental Resync is initialized by creating a change recording bitmap on the local site.
dscli> mkgmir -dev IBM.2107-1301261 -lss 20 -session 2 CMUC00162I mkgmir: Global Mirror for session 2 successfully started. dscli>
528
This section assumes that the local site failed and was recovered according to the process described in 33.3.2, Local site is back on page 512. This section describes the steps for moving production back to the local site from the intermediate site and then restoring the original Metro/Global Mirror with Incremental Resync environment. The same scenario is also applicable in the reverse direction. This scenario includes many steps that must be accomplished manually and carefully. Again, it is a preferred practice to use Tivoli Productivity Center for Replication or GDPS to automate and manage this process. For more information, see Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685. Figure 33-9 illustrates the steps to move Metro/Global Mirror with Incremental Resync from the intermediate site to the local site.
8 7 4
75-TV181
11
Intermediate Site
Secondary production host
C
10 6 3
B
75-03461
12
Figure 33-9 Move Metro/Global Mirror with Incremental Resync back to the local site
Here are the steps to move the production environment back to the local site and restore Metro/Global Mirror with Incremental Resync to its original configuration: 1. Stop applications at the intermediate site. 2. Suspend Metro Mirror from the intermediate site to the local site. 3. Fail over Global Copy from the remote site to the intermediate site. 4. Terminate Metro Mirror at the local site. 5. Start applications at the local site. 6. Fail back Global Copy from the remote site to the intermediate site. 7. Start Incremental Resync at the local site. 8. Terminate Global Mirror at the local site. 9. Suspend and terminate Global Copy from the local site to the remote site. 10.Reverse Global Copy to run from the intermediate site to the remote site.
529
11.Establish Metro Mirror from the local site to the intermediate site. 12.Start Global Mirror at the intermediate site.
Step 2: Suspending Metro Mirror from the intermediate site to the remote site
Before you move the production environment back to the local site, suspend Metro Mirror from the intermediate site to the local site, which stops data from being copied to the local site. Example 33-31 shows the DS CLI command to suspend the Metro Mirror.
Example 33-31 Suspend Metro Mirror from the intermediate site to the remote site dscli> pausepprc -remotedev IBM.2107-75ABTV1 CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy CMUC00157I pausepprc: Remote Mirror and Copy dscli> 6400-6403:4200-4203 volume pair 6400:4200 volume pair 6401:4201 volume pair 6402:4202 volume pair 6403:4203 relationship relationship relationship relationship successfully successfully successfully successfully paused. paused. paused. paused.
Step 3: Failing over Global Copy from the remote site to the intermediate site
Run failoverpprc with the -force and -cascading options from the remote site to the intermediate site. The -force option bypasses validation at the remote site to determine whether the remote site is a secondary of the intermediate site, thus allowing the failover to be successful. The failover changes the state of the remote site devices to suspended primary, where the relationship at the remote site is now cascaded to the Global Copy from the intermediate site. Example 33-32 shows the DS CLI command that is run.
Example 33-32 Fail over Global Copy from the remote site to the intermediate site
dscli> failoverpprc -remotedev IBM.2107-7503461 -type gcp -cascade -force e400-e403:6400-6403 CMUC00196I failoverpprc: Remote Mirror and Copy pair E400:6400 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair E401:6401 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair E402:6402 successfully reversed. CMUC00196I failoverpprc: Remote Mirror and Copy pair E403:6403 successfully reversed. dscli> dscli> lspprc e400-e403 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ============================================================================================================== 4200:E400 Target Copy Pending Global Copy 42 unknown Disabled Invalid 4201:E401 Target Copy Pending Global Copy 42 unknown Disabled Invalid 4202:E402 Target Copy Pending Global Copy 42 unknown Disabled Invalid 4203:E403 Target Copy Pending Global Copy 42 unknown Disabled Invalid E400:6400 Suspended Host Source Global Copy E4 unknown Disabled True E401:6401 Suspended Host Source Global Copy E4 unknown Disabled True E402:6402 Suspended Host Source Global Copy E4 unknown Disabled True E403:6403 Suspended Host Source Global Copy E4 unknown Disabled True dscli>
530
dscli> rmpprc -quiet -unconditional -at tgt 4200-4203 CMUC00155I rmpprc: Remote Mirror and Copy volume pair CMUC00155I rmpprc: Remote Mirror and Copy volume pair CMUC00155I rmpprc: Remote Mirror and Copy volume pair CMUC00155I rmpprc: Remote Mirror and Copy volume pair dscli>
Step 6: Failing back from the remote site to the intermediate site
The production environment is running at the local site and Global Copy is still running from the local site to the remote site. Now, you prepare to reinstate Global Copy at the intermediate site. Run a failback from the remote site to the intermediate site. The local sites out-of-sync bitmaps can be obtained for resync changes that might have occurred since the swap. The remote sites out-of sync bitmaps contain the changes that are made after the production environment moved back to the local site. The intermediate sites Incremental Resync change recording bitmaps are released during the failback. There might still be changes on the remote site that have not made it to the intermediate site and must be updated. Note: The writes from the remote site to the intermediate site must be drained or completed the first pass before you continue with Step 7: Starting incremental resynchronization at the local site on page 532. All changes at the local site should already be updated to the remote site, which is also being updated to the intermediate site with the failover. Example 33-34 shows how to issue the failbackpprc command. To monitor the out-of-sync tracks, run lspprc -l at the remote site. The output shows that the Global Copy from the remote site to intermediate site is still a cascaded relationship of the Global Copy from local site to remote site.
Example 33-34 Failback from the remote site to the intermediate site
dscli> ffailbackpprc -remotedev IBM.2107-7503461 -type gcp -cascade e400-e403:6400-6403 e500-e503:5700-5703 CMUC00197I failbackpprc: Remote Mirror and Copy pair E400:6400 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair E401:6401 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair E402:6402 successfully failed back. CMUC00197I failbackpprc: Remote Mirror and Copy pair E403:6403 successfully failed back. dscli> dscli> lspprc -l e400-e403 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ========================================================================================================================================== 4200:E400 Target Copy Pending Global Copy 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown False 4201:E401 Target Copy Pending Global Copy 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown False
531
4202:E402 Disabled 4203:E403 Disabled E400:6400 Disabled E401:6401 Disabled E402:6402 Disabled E403:6403 Disabled dscli>
Target Copy Pending Invalid Target Copy Pending Invalid Copy Pending True Copy Pending True Copy Pending True Copy Pending True
Global Disabled Global Disabled Global Disabled Global Disabled Global Disabled Global Disabled
Copy 0 Disabled Copy 0 Disabled Copy 0 Disabled Copy 0 Disabled Copy 0 Disabled Copy 0 Disabled N/A N/A N/A N/A N/A N/A
Disabled Invalid N/A Unknown Disabled Invalid N/A Unknown Disabled Enabled Enabled Unknown Disabled Enabled Enabled Unknown Disabled Enabled Enabled Unknown Disabled Enabled Enabled Unknown
Enabled False Enabled False Invalid False Invalid False Invalid False Invalid False
42 42 E4 E4 E4 E4
dscli> rmgmir -quiet -lss 42 -session 2 CMUC00165I rmgmir: Global Mirror for session 2 successfully stopped. dscli>
Step 9: Suspending and removing Global Copy from the local site to the remote site
Now that the Global Mirror from the local site to the remote site is terminated, suspend and terminate Global Copy running from the local site to the remote site. By suspending Global Copy, data is not longer copied to the remote site. Resynchronization now can complete from the remote site to the intermediate site. When the resynchronization of the intermediate site is complete, you can terminate Global Copy from the local site to the remote site at the remote site. By terminating Global Copy only at the remote site, the status at the remote site is no longer a Global Copy secondary, which allows for a failback from the intermediate site to the remote site later. In addition, the local site continues to have out-of-sync bitmaps in operation with its status as suspended primary.
532
Example 33-37 shows the suspension of the Global Copy from the local site to the remote site. Then, the Global Copy from the remote site to the intermediate site is queried to check that all out-of-sync tracks are drained. If so, the Global Copy relationship at the remote site is removed.
Example 33-37 Suspend and remove Global Copy from the local site to the remote site at the remote site
dscli> pausepprc -dev IBM.2107-75ABTV1 -remotedev IBM.2107-75TV181 pk@pk-laptop:~/work/Develop/DSCLIbroker/DSCLInator$ CMUC00157I pausepprc: Remote Mirror and Copy volume CMUC00157I pausepprc: Remote Mirror and Copy volume CMUC00157I pausepprc: Remote Mirror and Copy volume CMUC00157I pausepprc: Remote Mirror and Copy volume dscli> 4200-4203:e400-e403 5300-5303:e500-e503
pausepprc.pl -n teamred_ac pair 4200:E400 relationship pair 4201:E401 relationship pair 4202:E402 relationship pair 4203:E403 relationship
# # Wait until all OOS has drained from remote to intermediate; lspprc at remote site # dscli> lspprc -l e400-e403 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ========================================================================================================================================== 4200:E400 Target Suspended Update Target Global Copy 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown False 4201:E401 Target Suspended Update Target Global Copy 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown False 4202:E402 Target Suspended Update Target Global Copy 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown False 4203:E403 Target Suspended Update Target Global Copy 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown False E400:6400 Copy Pending Global Copy 0 Disabled Enabled Invalid E4 unknown Disabled True Disabled Disabled N/A Enabled Unknown False E401:6401 Copy Pending Global Copy 0 Disabled Enabled Invalid E4 unknown Disabled True Disabled Disabled N/A Enabled Unknown False E402:6402 Copy Pending Global Copy 0 Disabled Enabled Invalid E4 unknown Disabled True Disabled Disabled N/A Enabled Unknown False E403:6403 Copy Pending Global Copy 0 Disabled Enabled Invalid E4 unknown Disabled True Disabled Disabled N/A Enabled Unknown False dscli> # # Now remoce Global Copy at remote site # dscli> rmpprc -quiet -unconditional -at tgt e400-e403 CMUC00155I rmpprc: Remote Mirror and Copy volume pair CMUC00155I rmpprc: Remote Mirror and Copy volume pair CMUC00155I rmpprc: Remote Mirror and Copy volume pair CMUC00155I rmpprc: Remote Mirror and Copy volume pair dscli>
Step 10: Reversing Global Copy to run from the intermediate site to the remote site
To reverse the Global Copy relationship, you must perform a failover and then a failback before you create the Global Copy in the reverse direction. Fail over Global Copy with cascading allowed from the intermediate site to the remote site. The failover sets the intermediate site volumes status to primary suspended and sets up for a local site to intermediate site and intermediate site to remote site connection. Next, fail back Global Copy from the intermediate site to the remote site at the intermediate site with cascading allowed and in Global Copy mode. The failover and failback successfully reverse the Global Copy relationship to run from the intermediate site to the remote site. You can now establish Global Copy from the intermediate site to the remote site with the Global Copy mode and cascading options.
533
dscli> failoverpprc -remotedev IBM.2107-75TV181 CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy dscli> dscli> failbackpprc -remotedev IBM.2107-75TV181 CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy CMUC00197I failbackpprc: Remote Mirror and Copy dscli>
-type gcp -cascade 6400-6403:e400-e403 pair 6400:E400 successfully reversed. pair 6401:E401 successfully reversed. pair 6402:E402 successfully reversed. pair 6403:E403 successfully reversed. -type gcp -cascade 6400-6403:e400-e403 pair 6400:E400 successfully failed back. pair 6401:E401 successfully failed back. pair 6402:E402 successfully failed back. pair 6403:E403 successfully failed back.
Step 11: Establishing Metro Mirror from the local site to the intermediate site
Metro Mirror is now established from the local site to the intermediate site with Incremental Resync and force and then again with Incremental Resync initialization. The command to establish Metro Mirror is run at the local site. The forced mirror also causes the change recording bitmaps that were created in Step 7: Starting incremental resynchronization at the local site on page 532 to merge with the out-of-sync bitmaps at the local site. Then, Metro Mirror is established with Incremental Resync initialized, which creates change recording bitmaps for Metro Mirror at the local site. Metro Mirror state: Metro Mirror must be in the Full Duplex state now, and Global Copy must complete first pass before you continue with Step 12: Starting Global Mirror on page 535.
Tip: If there is a failure of the local site (see 33.3.1, Local site fails on page 509), a freezepprc command is issued and the paths from the local site to intermediate site are removed. To succeed with the failback from the local site to the intermediate site, the paths must be re-established now. Example 33-39 shows that the -incrementalresync override option is used to copy the track marked in the change recording bitmap at the local site. All out-of-sync tracks are drained to the intermediate site. When this drain finishes, the incremental resynchronization is started with empty bitmaps at the local site.
Example 33-39 Establish Metro Mirror from the local site to the intermediate site
dscli> mkpprc -remotedev IBM.2107-7503461 -type mmir -mode nocp -incrementalresync override 4200-4203:6400-6403 pk@pk-laptop:~/work/Develop/DSCLIbroker/DSCLInator$ mkpprc.pl -n teamred_ab -incresync override -mode nocp CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4200:6400 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4201:6401 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4202:6402 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4203:6403 successfully created. dscli> # # Wait until all OOS has drained # dscli> lspprc -l 6400-6403 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync
534
========================================================================================================================================== 4200:6400 Target Full Duplex Metro Mirror 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown 4201:6401 Target Full Duplex Metro Mirror 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown 4202:6402 Target Full Duplex Metro Mirror 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown 4203:6403 Target Full Duplex Metro Mirror 0 Disabled Invalid Enabled 42 unknown Disabled Invalid Disabled Disabled N/A N/A Unknown 6400:E400 Copy Pending Global Copy 0 Disabled Enabled Invalid 64 unknown Disabled True Disabled Disabled N/A Disabled Unknown False 6401:E401 Copy Pending Global Copy 0 Disabled Enabled Invalid 64 unknown Disabled True Disabled Disabled N/A Disabled Unknown False 6402:E402 Copy Pending Global Copy 0 Disabled Enabled Invalid 64 unknown Disabled True Disabled Disabled N/A Disabled Unknown False 6403:E403 Copy Pending Global Copy 0 Disabled Enabled Invalid 64 unknown Disabled True Disabled Disabled N/A Disabled Unknown False dscli> dscli> mkpprc -dev IBM.2107-75ABTV1 -remotedev IBM.2107-7503461 -type mmir -mode nocp -incrementalresync enable 4200-4203:6400-6403 5300-5303:5700-5703 CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4200:6400 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4201:6401 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4202:6402 successfully created. CMUC00153I mkpprc: Remote Mirror and Copy volume pair relationship 4203:6403 successfully created. dscli> dscli> lspprc -l 4200-4203 ID State Reason Type Out Of Sync Tracks Tgt Read Src Cascade Tgt Cascade Date Suspended SourceLSS Timeout (secs) Critical Mode First Pass Status Incremental Resync Tgt Write GMIR CG PPRC CG isTgtSE DisableAutoResync ========================================================================================================================================== 4200:6400 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 42 60 Disabled Invalid Enabled Disabled N/A Enabled Unknown 4201:6401 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 42 60 Disabled Invalid Enabled Disabled N/A Enabled Unknown 4202:6402 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 42 60 Disabled Invalid Enabled Disabled N/A Enabled Unknown 4203:6403 Full Duplex Metro Mirror 0 Disabled Enabled Invalid 42 60 Disabled Invalid Enabled Disabled N/A Enabled Unknown dscli>
535
536
34
Chapter 34.
537
34.1 Overview
With Metro/Global Mirror and Metro/Global Mirror Incremental Resync, the DS8000 storage system has powerful copy functions. This section introduces an additional function, which is an option of the Incremental Resync. Incremental Resync has been successfully deployed in certain data migrations in a Metro Mirror or a Global Mirror environment, where the remote storage system was replaced. Using Incremental Resync, you can replace the remote storage system without needing to perform a full copy of the data from the primary. During the synchronization of the new storage system, the existing Metro Mirror (or Global Mirror) relationship is maintained, so if there is a disaster, you still have Metro Mirror. As shown in Figure 34-1, the new storage system is cascaded by a Global Copy relationship to the existing Metro Mirror. When the initial copy phase passes, Incremental Resync is enabled at the local site and a new Metro Mirror or Global Mirror relationship, using the change recording bitmaps, is established to the new storage system.
Local site
Remote site
Primary system A
DS8000
New system C
DS8000
DS8000
2 3
System B
The following sections explain the detailed steps for a migration of a target Metro Mirror, which is replaced by a new storage system when you complete the following steps: 1. 2. 3. 4. 5. Create a Global Copy between B and C. Enable Incremental Resync from A to C. Suspend Metro Mirror. Fail over Global Copy form C to B. Establish a new Metro Mirror from A to C.
538
The -cascade parameter allows the B volumes to be a target and a source for two different relationships simultaneously. Before you can proceed, you must wait until the initial copy phase from the B to C volumes completes.
539
Disabled
42
60
Example 34-2 on page 539 specified the -mode nocp -incrementalresync enablenoinit parameters. In a Metro/Global Mirror relationship, you specify -incrementalresync enable. The -incrementalresync option determines how the Incremental Resync bitmaps are initialized. When you use the -incrementalresync enable option, the bitmap is initialized so that everything is copied. The Global Mirror function at the B side informs the A side about updates that are sent to the C side. So, Global Mirror eventually resets the A sides bitmaps to all data copied. When updates to the A volumes occur, the Incremental Resync bitmaps at A reflect this change, and when the updates arrive at the C volumes, Global Mirror resets these bits for the A volumes. In our example, because we have Global Copy running at the B side and not Global Mirror, there is no instance that resets the Incremental Resync bits for the A volumes. When -incrementalresync enablenoinit is used, the bitmaps for the A volumes are initialized, indicating that nothing must be copied. When updates to the A volumes occur, these updates are reflected in the bitmaps. The corresponding tracks are transmitted to the C volumes if you do an Incremental Resync from A to C. These bits are not reset. Change bits accumulate until you do an Incremental Resync. Therefore, timing is crucial when you do an Incremental Resync in a Metro/Global Copy relationship. You must modify the existing relationship right before you switch from the A to B pairs to the A to C pairs. If you modify the existing Metro Mirror relationship too far ahead of the switch, the amount of data that must be resynchronized is high. When you run the command to modify the Metro Mirror relationship with the -incrementalresync enablenoinit parameter, you tell the system to assume that all A, B, and C volumes contain the same data. But there might be some data in flight, so you should wait until all the tracks are copied before you switch the pairs to ensure that the in-flight updates reached the C volumes. You can check the out-of-sync tracks by running lspprc -l.
540
dscli> failoverpprc -remotedev IBM.2107-7503461 CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy CMUC00196I failoverpprc: Remote Mirror and Copy dscli>
-type gcp -cascade e400-e403:6400-6403 pair E400:6400 successfully reversed. pair E401:6401 successfully reversed. pair E402:6402 successfully reversed. pair E403:6403 successfully reversed.
The B volumes are no longer needed. The volumes, however, are still in a Metro Mirror state and paths exist from the B system to the C system. You should delete the pairs and remove the paths before you remove the system.
Now, you have Metro Mirror pairs between A and C volumes and you can remove the old storage system.
541
542
Part 8
Part
543
544
35
Chapter 35.
545
Server
Storage
Full Provisioning
Real capacity
Thin Provisioning
Allocates on write
Figure 35-1 General differences between full provisioning and thin provisioning
A volume that supports thin provisioning is referred to as a Space-Efficient volume. At the time the volume is created, only a small amount of real capacity is physically allocated for metadata. The DS8000 storage system uses specific metadata to manage and determine when capacity must be allocated for write operations.
546
A Space-Efficient volume is created with virtual capacity rather than physical capacity by allocated extents. When a Space-Efficient volume is assigned to a host, the host sees the whole (virtual) capacity of the volume, as through it were a fully provisioned volume. All I/O activities that are performed by the storage system to allocate space when needed are fully transparent to the host. The amount of real capacity in the storage system is shared between all Space-Efficient volumes. The amount of real capacity is by definition less or equal than the total virtual capacity. The ratio between virtual capacity and real capacity represents the storage oversubscription or storage over-commitment.
547
Standard volumes
Used extents
Ranks
Extents for standard volumes and dynamically allocated extents for ESE volumes are allocated independently from the same extent pool. When the extent pool has more than one rank, the dynamic allocation of extents follows the usual extent allocation methods, that is, either rotate extents or rotate volumes, depending on what was specified when the ESE volume was created.
548
Out-of-space conditions in an extent pool should be avoided in any case because this situation definitely leads to an application access loss. The extent pools are characterized by parameters and functionality to allow monitoring of their capacity and submitting notifications to administrators when a threshold is reached.
549
550
36
Chapter 36.
551
552
Here are the main differences between FlashCopy with thin provisioned Extent-Space-Efficient (ESE) volumes and FlashCopy SE with Track-Space-Efficient (TSE) volumes: FlashCopy SE uses small tracks from a repository instead of extents from an extent pool, as thin provisioning does. To use FlashCopy SE, you must have a FlashCopy SE repository established. Track-Space-Efficient volumes are only supported as FlashCopy SE target volumes.
553
When you establish Global Mirror relationships with ESE volumes, all space on the thin provisioned (ESE) secondary and journal volumes are released. For Metro/Global Mirror, the behavior to release space depends on whether you first establish the Metro Mirror (MM) or the Global Copy (GC) relationship, as shown in Figure 36-1. Note: With LMC 6.6.30.nnn for DS8700 and 7.6.30.nnn for DS8800, thin provisioned (ESE) volumes can be used also for all other Copy Services operations. However, because a thin provisioned (ESE) FlashCopy target volume, as part of Global Mirror, releases space only at initial creation, you should not use thin provisioned (ESE) volumes as FlashCopy target volumes in a Global Mirror or Metro/Global Mirror environment if standard volumes are used as primary and secondary volumes.
1. MM
Release allocated space on B; Set bits in As Outof-Sync Bitmap (OOS BM) for the allocated extents on A Release allocated space on C; Set bits in Bs OOS BM for the allocated extents on B
OOS BM
0 0 0 0 0
1. MM
OOS BM
0 0 0 0 0
1. MM
OOS BM
0 0 0 0 0
2. GC
OOS BM
1. GC
1 0 1 0 0
0 0 0 0 1
2. MM
OOS BM
1.GC 0
0 0 0 0
OOS BM
0 0 0 0 0
2. MM
OOS BM
1.GC
0 0 0 0 0
OOS BM
Figure 36-1 Release space behavior of Metro/Global Mirror
OOS BM
Global Mirror or Metro/Global Mirror copy only the allocated extents from the thin provisioned primary volume to the thin provisioned secondary volume.
554
Part 9
Part
555
556
37
Chapter 37.
IBM i overview
This chapter provides insight into IBM i business continuity solutions that are based on the IBM System Storage DS8000 Copy Services. IBM i supports each of the following connections to the DS8000 series: Native attachment Native attachment is a connection to a DS8000 using adapters in the IBM i server. IBM i is in a partition of an IBM POWER system or in a former IBM System i model. Attachment through virtual input/output server Node Port ID Virtualization (VIOS NPIV) Attachment through virtual input/output server (VIOS) using virtual SCSI adapters Not all of these connection modes are available for all IBM i hardware and software levels. For more information, see IBM System Storage DS8000 Host Attachment and Interoperability, SG24-8887 and the System Storage Interoperation Center (SSIC) at: http://www.ibm.com/systems/support/storage/ssic/interoperability.wss This chapter covers the following topics: Introduction IBM i architecture and external storage DS8000 Copy Services with IBM i Managing solutions with DS8000 Copy Services and IBM i Supported solutions and management References
557
37.1 Introduction
IBM i servers have used Copy Services provided by external disk storage since 2001. Over time, with the evolution of IBM i technologies and new external storage offerings, more clients with an IBM i environment have addressed their business continuity goals by relying on external storage. Today, IBM i clients have various choices for DS8000 based solutions. This chapter presents and describes solutions that are designed for high availability, disaster recovery, and to minimize downtime during backups. When we describe business resiliency solutions for IBM i servers, it is helpful to understand the definitions of high availability, disaster recovery, and offline backup: High availability (HA) The ability of a system to provide access to applications regardless of local failures in hardware, software, facilities, or processes. The capability to recover a data center at a different site if the primary site becomes inoperable. The ability to perform a backup of production data from a backup system that has a second copy of the user's data. This backup minimizes the production downtime that is needed for the backup.
558
Main Memory
559
Similarly, writing a new record or updating an existing record is done in main memory, and the affected pages are marked as changed. A changed page normally remains in main memory until it is written to disk as a result of a page fault. Pages are also written to disk when a file is closed or when write-to-disk is forced by a user through commands and parameters.
37.2.5 Clusters
Many of the IBM i solutions using DS8000 Copy Services are based on IBM i clusters and independent disk pools (independent auxiliary storage pools (IASPs)), for example, the solutions with IBM PowerHA SystemMirror for IBM i require IBM i cluster and IASP. Therefore, you should understand the IASP structure before you work with IBM i and DS8000 Copy Services. An IBM i cluster or IASP is a group of one or more systems or logical partitions that work together as a single system. The basic concepts that are related to a cluster include cluster nodes, domains, and cluster resource groups: A cluster node is either a system or logical partition that is a member of the cluster. A cluster consists of a minimum of two nodes and a maximum of 128 nodes. The nodes must be connected with an IP connection that provides a communication path between cluster services on each node in the cluster. A device domain is a subset of nodes in a cluster that share device resources. It is a logical construct within Cluster Resource Services that is used to ensure that there are no configuration conflicts that prevent a switchover or failover of an IASP. A cluster resource group (CRG) is an object in IBM i that represents a set of cluster resources that are used to manage events that occur in a clustered environment. Different CRG types are used to represent different resources, for example, device CRG are used for devices, such as IASP. A recovery domain defines the role of each node in the CRG. When you create a CRG in a cluster, the CRG object is created on all nodes that are specified to be included in the recovery domain. A recovery domain specifies the order of recovery for the nodes in the cluster.
560
IBM i Cluster
Node 1 Node 2
CRG
CRG
Device domain
Figure 37-2 Elements of an IBM i cluster
If a system outage or a site loss occurs, the functions that are provided on a system or partition within a cluster can be accessed through other systems or partitions that are defined in the cluster. When maintenance is needed on the production partition, another node in a cluster can handle resources of the production partition and continue production work. This functionality is achieved through cluster events, such as failover, switchover, replication, and rejoin. For more information about IBM i clusters, see the IBM i and System i Information Center at: http://publib.boulder.ibm.com/iseries/
561
The following IBM i disk pools are available: System disk pool (disk pool 1): The system disk pool contains the load source and all configured disks that are not assigned to any other disk pool. A system disk pool is also referred to as Auxiliary Storage Pool 1 (ASP1). Basic disk pools (disk pools 2 - 32): Basic disk pools can be used to separate objects from the system disk pool. Basic disk pools are also referred to as ASP2 - ASP32. Primary disk pool: This pool is an independent disk pool that defines a collection of directories and libraries and might have other secondary disk pools that are associated with it. It is also referred to as Primary IASP. Secondary disk pool: This pool is an independent disk pool that must be associated with a primary disk pool. ASPs and IASPs in IBM i are shown in Figure 37-3.
Sysbas
Only the system disk pool is needed for an IBM i installation. The other pools are optional and are implemented depending on your needs and preferences. The following implementations of disk pools can be found in IBM i: System disk pool only. This implementation is usually used for smaller customers. System disk pool and a number of basic disk pools. This implementation is the traditional one for larger customers that have not decided to use IASP. System disk pool and primary disk pool. Many small and medium customers use a primary disk pool for their applications and keep the system data in the ASP1. System disk pool, basic disk pools, and primary disk pool. This implementation is typical for larger customers who decide to set up one of their applications in the primary disk pool and keep the other application programs and data in basic pools.
562
System disk pool, primary disk pool, and a number of secondary disk pools: This implementation is typical for larger customers who decide to run all of their applications in the IASPs and keep only the system data in the system disk pool. System disk pool, basic disk pools, primary disk pool, and secondary disk pools: Large customers who want to keep some of their applications in basic pools and run the other applications in IASPs implement all the disk pools.
563
37.4.2 Advanced Copy Services for PowerHA on i and Copy Services Tool Kit
Advanced Copy Services for PowerHA on i (ACS) and Copy Services Tool Kit are offerings by IBM Lab Services that provide both software for managing the solution and services to set up the environment and perform installation and maintenance. They provide high-level automation for the DS8000 Copy Services based solutions, and comprehensive tools for checking and troubleshooting. They also provide IBM services to set up both DS8000 and IBM i environments for the solution, install the management software, and provide regular maintenance. The Copy Services Tool Kit has been available for IBM i environments for a relatively long time; it was available for several years before the announcement of PowerHA for IBM i. Advanced Copy Services for PowerHA on IBM i is the new version of the toolkit that is based on PowerHA for IBM i and provides additional functions for automation and troubleshooting. IBM i customers use either the Copy Services Tool Kit or ACS.
564
The supported environments and management software products are listed in Table 37-1.
Table 37-1 Supported environments and management software products DS8000 attachment Native VIOS_NPIV VIOS VSCSI Managing software PowerHA for i ACS for PowerHA on i Tivoli Storage Productivity Center for Replication
Copy Services of IASPs FlashCopy Metro Mirror Metro Mirror and FlashCopya Global Mirror Global Mirror and FlashCopy
a
Yes Yes
Yes Yes
Yes Yes
Yes Yes
Yes Yes
Metro/ Global Mirror Copy Services of full system FlashCopy Metro Mirror Metro Mirror and FlashCopy
a
Yes
Yes
Yes
Yes Yes
Yes Yes
Yes b Yes b
Yes c Yes c
Yes
Yes
Yes b
Yes c
a. FlashCopy of Remote Copy source LUNs or FlashCopy of Remote Copy target LUNs. b. Not tested. c. The specifications in this table are valid for the PowerHA SystemMirror for i Version 7.1 and for Advanced Copy Services for PowerHA on i Version 1.3. If you are planning the solution with previous versions of these products, you should check whether the needed function is supported. If the FlashCopy part of a combined solution with Remote Copy and FlashCopy is managed by Full System FlashCopy (FSFC), you should not use Tivoli Storage Productivity Center for Replication to manage the DS8000 part.
565
37.6 References
For planning and implementing PowerHA SystemMirror for i, see the following publications: PowerHA SystemMirror for IBM i Cookbook, SG24-7994 Implementing PowerHA for IBM i, SG24-7405 For Advanced Copy Services for PowerHA on i and the Copy Services Tool Kit, go to the Client Technology Center at: http://www.ibm.com/servers/eserver/services/iseriesservices.html For sizing of Metro Mirror and Global Mirror links for IBM i, see IBM i and IBM System Storage: A Guide to Implementing External Disks on IBM i, SG24-7120. For detailed sizing based on IBM i Performance Explorer (PEX) analysis, go to the Client Technology Center at: http://www.ibm.com/servers/eserver/services/iseriesservices.html For step by step instructions to implement full system Copy Services, see the following publications: IBM System Storage Copy Services and IBM i: A Guide to Planning and Implementation, SG24-7103 DS8000 Copy Services for IBM i with VIOS, REDP-4584
566
38
Chapter 38.
IBM i options
This chapter provides information about IBM i and DS8000 Copy Services options. Important: When you use Copy Services functions such as Metro Mirror, Global Mirror, or FlashCopy for the replication of the load source unit or other IBM i disk units within the same DS8000 or between two or more DS8000 systems, the source volume and the target volume characteristics must be identical. The target and source must have matching capacities and protection types. Additionally, after a volume is assigned to an IBM i partition and added to that partitions configuration, its characteristics must not be changed. If there is a requirement to change some characteristics of a configured volume, it must be removed from the IBM i configuration. After the characteristics changes are made, the volume can be reassigned to the IBM i configuration. To simplify the configuration, use a symmetrical configuration between two IBM Storage systems, creating the same volumes with the same volume ID that decides the LSS_ID. This chapter covers the following topics: Metro Mirror for independent disk pools Global Mirror for independent disk pools FlashCopy for independent disk pools Full system Metro Mirror Full system Global Mirror Full System FlashCopy Solutions with Remote Copy and FlashCopy FlashCopy SE with IBM i Metro/Global Mirror with IBM i
567
IBM i Cluster
Power server
DS8000
DS8000
Power server
IASP
Metro Mirror
IASP
Local site
Figure 38-1 Metro Mirror of IASP
Remote site
568
This solution provides continuous availability in case of planned or unplanned outages. For each planned or unplanned outage, the Metro Mirror copy of the production independent disk pools is made available to the remote partition, which continues to run production application from the data in the mirrored IASPs. For more information about the steps that are performed at switchover to the DR site, see Chapter 39, IBM i implementation on page 593.
569
Careful sizing of links for Metro Mirror is needed. To size the needed Metro Mirror links bandwidth for an IBM i disk pool, complete the following steps: 1. Collect IBM i performance data. Collect it over one week, and if needed, during heavy workload, such as when you run end-of-month jobs. 2. If you already established the independent disk pool, observe, in the performance reports, the number of writes that go to the IASP. If the IASP is not set up yet, observe the number of writes that go to the database by completing the following steps: a. Multiply the writes to the disk pool by the reported transfer size to get the write rate (MB per second) for the entire period of collecting performance data. b. Look for the highest reported write rate. Calculate the needed bandwidth by completing the following steps: i. Assume 10 bits per byte for network processing impact. ii. If the compression of devices for remote links is known, apply it. If it is not known, assume a 2:1 or 1.5:1 compression. iii. Assume a maximum 80% usage of the network. iv. Apply a 10% uplift factor to the result to account for peaks in the 5-minute intervals. The Client Technology Center can perform an in-depth analysis of bandwidth requirements by using testing workload PEX if more accurate sizing is needed.
570
The solution provides continuous availability in case of planned and unplanned outages at the local partition. The needed actions for recovery at planned or unplanned outages are provided by the management products with minimal user interaction. An overview of IASP Global Mirror is shown in Figure 38-2.
IBM i Cluster
Power server
DS8000
DS8000
Power server
IASP
Global Mirror
IASP
Local site
Figure 38-2 Global Mirror of IASP
Remote site
571
572
DS8000
IASP
Production partition
Save to tape
Cluster
FlashCopy
Backup partition
IASP
573
38.4.1 Description
Because of the IBM i single-level storage architecture, all disk units for the system disk pool and base disk pools must be on external storage and all of them must be replicated (with Metro Mirror) to the remote site. For more information about single-level storage, see 37.2, IBM i architecture and external storage on page 558. This solution is depicted in Figure 38-4.
Power server
DS8000
DS8000
Power server
BfS
Metro Mirror
BfS
Local site
Figure 38-4 Metro Mirror of an entire disk space
Remote site
574
If there is a planned or unplanned outage at the production site, the stand-by partition at the recovery site uses Boot from SAN to start the system from the Metro Mirror copy of production volumes. After the boot is completed, the recovery system (stand-by partition) has access to an exact clone of the production system, including database files, application programs, user profiles, job queues, data areas, and so on. Critical applications can continue to run in this production system clone. After the planned outage is over or the unplanned outage is fixed, a Metro Mirror of all the disk space is performed from the standby partition to the local partition. This action copies back to the production system all the updates that occurred during the outage. The original local partition can then be rebooted (Boot from SAN) from the updated primary volumes. The local partition also acquires access to primary volumes updated with the changes that occurred during the outage.
575
Important: When you use Copy Services functions such as Metro Mirror, Global Mirror, or FlashCopy for the replication of the load source unit or other IBM i disk units within the same DS8000 or between two or more DS8000 systems, the source volume and the target volume characteristics must be identical. The target and source must be of matching capacities and matching protection types. Also, after a volume is assigned to an IBM i partition and added to that partitions configuration, its characteristics must not be changed. If there is a requirement to change some characteristic of a configured volume, it must first be removed from the IBM i configuration. After the characteristic changes are made, for example, protection type, and capacity, the volume can be reassigned to the IBM i configuration. To simplify the configuration, use a symmetrical configuration between two IBM Storage systems, creating the same volumes with the same volume ID that decides the LSS_ID.
576
POWER server
DS8000
DS8000
POWER server
IBM i partition
BfS BfS
GM
Local site
Figure 38-5 Global Mirror of the entire disk space
Remote site
577
Important: When you use Copy Services functions such as Metro Mirror, Global Mirror, or FlashCopy for the replication of the load source unit or other IBM i disk units within the same DS8000 or between two or more DS8000 systems, the source volume and the target volume characteristics must be identical. The target and source must be of matching capacities and protection types. After a volume is assigned to an IBM i partition and added to that partitions configuration, its characteristics must not be changed. If there is a requirement to change some characteristic of a configured volume, it first must be removed from the IBM i configuration. After the changes are made, for example, to the protection type and capacity, the volume can be reassigned to the IBM i configuration. To simplify the configuration, use a symmetrical configuration between two IBM Storage systems, creating the same volumes with the same volume ID that decides the LSS_ID.
578
POWER server
DS8000
Production partition
BfS
FlashCopy
Backup partition
BfS
Save to tape
579
If this solution is implemented together with BRMS, you must plan the correct setup of BRMS to ensure that it is working correctly with FlashCopy. For more information about this topic, see IBM System Storage Copy Services and IBM i: A Guide to Planning and Implementation, SG24-7103. A Full System FlashCopy solution can be automated by using the Full System FlashCopy tool (FSFC), which is provided by IBM Systems Lab Services. Important: When you use Copy Services functions such as Metro Mirror, Global Mirror, or FlashCopy for the replication of the load source unit or other IBM i disk units within the same DS8000 or between two or more DS8000 systems, the source volume and the target volume characteristics must be identical. The target and source must be of matching capacities and protection types. After a volume is assigned to an IBM i partition and added to that partitions configuration, its characteristics must not be changed. If there is a requirement to change some characteristic of a configured volume, it must first be removed from the IBM i configuration. After the characteristic changes are made, for example, protection type and capacity, by destroying and re-creating the volume or by using the DS CLI, the volume can be reassigned to the IBM i configuration. To simplify the configuration, use a symmetrical configuration between two IBM Storage systems, creating the same volumes with the same volume ID that decides the LSS_ID.
FlashCopy on either site can be managed by the Full System FlashCopy tool. In this case, the Metro Mirror must be managed by the user procedures. Metro Mirror and FlashCopy can be managed by Tivoli Storage Productivity Center for Replication. In this case, the IBM i part must be managed by user procedures. Global Mirror and FlashCopy for full system: FlashCopy on a Global Mirror primary site FlashCopy on a Global Mirror secondary site FlashCopy on both primary and secondary sites in the same scenario The DS8000 part of the solution can be managed by Tivoli Storage Productivity Center for Replication. In this case, the IBM i part must be managed by user procedures. FlashCopy on a primary site or on a secondary site can be managed by the Full System FlashCopy tool. In this case, the Global Mirror must be managed by user procedures.
581
Use the following formula to calculate the needed capacity for the FlashCopy SE repository: Writes per second * 0.67 * FlashCopy SE active time (sec) * 64 * 1.5 = repository capacity (KB) Here is a short explanation of this formula: Whenever a write occurs, FlashCopy SE allocates one track of the repository capacity (one track is 64 KB in fixed block volumes). If multiple writes occur to the same track, FlashCopy SE allocates only one track whatever number of subsequent writes are done to the same track. With random workloads, you can estimate about 33% of such rewrites, so take into account that 67% of writes per second result in track allocation (writes per second * 0.67). To calculate the capacity that is required, multiply writes per second by the FlashCopy SE active time to obtain the total amount of writes while FlashCopy SE is active (writes per second * 0.67 * FlashCopy SE active time). Multiply the number of writes by the track size of 64 KB to obtain the capacity that is needed, and add 50% for contingency (writes per second * 0.67 * FlashCopy SE active time * 64 * 1.5). The resulting capacity is measured in kilobytes in the formula: divide it by 1 million to express the capacity in GB. Here is the sizing of the capacity for a FlashCopy SE repository with the IBM i partition that we set up for the scenarios in this book: The disk space consists of 32 * 35-GB LUNs on a DS8000, for a total capacity of 1125 GB assigned to the System i partition. For the System i workload, we made the following assumptions: AD = 1 50% read, 50% write 33% rewrites We estimate that FlashCopy SE is active 2 hours, that is, 7200 seconds. To size the repository, we calculate the percentage of production capacity that is needed for the repository, using the formula in Sizing guidelines when writes/sec are not known. After you insert the assumed values in to the formula, our calculation is the following one: 1 * 0.5 * 0.67 * 7200 * 64 * 1.5 = 231552 KB/GB = 23% So, we must allocate 23% of the capacity in the System i partition for the repository. For our tests, we allocate 280 GB for the repository, which is about 25% of the 1125-GB System i production capacity.
Here is a short explanation of the formula: Each write in the production system, less the expected percentage of rewrites (x 0.67), results in a write operation to repository. In RAID5, each write results in four disk operations. Assume a maximum of 130 disk operations per second with a 15-K RPM disk drive, that is, 130 / 4 = 33 writes/sec. Assume a maximum of 100 disk operations/sec with a 10-K RPM disk drive, that is 100 / 4 = 25 writes/sec. In RAID10, each write results in two 2-disk operations. Thus 130 / 2 = 65 writes/sec to a 15-K RPM disk drive and 100 / 2 = 50 writes/sec with a 10-K RPM disk drive. For example, if the workload is of 1000 writes/sec, consider 1000 * 0.67 / 33 = 20 * 15 K RPM disk drives in RAID5. Sizing: If you are sizing a FlashCopy SE repository for the solution with independent disk pools, you must take into account only the writes per second that are done to the IASP.
38.8.2 Implementation
The solutions with FlashCopy SE require the same implementation steps as the solutions with classic FlashCopy, with the difference that the DS8000 TSE volumes are used as FlashCopy targets. To provide as good as possible performance during the FlashCopy SE, it might be a good idea to set up the extent pools for IBM i volumes and for the repository in the following way. Set up the production IBM i volumes (FlashCopy SE sources) and FlashCopy SE targets in four extent pools, each of them containing two RAID5 ranks in the DS8000. Two of the extent pools belong to rankgroup 0, and the other two belong to rankgroup 1. Define both FlashCopy SE source and target volumes in each extent pool; source volumes have their corresponding targets in another extent pool, which belongs to the same rank group as the extent pool with the sources.
583
DS8000
Extent pool A, rankgrp 0 Production LUNs Ext.ent pool B, rankgrp 1 Production LUNs
SE target LUNs
SE target LUNs
FlashCopy SE
FlashCopy SE
SE target LUNs
SE target LUNs
When you use FlashCopy SE for daily backups, you must remove the FlashCopy relationship and release space in the repository after the backup is finished to keep the space occupied in the repository at a minimum level.
38.8.3 Monitoring the usage of the repository space with workload CPW
How quickly the available repository space is used depends on the amount and characteristic of writes in both the production and backup partitions while the FlashCopy SE relationship is in effect. When you use FlashCopy SE to take backups, expect a limited number of writes in the backup partition. Therefore, it is essentially the workload in the production partition that affects the space that is occupied in the repository.
584
To explain to you how the occupied repository space grows in a typical System i environment, we create a test with the IBM i Commercial Processing Workload (CPW) during a FlashCopy SE operation. (We use CPW because it has the same workload patterns that are experienced by many IBM i installations).
Test description
Our production partition is on an IBM i 570, and we allocate four processors and 12 GB of memory. For this partition, we set up 1125 GB of disk capacity on the DS8000, and we create a FlashCopy SE repository of 280 GB. The layout of both production and Space-Efficient volumes is described in 38.8.2, Implementation on page 583. In the production partition, we set up CPW with a 20,000 users workload, which automatically allocates a IBM i memory pool of 5 GB. At the beginning of the test, the repository allocation was 0. We made a FlashCopy SE of all the production volumes and, at the same time, started CPW and let the test run for about 4 hours. We used various performance monitoring tools to observe the results.
Results
CPW produced an average of 1500 IOPS, as can be seen in the graph in Figure 38-8 (the capture interval was 15 minutes).
585
During the workload, an average of about 900 writes/sec were experienced, as shown in Figure 38-9.
Figure 38-10 shows how the space occupied in the repository grows over time (at 15-minute intervals). This graph was created by the FlashCopy Space-Efficient sizing tool. Observe that initially the space occupied grew faster because of more writes per second at the beginning of the CPW. After the CPW normalized, the growth becomes almost linear with an increase of about 5 - 6 GB every 15 minutes.
586
The percentage of used repository space during CPW is shown in Figure 38-11. After CPW normalizes, the occupied repository space grows at a rate of 2% per 15 minutes; in about 3 hours and 20 minutes, it reached 25% of the repository capacity.
Summary
In our test, we experienced the following conditions: Production partition disk capacity: 1125 GB. Repository size: 280 GB. CPW running in the production partition produced about 1500 IO/sec with a read/write ratio of 0.6 (about 900 writes/sec). During FlashCopy SE, the repository occupied space grew from 0 to 70 GB, in 3 hours and 20 minutes. This space represents about 25% of the available repository capacity.
587
To receive SNMP alerts, the DS8000 and SNMP manager must be correctly set up. To observe SNMP alerts, connect directly or through the remote desktop to the workstation where the SNMP manager is and open the SNMP Trap Watcher, as shown in Figure 38-12.
In the SNMP Trap Watcher window, click the alert that you want to look at in the upper pane. The description is shown in the bottom pane (see Figure 38-13).
For more information about configuring the DS8000, installing the SNMP agent, and handling alerts, see IBM System Storage DS8000: Architecture and Implementation, SG24-8886.
588
Figure 38-14 shows the SNMP alert when the system reaches the default 85% threshold for the repository. Observe the number of extent pools, which is written in hexadecimal notation.
589
If the repository fills up during a FlashCopy SE operation, the FlashCopy SE relationship fails. Figure 38-15 shows that the repository occupied space reached 100%, and the failed FlashCopy SE relationship that resulted.
dscli> lssestg -l Date/Time: November 6, 2007 5:31:05 PM CET IBM DSCLI Version: 5.3.0.977 DS: IBM.2107-7520781 extentpoolID stgtype datastate configstate repcapstatus %repcapthreshold repcap (2^30B) vircap repcapalloc vircapalloc ================================================================================================================== ==== P14 fb Normal Normal below 0 70.0 282.0 70.0 264.0 P15 fb Normal Normal below 0 70.0 282.0 70.0 264.0 P34 fb Normal Normal below 0 70.0 282.0 70.0 264.0 P47 fb Normal Normal below 0 70.0 282.0 70.0 264.0 dscli> lsflash 1000-15ff Date/Time: November 6, 2007 5:31:16 PM CET IBM DSCLI Version: 5.3.0.977 DS: IBM.2107-7520781 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled TargetWriteEnabled Backgrou ================================================================================================================== ============ 1000:1200 1001:1201 1002:1202 1003:1203 1004:1204 1005:1205 1006:1206 1209:1009 120A:100A -
A FlashCopy SE failure does not affect the IBM i production partition; there is no interruption or a message. However, the backup partition that uses disk space on FlashCopy SE target volumes stops when the FlashCopy relationship fails. In our testing, the backup partition stopped with SEC code A6020266 even a bit before the repository occupation reached 100%.
590
The backup partition with the critical SRC code is shown in Figure 38-16.
591
592
39
Chapter 39.
IBM i implementation
The implementation of a solution for IBM i with DS8000 Copy Services includes establishing the Copy Services, setting up IBM i part, and setting up the procedures to follow during planned outages, disasters, or offline backups. The IBM i setup differs depending on the disk pool that is being copied. Copy Services for primary and secondary disk pools (independent disk pools) require that the IASPs be on the DS8000, while the system pool and base pools (sysbas) can be on the DS8000 or can be on POWER internal disks or another type of storage. With this solution, it is necessary to implement an IBM i cluster and independent disk pools. Also, you should use one of the management software products, such as IBM PowerHA for i or ACS to manage the solution. The DS8000 must be connected to IBM i natively or in VIOS_NPIV (VIOS Virtual SCSI connection does not support Copy Services of IASPs). Copy Services for system disk pool and base disk pools (Full system Copy Services) require that all disk capacity be on DS8000 and that Boot from SAN is used. The relevant disaster recovery and backup procedures require a boot of the stand-by IBM i partition. With this type of solution, the DS8000 can be connected to IBM i natively, in VIOS_NPIV, or in VIOS with virtual SCSI adapters. Tivoli Storage Productivity Center for Replication can be used to manage the DS8000 part of the scenario, while the IBM i setup can be automated by user written scripts and programs. The Copy Services Tool Kit offering by IBM Systems Lab Services provides automation of the IBM i part for the Full System FlashCopy solution. This chapter describes the important points to consider when you implement either one of these solutions. This chapter provides references for the detailed step by step instructions to implement a particular part of the scenario.
593
Important: When you use Copy Services functions, such as Metro Mirror, Global Mirror, or FlashCopy for the replication of the load source unit or other IBM i disk units within the same DS8000 or between two or more DS8000 systems, the source volume and the target volume characteristics must be identical. The target and source must have matching capacities and protection types. Additionally, after a volume is assigned to an IBM i partition and added to that partitions configuration, its characteristics must not be changed. If you must change some characteristic of a configured volume, it must first be removed from the IBM i configuration. After the characteristic changes are made, the volume can be reassigned to the IBM i configuration. To simplify the configuration, use a symmetrical configuration between two IBM Storage systems, creating the same volumes with the same volume ID that decides the LSS_ID. This chapter covers the following topics: Copy Services with independent disk pools Full System Copy Services
594
An example of creating IASP with Systems Director Navigator for i is shown in Figure 39-1.
Figure 39-1 Creating a disk pool with Systems Director Navigator for i
Next, you migrate the IBM i applications to the independent disk pools. You must choose which application objects to move to the IASP, save them, restore them to the IASP, make the libraries in the disk pool available to the user, and test the application. The guidelines to set up an application in the independent disk pool can be found in IBM eServer iSeries Independent ASPs: A Guide to Moving Applications to IASPs, SG24-6802 and IBM i 6.1 Independent ASPs: A Guide to Quick Implementation of Independent ASPs, SG24-7811. If you do not feel comfortable with setting up your application in the IASP, you might want to engage the services of IBM Systems Lab Services for this task. For solutions with DS8000 Copy Services, the IASP is implemented on a IBM i production partition. You must define a copy description of the IASP on all other partitions that are included in the cluster. To create the description of the IASP, use the CRTDEVASP command.
595
To create a cluster, you can use System Director Navigator for i or the IBM i Control language (CL) command CRTCLU. After the cluster is created, start the cluster communication on each node of the cluster, that is, start the cluster nodes. Depending on which Copy Services are being used in the solution, you might want to set up the cluster device domain and define and start the cluster resource group (CRG). For this task, you can use System Director Navigator for i or the CL commands ADDDEVDMNE, CRTCRG, and STRCRG. For more information about how to create a cluster, start the cluster nodes, create a device domain, and create and start the CRG, go to the IBM i Information Center at: http://publib.boulder.ibm.com/eserver/ibmi.html
LUN ranges
Bottom
Figure 39-2 Copy description for Metro Mirror primary storage
597
The PowerHA session indicates the copy descriptions and DS8000 Copy Services that are being used in the solution. For example, the session that is shown in Figure 39-3 joins the copy descriptions for Metro Mirror primary and secondary devices in the Metro Mirror session. Display ASP Session 06/18/12 Session . . . . . . . . . . . . . . . . . . : Type . . . . . . . . . . . . . . . . . . : PWRHA_MMS *METROMIR PRODUCT 11:31:07
Step by step instructions to set up PowerHA for i with DS8000 Copy Services and use it for disaster recovery actions and FlashCopy are described in PowerHA SystemMirror for IBM i Cookbook, SG24-7994. Chapter 40, IBM i examples on page 607 also presents step by step instructions to set up the solution with Metro Mirror and FlashCopy on the remote site that is managed by PowerHA for i.
598
ACS also provides options to display and check the environment and to manage DS8000 Copy Services from the IBM i command interface. For example: The CHECKPPRC command checks the status of the Copy Services Environment CRG, nodes, and hardware resources to determine whether a Switch of Metro Mirror or Global Mirror (SWPPRC) can be successfully performed. The WRKCSE command is used to view details of the Metro Mirror, Global Mirror, or FlashCopy configuration and scripts and profile that is used for operations. This command also provides a menu to perform some of the Copy Services functions. Figure 39-4 shows a display of the option 12 Work with Metro Mirror environment of the WRKCSE command. Work with MMIR Environment Environment . : Direction . . : PYSHT Reversed Status . . . . : Running
Select one of the following: 2. Pause 3. Resume 4. Failover 6. Start Replication after failover 12. Work with Volumes 13. Display Out of Sync sectors 14. List Stream files
599
Figure 39-5 shows the display of option 13 Display Out of Sync sectors, which is useful when you either create a Metro Mirror connection, add a disk, or catch up after a failover. Display Out of Sync Sectors Press Enter to refresh. Environment name . . . : Copy Service Type . . : Out of Sync Sectors . : Pending results: None PYSHT MMIR 0
Bottom Command ===> _________________________________________________________________________ F1=Help F3=Exit F5=Refresh Status F9=Viewlog F12=Cancel
Figure 39-5 Display Out of Sync Sectors
Figure 39-6 shows part of the output of option Display PPRC Environment for Global Mirror of the WRKCSE command. Display an Environment Press Enter to continue. CG interval Symmetric . D-Copy Flash normal . . D-Copy Flash reversed . . . : . . : . . : . . : 0 *YES *YES *YES Bottom Volume relationships: PROD BACKUP PPRC Vols PPRC Vols 1200-1202 1200-1202 BACKUP CG Flash Vols 1203-1205 PROD CG Flash Vols 1203-1205 Extra CG Flash . : Spc Eff CG Flashes . . . : Spc Eff Reversed CG Flashes . . : *NO *NO *NO
For more information about Advanced Copy Services for PowerHA on IBM i and for placing queries, go to the IBM Systems Lab Services and Training website at: http://www.ibm.com/systems/services/labservices/
600
Because of the IBM i single-level storage architecture, the write operations for a database update might be in IBM i memory for some time before they are swapped to disk. However, the journal entry for the database update is immediately written to disk. Therefore, journaling provides the current transaction and database status in journal receivers on disk. With disaster recovery solutions that use Copy Services for external storage, the data is replicated to the DR site at a disk level. Thus, the data on disk must reflect a consistent and current transaction status. If journals and commitment control are not used and a disaster occurs at the production site, the system and application data at the DR site can experience damage because the transaction data is not yet swapped from IBM i main memory to disk at the time of the disaster. In some cases, the damage on objects that are not journaled prevents the boot or vary on of an independent disk pool at the DR site. For more information about IBM i journaling and commitment control, see Implementing PowerHA for IBM i, SG24-7405 and the IBM i Information Center at: http://publib.boulder.ibm.com/eserver/ibmi.html
601
An example of using the CHGASPACT command is shown on Figure 39-7. Change ASP Activity (CHGASPACT) Type choices, press Enter. ASP device . . . . . . Option . . . . . . . . Suspend timeout . . . Suspend timeout action . . . . . . . . . . . . . . . . . > PWRHA_IASP . > *SUSPEND . 30 . *cont Name, *SYSBAS *SUSPEND, *RESUME, *FRCWRT Number *CONT, *END
F5=Refresh
F12=Cancel
When you run the CHGASPACT *SUSPEND command, the following events occur: 1. As much as possible, data is flushed to disk without distinguishing between database and non-database files, and transaction and non-transaction activity. 2. The system waits for the specified timeout to get all current transactions to their next commit-boundary and does not let them continue past that commit-boundary. When you run CHGASPACT *SUSPEND, Commitment Control is informed not to allow any new transactions to start. It maintains a count of open transactions, which are being monitored. As transactions are committed, that count is decremented. When the count reaches zero, no open transactions exist, and the suspend processing continues as described in step 5, even if the timeout is not expired. 3. If a timeout occurs (the amount of time to wait has passed and the transaction count is not zero) and the timeout action *END is specified, the transactions are released. A message is sent indicating that the CHGASPACT *SUSPEND command failed, and the CHGASPACT is marked as Failed. 4. If a timeout occurs and the timeout action *CONT is specified, a diagnostic message is sent that indicates that not all transactions were successfully suspended, and processing continues as is described in step 5. 5. If all transactions were successfully suspended within the time limit, or a timeout occurred and the timeout action *CONT is specified, the system proceeds with following actions: a. Non-transaction- based operations are suspended. b. Data that is non-pinned in the memory is flushed to disk to write any changes that occurred while the suspend operation was occurring. When you want the transactions to resume, for example, after you perform FlashCopy, run CHGASPACT *RESUME.
602
603
For more information about Boot from SAN, see IBM i and IBM System Storage: A Guide to Implementing External Disks on IBM i, SG24-7120.
604
In particular, when you activate a clone, you must ensure that it does not connect to the network automatically, because doing so can cause substantial problems within both the clone and its parent system. Step by step instructions to implement solutions with Full System Copy Services can be found in IBM System Storage Copy Services and IBM i: A Guide to Planning and Implementation, SG24-7103 and IBM i and IBM System Storage: A Guide to Implementing External Disks on IBM i, SG24-71201. The scenario with a Global Mirror of an FSCS that uses Tivoli Storage Productivity Center for Replication to manage the Copy Services is described in 40.3, Full system Global Mirror with Tivoli Storage Productivity Center for Replication on page 630.
605
606
40
Chapter 40.
IBM i examples
This chapter contains IBM i related examples and details for DS8000 Copy Services with PowerHA SystemMirror for IBM i and IBM Tivoli Storage Productivity Center for Replication. Tip: Our examples use IBM i Control Language (CL) commands for IBM i setup, switchover, and failover, and we use DS CLI commands for the DS8000 setup. However, you may use IBM Systems Director Navigator for i GUI for the IBM i setup, and the DS8000 GUI for the DS8000 setup. This chapter covers the following topics: Metro Mirror with PowerHA for i Metro Mirror and FlashCopy on the remote site with PowerHA for i Full system Global Mirror with Tivoli Storage Productivity Center for Replication Planning
607
40.1.1 Environment
In our example, the setup has two IBM i partitions on a POWER server model 770, a DS8800 system with Metro Mirror primary LUNs, and a DS8800 system with Metro Mirror secondary LUNs. The LPARs use the names Production and Backup. The Metro Mirror primary LUNs are connected to the Production partition with two VIOS_NPIVs, and the Metro Mirror secondary LUNs are connected to the Backup partition with VIOS_NPIV using the same two VIO Servers. For our test, we implemented both Production and Backup LPARs on the same POWER server that is connected through the same two VIO Servers. Obviously, in a real environment, we expect customers to use IBM i partitions on different POWER servers and have them connected to storage systems through different VIO Servers.
608
5. In the Backup partition, create the description of the primary disk pool by running the following command: CRTDEVASP DEVD(PWRHA_IASP) RSRCNAME(PWRHA_IASP) RDB(*GEN) 6. In the Production partition, create the cluster resource group and specify the role of the nodes in the recovery domain by running the following command: CRTCRG CLUSTER(PWRHA_CLU) CRG(PWRHA_CRG) CRGTYPE(*DEV) EXITPGM(*NONE) USRPRF(*NONE) RCYDMN((PRODUCT *PRIMARY *LAST SITE1) (BACKUP *BACKUP *LAST SIT2)) CFGOBJ((PWRHA_IASP *DEVD *ONLINE)) 7. On the primary DS8000, establish a Metro Mirror path to the secondary DS8000 by running the following DS CLI commands: lsavailpprcport -remotedev IBM.2107-75ACV21 -remotewwnn 5005076303FFD18E 58:58 mkpprcpath -dev IBM.2107-75TV181 -remotedev IBM.2107-75ACV21 -remotewwnn 5005076303FFD18E -srclss 58 -tgtlss 58 I0201:I0207 I0331:I0137 These DS CLI commands establish the path for all LSSes containing the IBM i Production LUNs. 8. On the secondary DS8000, establish the Metro Mirror path to the primary DS8000 for the LUNs of the independent disk pool in the Backup partition. Use the same DS CLI commands that are shown in step 7. 9. On the primary DS8000, establish a Metro Mirror for the LUNs of the Production primary disk pool by running the following DS CLI command: vmkpprc -remotedev IBM.2107-75ACV21 -type mmir -mode full 5801:5801 5901:5901 Wait until the initial Metro Mirror synchronization finishes. 10.Create the PowerHA Copy description for the Production and Backup nodes, in which you insert the specifications of the DS8000 system that are accessed by PowerHA in the particular partition. Access is through the DS CLI on IBM i. Run the following commands: For the Production partition: ADDASPCPYD ASPCPY(PWRHA_MM) ASPDEV(PWRHA_IASP) CRG(PWRHA_CRG) SITE(SITE1) STGHOST(test1 (xxxxxx) ('x.x.x.x')) LOCATION(*DEFAULT) LUN('IBM.2107-75TV181' (5801 5901) ()) For the Backup partition: ADDASPCPYD ASPCPY(PWRHA_MMB) ASPDEV(PWRHA_IASP) CRG(PWRHA_CRG) SITE(SIT2) STGHOST(test1 (xxxxxx) ('x.x.x.x')) LOCATION(*DEFAULT) LUN('IBM.2107-75ACV21' (5801 5901) ()) Create each copy description on the relevant node. Any copy description can be created on any node. 11.Start the PowerHA session for Metro Mirror, where you specify the PowerHA copy descriptions that are created in step 10. Run the following command in the Production partition: STRASPSSN SSN(PWRHA_MMS) TYPE(*METROMIR) ASPCPY((PWRHA_MM PWRHA_MMB)) 12.In the Production partition, start the Cluster Resource Group by running the following command: STRCRG CLUSTER(PWRHA_CLU) CRG(PWRHA_CRG) The setup for the solution with PowerHA and Metro Mirror is now ready.
609
For more information about setting up the PowerHA for i and about the prerequisites for this solution, see PowerHA SystemMirror for IBM i Cookbook, SG24-7994. For more information about the setup of Metro Mirror on DS8000, see Chapter 17, Metro Mirror interfaces and examples on page 177.
Type options, press Enter. 1=Create 2=Change 3=Change primary 6=Recovery domain 7=Configuration objects 20=Dump trace Cluster Resource Group PWRHA_CRG
4=Delete 8=Start
5=Display 9=End
Opt 6
Type *DEV
Status Active
Bottom Parameters for options 1, 2, 3, 8, 9 and 20 or command ===> F1=Help F3=Exit F4=Prompt F5=Refresh F9=Retrieve F13=Work with cluster menu
Figure 40-1 Work with CRG - Recovery domain
F12=Cancel
610
The Work with Recovery Domain panel opens, where you can see the roles of the nodes. As shown in Figure 40-2, the Production partition is presently the primary node in the recovery domain and the Backup partition is the Backup node. Work with Recovery Domain Cluster resource group . . . . . . . . . : Consistent information in cluster . . . : Type options, press Enter. 1=Add node 4=Remove node PWRHA_CRG Yes
Opt
Bottom Parameters for options 1 and 20 or command ===> F1=Help F3=Exit F4=Prompt F5=Refresh F13=Work with cluster menu
Figure 40-2 Work with Recovery Domain
F9=Retrieve
F12=Cancel
Check the status of the primary disk pool in the Production partition by running the following command: WRKCFGSTS CFGTYPE(*DEV) CFGD(PWRHA_IASP)
611
As shown in Figure 40-3, the primary disk pool is in the Available status. Work with Configuration Status 12/06/12 Position to . . . . . Starting characters PRODUCT 14:01:29
Type options, press Enter. 1=Vary on 2=Vary off 5=Work with job 8=Work with description 9=Display mode status 13=Work with APPN status... Opt Description PWRHA_IASP Status AVAILABLE -------------Job--------------
F23=More options
F24=More keys
Check the direction of the Metro Mirror on the primary DS8000 by running the following DS CLI command: lspprc 5801 5901 5801 and 5901 are the volume IDs of the LUNs in the Production IASP. The DS CLI command and the output are shown in Figure 40-4. The LUNs of the Production primary disk pool are in the Full Duplex status.
dscli> lspprc 5801 5901 Date/Time: 12 June 2012 14:13:35 CEST IBM DSCLI Version: 7.6.20.570 DS: IBM.2107-75TV181 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 5801:5801 Full Duplex Metro Mirror 58 60 Disabled Invalid 5901:5901 Full Duplex Metro Mirror 59 60 Disabled Invalid dscli>
612
Check the status of the Metro Mirror session in PowerHA for i by running DSPASPSSN. The displayed Metro Mirror session is shown in Figure 40-5. Display ASP Session 12/06/12 Session . . . . . . . . . . . . . . . . . . : Type . . . . . . . . . . . . . . . . . . : PWRHA_MMS *METROMIR PRODUCT 14:05:01
Switchover
To switch over the independent disk pool to the Backup node, change the role of cluster nodes in the Recovery domain. To do so, run the following command at the Backup partition: CHGCRGPRI CLUSTER(PWRHA_CLU) CRG(PWRHA_CRG) This command automatically performs the following actions: Varies off the IASP from the primary node. Performs a Metro Mirror failover to the secondary DS8000 and a failback from the secondary to the primary DS8000 through DS CLI commands. Changes the roles of the primary and Backup node in the cluster Recovery domain. Varies on the independent disk pool on the new primary node at the DR site.
613
The message on the Backup node that is shown in Figure 40-6 indicates that the switch successfully finished. MAIN Select one of the following: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. User tasks Office tasks General system tasks Files, libraries, and folders Programming Communications Define or change the system Problem handling Display a menu Information Assistant options IBM i Access tasks IBM i Main Menu System: BACKUP
90. Sign off Selection or command ===> F3=Exit F4=Prompt F9=Retrieve F12=Cancel F13=Information Assistant F23=Set initial menu Cluster Resource Services API QcstInitiateSwitchOver completed.
614
Work with Recovery Domain Cluster resource group . . . . . . . . . : Consistent information in cluster . . . : Type options, press Enter. 1=Add node 4=Remove node PWRHA_CRG Yes
Opt
BACKUP 14:56:50
Type options, press Enter. 1=Vary on 2=Vary off 5=Work with job 8=Work with description 9=Display mode status 13=Work with APPN status... Opt Description PWRHA_IASP Status AVAILABLE -------------Job--------------
F23=More options
F24=More keys
dscli> lspprc 5801 5901 Date/Time: 12 June 2012 14:59:13 CEST IBM DSCLI Version: 7.6.20.570 DS: IBM.2107-75TV181 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ========================================================================================================= 5801:5801 Target Full Duplex Metro Mirror 58 unknown Disabled Invalid 5901:5901 Target Full Duplex Metro Mirror 59 unknown Disabled Invalid dscli>
615
616
Time sent . . . . . . :
13:22:49
Message . . . . : Cluster resource group PWRHA_CRG is failing over from node PRODUCT to node BACKUP. Cause . . . . . : Cluster resource group PWRHA_CRG is attempting to fail over from node PRODUCT to node BACKUP.
Bottom Press Enter to continue. F1=Help F3=Exit F6=Print F21=Select assistance level
Figure 40-10 CRG failing over
F12=Cancel
617
If the cluster is created with the failover wait time (FLVWAITTIM) = *NOMAX option, the failover waits until you answer the message that is shown in Figure 40-11. After you reply to the message with G, the failover proceeds. However, If the cluster is created with the value FLVWAITTIM = *NOWAIT, the failover proceeds without user intervention. Additional Message Information Message ID . . . . . . : Date sent . . . . . . : Message . . . . : (G C) CPABB02 14/06/12
Time sent . . . . . . :
13:22:49
Cause . . . . . : Cluster resource groups are attempting to fail over to node BACKUP. See previous messages for the cluster resource groups which are affected. Recovery . . . : Do one of the following: -- Type G to continue failing over all of the cluster resource groups. -- Type C to cancel all failovers to this node. Possible choices for replying to message . . . . . . . . . . . . . . . : G -- Continue failover. C -- Cancel failover. Bottom Type reply below, then press Enter. Reply . . . . g F1=Help F3=Exit F6=Print F21=Select assistance level F9=Display message details F12=Cancel
Figure 40-11 Reply to the message to proceed with the cluster failover
For more information about the failover options, see PowerHA SystemMirror for IBM i Cookbook, SG24-7994 and go to the IBM i Information Center at: http://publib.boulder.ibm.com/eserver/ibmi.html In our example, the cluster was created with the FLVWAITTIM = *NOMAX option, so we reply to the message in the Backup partition so that the failover proceeds. 2. The failover of Metro Mirror for the LUNs in the independent disk pools is performed. The status of the Metro Mirror target LUNs becomes Source suspended, as shown in Figure 40-12.
dscli> lspprc 5801 5901 Date/Time: 14 June 2012 13:26:11 CEST IBM DSCLI Version: 7.6.20.570 DS: IBM.2107-75ACV21 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ===================================================================================================== 5801:5801 Suspended Host Source Metro Mirror 58 60 Disabled Invalid 5901:5901 Suspended Host Source Metro Mirror 59 60 Disabled Invalid dscli>
618
The Metro Mirror source LUNs remain in the Full Duplex status, as shown in Figure 40-13.
dscli> lspprc 5801 5901 Date/Time: 14 June 2012 13:27:41 CEST IBM DSCLI Version: 7.6.20.570 DS: IBM.2107-75TV181 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 5801:5801 Full Duplex Metro Mirror 58 60 Disabled Invalid 5901:5901 Full Duplex Metro Mirror 59 60 Disabled Invalid dscli>
3. The independent disk pool is varied on at the Backup node. 4. The role of the cluster node in recovery domain is changed so that the Backup node becomes the primary node and the Production partition becomes the Backup node. 5. After the first write to the Metro Mirror source LUNs, the status of the LUNs changes to suspended. Figure 40-14 shows the status of source LUNs after the IPL of the Production IBM i that caused write operations to the disk pool. If you do not expect a write operation to the Metro Mirror source LUNs, you can suspend them manually before the switchback to the Production site.
dscli> lspprc 5801 5901 Date/Time: 14 June 2012 13:42:26 CEST IBM DSCLI Version: 7.6.20.570 DS: IBM.2107-75TV181 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================================== == 5801:5801 Suspended Internal Conditions Target Metro Mirror 58 60 Disabled Invalid 5901:5901 Suspended Internal Conditions Target Metro Mirror 59 60 Disabled Invalid dscli>
Figure 40-14 Status of MM source LUNs after the IPL of the IBM i Production node
619
Reattaching the session requires that you reply to the message in the QSYSOPR message queue in the Production partition, as shown in Figure 40-15. Additional Message Information Message ID . . . . . . : Date sent . . . . . . : Message . . . . : HAA2000 14/06/12
Time sent . . . . . . :
13:47:20
Cause . . . . . : A request was made to reattach auxiliary storage pool (ASP) session PWRHA_MMS. This operation will start replication from cluster node BACKUP to cluster node PRODUCT. Recovery . . . : Do one of the following: -- Type G to continue the reattach. -- Type C to cancel the reattach. Possible choices for replying to message . . . . . . . . . . . . . . . : C -- Processing is terminated. G -- Processing is continued.
Bottom Type reply below, then press Enter. Reply . . . . g F1=Help F3=Exit F6=Print F21=Select assistance level F9=Display message details F12=Cancel
3. Switch back to the Production node by changing the role of nodes in the cluster recovery domain. In the Production partition, run the following command: CHGCRGPRI CLUSTER(PWRHA_CLU) CRG(PWRHA_CRG) The IBM i cluster nodes are now in their original status: The Production node is primary and the Backup node is secondary. The Metro Mirror direction is from the LUNs in Production IASP to the LUNs in backup IASP.
40.2 Metro Mirror and FlashCopy on the remote site with PowerHA for i
This section describes the solution that combines a Metro Mirror of an independent disk pool and FlashCopy of the Metro Mirror target disk pool at the DR site. DS8000 systems are connected to IBM i through VIOS_NPIV, and the solution is managed by PowerHA SystemMirror for IBM i.
620
40.2.1 Environment
In our scenario, in addition to the environment described in 40.1.1, Environment on page 608, we add the third IBM i partition that is on a POWER server and having a system disk pool on the primary DS8000; We name the new partition Flash partition. The FlashCopy source LUNs are the Metro Mirror target LUNs of the independent disk pool that are on the remote DS8000. The FlashCopy target LUNs of the IASP are also on the remote DS8000. The FlashCopy target LUNs are connected to the Flash partition through VIOS_NPIV.
621
Starting the session creates the FlashCopy relationship in the DS8000 on the LUNs specified in the copy descriptions. The DS CLI command lsflash slows the created FlashCopy relation, as shown in Figure 40-16.
dscli> lsflash 5801 5901 Date/Time: 15 June 2012 15:16:33 CEST IBM DSCLI Version: 7.6.20.570 DS: IBM.2107-75ACV21 ID SrcLSS SequenceNum Timeout ActiveCopy Recording Persistent Revertible SourceWriteEnabled =================================================================================================== 5801:6801 58 0 60 Disabled Disabled Disabled Disabled Enabled 5901:6901 59 0 60 Disabled Disabled Disabled Disabled Enabled dscli>
Figure 40-16 FlashCopy relationship after you start the PowerHA session
3. Resume activity in the Production IASP by running the following command in the Production partition: CHGASPACT ASPDEV(PWRHA_IASP) OPTION(*RESUME) 4. Vary-on the independent disk pool in the Flash partition by running the following command: VRYCFG CFGOBJ(PWRHA_IASP) CFGTYPE(*DEV) STATUS(*ON) The application data in the IASP in the Flash partition is now ready for saving to tape while the Production partition is working. After the data is saved to tape, proceed with the next step. 5. Vary off the IASP in the Flash partition by running the following command: VRYCFG CFGOBJ(PWRHA_IASP) CFGTYPE(*DEV) STATUS(*OFF) 6. Detach the FlashCopy session by running the following command in the Flash partition: CHGASPSSN SSN(PWRHA_FLCS) OPTION(*DETACH) The PowerHA environment is now ready for the next FlashCopy.
Additional times
After you perform the FlashCopy the first time, for all other times, complete the following steps: 1. Quiesce the IASP data to disk in the Production partition by running the following command: CHGASPACT ASPDEV(PWRHA_IASP) OPTION(*SUSPEND) SSPTIMO(30) 2. In the Flash partition, reattach the FlashCopy session by running the following command: CHGASPSSN SSN(PWRHA_FLCS) OPTION(*REATTACH) 3. Resume the activity in the Production IASP by running the following command in the Production partition: CHGASPACT ASPDEV(PWRHA_IASP) OPTION(*RESUME) 4. Vary on the independent disk pool in the Flash partition by running the following command: VRYCFG CFGOBJ(PWRHA_IASP) CFGTYPE(*DEV) STATUS(*ON) After the backup is finished, proceed with the next step. 5. Vary off the IASP in the Flash partition by running the following command: VRYCFG CFGOBJ(PWRHA_IASP) CFGTYPE(*DEV) STATUS(*OFF) 6. Detach the FlashCopy session by running the following command in the Flash partition: CHGASPSSN SSN(PWRHA_FLCS) OPTION(*DETACH) 622
IBM System Storage DS8000 Copy Services for Open Systems
Opt
F9=Retrieve
F12=Cancel
Figure 40-17 Copy descriptions in Metro Mirror and FlashCopy on the remote site
623
PRODUCT 11:31:07
Display ASP Session 18/06/12 Session . . . . . . . . . . . Type . . . . . . . . . . . Persistent . . . . . . . . FlashCopy type . . . . . . Number sectors copied . . . Number sectors remaining to . . . . . . . . . . . . . . . . . . . . . . . . . be copied . . . . . . . . . . . . : : : : : : PWRHA_FLCS *FLASHCOPY *NO *NOCOPY 0 0
PRODUCT 11:37:28
624
The status of the Metro Mirror and FlashCopy is shown in Figure 40-20 and Figure 40-21. Metro Mirror is in the direction from the Production to the Backup partition, and there is no FlashCopy relationship on the Metro Mirror at the secondary DS8000.
dscli> lspprc 5801 5901 Date/Time: 18 June 2012 12:17:01 CEST IBM DSCLI Version: 7.6.20.570 DS: IBM.2107-75TV181 ID State Reason Type SourceLSS Timeout (secs) Critical Mode First Pass Status ================================================================================================== 5801:5801 Full Duplex Metro Mirror 58 60 Disabled Invalid 5901:5901 Full Duplex Metro Mirror 59 60 Disabled Invalid dscli>
dscli> lsflash 5801 5901 Date/Time: 18 June 2012 12:20:08 CEST IBM DSCLI Version: 7.6.20.570 DS: IBM.2107-75ACV21 CMUC00234I lsflash: No Flash Copy found. dscli>
625
The FlashCopy session shows the detached status of the FlashCopy target node, as shown in Figure 40-23. Display ASP Session 18/06/12 Session . . . . . . . . . . . Type . . . . . . . . . . . Persistent . . . . . . . . FlashCopy type . . . . . . Number sectors copied . . . Number sectors remaining to . . . . . . . . . . . . . . . . . . . . . . . . . be copied . . . . . . . . . . . . : : : : : : PWRHA_FLCS *FLASHCOPY *NO *NOCOPY 0 0 BACKUP 12:54:41
626
After the switchback is complete, you can observe the status of the environment by displaying the Metro Mirror and FlashCopy sessions. As shown in Figure 40-24 and Figure 40-25, the Production partition becomes again the primary node in the recovery domain, and the FlashCopy session is still detached. Display ASP Session 18/06/12 Session . . . . . . . . . . . . . . . . . . : Type . . . . . . . . . . . . . . . . . . : PWRHA_MMS *METROMIR PRODUCT 13:50:41
Display ASP Session 18/06/12 Session . . . . . . . . . . . Type . . . . . . . . . . . Persistent . . . . . . . . FlashCopy type . . . . . . Number sectors copied . . . Number sectors remaining to . . . . . . . . . . . . . . . . . . . . . . . . . be copied . . . . . . . . . . . . : : : : : : PWRHA_FLCS *FLASHCOPY *NO *NOCOPY 0 0
PRODUCT 13:55:25
627
Copy Descriptions ASP device PWRHA_IASP PWRHA_IASP Name PWRHA_MMB PWRHA_FLC Role SOURCE TARGET State UNKNOWN UNKNOWN Node BACKUP *NONE Bottom Press Enter to continue F3=Exit F5=Refresh F12=Cancel F19=Automatic refresh
628
A failover to the Backup partition is automatically performed. After the failover, the Metro Mirror session shows the Backup partition as the primary node and the IASP is available on the Backup LPAR, as shown in Figure 40-27. Display ASP Session 18/06/12 Session . . . . . . . . . . . . . . . . . . : Type . . . . . . . . . . . . . . . . . . : PWRHA_MMS *METROMIR BACKUP 16:03:14
The FlashCopy session is still attached and the disk pool in the Flash partition is varied on, as shown in Figure 40-28 and Figure 40-29 on page 630. Display ASP Session 18/06/12 Session . . . . . . . . . . . Type . . . . . . . . . . . Persistent . . . . . . . . FlashCopy type . . . . . . Number sectors copied . . . Number sectors remaining to . . . . . . . . . . . . . . . . . . . . . . . . . be copied . . . . . . . . . . . . : : : : : : PWRHA_FLCS *FLASHCOPY *NO *NOCOPY 275584 275368832 FLASH 13:57:53
629
FLASH 14:08:46
Type options, press Enter. 1=Vary on 2=Vary off 5=Work with job 8=Work with description 9=Display mode status 13=Work with APPN status... Opt Description PWRHA_IASP Status AVAILABLE -------------Job--------------
F23=More options
F24=More keys
40.3 Full system Global Mirror with Tivoli Storage Productivity Center for Replication
Tivoli Storage Productivity Center for Replication is described in Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685. Typically, you install Tivoli Storage Productivity Center for Replication on a Windows workstation, or on Linux or AIX workstation. IP connections are required from the Tivoli Storage Productivity Center for Replication workstation to the storage systems that you want to manage and monitor. After Tivoli Storage Productivity Center for Replication is installed and running, you can access its graphical user interface (GUI) through a web browser or through a command-line interface (CLI). This section describes how to plan the usage of Tivoli Storage Productivity Center for Replication with an IBM i host. This section explains how to access and use Tivoli Storage Productivity Center for Replication to set up Global Mirror for the IBM i full system. This section also explains how to switch over to the remote site during outages, and then fail back to the local site.
630
40.3.1 Planning
To add LUNs to a Tivoli Storage Productivity Center for Replication session, define the LUNs on both the local and remote DS8000 by using the following guidelines: Use the minimal number of LSSs. Make sure that each relevant LSS contains only the volumes that are managed by Tivoli Storage Productivity Center for Replication. At the remote site, define Global Copy targets and FlashCopy targets in different LSSes with the last two digits of a volume ID the same as the volumes on the local site. If you are not able to define the LUNs while you adhere to these guidelines, it might be a good idea to prepare a CSV file that contains the LUN IDs and import it into the Tivoli Storage Productivity Center for Replication session, as described in Adding Copy Sets to the Tivoli Storage Productivity Center for Replication session on page 635.
631
After a successful login, the Tivoli Storage Productivity Center for Replication Health Overview window opens (partial view), as shown in Figure 40-30.
632
Source volumes terminology: In a Tivoli Storage Productivity Center for Replication session for Global Mirror, source volumes are referred to as Host1 (H1) volumes, Global Copy targets as Host2 (H2) volumes, and FlashCopy targets as Journal2 (J2) volumes. 4. The Properties window opens (Figure 40-32). Specify the name and description of the session. You can also change the GM Consistency Group interval time from the default of 0 to another value. Click Next.
5. Choose the Disk Systems for the volumes at the local and remote sites. You can choose from any Disk System that is registered in Tivoli Storage Productivity Center for Replication, regardless of the available Global Mirror links; the links and paths are checked by Tivoli Storage Productivity Center for Replication when you establish the Global Mirror.
633
An example of selecting the Disk System for H2 volumes (Global Copy target volumes within GM) is shown in Figure 40-33.
634
Adding Copy Sets to the Tivoli Storage Productivity Center for Replication session
Assume that both the local and remote DS8000 systems are now configured and attached to IBM i partitions. To add Copy Sets (add volumes) to a Tivoli Storage Productivity Center for Replication session, complete the following steps: 1. Open the Sessions window and click the relevant session to display its details. In the Session Details window, select Add Copy Sets from the drop-down menu (Figure 40-34). You can also select the relevant session in the Sessions window, and select Add Copy Sets from the drop-down menu in the same window. Click Go.
2. In the next set of windows, specify the disk systems, LSSs, and LUNs to include in the Tivoli Storage Productivity Center for Replication session as H1, H2, and J2 volumes. If the LUNs are created according to the guidelines in 40.3.1, Planning on page 631, you can add three entire LSSs for H1, H2, and J2 LUNs, as one Copy Set. Otherwise, add a set of three LUNs (H1, H2, and J2) as one Copy Set. For efficiency, it is a good idea to prepare a CSV file and import it to the session, as described in the following example.
635
In our example, we create one Copy Set containing one H1 volume, one H2 volume, and one J2 volume, export it as a CSV file, complete the CSV file with other LUNs, and import it to the Tivoli Storage Productivity Center for Replication session. Complete the following steps: a. Specify one volume from each site H1 H2 and J2 for the Copy Set. An illustration of specifying LUNs for H2 is shown in Figure 40-35.
b. After you select LUNs, the Select Copy Sets window opens (Figure 40-36). You can click the new Copy Set to display it, or click Next. In the next window, confirm the addition of the Copy Set by clicking Next, and then click Finish in the subsequent window. After the new Copy Set is added, the wizard opens the Session Details window.
636
c. From the drop-down menu in the Session Details window, select Export Copy Sets (Figure 40-37). Click Go.
d. You get a message about successfully exporting and a link from where you can download the CSV file (Figure 40-38). Click the link and download the file.
The downloaded CSV file contains the volumes of the exported Copy Set. Open it with Windows Excel, enter the other volumes that you want to add to the Tivoli Storage Productivity Center for Replication session, and save the file. When you insert IBM i volumes, make sure that each LUN maps to a LUN of equal size and protection. For more information about sizes and protection of IBM i LUNs, see IBM System Storage DS8000: Architecture and Implementation, SG24-8886.
637
The sample CSV file that is used in our example is shown in Figure 40-39.
e. To import the CSV file into Tivoli Storage Productivity Center for Replication, from the drop-down menu in the Session Details window, select Add Copy Sets and click Go. In the Add Copy Sets window, check Use a CSV file to import copysets, click Browse, and select the CSV file (Figure 40-40). Click Next in the following windows, and click Finish in the last window.
638
f. After the Copy Sets are imported, you can examine them by clicking one of the Role Pairs in the Session Details window (Figure 40-41) This action displays a list of all the volume pairs that are defined on the selected roles.
639
In our example, you can see all volume pairs H1 and H2, and their status in Global Mirror, as illustrated in Figure 40-42. Because GM is not started, all the volumes are in the Defined status, but not Preparing or Prepared.
640
In the next window that opens, observe the warning about the target LUNs being overwritten. If necessary, check the LUN IDs again and confirm the start of a Global Mirror session from H1 to H2. The status of the session changes to Preparing, and the picture indicates a data flow from H1 to H2 (Figure 40-44). Global Copy between H1 and H2 is running at this time, but it is not yet synchronized. Also, the FlashCopy from H2 to J2 and the Global Mirror session are not created.
641
2. After Global Copy is synchronized, Tivoli Storage Productivity Center for Replication creates the FlashCopy relationship from H2 to J2 and starts the Global Mirror session. As soon as this task is done, the Tivoli Storage Productivity Center for Replication session status changes from Preparing to Prepared (Figure 40-45).
Figure 40-45 Prepared Tivoli Storage Productivity Center for Replication session
In the Sessions window, click the Tivoli Storage Productivity Center for Replication session to display its details. In our example, the session is in the Normal status and Prepared state. The small triangle in the picture denotes the set of recoverable volumes (at this stage, local volumes H1 are recoverable). See Figure 40-46.
Figure 40-46 Tivoli Storage Productivity Center for Replication session - started GM
3. View the messages in the Tivoli Storage Productivity Center for Replication console to detect any possible issues during the Global Mirror start. To display a message, click Console in the My Work window.
642
2. In Tivoli Storage Productivity Center for Replication, in the Sessions window, click the relevant session to open the Session Details window. Select Suspend from the drop-down window and click Go. In the warning window that opens, check that the LUNs are suspended, and confirm their status. Tivoli Storage Productivity Center for Replication then suspends the Global Mirror session. The Tivoli Storage Productivity Center for Replication session indicates a Severe status and Suspended state (Figure 40-47).
3. From the drop-down menu, select Recover (Figure 40-48). Click Go, and acknowledge the warning.
643
During the recovery, Tivoli Storage Productivity Center for Replication performs a failover to GC target volumes (H2) and a Fast Reverse Restore from the FlashCopy target LUNs to the GC targets. The Tivoli Storage Productivity Center for Replication session is now in the Target Available status. You can perform an IPL of the remote partition that is connected to H2 LUNs. The H2 LUNs are now recoverable, as indicated by the triangle to the right of H2. 4. Check that the LoadSource tag on the POWER HMC shows the IOP or FC adapter with a connected external LoadSource. Perform an IPL of the recovery IBM i partition by activating it at the HMC. 5. After the IPL is complete, the Recovery partition runs a clone of the Production system. The disk units in the Recovery partition are GM target LUNs (Figure 40-49) (the volume serial numbers of disk units contain volume IDs and the last three digits of the remote DS8000 image ID).
Display Disk Configuration Status Serial Number 50-C000461 50-C001461 50-C009461 50-C012461 50-C006461 50-C015461 50-C018461 50-C010461 50-C005461 50-C00A461 50-C01A461 50-C007461 50-C00C461 Resource Name DD019 DD020 DMP143 DMP127 DMP137 DMP129 DMP163 DMP133 DMP159 DMP109 DMP155 DMP111 DMP125
ASP Unit 1 1 1 14 15 16 17 19 20 21 23 24 26 27
Type Model 2107 2107 2107 2107 2107 2107 2107 2107 2107 2107 2107 2107 2107 A85 A85 A05 A05 A05 A05 A05 A05 A05 A05 A05 A05 A05
Status Mirrored Active Active RAID-5/Active RAID-5/Active RAID-5/Active RAID-5/Active RAID-5/Active RAID-5/Active RAID-5/Active RAID-5/Active RAID-5/Active RAID-5/Active RAID-5/Active More...
Press Enter to continue. F3=Exit F5=Refresh F11=Disk configuration capacity F9=Display disk unit details F12=Cancel
644
6. In Tivoli Storage Productivity Center for Replication, in the Session details window, select Start H2->H1 from drop-down menu and click Go. This action starts the Global Copy failback from the H2 LUN to the H1 LUNs. The Tivoli Storage Productivity Center for Replication session is now in the Preparing state and has a replication direction from H2 to H1 (Figure 40-50).
645
3. In the Session Details window, select Recover from the drop-down menu, click Go, and acknowledge the warning. The status of the session is now Normal, and the H1 volumes are recoverable. In Figure 40-51, the small triangle next to HI shows that H1 is recoverable.
4. At the HMC, perform an IPL of the Production partition that is connected to the H1 volumes. After the Production system is running, it contains the updates that are made to the Recovery system during the outage. 5. In Tivoli Storage Productivity Center for Replication, in the Session Details window, select Start H1->H2 from the drop-down menu and click Go. In the warning window that opens, confirm that the direction of the replication and disk systems are correct. Global Mirror resumes in the initial direction from H1 to H2.
646
Part 10
Part
10
Solutions
This part of the book describes several solutions for Open Systems environments to assist you in the usage, management, and automation of the DS8000 Copy Services: Multi-site replication scenarios IBM Tivoli Storage FlashCopy Manager Open HyperSwap for AIX with IBM Tivoli Storage Productivity Center PowerHA SystemMirror for AIX Enterprise Edition VMware Site Recovery Manager Geographically Dispersed Open Clusters (GDOC) IBM Tivoli Storage Productivity Center for Replication Each solution has an overview, and a description of its benefits and the most important features. There are references to product documentation and other sources of information if you need more details.
647
648
41
Chapter 41.
649
System p5
System p5
Global Copy
A
DS8000
Metro Mirror
B
DS8000
C
DWDM DWDM
DS8000
Global Copy
D
DS8000
The migration can be completed when all the Global Copy relationships finish their initial copy. You shut down all applications at the A volume location and wait until both cascaded Global Copy relationships have no out-of-sync tracks left. Now all four copies are identical and consistent and the migration is done. To prepare for starting the applications in the C volume location, terminate the Global Copy between B and C and change the C to D relationship to synchronous mode (Metro Mirror). If your production environment runs at the D location, you also must reverse the C to D replication using Metro Mirror failover and failback. Suspension versus termination: Instead of terminating the B to C relationships, you can also suspend them and perform a Metro Mirror failover at C. Thus, you retain the relationships and have change recording enabled. Now, you can resynchronize either B to C or C to B without a full copy, if required. Suspending I/O: No I/O should be running at any of the locations before you finish the reversal of the Remote Copy relationships in order to avoid inconsistent data.
650
If you want to test, during the migration and without interrupting production, whether the data at the new site (volume C and D location) is usable, you can complete the following steps (again, you are looking at the example in Figure 41-1 on page 650): 1. Issue a Freeze to the Metro Mirror relationship between A and B. This relationship is already synchronous. Therefore, the B volumes are consistent with the A volumes at any time. 2. Suspend the Metro Mirror relationship from A to B. 3. Unfreeze the A volumes and allow production I/O to continue. Timing: Steps 2 and 3 must be completed within the application I/O timeout. Otherwise, the applications get I/O errors. 4. Wait until the number of out-of-sync tracks reaches zero for both Global Copy relationships so that there is consistent data at the C and D volumes location, which is a snapshot of A at the time of the freeze operation. 5. Suspend the Global Copy relationship from B to C. 6. Resume the Metro Mirror relationship from A to B. First, you must re-establish the PPRC paths between A and B. They were set to the Failed state with the Freeze operation. 7. Now, you can check whether the migrated data is usable by making the volumes at either the C or D location read/writeable and start a test application.
651
As shown in Figure 41-2, you establish Metro Mirror (MM) relations between both production data centers (local site). From the Metro Mirror secondary volumes, you run a Global Mirror (GM) relationship to replicate the production data asynchronously to the disaster recovery (remote) site. At the recovery site, you establish Global Copy between the Global Mirror secondary volumes and the second data center at the remote site (remote data center II).
System p5
1 2
A MM primary
J GM Journal
System p5
C GM secondary GC primary
Global Copy
Sys tem Storage
System p5
1 2
B MM secondary GM primary
D GC Secondary
System p5
Figure 41-2 Four -site replication by combining Metro/Global Mirror and Global Copy
With this 4-site solution, you avoid the situation where you do not have a synchronous copy while you run production at the remote site. If there is a disaster at the production site, you convert the Global Copy relationship at the remote site into Metro Mirror. Thus, you are able to restart the production site with the same high availability and disaster recovery features you had at the local site. To allow planned site switches between the production and the disaster recovery sites, you also set up a Global Mirror (see the dotted arrow in Figure 41-2) relationship between the remote site secondary volumes and the local site primary volumes. To do a planned site switch from the local to the remote site, complete the following steps: 1. Shut down the production systems and wait until all the updates transfer to all four data centers. 2. Terminate Global Mirror and delete Global Copy relationships between local and remote sites (volumes B and C). 3. Suspend Metro Mirror and resume Global Copy at the local site. 652
IBM System Storage DS8000 Copy Services for Open Systems
4. Establish Global Copy relationships with the nocopy option and start Global Mirror between the remote and local sites (D and A volumes). 5. Convert Global Copy to Metro Mirror at the remote site. 6. Restart production systems at the remote site. In this scenario, the Global Mirror Incremental Resync functions are useful. If there is an outage in the Local data center II, you can re-establish the Global Mirror between the Local data center I and the remote site using Incremental Resync without coping all the data. For more information about Metro/Global Mirror Incremental Resync, see Chapter 33, Metro/Global Mirror incremental resynchronization on page 497. The distance between the local site and remote site can be nearly unlimited because you use Global Mirror between both sites. For more information about Metro/Global Mirror, see Chapter 28, Metro/Global Mirror overview on page 437. SCORE: You need an approved Storage Customer Opportunity Request (SCORE) for solution. Make a request to IBM Support to obtain the best approach for your scenario. Your IBM Storage marketing representative can assist you with engaging the correct resources to develop the solution.
653
Figure 41-3 shows a cross-site mirroring solution at the production site and a Global Mirror relationship between the production and the disaster recovery site. For more information about Global Mirror, see Chapter 23, Global Mirror overview on page 283. For details about host-based cross-site mirroring, see the appropriate Logical Volume Manager documentation. In our example, we assume that the production site has two local data centers at a metro distance, which means you can use host-based cross-site mirroring between those data centers. In each of those production data centers, you need one half of your LVM mirrors. From each of these mirrors, you should establish a Global Mirror relationship to the remote site data centers, as shown in Figure 41-3.
RH 1
System p5
1 2 1 2
System p5
LH 1
Global Mirror JH 1
System p5
1 2 1 2
RH 2 LH 2
Global Mirror
System p5
JH 2
Figure 41-3 Four-site replication by combining host-based mirroring and Global Mirror
LVM mirroring: If you do not replicate both halves of the cross-site mirroring, you could end up with unusable data at the disaster recovery site. There is no guarantee in a rolling disaster that one half of an LVM mirror will be usable. For example, if you replicate one half of an AIX LVM mirror only and this half becomes stale, you cannot restart the AIX host from this mirror at the remote site. The host-based cross-site mirroring of AIX LVM can cover outages of one storage system without an application outage at the production site. So, this solution provides high availability at your local site. 654
IBM System Storage DS8000 Copy Services for Open Systems
Global Mirror or Metro Mirror provide a disaster recovery (DR) solution if there is a local disaster at both production data centers. With this 4-site replication solution, you overcome the situation where you lose your HA copy (one half of your LVM mirror) while you run production at the remote site, which may not be acceptable to you. You can use either Metro Mirror or Global Mirror to replicate the production data to the remote site. Which technique you use depends on your requirements. Both techniques can provide consistent data. Metro Mirror is synchronous replication and supports a distance up to 300 km. Global Mirror is asynchronous replication, which means that the remote site is out of sync with the local site (the data currency, also known as the Recovery Point Objective (RPO), is greater than zero). But the significant advantage of Global Mirror is that it has no distance limitations. So, your disaster recovery site can be far away from the production site if you choose Global Mirror. How large the RPO at the remote site is depends on the distance between the local site and remote site, the available link bandwidth that is used for Global Mirror, the production workload, and the remote storage system. For more information about Global Mirror performance considerations, see Chapter 26, Global Mirror performance and scalability on page 341. Here are the benefits of such a 4-site replication solution: Host-based cross-site mirroring provides HA at the production site If there is a production site disaster, you can restart the production site with the same HA feature you used before at the local site because you have both halves of the host-based mirroring at the remote site. DR tests at the remote site can be done without impacting the HA functions at the production site. So, for organizations that must have local continuous availability to handle most expected events and remote disaster recovery for regulatory compliance or to handle large-scale risks, a combination of host-based cross-site mirroring and DS8000 storage-based mirroring fulfils their requirements.
655
656
42
Chapter 42.
657
Application System
FlashCopy Manager
Application Data Snapshot Backup Sn a Re psho sto t re Local Snapshot Versions
Storage Manager 6
With Optional TSM Backup 9SVC 9XIV Integration 9DS8000 9Storwize V7000 9 DS 3/4/5* *Microsoft Volume Shadow Copy Service (VSS) Integration
658
For an overview of Tivoli Storage FlashCopy Manager, go to the following website: http://www.ibm.com/software/tivoli/products/storage-flashcopy-mgr/ Here are some highlights of Tivoli Storage FlashCopy Manager: Performs near-instant and application-aware snapshot backups, with a minimal performance impact on IBM DB2, Oracle, SAP, Microsoft SQL Server, Microsoft Exchange, VMware, and other applications through custom scripting. Improves application availability and service levels through high-performance and near-instant restore capabilities that reduce downtime. Integrates with IBM System Storage DS8000, IBM SAN Volume Controller, IBM Storwize V7000, and IBM XIV System on IBM AIX, Linux, Solaris, HP-UX, VMware, and Microsoft Windows. Protects applications on IBM System Storage DS3000, IBM System Storage DS4000, and IBM System Storage DS5000 on Windows using Microsoft Volume Shadow Copy Services (VSS). Satisfies advanced data protection and data reduction needs with optional integration with Tivoli Storage Manager. Supports Windows, AIX, Solaris, HP-UX, and Linux. IBM Tivoli Storage FlashCopy Manager for UNIX and Linux uses the Copy Services capabilities of intelligent storage systems to create point-in-time copies. These copies are application-aware copies (FlashCopy or snapshot) of the production data. IBM DB2 and Oracle databases are supported, as are SAP environments with IBM DB2 or Oracle and other applications through custom scripting. For more information about Tivoli Storage FlashCopy Manager for UNIX and Linux, go to: http://pic.dhe.ibm.com/infocenter/tsminfo/v6r3/topic/com.ibm.itsm.fcm.unx.doc/b_fc m_unx_guide.pdf IBM Tivoli Storage FlashCopy Manager for Windows provides the tools and information that is needed to create and manage volume-level snapshots of Microsoft SQL Server and Microsoft Exchange server data. Snapshots are created while the applications remain online. In addition, you can use Tivoli Storage FlashCopy Manager for Windows to create snapshots for custom applications through scripting. Tivoli Storage FlashCopy Manager uses Microsoft Volume Shadow Copy Services (VSS) and IBM storage hardware FlashCopy technology to protect your business-critical data. For more information about Tivoli Storage FlashCopy Manager for Windows, go to: http://pic.dhe.ibm.com/infocenter/tsminfo/v6r3/topic/com.ibm.itsm.fcm.win.doc/b_fc m_win_guide.pdf IBM Tivoli Storage FlashCopy Manager for VMware can back up VMware environments from a Linux -based backup server. It integrates VMware vSphere APIs and IBM FlashCopy mechanisms. Optionally, Tivoli Storage FlashCopy Manager for VMware can be integrated with Tivoli Storage Manager for Virtual Environments to store VMware image backups. For more information about Tivoli Storage FlashCopy Manager for VMware, go to: http://pic.dhe.ibm.com/infocenter/tsminfo/v6r3/topic/com.ibm.itsm.fcm.vm.doc/b_fvm _guide.pdf For detailed technical information, go to the IBM Tivoli Storage Manager Version 6.3 Information Center at: http://pic.dhe.ibm.com/infocenter/tsminfo/v6r3/index.jsp
659
In the Information Center, in the Content menu, select IBM Tivoli Storage FlashCopy Manager.
42.1.2 Cloning support for SAP databases with Tivoli Storage FlashCopy Manager
Tivoli Storage FlashCopy Manager for UNIX and Linux supports the cloning of an SAP database since Version 2.2. In SAP terms, this function is called a Homogeneous System Copy, that is, the system copy runs the same database and operating system as the original environment. Again, Tivoli Storage FlashCopy Manager uses the FlashCopy or snapshot features of the IBM storage system to create a point-in-time copy of the SAP database. The SAP System Copy guidelines describe a number of additional actions to perform in the copied SAP system (for example, disable Remote Function Call (RFC) destinations and disable batch-job processing). IBM can provide a number of scripts to automate some of these tasks. However, these scripts are not part of the Tivoli Storage FlashCopy Manager software package and must be ordered separately. The FlashCopy technology is ideally suited for database cloning, especially for large and intensively used databases, because it is fast (short time to recover and access the copy) and can be used in an ad hoc manner (database online with no load on production).
660
43
Chapter 43.
IBM Open HyperSwap for AIX with IBM Tivoli Storage Productivity Center for Replication
This chapter describes the Open HyperSwap solution for AIX environments and explains the setup of Open HyperSwap. In addition, this chapter shows an example about how to recover from an unplanned HyperSwap.
661
43.1 Open HyperSwap for AIX with Tivoli Storage Productivity Center
The Tivoli Storage Productivity Center for Replication software helps you manage the advanced Copy Services provided by the IBM System Storage DS8000, IBM Storwize V7000, and IBM System Storage SAN Volume Controller. Tivoli Storage Productivity Center for Replication V4.2 introduced a feature called Open HyperSwap. Open HyperSwap (comparable to HyperSwap for System z) helps improve the continuous availability attributes of AIX hosts by managing a set of planned and unplanned disk system outages for Metro Mirror capable disk systems, as shown in Figure 43-1. It provides a business continuity solution that protects customers from storage failures with a minimal application impact. Open HyperSwap allows I/O to the primary volumes in a Metro Mirror relationship to be swapped to the secondary volumes without any human intervention. After the swap occurs, the application automatically writes to the secondary site without noticing the switch.
I/O path (FC) Rep lication path (Metro Mirro r) Command path (IP)
DS8000 #1
DS8000 #2
Figure 43-1 Open HyperSwap for AIX with Tivoli Storage Productivity Center for Replication
Open HyperSwap replication is a special replication method that is based on Metro Mirror, which is designed to automatically fail over I/O from the primary logical devices to the secondary logical devices if there is a primary disk storage system failure with minimal disruption to the application. Open HyperSwap replication applies to both planned and unplanned replication swaps. When a session has Open HyperSwap enabled, an I/O error on the primary site automatically causes the I/O to switch to the secondary site without any user interaction and only minimal application impact. In addition, while Open HyperSwap is enabled, the Metro Mirror session also supports disaster recovery. If a write is successful on the primary site but is unable to get replicated to the secondary site, Tivoli Storage Productivity Center for Replication suspends all replication for the session, thus ensuring that a consistent copy of the data exists on the secondary site. If the system fails, this data might not be the latest data, but the data is consistent, and you can manually switch host servers to the secondary site.
662
You can control Open HyperSwap from any system that is running Tivoli Storage Productivity Center for Replication (AIX, Windows, Linux, or z/OS). However, the volumes of the Open HyperSwap session must be attached to an AIX host and an IP connection between the AIX host system and the Tivoli Storage Productivity Center for Replication server must be available. The Open HyperSwap functionality requires a minimum version of Licensed Machine Code (LMC) 6.5.1.nnn for a DS8700 and 7.6.1.nnn for a DS8800. It does support stand-alone AIX hosts that have IBM AIX 5L V5.3 or AIX V6.1 installed. It does not support AIX 5L V5.2, AIX V7.1, clustering environments such as PowerHA SystemMirror for AIX (formerly known as IBM HACMP), or SAN boot volume groups. Open HyperSwap for AIX requires the following items: Tivoli Storage Productivity Center for Replication V4.2 or Tivoli Storage Productivity Center for Replication V5.1 with the appropriate Tivoli Storage Productivity Center license SDDPCM V3.x with host attachment script V2.x IP connectivity between the Tivoli Storage Productivity Center for Replication server and DS8000 and AIX hosts For more information about setting up Open HyperSwap with Tivoli Storage Productivity Center for Replication, go to the IBM Tivoli Storage Productivity Center Information Center at the following link and, under Contents, click Managing replication: http://publib.boulder.ibm.com/infocenter/tivihelp/v59r1/index.jsp To enable support for Open HyperSwap on the AIX host, SDDPCM Version 3.x is required. For more information about SDDPCM and Open HyperSwap, see the following resources: IBM System Storage Multipath Subsystem Device Driver User's Guide, GC52-1309 SDDPCM 3.x Package for DS8000 Open HyperSwap (non-clustering environment), found at: http://www.ibm.com/support/docview.wss?rs=540&context=ST52G7&dc=D430&uid=ssg1S4 000201 Host Attachment file set v2.x for SDDPCM 3.x on AIX, found at: http://www.ibm.com/support/docview.wss?rs=540&context=ST52G7&dc=D400&q1=host+sc ript&uid=ssg1S4000203 The Open HyperSwap functionality depends on SDDPCM Version 3.x and Version 2.x of the DS8000 host attachment file set (devices.fcp.disk.ibm.mpio.rte). SDDPCM v3.x currently supports only DS8000 storage devices with the Open HyperSwap feature. Updates of SDDPCM V3.x in support of Open HyperSwap include the following items: The addition of the Arbitration Engine (AE) daemon to the SDDPCM installation package, which is used for communication with Tivoli Storage Productivity Center for Replication Updated device support information Updates to the SDDPCM Object Data Manager (ODM) attribute settings, to add support for the Open HyperSwap quiesce expire time of a device Information about dynamically enabling or disabling a path for Open HyperSwap devices The addition of the pcmpath query session command, which displays the session of the Open HyperSwap devices that are configured on the host Updates to the pcmpath query device information The addition of two more trace files for SDDPCM: AE.log and AE_bak.log Additional SDDPCM error log messages
Chapter 43. IBM Open HyperSwap for AIX with IBM Tivoli Storage Productivity Center for Replication
663
For more information, see the latest readme file of SDDPCM V3.x, the DS8000 host attachment file set V2.x, and the IBM System Storage Multipath Subsystem Device Driver User's Guide, GC52-1309. For SDDPCM V3.x or later releases, a UNIX application daemon, AE server, is added to SDDPCM path control module. The AE server daemon interfaces with the Tivoli Storage Productivity Center for Replication server and the SDDPCM kernel driver to provide Open HyperSwap functionality. Tivoli Storage Productivity Center for Replication uses TCP/IP port 9930 for communication with the AIX host for the Open HyperSwap.
Tivoli Storage Productivity Center for Replication tries to connect to the defined host, where the AE daemon must be running. Figure 43-3 shows a successful connection.
664
The AE daemon can be controlled by running the following commands, as shown in Example 43-1: lssrc -s AE (to check whether the AE daemon is running; the status is either Active or Inoperative) startAE (to manually start the AE daemon) stopAE (to manually stop the AE daemon) Attention: Do not manually stop AE while the application is running and there are devices configured. AE is a part of the Open HyperSwap functionality. If AE is not running, Open HyperSwap functionality is not available. It is important to ensure that AE is running.
Example 43-1 Managing the AE daemon of SDDPCM V3.x for Open HyperSwap
# lssrc -s AE Subsystem Group PID Status AE 3211342 active # stopAE 0513-044 The AE Subsystem was requested to stop. # lssrc -s AE Subsystem Group PID Status AE inoperative # startAE AE started # lssrc -s AE Subsystem Group PID Status AE 3211342 active Open HyperSwap depends on SDDPCM to distinguish the paths of the primary volume from the paths of the secondary volume on an Open HyperSwap copy set. With an Open HyperSwap device, I/O can be sent only to the primary volume, so when SDDPCM selects paths for I/O, it selects only paths that are connected to the primary volume. If there is no path on the primary volume that can be used, SDDPCM initiates the Open HyperSwap request to Tivoli Storage Productivity Center for Replication. After the swap, SDDPCM selects the secondary volume paths for I/O.
Chapter 43. IBM Open HyperSwap for AIX with IBM Tivoli Storage Productivity Center for Replication
665
Before you assign your volumes to the host system, you must define a Metro Mirror Failover/Failback session in Tivoli Storage Productivity Center for Replication with the volumes you plan to assign to the host. The session definition must have the enabled Open HyperSwap option, as shown in Figure 43-4.
After the session is defined and started, you can assign volumes from both storage systems to the host. Tivoli Storage Productivity Center for Replication automatically assigns the session to the host connection and manages it as a HyperSwap session, as shown in Figure 43-5.
666
The following examples show an active Open HyperSwap session on an AIX host with active I/O to the primary storage system. With an active Open HyperSwap session, each Metro Mirror pair, which consists of a primary and a secondary volume, is represented by only a single hdisk on the AIX host but with paths to both storage systems, as shown in Example 43-2.
Example 43-2 Open HyperSwap device representation on an AIX host
# errpt -j B3C9FBE3 IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION B3C9FBE3 1011115410 I O aix11_hyperswa SESSION READY FOR HYPERSWAP # lsdev -Cc disk hdisk0 Available Virtual SCSI Disk Drive hdisk1 Available Virtual SCSI Disk Drive hdisk2 Available 30-T1-01 IBM MPIO FC 2107 # lspath -F "name:path_id:connection:parent:path_status:status" | sort hdisk2:0:50050763030cc1a5,4025400000000000:fscsi0:Available:Enabled hdisk2:1:500507630304c1a5,4025400000000000:fscsi0:Available:Enabled hdisk2:2:500507630300c08f,4025400000000000:fscsi0:Available:Enabled hdisk2:3:50050763030b408f,4025400000000000:fscsi0:Available:Enabled hdisk2:4:50050763031cc1a5,4025400000000000:fscsi1:Available:Enabled hdisk2:5:500507630314c1a5,4025400000000000:fscsi1:Available:Enabled hdisk2:6:500507630319c08f,4025400000000000:fscsi1:Available:Enabled hdisk2:7:500507630314c08f,4025400000000000:fscsi1:Available:Enabled The different WWPNs displayed by the AIX lspath command can help you quickly identify the DS8000 I/O ports (and thus the different DS8000 storage systems) and the DS8000 volume IDs for the AIX hdisks: hdisk2:0:50050763030cc1a5,4025400000000000 = volume 2500 on DS8000 #1 DS8000 I/O port I0143 (WWPN 50050763030CC1A5) on WWNN 5005076303FFC1A5 hdisk2:2:500507630300c08f,4025400000000000 = volume 2500 on DS8000 #2 DS8000 I/O port I0003 (WWPN 500507630300C08F) on WWNN 5005076303FFC08F The WWPNs of the DS8000 I/O ports can be taken from the output of the DS CLI lsioport command. The DS8000 storage systems WWNN and Storage Image serial number can be taken from the DS CLI lssi command, as shown in Example 43-3 and Example 43-4.
Example 43-3 WWNN and WWPNs of primary DS8000 storage system for Open HyperSwap
dscli> lssi Name ID Storage Unit Model WWNN State ESSNet ================================================================================= ATS_20780 IBM.2107-7520781 IBM.2107-7520780 951 5005076303FFC1A5 Online Enabled dscli> lsioport ID WWPN State Type topo portgrp =============================================================== I0143 50050763030CC1A5 Online Fibre Channel-SW SCSI-FCP 0 [...]
Example 43-4 WWNN and WWPNs of secondary DS8000 storage system for Open HyperSwap
Storage Unit
Model WWNN
State
ESSNet
Chapter 43. IBM Open HyperSwap for AIX with IBM Tivoli Storage Productivity Center for Replication
667
============================================================================ ATS1 IBM.2107-7503461 IBM.2107-7503460 951 5005076303FFC08F Online Enabled dscli> lsioport ID WWPN State Type topo portgrp =============================================================== I0003 500507630300C08F Online Fibre Channel-SW SCSI-FCP 0 [...] On the AIX host, the status of the Open HyperSwap session can be easily checked by running pcmpath query session while the status of the devices and paths can be checked by running pcmpath query device. For devices not used yet,the output is slightly different. In the Example 43-5, there is one hdisk in an Open HyperSwap session that is being replicated using Metro Mirror from the primary system with S/N 7520781 to a secondary system with S/N 7503461. It has not been used yet, so no storage is marked as active.
Example 43-5 Open HyperSwap session status check with pcmpath query session / device command
# pcmpath query session Total Open Hyperswap Sessions : 1 SESSION NAME: aix11_hyperswap SessId Host_OS_State Host_copysets Disabled Quies Resum SwRes 0 READY 1 0 0 0 0 # pcmpath query device Total Dual Active and Active/Asymmetrc Devices : 1 DEV#: 2 DEVICE NAME: hdisk2 TYPE: 2107900 ALGORITHM: Load Balance SESSION NAME: aix11_hyperswap OS DIRECTION: H1->H2 ========================================================================== PRIMARY SERIAL: 75207812500 SECONDARY SERIAL: 75034612500 ----------------------------Path# Adapter/Path Name State Mode Select Errors 0 fscsi0/path0 CLOSE NORMAL 0 0 1 fscsi0/path1 CLOSE NORMAL 0 0 2 fscsi0/path2 CLOSE NORMAL 0 0 3 fscsi0/path3 CLOSE NORMAL 0 0 4 fscsi1/path4 CLOSE NORMAL 0 0 5 fscsi1/path5 CLOSE NORMAL 0 0 6 fscsi1/path6 CLOSE NORMAL 0 0 7 fscsi1/path7 CLOSE NORMAL 0 0 Example 43-6 shows a device that is being activated and some I/O issued. The asterisk (*) indicates the active device to which the current I/Os are being sent. At any time, only one of the four paths are active for I/Os. The I/O of the AIX host is going to the primary storage system.
Example 43-6 Device activation that causes I/O
# varyonvg vg_hyperswap # pcmpath query device Total Dual Active and Active/Asymmetrc Devices : 1 DEV#: 2 DEVICE NAME: hdisk2 TYPE: 2107900 ALGORITHM: SESSION NAME: aix11_hyperswap OS DIRECTION: H1->H2 668
Load Balance
========================================================================== PRIMARY SERIAL: 75207812500 * ----------------------------Path# Adapter/Path Name State Mode Select Errors 0 fscsi0/path0 OPEN NORMAL 28 0 1 fscsi0/path1 OPEN NORMAL 27 0 2 fscsi1/path4 OPEN NORMAL 20 0 3 fscsi1/path5 OPEN NORMAL 21 0 SECONDARY SERIAL: 75034612500 ----------------------------Path# Adapter/Path Name State Mode Select Errors 4 fscsi0/path2 OPEN NORMAL 0 0 5 fscsi0/path3 OPEN NORMAL 0 0 6 fscsi1/path6 OPEN NORMAL 0 0 7 fscsi1/path7 OPEN NORMAL 0 0 The output of the pcmpath query device command shows that the primary volume paths are being selected. The primary serial is not always the primary volume. The primary serial is the serial number of the volume on site one, and secondary serial is the serial number of the volume on site two, as defined in Tivoli Storage Productivity Center for Replication session. After a swap from site one to site two, SDDPCM selects paths of devices on site two for I/O. Example 43-7 shows an I/O error to the currently active storage system that is causing Open HyperSwap to switch over to the secondary site.
Example 43-7 Open HyperSwap switchover example
# errpt | head IDENTIFIER TIMESTAMP T C RESOURCE_NAME DESCRIPTION 2D73089A 1011120610 I O aix11_hyperswa SESSION SWAPRESUME COMPLETED D48C5A11 1011120610 I O aix11_hyperswa SWAPRESUME MESSAGE RECEIVED 1520577F 1011120610 I O aix11_hyperswa QUIESCE MESSAGE RECEIVED DE5B2562 1011120610 P H hdisk2 HYPERSWAP EVENT TRIGGERED DE3B8540 1011120610 P H hdisk2 PATH HAS FAILED DE3B8540 1011120610 P H hdisk2 PATH HAS FAILED DE3B8540 1011120610 P H hdisk2 PATH HAS FAILED # pcmpath query device Total Dual Active and Active/Asymmetrc Devices : 1 DEV#: 2 DEVICE NAME: hdisk2 TYPE: 2107900 ALGORITHM: Load Balance SESSION NAME: aix11_hyperswap OS DIRECTION: H1<-H2 ========================================================================== PRIMARY SERIAL: 75207812500 ----------------------------Path# Adapter/Path Name State Mode Select Errors 0 fscsi0/path0 FAILED NORMAL 323 0 1 fscsi0/path1 FAILED NORMAL 360 0 2 fscsi1/path4 FAILED NORMAL 321 0 3 fscsi1/path5 OPEN NORMAL 316 0 SECONDARY SERIAL: 75034612500 * ----------------------------Path# Adapter/Path Name State Mode Select Errors 4 fscsi0/path2 OPEN NORMAL 1168 0 5 fscsi0/path3 OPEN NORMAL 1194 0 6 fscsi1/path6 OPEN NORMAL 1195 0 7 fscsi1/path7 OPEN NORMAL 1194 0
Chapter 43. IBM Open HyperSwap for AIX with IBM Tivoli Storage Productivity Center for Replication
669
You can see also that all paths except one path are in the FAILED state; that one path is still shown as OPEN. Metro Mirror pair is now failed over to the secondary site (as described in 15.4, Failover and failback on page 167) and the volumes are suspended, as shown in Example 43-8.
Example 43-8 Metro Mirror state is suspended
dscli> lspprc -l 2500 ID State Reason Type Out Of Sync Tracks ================================================================ 2500:2500 Suspended Host Source Metro Mirror 14075 To fail back the Metro Mirror relationship, resolve the issue and then click Start H2 H1, for example, in the Tivoli Storage Productivity Center for Replication GUI, as shown in Figure 43-6.
Figure 43-6 Fail back the Metro Mirror relationship of Open HyperSwap session
After you reconnect volumes from the primary storage to the host, the output of pcmpath query device must show paths as OPEN or CLOSE if not active, but not FAILED. To return the production back to the primary storage, select HyperSwap in the drop-down menu that is shown in the Tivoli Storage Productivity Center for Replication GUI window that is shown in Figure 43-7.
670
In the Tivoli Storage Productivity Center for Replication GUI, you get a success message if the HyperSwap is done successfully, and on your AIX hosts, you can verify that the HyperSwap was successful by checking the AIX error log, as shown in Example 43-9.
Example 43-9 Summary of events while you perform the recovery actions
# errpt IDENTIFIER 2D73089A D48C5A11 1520577F B3C9FBE3 67150733 67150733 67150733 EBFE3E1E
T I I I I I I I I
C O O O O O O O O
DESCRIPTION SESSION SWAPRESUME COMPLETED SWAPRESUME MESSAGE RECEIVED QUIESCE MESSAGE RECEIVED SESSION READY FOR HYPERSWAP PATH HAS RECOVERED PATH HAS RECOVERED PATH HAS RECOVERED HYPERSWAP DEVICE ENABLED
Chapter 43. IBM Open HyperSwap for AIX with IBM Tivoli Storage Productivity Center for Replication
671
672
44
Chapter 44.
673
The acceptable distance between two remote sites for synchronous data replication strongly depends on the additional latency in I/O response time that can be tolerated by the application. It is typically less than 300 km.
674
Figure 44-1 shows a basic example of a 4-node PowerHA cluster that is attached to two DS8000 storage systems using Metro Mirror for data replication between remote sites.
LAN
FC
SAN
FC
Site A
Site B
Figure 44-1 IBM PowerHA SystemMirror for AIX Enterprise Edition using DS8000 Metro Mirror for data replication
For the latest product information, see the IBM PowerHA SystemMirror for AIX website at: http://www.ibm.com/systems/power/software/availability/aix/ For technical information about the PowerHA SystemMirror for AIX and the integration of DS8000 Metro Mirror and Global Mirror, see the following resources: PowerHA SystemMirror V6.1 for AIX Enterprise Edition: Storage Based High Availability and Disaster Recovery Guide, found at: http://pic.dhe.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.hacmp.pprc/hacmp_p prc_pdf.pdf IBM AIX 7.1 / PowerHA SystemMirror Information Center, found at: http://pic.dhe.ibm.com/infocenter/aix/v7r1/index.jsp
Chapter 44. IBM PowerHA SystemMirror for IBM AIX Enterprise Edition
675
676
45
Chapter 45.
677
678
Figure 45-1 shows an overview of the necessary components of a VMware SRM solution with a DS8000.
Site 1 (production site) vCenter Server 1 VCMS 1 DB SRM Server 1 Storage Replication Adapter for DS8000 DS8000 # 1
Metro Mirror / Global Mirror or Metro/Global Mirror
Site 2 (recovery site) vCenter Server 2 VCMS 2 DB SRM Server 2 Storage Replication Adapter for DS8000 DS8000 # 2
Metro Mirror / Global Mirror or Metro/Global Mirror
SRM 1 DB
SRM 2 DB
ESX Server 1
ESX Server 2
More detailed information about concepts, and installing, configuring, and using VMware SRM can be found at the following website: http://www.vmware.com/support/pubs/srm_pubs.html You can download the latest DS8000 Storage Replication Adapter (SRA) software from the following website: ftp://ftp.software.ibm.com/storage/ds_open_api/VMWARE/DS8K_SRA/ Additional information about how to install and configure the SRA software is described in the IBM DS8000 SRA Installation and Users Guide, which is included with the SRA software package. If the environment mandates the use of VMware SRM with IBM Tivoli Productivity Center for Replication, then the SRA should be configured for a pre-configured environment. For more information about this configuration, see the IBM DS8000 SRA Installation and Users Guide, and for more information about Tivoli Storage Productivity Center for Replication, see Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685.
679
680
46
Chapter 46.
681
682
This solution includes the following base set of services: GDOC consulting and planning: Assesses high availability and rapid recovery solution requirements. During the planning stage, the wanted state is defined, and the steps that are needed to achieve this state. This stage is often accomplished through a GDOC Technology Consulting Workshop (TCW) engagement. GDOC conceptual, logical, and physical design: Designs the solution infrastructure to meet the business requirements and objectives. It also provides the technical definition for the physical infrastructure. GDOC non-production solution build and test: Takes the conceptual, logical, and physical design, including prototype and proof of concept if applicable, and implements it in the customer's environment in a test capacity. GDOC solution roll-out and deployment: Takes the prototype and test implementations and builds an enterprise-wide roll-out strategy that enables you, with IBM, to begin a production roll-out in your environment. IBM educates you so that you can continue this roll-out process independently. You can find more information about GDOC service offerings by going to the following link: http://www-935.ibm.com/services/us/en/it-services/implementation-services-for-geog raphically-dispersed-open-clusters.html For more information, contact your local IBM Global Services marketing representative.
683
684
47
Chapter 47.
685
For Tivoli Storage Productivity Center for Replication running on System z and z/OS, there is a separate main section in the Information Center called Tivoli Storage Productivity Center for Replication for System Z, which includes installation and usage information. There are several IBM Redbooks publications available for Tivoli Storage Productivity Center for Replication. Some of them refer to previous releases of the product, but most of the information is still valid. Here are some examples: IBM TotalStorage Productivity Center for Replication Using DS8000, SG24-7596. IBM Tivoli Storage Productivity Center for Replication for Series z, SG24-7563.
686
687
Tivoli Storage Productivity Center for Replication provides easy configuration and management for many different Copy Services scenarios. These scenarios range from basic FlashCopy for backup or test purposes to complete 3-site Metro/Global Mirror setups. You can use wizards, which automatically line up the candidate volumes, to set up the Copy Services relationships or import configuration files that describe it. Tivoli Storage Productivity Center for Replication uses a session concept to define a Copy Services scenario. The scope and purpose of a scenario is defined by the session type. Here are the available session types for the DS8000 storage system: FlashCopy Metro Mirror failover/failback Metro Mirror single direction Metro Mirror failover/failback with a practice volume Global Mirror failover/failback Global Mirror single direction Global Mirror failover/failback with a practice volume Global Mirror in either direction with a 2-site practice volume Metro Global Mirror Metro Global Mirror with practice volume Different session types are available for the other storage systems, depending on their Copy Services capabilities. HyperSwap support: For certain IBM host platforms (AIX and z/OS), Tivoli Storage Productivity Center for Replication provides support for HyperSwap functionality. HyperSwap is a high availability feature that allows transparent failover of host I/O operations from a Metro Mirror primary to secondary volumes. Tivoli Storage Productivity Center for Replication running on z/OS supports an additional session type called Basic HyperSwap. For more information, see Chapter 43, IBM Open HyperSwap for AIX with IBM Tivoli Storage Productivity Center for Replication on page 661. You can manage all allowed recovery, failover, and failback scenarios with a few mouse clicks. Tivoli Storage Productivity Center for Replication supports your decision making process by offering only those activities that make sense in a situation. It also requires extra confirmation for critical decisions that affect host data access or will cause data to be overwritten. The scenarios are illustrated with pictorials that visualize the effect of each action you take. There are virtually no limits to the size of the configuration Tivoli Storage Productivity Center for Replication can manage. You can have multiple scenarios (sessions) with thousands of relationships that are distributed over multiple storage systems. Licensing: Since Version 5.1, there is no differentiation between the Tivoli Storage Productivity Center for Replication license for 2-site or 3-site configurations. They are all covered by the new and simplified Tivoli Storage Productivity Center license. Tivoli Storage Productivity Center for Replication running on z/OS can still be purchased separately. There is only one license scope available that covers all scenarios. Tivoli Storage Productivity Center for Replication Basic Edition, which supports only the Basic HyperSwap session type, is still part of the z/OS operating system.
688
47.1.4 Tivoli Storage Productivity Center for Replication reliability, availability, and serviceability
This section describes some features of Tivoli Storage Productivity Center for Replication that are not related to the core functionality, but required for any application that performs important management functionality in an enterprise environment.
High availability
You can create a high availability environment by setting up a standby Tivoli Storage Productivity Center for Replication server. This instance is a second instance of Tivoli Storage Productivity Center for Replication that runs on a different physical system. It is continuously synchronized with the primary (or active) Tivoli Storage Productivity Center for Replication server. Standby server location: The standby server does not need to run on the same platform as the active one. For example, an active Tivoli Storage Productivity Center for Replication on z/OS can have a Tivoli Storage Productivity Center for Replication on Linux solution on the standby server, and vice versa. The active server issues commands and processes events, while the standby server records the changes to the active one. The standby server keeps all configuration that is identical to the active one and can take over and run the environment without any loss of data. Tivoli Storage Productivity Center for Replication does not automatically fail over to the standby system if there is an incident. You must initiate the takeover manually. This behavior is reasonable for a disaster recovery solution because the decision about which management server is still operating correctly can be ambiguous and night require human judgment. Tip: It is a preferred practice to run the active instance of Tivoli Storage Productivity Center for Replication at the remote site because you want to recover the environment after an incident at the local or primary location.
Figure 47-1 An example Tivoli Storage Productivity Center for Replication Console entry with a child link
689
SNMP
Tivoli Storage Productivity Center for Replication can be configured to send SNMP alerts to registered SNMP managers. A management information database (MIB) is available to provide textual interpretation of the alerts.
User management
For authentication and authorization, Tivoli Storage Productivity Center for Replication uses users and groups that are defined in the user registry of the management server. This registry can either be local (managed by the operating system) or a Lightweight Directory Access Protocol (LDAP) server. Tivoli Storage Productivity Center for Replication cannot create, update, or delete users or groups in this user registry. To manage users or groups, you must use the appropriate tool that is associated with the user registry in which the users and groups are stored. In Tivoli Storage Productivity Center for Replication, you can assign users from the management servers user registry that need access to Tivoli Storage Productivity Center for Replication to one of the following roles: Administrator: Unrestricted access Operator: Can perform operations to specific sessions Monitor: View only access Attention: The Tivoli Storage Productivity Center for Replication user management configuration is not replicated to the standby server. This configuration is for the local server only and must be configured separately on each Tivoli Storage Productivity Center for Replication server. IBM RACF integration: Tivoli Storage Productivity Center for Replication running on z/OS can integrate with IBM RACF for user management.
690
Figure 47-2 Tivoli Storage Productivity Center for Replication Metro / Global Mirror copy set
Important: Do not manipulate Copy Services relationships that are managed with Tivoli Storage Productivity Center for Replication with other tools, such as the DS CLI. Doing so might result in an inconsistent Tivoli Storage Productivity Center for Replication configuration. Also, you should not delete volumes that are still part of a Tivoli Storage Productivity Center for Replication copy set.
47.2.2 Session
A session is a logical construct that holds multiple copy sets, representing a group of volumes that depend on each other and are expected to contain data that is consistent across all affected volumes. Actions that are performed on a session, such as start, stop, and recover, are always run against all the copy sets of a session so that data consistency is ensured. There are predefined session types that cover all scenarios that are supported by Tivoli Storage Productivity Center for Replication. They range from simple FlashCopy to complex 3-site configurations. A session can contain an unlimited number of copy sets that can span multiple storage systems. A Tivoli Storage Productivity Center for Replication instance can maintain multiple active and inactive sessions at the same time.
691
Attention: Under some circumstances, Tivoli Storage Productivity Center for Replication allows a copy set (or volume within a copy set) to be part of more than one session. It issues a warning when it detects such a situation. This possibility must be treated with care. Figure 47-3 illustrates the session concept with an example of two storage systems at the local and two corresponding systems at the remote site and two Metro Mirror sessions.
DS8000
DS8000
Session 1
Primary Primary
Primary
H1
PPRC path
H2
Secondary
Primary Primary
Primary
H1
PPRC path
H2
Secondary
DS8000
DS8000
Primary Primary
H1
PPRC path
H2
Primary
Secondary
Session 2
Primary Primary
H1
PPRC path
H2
Primary
Secondary
Primary Primary
H1
PPRC path
H2
Primary
Secondary
Local site
Remote site
Figure 47-3 Tivoli Storage Productivity Center for Replication session concept
The common extent of a session is a group of applications that depend on each other. All volumes that contain data of this group must belong to the same session to ensure consistent recovery data for the whole group. Important: The DS8000 Metro Mirror CG functionality works on an LSS pair. This functionality is based on the Freeze functionality, which fails all PPRC paths between the frozen LSS pair. Therefore, Metro Mirror volumes from the same LSS pair should not be part of more than one session with active Metro Mirror role-pairs in a copy set.
692
47.2.3 Location
You can group the storage systems in a Tivoli Storage Productivity Center for Replication configuration in to locations. When you define a session, you can assign one of the existing locations to each site. Tivoli Storage Productivity Center for Replication displays these locations in the pictogram that illustrates the Session state. The example that is shown in Figure 47-4 defines the locations Local, Intermediate, and Remote.
Figure 47-4 Tivoli Storage Productivity Center for Replication locations in a session
When you add a copy set to a session, Tivoli Storage Productivity Center for Replication allows you to chose volumes only from the storage systems that have the same location as the site you select the volumes for. Location awareness: Tivoli Storage Productivity Center for Replication has no real location awareness. The location definitions have two purposes: They help you avoid mistakes during the creation of sessions and volume sets. Tivoli Storage Productivity Center for Replication does not audit existing copy sets for location mismatches. They provide a better visualization of a session and its current state. Tivoli Storage Productivity Center for Replication does not perform automatic discovery of locations. You define a location when you add a storage system to the Tivoli Storage Productivity Center for Replication configuration. You can set or change the location that is associated with a storage system at any time. Tivoli Storage Productivity Center for Replication deletes a location when there is no longer a storage system that is associated with it. Using locations: You do not have to use locations. Without a location association, Tivoli Storage Productivity Center for Replication just names a site Site 1, Site 2, and so on, and does not perform any storage system filtering.
693
Host volume
A host (H) volume is a volume that can be connected to a host and serve host I/O operation. In all recovery session types, there are several H volume types, one in each site, which is designated by a number. If a session is in its normal state (prepared, with everything running normally in Site 1 (see 47.2.5, Actions on sessions on page 695), H1 volumes are the ones that hold the production copy of the data and serve the production I/O. In a Metro Mirror session, the Metro Mirror secondaries in Site 2 are the H2 volumes. If there is an outage at Site 1, Tivoli Storage Productivity Center for Replication performs the Metro Mirror failover operations and the H2 volumes are used to serve the production I/O. There are also H3 volumes in the 3-site session types such as Metro / Global Mirror, as shown in Figure 47-5. Tivoli Storage Productivity Center for Replication uses a little blue triangle to indicate which site and therefore which set of H volumes is used for production I/O, as shown in Figure 47-5.
Journal volume
A journal (J) volume is the FlashCopy target volume in a copy set containing a Global Mirror relationship. It acts as a journal and is never accessed by a host directly. It is used to re-create a consistency group at the Global Mirror if required. Depending on the session type, there can be more than one J volume.
694
Intermediate volume
Intermediate (I) volumes are used in certain Metro Mirror and Global Mirror sessions to provide support for an additional volume at the recovery site to test (practice) the recovery procedures on the host volumes of the DR site while replication continues to the intermediate volumes. Figure 47-6 shows the pictogram of the Metro Mirror failover/failback with Practice session as an example. In this session type, the I2 volumes are the Metro Mirror secondaries. When a test is scheduled, Tivoli Storage Productivity Center for Replication creates consistent Metro Mirror targets on the I2 volumes and starts a FlashCopy to the H2 volumes. After the FlashCopy is initiated, the Metro Mirror replication is resumed automatically to the I2 volumes. The FlashCopy targets become the H2 volumes, which then can be accessed by a host without affecting the replication to I2. Another benefit of having intermediate volumes is that you have an additional set of consistent data at the recovery site in real DR situations. This situation provides redundancy when the application recovery corrupts the data on the H2 volumes for any reason. In that case, Tivoli Storage Productivity Center for Replication can reflash the consistent data from the I2 to the H2 volumes to allow another application restart.
Figure 47-6 A Tivoli Storage Productivity Center for Replication session with an intermediate volume
Target volume
Target (T) volumes are used only in the basic FlashCopy session type. It is the FlashCopy target that can be used for test or backup purposes. Important: A simple DS8000 FlashCopy session does not provide FC consistency. To ensure consistent target volumes, you must quiesce application I/O before you run the Flash command.
695
696
The available actions are shown in Figure 47-9. You can select one of the following options: RecoverH3: Re-creates consistent data and allows host access on the H3 volumes. It also turns on change recording on H3, so that the session can be resynchronized without performing a full copy. StartH1->H2->H3: Tries to re-establish the original prepared state of the session. Suspend: Suspends the whole session and allows recovery at the H2 volumes.
Figure 47-9 Recovery choices when the Global Mirror part of an MGM session is suspended
Incremental Resync: Tivoli Storage Productivity Center for Replication always uses Incremental Resync for DS8000 MGM sessions. This concept also requires Fibre Channel connectivity between H1 and H3 to cover these cases: H2 is lost and Incremental Resync from H1 to H3 is started. A recovery is performed to either H2 or H3 and the failed volumes are reintegrated into the MGM configuration.
697
Port pairing definition file: You can specify a file in CSV format that is called portpairings.csv that defines the port pairs that Tivoli Storage Productivity Center for Replication uses for all PPRC paths between two DS8000 systems. Tivoli Storage Productivity Center for Replication uses this definition to create a path whenever it needs one. PPRC paths: When a freeze happens in a DS8000 remote replication relationship, all PPRC paths are set to the Failed state. They must be re-established to resume the replication. Without a port pairing definition file, Tivoli Storage Productivity Center for Replication automatically establishes the paths again, but with only one port pair. The same is valid, if, in a recovery situation, paths in the reverse direction are required, but not yet defined. Using a port pairing definition file is the preferred method because Tivoli Storage Productivity Center for Replication always uses all defined port pairs to establish the required PPRC paths. When you use a port pairing definition file, it must be maintained on each Tivoli Storage Productivity Center for Replication server, as this file is not replicated automatically.
Connection features: For early versions of the DS8100 and DS8300 systems, there was a separate connection feature for Tivoli Storage Productivity Center for Replication, consisting of two Ethernet cards that were installed directly in the DS8000 servers. This feature is not required anymore and was withdrawn from marketing. Today, all DS8000 systems can be connected to Tivoli Storage Productivity Center for Replication through the HMC.
698
However, the consistency of the recovery data that is contained in the Metro Mirror secondary volumes depends on the ability of Tivoli Storage Productivity Center for Replication to react immediately to any Metro Mirror incidents. Tivoli Storage Productivity Center for Replication Metro Mirror heartbeat monitors the connection between the Tivoli Storage Productivity Center for Replication server and all DS8000 HMCs. It is active on both the Tivoli Storage Productivity Center for Replication server and the DS8000 and immediately freezes and suspends all Metro Mirror relationships if a communication problem occurs. Thus, it makes sure that the recovery data is consistent. Heartbeat considerations: Metro Mirror heartbeat is not mandatory. It can be switched on or off on the active Tivoli Storage Productivity Center for Replication server and is controlled only by the active Tivoli Storage Productivity Center for Replication server. When Metro Mirror relationships are frozen because of a heartbeat lost event, there is no automatic unfreeze of the relationships because the Tivoli Storage Productivity Center for Replication communication and coordination of the unfreeze might be broken. If so, there is an Extended Long Busy (ELB) condition (or SCSI Queue full condition) on the PPRC primaries for the defined ELB timeout. The default ELB timeout is 60 seconds for FB volumes and 120 seconds for CKD volumes.
https://9.155.49.168:9559/CSM/ Note: The IP port number (the 4-digit number after the colon) can be different in your environment. The value that is given in Example 47-1 is the default.
699
After you connect to the GUI, you see the Tivoli Storage Productivity Center for Replication login window, as shown in Figure 47-10.
Figure 47-10 Tivoli Storage Productivity Center for Replication login window
Apart from the login fields, this window shows you the Tivoli Storage Productivity Center for Replication version information and the server name.
700
After a successful login, you see the main window, called Health Overview, as shown in Figure 47-11. It provides a status overview of the Tivoli Storage Productivity Center for Replication system and the configured Copy Services sessions.
Figure 47-11 Tivoli Storage Productivity Center for Replication health overview
By selecting the links on either the main view or the navigation side bar, you can switch to specific windows that you can use to view more detailed data or perform activities. Tip: The GUI has automatic refresh capabilities, which means all states are refreshed automatically without manual intervention. The default refresh interval is set to 30 seconds, but can be lowered to 5 seconds in the Advanced Tools window for real-time monitoring of the environment. The latest refresh time stamp is visible at the upper right of each window.
701
You can run CSMCLI in the same fashion as the DS CLI for the DS products. It provides three different modes: Singe-shot mode Interactive mode Script mode CSMCLI uses a standard configuration file that is called repcli.properties that is in the same directory as the csmcli command itself. The file is mandatory. It contains the IP address or host name and the CSMCLI communication port. CSMCLI availability: The CSMCLI is an application that you must install on all systems on which you want to run CSMCLI commands, You can find the installation code on the Tivoli Storage Productivity Center for Replication server, along with the matching repcli.properties file. It cannot be downloaded from the Internet. CSMCLI is a platform independent Java program. For more information, see 47.1.1, Sources of information on page 686. Example 47-2 shows a single-shot command and its results.
Example 47-2 Single-shot CLI command
$./csmcli.sh lsdevice -devtype ds Tivoli Storage Productivity Center for Replication Command Line Interface (CLI) Copyright 2007, 2012 IBM Corporation Version: 5.1 Build: l20120510-1010 Server: mcecebc.mainz.de.ibm.com Port: 5110 Authentication file: /home/itso/tpcr-cli/tpcrcli-auth.properties Device ID Connection Type Device Type Local Server Connection ========================================================================= DS8000:BOX:2107.TV181 HMC DS8000 Connected;Connected ... Tip: As you see in output of the command, the CLI supports an authorization file. This file contains a user ID and an encrypted password. It allows you to log on to Tivoli Storage Productivity Center for Replication without providing user credentials every time. You add the password in clear text and CSMCLI encrypts it at the time of the first logon. To run more than one command, start the CLI program by itself and run the commands at the interactive prompt, as shown in Example 47-3.
Example 47-3 Interactive CLI commands
$./csmcli.sh Tivoli Storage Productivity Center for Replication Command Line Interface (CLI) ... csmcli> lssess Name Status State Copy Type ============================================================= ITSO_MM_Test Inactive Defined Metro Mirror Failover/Failback ITSO_HSIB_Test Inactive Defined Basic HyperSwap ... csmcli>
702
The third mode is the script mode, which runs commands out of a file (Example 47-4).
Example 47-4 Script mode to run CLI commands
$ ./csmcli -script ~/rm/scripts/devreport CSMCLI provides a built-in help function. If you run help without any options, CSMCLI prints a list of all the available commands. If you run help with the name of a command, you get detailed usage information. The help function is illustrated in Example 47-5.
Example 47-5 The help command
csmcli> help adddevice addhost addmc addstorsys chauth chdevice chhost chlocation chmc chsess chvol cmdsess cmdsnapgrp csmcli exit exportcsv exportgmdata hareconnect hatakeover help
importcsv lsauth lsavailports lscpset lscptypes lsdevice lshaservers lshost lslocation lslss lsmc lspair lsparameter lspath lspool lsrolepairs lsrolescpset lssess lssessactions lssessdetails
lssnapgrp lssnapgrpactions lssnapshots lssnmp lsstorcandidate lsvol mkauth mkbackup mkcpset mklogpkg mkpath mksess mksnmp quit refreshdevice rmactive rmassoc rmauth rmcpset rmdevice
rmhost rmmc rmpath rmsess rmsnmp rmstdby rmstorsys setasstdby setoutput setparameter setstdby showcpset showdevice showgmdetails showha showmc showsess ver whoami
csmcli> help lspair lspair Use the lspair command to list the copy pairs for a specified role pair or to list the copy pairs for a specified copy set. ... #(complete description of the command removed)
703
704
Appendix A.
705
706
You should perform a file system synchronization before you create a point-in-time copy that writes the contents of the file system buffers to disk.
Database consistency
A simple way to provide consistency is to stop the database before you create the FlashCopy pairs. However, if you cannot stop a database for the FlashCopy, you must perform some pre- and post-processing actions to create a consistent copy: Use database functions such as Oracle online backup mode or DB2 suspend I/O before you create a FlashCopy copy. After the FlashCopy process finishes, withdraw the backup mode or resume I/O. An I/O suspension is not required for Oracle if Oracle hot backup mode is enabled. Oracle handles the resulting inconsistencies during database recovery. Optionally, perform a file system freeze operation before and a thaw operation after the FlashCopy. If the file system freeze is omitted, file system checks are required before you mount the file systems on the FlashCopy target volumes. Perform a file system synchronization before you create a FlashCopy copy. Use FlashCopy Consistency Groups if the file system log is allocated on multiple disks, Create FlashCopy copies of the data files, switch the database log file, and then create FlashCopy copies of the database logs.
707
AIX specifics
This section describes the steps that are needed to use volumes that are created by the DS 8000 Copy Services on AIX hosts.
#chdev -l <hdisk#> -a pv=clear #chdev -l <hdisk#> -a pv=yes Therefore, it is necessary to redefine the volume group information about the FlashCopy target volumes using special procedures or by running the recreatevg command. These actions alter the PVIDs and VGIDs in all the VGDAs of the FlashCopy target volumes so that there are no conflicts with existing PVIDs and VGIDs on existing volume groups that are on the source volumes. If you do not redefine the volume group information before you import the volume group, then the importvg command fails.
708
4. Import the target volume group by running the following command: #importvg -y <volume_group_name> <hdisk#> 5. Vary on the volume group (the importvg command should vary on the volume group) by running the following command: #varyonvg <volume_group_name> 6. Verify the consistency of all file systems on the FlashCopy target volumes by running the following command: #fsck -y <filesystem_name> 7. Mount all the target file systems by running the following command: #mount <filesystem_name> If the host is using SDD that works with vpath devices, complete the following steps: 1. The target volume (hdisk) is new to AIX, and therefore the Configuration Manager should be run on all Fibre Channel adapters by running the following command: #cfgmgr -l <fcs#> 2. Configure the SDD devices by running the following command: #smitty datapath_cfgall 3. Discover which of the physical volumes is your FlashCopy target volume by running the following command: #lsdev -C | grep 2107 4. Certify that all PVIDs in all hdisks that will belong to the new volume group are set. Check this information by running lspv. If they are not set, run the following command on each one to avoid the failure of the importvg command: #chdev -l <hdisk#> -a pv=yes 5. Import the target volume group by running the following command: #importvg -y <volume_group_name> <hdisk#> 6. Change the ODM definitions to work with the SDD devices that provide you with the load balance and failover functions by running the following command: #dpovgfix <volume_group_name> 7. Vary on the Volume Group (the importvg command should vary on the volume group) by running the following command: #varyonvg <volume_group_name> 8. Verify the consistency of all the file systems on the FlashCopy target volumes by running the following command: #fsck -y <filesystem_name> 9. Mount all the target file systems by running the following command: #mount <filesystem_name> The data is now available. You can, for example, back up the data on the FlashCopy target volume to a tape device. This procedure can be run after the relationship between the FlashCopy source and target volume is established, even if data is still being copied in the background.
709
The disks that contain the target volumes might have been defined to an AIX system, for example, if you periodically create backups using the same set of volumes. In this case, there are two possible scenarios: If no volume group, file system, or logical volume structure changes are made, then use procedure 1 (Procedure 1 on page 710) to access the FlashCopy target volumes from the target system. If some modifications to the structure of the volume group are made, such as changing the file system size or modifying logical volumes (LV), then use procedure 2 (Procedure 2 on page 710).
Procedure 1
For this procedure, complete the following steps: 1. Unmount all the source file systems by running the following command: #umount <source_filesystem> 2. Unmount all the target file systems by running the following command: #umount <target_filesystem> 3. Deactivate the target volume group by running the following command: #varyoffvg <target_volume_group_name> 4. Establish the FlashCopy relationships. 5. Mount all the source file systems by running the following command: #mount <source_filesystem> 6. Activate the target volume group by running the following command: #varyonvg <target_volume_group_name> 7. Perform a file system consistency check on the target file systems by running the following command: #fsck -y <target_file_system_name> 8. Mount all the target file systems by running the following command: #mount <target_filesystem>
Procedure 2
For this procedure, complete the following steps: 1. Unmount all the target file systems by running the following command: #umount <target_filesystem> 2. Deactivate the target volume group by running the following command: #varyoffvg <target_volume_group_name> 3. Export the target volume group by running the following command: #exportvg <target_volume_group_name> 4. Establish the FlashCopy relationships. 5. Import the target volume group by running the following command: #importvg -y <target_volume_group_name> <hdisk#> 6. Perform a file system consistency check on target file systems by running the following command: #fsck -y <target_file_system_name> 710
IBM System Storage DS8000 Copy Services for Open Systems
7. Mount all the target file systems by running the following command: #mount <target_filesystem>
Accessing the FlashCopy target volume from the same AIX host
This section describes a method of accessing a FlashCopy target volume on a single AIX host while the source volume is still active on the same server. The procedure is intended to be used as a guide and may not cover all scenarios. If you are using the same host to work with source and target volumes, you must use the recreatevg command. The recreatevg command overcomes the problem of duplicated LVM data structures and identifiers that are caused by a disk duplication process such as FlashCopy. It is used to re-create an AIX Volume Group (VG) on a set of target volumes that are copied from a set of source volumes that belong to a specific VG. The command allocates new Physical Volume Identifiers (PVIDs) for the member disks and a new Volume Group Identifier (VGID) to the volume group. The command also provides options to rename the logical volumes with a prefix you specify, and options to rename labels to specify different mount points for file systems.
active active
5. Create the target volume group and prefix all file system path names with /backup, and prefix all AIX logical volumes with bkup by running the following command: recreatevg -y FC_test_bkup_vg -L /backup -Y bkup hdisk3
711
You must specify the hdisk names of all the volumes that participate in the volume group. The output from lspv that is shown in Example A-3 illustrates the new volume group definition.
Example: A-3 lspv output after re-creating the volume group
An extract from /etc/filesystems in Example A-4 shows how recreatevg generates a new file system stanza. The file system named /mnt/FC_test in the source volume group is renamed to /backup/mnt/FC_test in the target volume group. Also, the /backup/mnt/fc_test directory is created. The logical volume and JFS log logical volume are renamed. The remainder of the stanza is the same as the stanza for /mnt/fc_test.
Example: A-4 Target file systems stanza in /etc/filesystems
= = = = = = =
6. Perform a file system consistency check for all target file systems: #fsck -y /backup/mnt/fc_test 7. Mount the new file systems that belong to the target volume group to make them accessible.
712
If the secondary volumes were previously defined on the target AIX system as hdisks or vpaths, but the original volume group was removed from the primary volumes, the old volume group and disk definitions must be removed (by running exportvg and rmdev) from the target volumes and redefined (by running cfgmgr) before you run importvg again to get the new volume group definitions. If this task is not done first, importvg imports the volume group incorrectly. The volume group data structures (PVIDs and VGID) in ODM will differ from the data structures in the VGDAs and disk volume super blocks. The file systems will not be accessible. If the secondary volumes that are configured on the target AIX server are in a Remote Mirror and Copy relationship and you do not have the Permit read access from target and Reset Reserve options enabled (and the volumes are not in the Full Duplex state), after you reboot the target server, the hdisks might be configured to AIX again. You might see each Remote Mirror and Copy secondary volume twice on the target server. AIX knows that these physical volumes exist with entries in the Configuration Database (ODM). However, when the configuration manager runs during reboot, it cannot read their PVIDs because, as Remote Mirror and Copy targets, they are locked by the DS8000 server. This scenario results in AIX causing the original hdisks to be configured to a defined state, and new (phantom) hdisks or vpaths being configured and placed in an Available state. This condition is an unwanted one that must be remedied before the secondary volumes can be accessed. To access the secondary volumes, the phantom hdisks must be removed and the real or original hdisks must be changed from a Defined state to an Available state.
713
The following example shows two systems, host1 and host2, where host1 has the primary volume hdisk6 and host2 has the secondary volume hdisk9. Both systems have their ODMs populated with the volume group itsovg from their respective Remote Mirror and Copy volumes and, before any modifications, both systems ODM have the same time stamp, as shown in Example A-5.
Example: A-5 Original time stamp
[host1:root:/:] getlvodm -T itso_mm_vg 4fd37761181f8574 [host2:root:/:] getlvodm -T itso_mm_vg 4fd37761181f8574 Volumes hdisk6 and hdisk9 are in the Remote Mirror and Copy duplex state, and the volume group itso_mm_vg on host1 is updated with a new logical volume. The time stamp on the VGDA of the volumes is updated and so is the ODM on host1, but not on host2 (see Example A-6).
Example: A-6 Update source time stamp
[host1:root/:] lqueryvg -p hdisk6 -Tt Time Stamp: 4fd37d3a1df3146c [host1:root:/:] getlvodm -T itso_mm_vg 4fd37d3a1df3146c [host2:root:/:] lqueryvg -p hdisk9 -Tt Time Stamp: 4fd37d3a1df3146c [host2:root:/:] getlvodm -T itso_mm_vg 4fd37761181f8574 To update the ODM on the secondary server, suspend the Remote Mirror and Copy pair before you run importvg -L to avoid any conflicts from LVM actions that might occur on the primary server. Example A-7 shows the updated ODM entry on host2.
Example: A-7 Update secondary server (host2) ODM
[host2:root:/:] importvg -L itso_mm_vg hdisk9 itso_mm_vg [host2:root:/:] lqueryvg -p hdisk9 -Tt Time Stamp: 4fd37d3a1df3146c [host2:root:/:] getlvodm -T itso_mm_vg 4fd37d3a1df3146c When the importvg -L command completes, you can resume the Remote Mirror and Copy pairs and copy only the out-of-sync tracks.
With Windows Server 2008R2, Windows Server 2008, and Windows Server 2003, this information is stored on the disk drive itself in a partition that is called the LDM database, which is kept on the last few tracks of the disk. Each volume has its own 128-bit Globally Unique Identifier (GUID) and belongs to a disk group. This concept is similar to the concept of Physical Volume Identifier (PVID) and Volume Group in AIX. As the LDM is stored on the physical drive itself, with Windows Server 2008R2, Windows Server 2008, and Windows Server 2003, it is possible to move disk drives between different computers.
Copy Services limitations with Windows Server 2008R2, 2008, and 2003
Having the drive information stored on the disk itself imposes some limitations when you use Copy Services on a Windows system: The source and target volumes must be the same physical size. Normally, the target volume can be bigger than the source volume; with Windows, this situation is not the case, for two reasons: The LDM database holds information that relates to the size of the volume. As this information is copied from the source to the target, if the target volume is a different size from the source, then the database information is incorrect, and the host system returns an exception. The LDM database is stored at the end of the volume. The copy process is a track-by-track copy; unless the target is an identical size to the source, the database is not at the end of the target volume. It is not possible to have the source and target FlashCopy volumes on the same Windows system if they were created as Windows dynamic volumes. The reason is that each dynamic volume must have its own 128-bit GUID. As its name implies, the GUID must be unique on one system. When you perform FlashCopy, the GUID is copied as well, which means that if you try to mount the source and target volume on the same host system, you would have two volumes with the same GUID. This configuration is not allowed, and you cannot mount the target volume.
715
Tip: Disable the Fast-indexing option on the source disk; otherwise, operations to that volume are cached to speed up disk access. However, this situation means that data is not flushed from memory and the target disk might have copies of files or folders that were deleted from the source system. When you perform subsequent Remote Mirror and Copy/FlashCopy copies to the target volume, it is not necessary to perform a reboot, because the target volume is still known to the target system. However, to detect any changes to the contents of the target volume, you should remove the drive letter from the target volume before you do the FlashCopy. Then, after you do the FlashCopy, restore the drive letter in order for the host it is mounted on to be able to read/write to it. There is a Windows utility, DiskPart, that enables you to script these operations so that FlashCopy can be carried out as part of an automated backup procedure. DiskPart can be found at the Microsoft download site by searching for the key word DiskPart: http://www.microsoft.com/downloads A description of DiskPart commands can be found at the following website: http://technet.microsoft.com/en-us/library/cc770877%28v=ws.10%29.aspx
716
When you perform subsequent Remote Mirror and Copy/FlashCopy to the target volume, it is not necessary to perform a reboot because the target volume is still known to the target system. However, to detect any changes to the contents of the target volume, you should remove the drive letter from the target volume before you do the FlashCopy. Then, after you do the FlashCopy, you restore the drive letter in order for the host it is mounted on to be able to read/write to it. Again, you can use the Windows utility DiskPart, which enables you to script these operations so that FlashCopy can be carried out as part of an automated backup procedure.
717
Writers: A component of an application that stores persistent information about one or more volumes that participate in shadow copy synchronization. Writers are software that is included in applications and services to help provide consistent shadow copies. Writers serve two main purposes: Responding to signals provided by VSS to interface with applications to prepare for shadow copy Providing information about the application name, icons, files, and a strategy to restore the files Writers prevent data inconsistencies. Providers: A component that creates and maintains the shadow copies. IBM VSS Provider is the provider interface that interacts with the Microsoft Volume Shadow Copy Services and to the Common Interface Model Agent (CIM Agent) on the master console. Figure A-1 shows the Microsoft VSS architecture and how the software provider and hardware provider interact through Volume Shadow Copy Services.
R e q u e s to r
W r it e r s
Apps
I/O
S o ftw a re P r o v id e r H a rd w a re P r o v id e r
718
719
For more information about this process, see the FlashCopy diagram that is shown in Figure A-2.
Backup App
Requestor
Writers
Apps
Volume Shadow Copy Service I/O
DS8000
VSS_RESERVE Pool
VSS_FREE Pool
Target
FlashCopy
Source
Additional information
For more information, see IBM System Storage DS Open Application Programming Interface Installation and Reference, GC35-0516. For more information about Microsoft VSS, go to the following website: http://technet.microsoft.com/en-us/library/ee923636%28d=printer%29.aspx You can download the DS8000 VSS provider from the following FTP link: ftp://ftp.software.ibm.com/storage/ds8000/updates/DS8K_Customer_Download_Files/Vol ume_Shadow_Copy_Service/
720
721
Hardware Providers
Other stor Dis ak ge Subs system ystem HDDs Hardware Microsoft Functionalitya Non-Microsoft Functionality
Figure A-3 Microsoft VDS architecture
LUNs DS8000
The following minimum hardware is required for installing Microsoft VDS on a Windows Server 2008R2, Windows Server 2008, or Windows Server 200 operating system: For Virtual Disk Services: A DS8000 storage unit Common Information Model (CIM) agent
722
Dynamic disks. Microsoft VDS can be employed to create dynamic disks, which can consist of either simple volumes or multi-partition volumes. Multi-partition volumes physically span more than a single disk but are logically considered a single volume. Dynamic disks can be spanned, striped (RAID 0), mirrored (RAID 1), or striped with parity (RAID 5). Microsoft VDS can be used to expand dynamic disks to make more space available to a volume. The DS8000 interacts with the IBM VDS hardware provider to Microsoft VDS. The implementation is based on the DS CIM Agent and Microsoft VDS, using CIM technology to query storage system information and manage LUNs. Microsoft VDS together with Microsoft Virtual Shadow Copy Service forms a unified heterogeneous storage systems solution that provides the following functions: Managing block storage virtualization Discovery of new storage Boot from SAN Shadow copy creation that relates to the storage systems FlashCopy capability Creation of consistent backups of open files and applications Creation of shadow copies for shared folders For backup, testing, and data mining purposes IBM uses QLogic SANsurfer VDS Manager to interact with Microsoft VDS. Microsoft VDS by itself as a single component does not provide you with a disaster recovery solution. Microsoft VDS is primarily designed as a SAN management tool that you can use to perform disk management functions. Microsoft VDS provides you with an effective tool with which you can manage multivendor storage systems through a single interface. With Microsoft VDS and VSS, the most distinct advantages that are coupled with IBM System Storage DS products are the Boot from SAN and shadow copy (Microsoft Volume Shadow Copy Service) creation features. In a disaster recovery solution scenario, if you have DS8000 systems that serve a Windows Server 2008R2 environment, you can integrate them with Microsoft VSS and Microsoft VDS. This situation means that you can effortlessly manage your overall SAN environment by creating FlashCopy copies and offloading backup to backup servers without shutting down your production application, and creating and assigning logical units and managing the SAN storage environment.
723
Figure A-4 shows the overall Microsoft VDS and VSS architecture.
R e q u e s to r
V o lu m e S h a d o w C o p y S e rv ic e
W rite rs
Apps
VSS H a rd w a re P r o v id e r VDS H a rd w a re P r o v id e r
S o ft w a r e P r o v id e r
I/O
D is k M g t
Figure A-5 shows the Microsoft VDS and VSS architecture that is combined with FlashCopy.
Backup App
Requestor
Writers
Apps
Volume Shadow Copy Service I/O
Disk Mgt
DS8000
VSS_RESERVE Pool
VSS_FREE Pool
Target
FlashCopy
Source
724
Additional information
For more information, see IBM System Storage DS Open Application Programming Interface Installation and Reference, GC35-0516. For more information about the Microsoft VDS, see the following website: http://technet.microsoft.com/en-us/library/ee923636%28d=printer%29.aspx
#quiesce an application insert the quiescing script here #freeze the source lockfs -w /source #start FlashCopy relationships insert the FlashCopy ds cli script here using the option -wait on the mkflash command lockfs -u /source #and resume the application insert the resuming script here #check the target for consistency
Appendix A. Open Systems specifics
725
fsck -y /dev/rdsk/cXtYdZsN #if OK mount it mount /dev/dsk/cXtYdZsN /target The FlashCopy target volume (containing consistent data) can be mounted to the same host shown in Example A-8 on page 725 or it can be mounted to a different server that support the file system type. For example, a server with backup software can mount the volume and move the content of the volume from the disk to a tape device to meet disaster recovery objectives.
726
3. Run the FlashCopy commands. 4. Thaw the I/O to the source volume on Server A. 5. Mount the target volume on Server B.
#halt I/O on the source by unmounting the volume umount /vol1 #execute FlashCopy commands here #deport the source volume group vxdg deport DG1 #offline the source disk vxdisk offline c6t1d0s2 #now only the target disk is online #import the volume again vxdg import DG1 #recover the copy vxrecover -s Vol1 #re-mount the volume mount /vol1 If you want to make both the source and target available to the machine at the same time, it is necessary to change the private region of the disk so that VERITAS Volume Manager allows the target to be accessed as a different disk. Here we explain how to simultaneously mount DS FlashCopy source and target volumes to the same host without exporting the source volumes when you use VERITAS Volume Manager. Check with the Symantec Corporation and IBM on the supportability of this method at the specific software and firmware levels for your environment before you use it. Assume that the sources are constantly mounted by the Solaris host, that the FlashCopy is performed, and the target volume will be mounted without unmounting the source or rebooting. After the target volumes are provisioned, you must rescan the Solaris system for new SCSI devices by running devfsadm. The following procedure refers to these names: mydg: The name of the diskgroup that is being created. da_name: The disk name that is shown under the DISK column in the vxdisk list output. last_daname: The disk is known to VxVM as shown under the DEVICE column in the vxdisk list output. This output is the output of the vxdisk list on Solaris.
727
To mount the targets to the same host, complete the following steps: 1. Determine which disks have a copy of the disk group configuration in their private region. Run the following command to list the log disk disks: # vxdg list <disk group> 2. Determine the location of the private region (tag 15) on the disks (usually partition 3) by running the following command: # prtvtoc /dev/rdsk/c#t#d#s2 Or run the following command to get the partition number for the private region: # vxdisk list c#t#d#s2 | grep priv 3. Dump the private region by running the following command: # /usr/lib/vxvm/diag.d/vxprivutil dumpconfig /dev/rdsk/c#t#d#s3 > dg.dump 4. Create a script to initialize the disk group by running the following command: # cat dg.dump | vxprint -D - -d -F "vxdg -g <mydg> adddisk %name=%last_da_name" > dg.sh 5. Edit the dg.sh file and change the first line to: # vxdg init <mydg> <daname>=<last_daname> 6. Make the dg.sh file runnable by running the following command: # chmod 755 dh.sh 7. Create a file that can be used to rebuild the VM config by running the following command: # cat dg.dump | vxprint -D - -hvpsm > dg.maker 8. Initialize the disk group by running dg.sh: # ./dg.sh 9. If this script results in the error Disk is already in use by another system, then the private region on each disk that will be added to the disk group must be initialized. This task can be done by running the following command: # vxdisksetup -i <da_name> 10. Rebuild the VM configuration by running the following command: # vxmake -g <mydg> -d dg.maker 11. Start the volumes by running the following command: # vxvol -g <mydg> start <volume>
728
2. List all the known disk groups on the system by running the following command: #vxdisk -o alldgs list 3. Import the Remote Mirror and Copy disk group information by running the following command: #vxdg -C import <disk_group_name> 4. Check the status of volumes in all disk groups by running the following command: #vxprint -Ath 5. Bring the disk group online by running one of the following commands: #vxvol -g <disk_group_name> startall #vxrecover -g <disk_group_name> -sb 6. Perform a consistency check on the file systems in the disk group by running the following command: #fsck -V vxfs /dev/vx/dsk/<disk_group_name>/<volume_name> 7. Mount the file system for use by running the following command: #mount -V vxfs /dev/vx/dsk/<disk_group_name>/<volume_name> /<mount_point> When you finish with the Remote Mirror and Copy secondary volume, you should complete the following steps: 1. Unmount the file systems in the disk group by running the following command: #umount /<mount_point> 2. Take the volumes in the disk group offline by running the following command: #vxvol -g <disk_group_name> stopall 3. Export disk group information from the system by running the following command: #vxdg deport <disk_group_name> Tip: If you run FlashCopy or Remote Mirror and Copy on only one half of a RAID 1 mirror, you must force the import of the disk group because not all of the disks are available. Therefore, you must run the following command: vxdg -f import <disk_group> However, be aware that this command might cause disk group inconsistencies.
Source preparation
At the source system, a few steps should be performed to ensure that target copy is consistent. Complete the following steps: 1. Create a map file from your source volume group by running one of the following vgexport commands:
729
# vgexport -m <map_file_name> -p /dev/<source_vg_name> # vgexport -m <map_file_name> -p -s /dev/<source_vg_name> Attention: Using the -p option provides a preview of the existing mapfile without exporting anything. You can use the -s option with vgexport, but when you copy data from production, the VGID is the same as on the production disks. The preferred approach is to avoid doubling VGIDs in the VG mapfile unless the target system is a remote host and the VGID might be same as in the production system.
Tip: The generated mapfiles must be transferred to the target system. 2. At the source system, plan a strategy to quiesce your volumes, whether you use Consistency Groups or stop all the I/O on the source. For more information about FlashCopy characteristics, see Chapter 7, FlashCopy overview on page 45. The target system, that is, the system that you mount or use to access the data that will be copied, must be prepared to perform any copy procedures. To do so, complete the following steps: 1. Unmount all file systems that are currently mounted on target system for the copy by running the following command: # umount </fs_destination_name> Attention: If this copy is the first one, the target volumes are not mounted. This step might be skipped. 2. Vary off the target volume groups by running the following command: # vgchange -a n /dev/<source_vg_name> 3. If the target volume group exists, remove it by running vgexport. The target volumes cannot be members of a volume group when the vgimport command is run. # vgexport -m /dev/null /dev/<target_vg_name>
FlashCopy execution
To run FlashCopy, complete the following steps: 1. Unmount all the file systems in the source volume group or quiesce the I/O before you do the FlashCopy execution. Consistency Groups: For more information about Consistency Groups using FlashCopy, see 8.2, Consistency Group FlashCopy on page 61. 2. Perform the FlashCopy with the appropriate options for your copy strategy. 3. Remount all the file systems in the source volume group if they were mounted, or release the system from the quiesce of its I/O. 4. If the new disks were not previously discovered on target system, run ioscan -fnC disk and insf -e.
730
Post-copy steps
The backup procedure is completed. Now, mount the FlashCopy volumes. Before you continue, you must decide whether you are mounting the destination volumes on the same system or on another system. If you mount the volumes on the same system, when FlashCopy is finished, change the Volume Group ID on each DS Volume in the FlashCopy target system that belongs to the same group. The volume ID for each volume in the FlashCopy target volume group must be modified on the same command line. If you do not do this task, there will be a mismatch of Volume Group IDs within the volume group and a new copy must be generated. The only way to resolve this issue is to perform the FlashCopy again and reassign the Volume Group IDs by running the same command: vgchgid -f </dev/rdsk/c#t#d#_1>...</dev/rdsk/c#t#d#_n> This step is not needed if another system is used to access the target devices. Attention: Before you do this step, make sure that the c#t#d# matches the Destination DS Volume ID copied. Tip: Use the SDD output to obtain the disks ID: datapath query device With HP-UX 11.31, the output of scsimgr get_info -D /dev/rdisk/<diskN> provides the device serial number or the WWID. The following steps should be completed whether you mount the destination volumes on the same system or on another system: 1. Create the Volume Group for the FlashCopy target by running the following commands: # mkdir /dev/<target_vg_name> # mknod /dev/<target_vg_name>/group c <lvm_major_no> <next_available_minor_no> Run lsdev -C lvm to determine what the major device number should be for Logical Volume Manager objects. To determine the next available minor number, examine the minor number of the group file in each volume group directory by running ls -l. 2. Import the FlashCopy target volumes into the volume group by running vgimport: # vgimport -m <map_file_name> -v /dev/<target_vg_name> </dev/dsk/c#t#d#_1> ... </dev/dsk/c#t#d#_n> You can also, on another system where you run vgexport with the -s option, import the target volumes by using the -s option: # vgimport -m <map_file_name> -s /dev/<target_vg_name> 3. Activate the new volume group by running the following command: # vgchange -a y /dev/<target_vg_name> 4. Perform a full file system check on the logical volumes on the target volume group. This action is necessary to apply any changes in the JFS intent log to the file system and mark the file system as clean. Run the following command: # fsck -o full -y /dev/<target_vg_name>/<rlvol> If the logical volume contains a VxFS file system, run the following command instead: # fsck -F vxfs -o full -y /dev/<target_vg_name>/<rlvol>
731
5. If the logical volume contains a VxFS file system, mount the target logical volumes on the system by running the following command: # mount -F vxfs /dev/<target_vg_name> /<lvln_mount_point> When access to the FlashCopy target volume is no longer required, unmount the file systems and vary off the volume group by running the following command: # vgchange -a n /dev/<target_vg_name> If no changes are made to the source volume group before the subsequent FlashCopy, then vary on the volume group and perform a full file system consistency check, as shown in steps 4 on page 731 and 5.
732
733
734
The VMs stored on this data store can then be opened on the ESX host. To assign the existing virtual disks to new VMs, in the Add Hardware Wizard window, select Use an existing virtual disk and choose the .vmdk file you want to use (Figure A-6).
If the FlashCopy copied LUNs were assigned as RDMs, the target LUNs can be assigned to a VM by creating a RDM for this VM. In the Add Hardware Wizard window, select Raw Device Mapping and use the same parameters as on the source VM. Shutting down the source VM: If you do not shut down the source VM, you might not be able to use the target LUNs because of reservations.
735
Figure A-7 shows an ESX host with two virtual machines, each one using a virtual disk. The ESX host has one VMFS data store that consists of two DS8000 LUNs 1 and 2. To get a complete copy of the VMFS data store, you must copy both LUNs with FlashCopy. By using FlashCopy on VMFS LUNs, it is easy to create backups of whole VMs.
736
Figure A-8 Using FlashCopy within a VM - HDD1 is the source for the target HDD2
737
Figure A-9 Using FlashCopy between two different VMs - VM1's HDD1 is the source for HDD2 in VM2
738
In Figure A-10, FlashCopy is being used on two volumes. LUN 1 is used for a VMFS data store while LUN 2 is assigned to VM2 as an RDM. These two LUNs are then copied with FlashCopy and attached to another ESX Server host. In ESX host 2, you now assign the VDisk that is stored on the VMFS partition on LUN 1' to VM3 and attach LUN 2' through RDM to VM4. By doing this task, you can create a copy of ESX host 1's virtual environment and use it on ESX host 2. Note: If you use FlashCopy on VMFS volumes and assign them to the same ESX Server host, the server does allow the target to be used because the VMFS volume identifiers are duplicated. To circumvent this situation, you can use the VMFS Volume Resignaturing feature of VMware ESX server. For more information about resignaturing, see the VMware Knowledge Base, found at: http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displa yKC&externalId=1011387
739
One such limitation is that the mkpprc -tgtread command is not supported. VMware cannot use VMFS-formatted volumes or raw System LUNs in virtual mode without writing to the disk. However, it might be possible to use raw System LUNs in physical compatibility mode. Check with IBM Support about the supportability of this procedure. At a high level, here are the steps for creating a Remote Mirror and Copy: 1. Shut down the guest operating system on the target ESX Server. 2. Establish Remote Mirror and Copy from the source volumes to the target volumes. Important: You should run mkpprc -resetreserve when you establish the Remote Mirror and Copy. Otherwise, you might receive the following message on the target machine: Cannot create partition table for disk vmhbax:x:x because geometry info is invalid. Please rescan. This command removes hardware protection on the target device and should be used with caution. 3. When the initial copy completes and the volumes are in Full Duplex mode, suspend or remove the Remote Mirror and Copy relationship. 4. Issue the Rescan command on the target ESX Server. 5. If they are not already assigned to the target virtual machine, assign the Remote Mirror and Copy volumes to the target virtual machine. Virtual disks on VMFS volumes should be assigned as existing volumes, while raw volumes should be assigned as RDMs using the same parameters as on the source host. 6. Start the virtual machine and, if necessary, mount the target volumes.
740
Figure A-11 shows a scenario similar to the one shown in Figure A-10 on page 739, but now the source and target volumes are on two different DS8000 servers. This setup can be used for disaster recovery solutions where the ESX host 2 is in the backup data center.
For an overview about an automated disaster recovery solution with VMware and DS8000, see Chapter 45, VMware Site Recovery Manager on page 677.
741
742
Appendix B.
SNMP notifications
This appendix describes SNMP traps that are sent out in a Remote Mirror and Copy environment. This appendix repeats some of the SNMP trap information that is available in IBM System Storage DS8000: Architecture and Implementation, SG24-8886. This appendix covers the following topics: SNMP overview Physical connection events Remote Mirror and Copy events
743
SNMP overview
The DS8000 sends out SNMP traps when a state changes in a remote Copy Services environment. Therefore, 13 traps are implemented. The 1xx traps are sent out for a state change of a physical link connection. The 2xx traps are sent out for state changes in the logical Copy Services setup. The DS HMC can be set up to send SNMP traps to up to two defined IP addresses. IBM Tivoli Storage Productivity Center for Replication (see Chapter 47, IBM Tivoli Storage Productivity Center for Replication on page 685) listens to SNMP traps of the DS8000. In addition, a Network Management program, such IBM Tivoli NetView, can be used to catch and process the SNMP traps.
PPRC Links Degraded UNIT: Mnf Type-Mod SerialNm LS PRI: IBM 2107-922 75-20781 12 SEC: IBM 2107-9A2 75-ABTV1 24 Path: Type PP PLink SP SLink RC 1: FIBRE 0143 XXXXXX 0010 XXXXXX 15 2: FIBRE 0213 XXXXXX 0140 XXXXXX OK If all links are interrupted, a trap 101, as shown in Example B-2, is posted. This event indicates that no communication between the primary and the secondary system is possible.
Example: B-2 Trap 101 - Remote Mirror and Copy links are inoperable
PPRC Links Down UNIT: Mnf Type-Mod SerialNm LS PRI: IBM 2107-922 75-20781 10 SEC: IBM 2107-9A2 75-ABTV1 20 Path: Type PP PLink SP SLink RC 1: FIBRE 0143 XXXXXX 0010 XXXXXX 17 2: FIBRE 0213 XXXXXX 0140 XXXXXX 17 When the DS8000 can communicate again by using any of the links, trap 102, as shown in Example B-3, is sent when one or more of the interrupted links are available again.
Example: B-3 Trap 102 - Remote Mirror and Copy links are operational
PPRC Links Up UNIT: Mnf Type-Mod SerialNm LS PRI: IBM 2107-9A2 75-ABTV1 21 SEC: IBM 2107-000 75-20781 11 Path: Type PP PLink SP SLink
RC
744
1: 2:
FIBRE 0010 XXXXXX 0143 XXXXXX OK FIBRE 0140 XXXXXX 0213 XXXXXX OK
Table B-1 shows the Remote Mirror and Copy return codes.
Table B-1 Remote Mirror and Copy return codes Return code 02 03 04 05 06 07 08 09 Description Initialization failed. The ESCON link rejects the threshold exceeded when you attempt to send ELP or RID frames. Timeout. No reason is available. There are no resources available in the primary storage unit for establishing logical paths because the maximum number of logical paths are established. There are no resources available in the remote storage unit for establishing logical paths because the maximum number of logical paths are established. There is a remote storage unit sequence number, or logical subsystem number, mismatch. There is a secondary LSS subsystem identifier (SSID) mismatch, or failure of the I/O that collects the secondary information for validation. The ESCON link is offline. This situation is caused by the lack of light detection that is coming from a host, peer, or switch. The establish failed. It is tried again until the command succeeds or a remove paths command is run for the path. The attempt-to-establish state persists until the establish path operation succeeds or the remove Remote Mirror and Copy paths command is run for the path. The primary storage unit port or link cannot be converted to channel mode if a logical path is established on the port or link. The establish paths operation is not tried again within the storage unit. Configuration error. The source of the error is one of the following items: The specification of the SA ID does not match the installed ESCON adapters in the primary controller. For ESCON paths, the remote storage unit destination address is zero and an ESCON Director (switch) is found in the path. For ESCON paths, the remote storage unit destination address is not zero and an ESCON director does not exist in the path. The path is a direct connection. The Fibre Channel path link is down. The maximum number of Fibre Channel path retry operations was exceeded. The Fibre Channel path secondary adapter is not Remote Mirror and Copy capable. This return code could be caused by one of the following conditions: The secondary adapter is not configured correctly or does not have the current firmware installed. The secondary adapter is a target of 32 different logical subsystems (LSSs). The secondary adapter Fibre Channel path is not available. The maximum number of Fibre Channel path primary login attempts was exceeded. The maximum number of Fibre Channel path secondary login attempts was exceeded. The primary Fibre Channel adapter is not configured correctly or does not have the correct firmware level installed. The Fibre Channel path was established but degraded because of a high failure rate. The Fibre Channel path was removed because of a high failure rate.
0A 10
14 15 16
17 18 19 1A 1B 1C
745
LSS-Pair Consistency Group PPRC-Pair Error UNIT: Mnf Type-Mod SerialNm LS LD SR PRI: IBM 2107-922 75-03461 56 84 08 SEC: IBM 2107-9A2 75-ABTV1 54 84 Trap 202, as shown in Example B-5, is sent if a remote Copy Pair goes into a suspend state. The trap contains the serial number (SerialNm) of the primary and secondary machine, the logical subsystem or LSS (LS), and the logical device (LD). To avoid SNMP trap flooding, the number of SNMP traps for the LSS is throttled. The complete suspended pair information is represented in the summary. The last row of the trap represents the suspend state for all pairs in the reporting LSS. The suspended pair information contains a hexadecimal string of a length of 64 characters. By converting this hex string into binary, each bit represents a single device. If the bit is 1, then the device is suspended; otherwise, the device is still in Full Duplex mode. Trap 200 triggers: This alert can also show up depending on your actions. For example, the alert is triggered if you manually suspend the replication.
Example: B-5 Trap 202 - primary Remote Mirror and Copy devices on the LSS were suspended because of an error
Primary PPRC Devices on LSS Suspended Due to Error UNIT: Mnf Type-Mod SerialNm LS LD SR PRI: IBM 2107-922 75-20781 11 00 03 SEC: IBM 2107-9A2 75-ABTV1 21 00 Start: 2005/11/14 09:48:05 CST PRI Dev Flags (1 bit/Dev, 1=Suspended): C000000000000000000000000000000000000000000000000000000000000000
Asynchronous PPRC Initial Consistency Group Successfully Formed UNIT: Mnf Type-Mod SerialNm IBM 2107-922 75-20781 Session ID: 4002
746
Trap 211, shown in Example B-7, is sent if the Global Mirror setup goes in to a severe error state where no attempts are made to form a consistency group.
Example: B-7 Trap 211 - Global Mirror session is in a fatal state
Asynchronous PPRC Session is in a Fatal State UNIT: Mnf Type-Mod SerialNm IBM 2107-922 75-20781 Session ID: 4002 Trap 212, shown in Example B-8, is sent when a consistency group cannot be created in a Global Mirror relationship. Here are some possible reasons: Volumes were taken out of a copy session. The Remote Copy link bandwidth might not be sufficient. The FC link between the primary and secondary system is not available.
Example: B-8 Trap 212 - Global Mirror consistency group failure - Retry will be attempted Asynchronous PPRC Consistency Group Failure - Retry will be attempted UNIT: Mnf Type-Mod SerialNm IBM 2107-922 75-20781 Session ID: 4002
Trap 213, shown in Example B-9, is sent when a consistency group in a Global Mirror environment can be formed after a previous consistency group formation failure.
Example: B-9 Trap 213 - Global Mirror consistency group successful recovery
Asynchronous PPRC Consistency Group Successful Recovery UNIT: Mnf Type-Mod SerialNm IBM 2107-9A2 75-ABTV1 Session ID: 4002 Trap 214, shown in Example B-10, is sent if a Global Mirror Session is terminated by running the DS CLI command rmgmir or the corresponding GUI function.
Example: B-10 Trap 214 - Global Mirror Master terminated
Asynchronous PPRC Master Terminated UNIT: Mnf Type-Mod SerialNm IBM 2107-922 75-20781 Session ID: 4002 Trap 215, shown in Example B-11, is sent if, in the Global Mirror Environment, the master detects a failure to complete the FlashCopy commit. The trap is sent after a number of commit tries fail.
Example: B-11 Trap 215 - Global Mirror FlashCopy at remote site unsuccessful
Asynchronous PPRC FlashCopy at Remote Site Unsuccessful A UNIT: Mnf Type-Mod SerialNm IBM 2107-9A2 75-ABTV1 Session ID: 4002
747
Trap 216, shown in Example B-12, is sent if a Global Mirror master cannot terminate the Global Copy relationship at one of its subordinates (slave). This situation might occur if the master is terminated by the rmgmir command but the Master cannot terminate the copy relationship on the subordinate. You might need to run rmgmir against the subordinate to prevent any interference with other Global Mirror sessions.
Example: B-12 Trap 216 - Global Mirror subordinate termination unsuccessful
Asynchronous PPRC Slave Termination Unsuccessful UNIT: Mnf Type-Mod SerialNm Master: IBM 2107-922 75-20781 Slave: IBM 2107-921 75-03641 Session ID: 4002 Trap 217, shown in Example B-13, is sent if a Global Mirror environment was suspended by the DS CLI command pausegmir or the corresponding GUI function.
Example: B-13 Trap 217 - Global Mirror paused
Asynchronous PPRC Paused UNIT: Mnf Type-Mod SerialNm IBM 2107-9A2 75-ABTV1 Session ID: 4002 Trap 218, shown in Example B-14, is sent if a Global Mirror exceeds the allowed threshold for failed consistency group formation attempts.
Example: B-14 Trap 218 - Global Mirror number of consistency group failures exceed threshold
Global Mirror number of consistency group failures exceed threshold UNIT: Mnf Type-Mod SerialNm IBM 2107-9A2 75-ABTV1 Session ID: 4002 Trap 219, shown in Example B-15, is sent if a Global Mirror successfully forms a consistency group after one or more formation attempts previously failed.
Example: B-15 Trap 219 - Global Mirror first successful consistency group after previous failures
Global Mirror first successful consistency group after prior failures UNIT: Mnf Type-Mod SerialNm IBM 2107-9A2 75-ABTV1 Session ID: 4002 Trap 220, shown in Example B-16, is sent if a Global Mirror exceeds the allowed threshold of failed FlashCopy commit attempts.
Example: B-16 Trap 220 - Global Mirror number of FlashCopy commit failures exceed threshold
Global Mirror number of FlashCopy commit failures exceed threshold UNIT: Mnf Type-Mod SerialNm IBM 2107-9A2 75-ABTV1 Session ID: 4002
748
The host system sent a command to the primary volume of a Remote Mirror and Copy volume pair to suspend copy operations. The host system might have specified either an immediate suspension or a suspension after the copy completed and the volume pair reached a Full Duplex state. The host system sent a command to suspend the copy operations on the secondary volume. During the suspension, the primary volume of the volume pair can still accept updates but updates are not copied to the secondary volume. The out-of-sync tracks that are created between the volume pair are recorded in the change recording feature of the primary volume. Copy operations between the Remote Mirror and Copy volume pair were suspended by a primary storage unit secondary device status command. This system resource code can be returned only by the secondary volume. Copy operations between the Remote Mirror and Copy volume pair were suspended because of internal conditions in the storage unit. This system resource code can be returned by the control unit of either the primary volume or the secondary volume. Copy operations between the Remote Mirror and Copy volume pair were suspended when the remote storage unit notified the primary storage unit of a state change transition to simplex state. The specified volume pair between the storage units is no longer in a copy relationship. Copy operations were suspended because the secondary volume was suspended as a result of internal conditions or errors. This system resource code can be returned only by the primary storage unit. The Remote Mirror and Copy volume pair was suspended when the primary or remote storage unit was rebooted or when the power was restored. The paths to the remote storage unit might not be disabled if the primary storage unit was turned off. If the remote storage unit was turned off, the paths between the storage units are restored automatically, if possible. After the paths are restored, run mkpprc to resynchronize the specified volume pairs. Depending on the state of the volume pairs, you might have to run rmpprc to delete the volume pairs and run mkpprc to reestablish the volume pairs. The Remote Mirror and Copy pair was suspended because the host ran a command to freeze the Remote Mirror and Copy group. This system resource code can be returned only if a primary volume was queried.
04
05
06
07
08
09
0A
749
Space-Efficient Repository or Over-provisioned Volume has reached a warning watermark UNIT: Mnf Type-Mod SerialNm IBM 2107-9A2 75-ABTV1 Session ID: 4002 Here the conditions under which a trap 223 is sent: The extent status is not zero (the available space is below the threshold) when the first ESE volume is configured. The extent status changes state if ESE volumes are configured in the extent pool. Example B-18 shows a generated event trap 223.
Example: B-18 Trap 223 - SNMP trap alert message 223
Extent Pool Capacity Threshold Reached UNIT: Mnf Type-Mod SerialNm IBM 2107-922 75-03460 Extent Pool ID: P1 Limit: 95% Threshold: 95% Status: 0
750
Object ID (OID) 1.3.6.1.4.1.2.6.208.0.3 1.3.6.1.4.1.2.6.208.0.4 1.3.6.1.4.1.2.6.208.0.5 1.3.6.1.4.1.2.6.208.0.6 1.3.6.1.4.1.2.6.208.0.19 1.3.6.1.4.1.2.6.208.0.20 1.3.6.1.4.1.2.6.208.0.21 1.3.6.1.4.1.2.6.208.0.22 1.3.6.1.4.1.2.6.208.0.23
Description The state of session X change to Prepared. The state of session X changed to Suspended. The state of session X changed to Recovering. The state of session X changed to Target Available. The state of session X changed to Suspending. The state of session X changed to SuspendedH2H3. The state of session X changed to SuspendedH1H3. The state of session X changed to Flashing. The state of session X changed to Terminating.
Important: For communication-failure traps, after an SNMP trap for a failure is sent, it is not resent unless communication is reestablished and failed again.
751
1.3.6.1.4.1.2.6.208.0.15
1.3.6.1.4.1.2.6.208.0.16
1.3.6.1.4.1.2.6.208.0.17
1.3.6.1.4.1.2.6.208.0.18
Trap 101
MM/GM/MGM
752
Source error message Primary PPRC Devices on LSS Suspended Due to Error
Target error message PPRC Links Down. This error can come first on any of the available systems that report the message.
Possible actions This error message can be either reported after a trap 102. So, in this case, it means that the copy was suspended on the primary because of connectivity problems. Recheck the connectivity and resume the PPRC copy. The PPRC relationship was manually paused, or an error with the volume on the primary system caused the error. Check the volume status and connectivity and resume operations after you correct the issues. If the volume status depends on the DDM status, call IBM for a complete health check of the server before you continue. This message can also come after you remove a PPRC relationship. In this case, no further action is needed. Check the connectivity and bandwidth capacities. Recheck your sessions configurations on GM/MGM. Also, this error can appear when a secondary site has issues on LUNs.
Trap 202
MM/GM/MGM
None
Trap 218
GM/MGM
None
753
Source error message Global Mirror First successful Consistency Group after prior failures
Possible actions There is no needed action for this trap. It means only that it could successfully restore after a previous failure. The session was manually terminated. Reestablish a new session between the primary / secondary or tertiary sites. Check for your TSE or ESE repositories sizes. Resume PPRC operations after you correct the problem. Check your TSE or ESE repositories sizes. Resume PPRC operations after you correct the problem.
Trap 210
GM/MGM
None
Trap 221
GM/MGM
Space-Efficient Repository or Over-provisioned Volume has reached a warning watermark. None or Extent Pool Capacity Threshold Reached
Trap 223
MM/GM/MGM
754
Appendix C.
Resource Groups
There is a trend to consolidate multiple separate storage systems in fewer and larger ones. Company acquisitions, mergers, or outsourcing contracts create the need to accommodate different organizations (in this context, they are usually called tenants) on the same storage system. This multi-tenancy often mandates a clear separation of administrative access to certain resources and functions. This appendix describes how the IBM System Storage DS8000 family can handle multi-tenant Copy Services configurations using a feature called Resource Groups. The appendix describes the ideas and concepts behind this feature and explain its usage with two simple examples. For more information about Resource Groups, see IBM System Storage DS8000 Copy Services Scope Management and Resource Groups, REDP-4758. This appendix covers the following topics: Overview of Resource Groups Functional description Implementation examples
755
Primary
User 1 User 3
RG2
TenantB
User 1
756
Important: The user definitions that were described apply to the users that are intended to manage the Copy Services relationships of the respective tenants. Their DS8000 user IDs must be restricted by placing them in the op_copy_services User Group. To create and maintain the Resource Group configuration and the associated user IDs, you must be logged on to the DS8000 storage system with a user ID that has administration scope (the admin User Group). In mainframe environments, Copy Services are sometimes managed inband (using TSO or ICKDSF) through the host interface (FICON connection) and a connection device. In this case, you do not have a user ID that you can use to define access to a Resource Group. Instead, it is controlled by adding the connection device as a resource to the Resource Group. Figure C-2 illustrates the relationships of the objects if Resource Groups are managed through connection devices.
Server RG1
TenantA
DS8000
LPAR 1
TenantB
LPAR 2
access
RG2
Access methods: Both methods of access control can be mixed within one DS8000 storage system.
Functional description
This section describes the Resource Group functionality in more detail.
757
HMC
DS8000
Figure C-3 Basic Resource Group attributes and the relationship to users and resources
Remote relationships
The Resource Groups feature is intended to manage the scope of Copy Services, which in many cases affect more than one storage system (with all Remote Copy functions). Therefore, Resource Group definitions and scopes are also observed on the target storage systems that are involved in the Copy Services operations.
758
To manage remote relations, use the Copy Services Global Resource Scope attribute (CS_GRS). A Copy Services relationship is allowed if the primary (local) Resource Groups RGL matches the secondary (remote) Resource Groups CS_GRS and vice versa, as illustrated in Figure C-4.
DS8000 remote
Resources RG: RG1 RG1 RG1 RG1 RG1 RGL: TenantA CS_GRS: RemoteA RGL: RemoteA CS_GRS: TenantA RG: RG2
Tip: In the example in Figure C-4, the RGL and CS_GRS are different. We set them this way to clarify how remote relationships work. They do not have to be different. Using an identical string for RGL and CS_GRS is allowed. The RG identifiers also do not have to be different. Tip: You can also use an RGL to CS_GRS relationship to allow Copy Services between resources of two different RGs within a single DS8000 storage system.
Default behavior
Resource Groups is not a licensed feature. It is available automatically when you have a DS8000 microcode level installed that supports Resource Groups. When the feature becomes available through a microcode update, all available and newly created resources are assigned to a Default Resource Group (RG0). All users are associated with Resource Group Scope * and therefore have access to all resources. A system administrator does not have to configure Resource Groups functions when upgrading to a microcode level that supports Resource Groups. By default, the behavior of the storage system is unchanged. There is no change to the behavior of commands and scripts.
759
Tip: Resources that you do not explicitly assign to a resource group belong to RG0 and therefore are not restricted by the Resource Groups feature. You might consider putting them into their own Resource Group, which no tenant has access to. Thus, you can avoid the situation where tenants, whether accidentally or on purpose, use resources that are not assigned to them. Deleting Resource Groups: Before you can delete a Resource Group, you must assign its resources to a remaining Resource Group. This Resource Group can also be RG0.
Special attributes
Resource Groups can have a number of additional attributes for special purposes. The following attributes are available: Pass-thru Global Resource Scope (P_GRS): Pass-thru mode allows a connection device to manage Copy Services relationships in LSSs other than its own. P_GRS is a resource scope that is used to validate whether a connection volume can pass through to a destination volume. GM Sessions Allowed: A bit mask of those Global Mirror session numbers (1 - 255) that can be assigned to an LSS associated with the RG. It allows the assignment of GM session numbers to tenants. GM Masters Allowed: A bit mask of those GM session numbers (1 - 255) that can be used to manage a Global Mirror Master through an LSS associated with the RG. It limits which Storage Facility Image a GM Master can run on. If you need more information about these special attributes, see IBM System Storage DS8000 Copy Services Scope Management and Resource Groups, REDP-4758.
Implementation examples
This section explains how you can implement Resource Groups using two simple examples. Fixed block versus CKD resources: We describe the examples using fixed block resources. They are not fixed block specific, though. You can apply them for CKD resources as well.
760
To define a new Resource Group, either click Action Create or right-click anywhere in the list and select Create. In the dialog box that opens, define the basic attributes of the new Resource Group, as shown in Figure C-6.
For our simple example, this action is sufficient, but you could change the special attributes, including the Resource Group Identifier RG, by clicking Advanced.
Appendix C. Resource Groups
761
You can now see the new Resource Group in the list, as shown in Figure C-7. The RG ID was automatically assigned because you did not specify it explicitly. Resource Group Identifier: In the DS GUI, the Resource Group Identifier RG is shown in the ID column.
Select a URS from the ones that exist. Specify the group assignment to Copy Services Operator because Resource Groups affects only Copy Services operations. You could also assign a URS to an existing user through the Modify User window.
762
Figure C-9 shows the new user and its scope in the user list. The users with a User Group other than Copy Services Operator have URS * and thus are not impaired by any Resource Group definitions.
Attention: There is another user that is called itso_rg_earth with a different URS. We use this use user later in this example to show how Resource Groups limit access.
Assigning resources
Now we map a set of logical volumes to our Resource Group by setting the volumes Resource Group attribute. To do so, click either Volumes Fixed Block Volumes Manage Volumes or Volumes CKD LCUs and Volumes Manage existing LCUs and Volumes. Select the volumes that you want to change, right-click, and select Properties. Figure C-10 shows the list of volumes and the opened Multiple Volumes Properties window that you can use to set the volumes RG attribute all in one step. Assigning the Resource Group: You could have also assigned the Resource Group attribute in a similar fashion to the LSSs or LCUs as a whole by using the Manage LSS or Manage LCU windows, respectively.
Figure C-10 Assign the Resource Group attribute to the logical volumes
763
Figure C-11 shows the list of modified volumes with their Resource Group assignments.
Viewing the results: By default, the Resource Group column is further right in the Volumes window. We changed its position for clarity.
764
As a counter example, log on with a different user ID (itso_rg_earth) that does not have the matching URS. As shown in Figure C-13, you cannot see the volumes that you want to use in the Create FlashCopy window.
When you try to establish the FlashCopy relationships with the same user ID using the DS CLI, you get the error messages shown in Example C-1.
Example: C-1 Refused FlashCopy establish because of a Resource Group conflict
dscli> mkflash 4a00-4a03:4c00-4c03 CMUN03176E mkflash: 4A00:4C00: The CMUN03176E mkflash: 4A01:4C01: The CMUN03176E mkflash: 4A02:4C02: The CMUN03176E mkflash: 4A03:4C03: The task task task task cannot cannot cannot cannot be be be be initiated initiated initiated initiated because because because because a a a a user user user user resource resource resource resource scope scope scope scope policy policy policy policy violation violation violation violation has has has has occurred occurred occurred occurred ... ... ... ...
Volumes RG: RG3 4A02 User itso_rg_venus URS: venus 4A01 4A00 4A03 RGL: venus CS_GRS: venus_sec RGL: venus_sec CS_GRS: venus RG: RG1
765
Local configuration
Start on the local site by creating the Resource Group and defining the CS_GRS, as shown in Example C-2.
Example: C-2 Create and modify a Resource Group on the primary system dscli> mkresgrp -label venus -name "Resources_Client_Venus" Date/Time: June 6, 2012 11:13:59 AM CEST IBM DSCLI Version: 7.6.30.157 DS: IBM.2107-75TV181 CMUC00432I mkresgrp: Resource group RG3 successfully created. dscli> manageresgrp -ctrl copyglobal -action set -scope venus_sec rg3 Date/Time: June 6, 2012 11:17:39 AM CEST IBM DSCLI Version: 7.6.30.157 DS: IBM.2107-75TV181 CMUC00436I manageresgrp: Resource group rg3 successfully modified
Note: With the mkresgrp commands, you can specify only the basic RG attributes, such as label, RG, and name. All other attributes are set by running manageresgrp for an existing RG. Create the user ID with the URS needed to access the Resource Group, as shown in Example C-3.
Example: C-3 Create a user with the required URS dscli> mkuser -group op_copy_services -scope venus -pw test123 itso_rg_venus Date/Time: June 6, 2012 11:32:46 AM CEST IBM DSCLI Version: 7.6.30.157 DS: CMUC00133I mkuser: User itso_rg_venus successfully created.
Assign the volumes that you want to use as Metro Mirror primaries to the Resource Group you created, as shown in Example C-4.
Example: C-4 Assign primary volumes to the Resource Group dscli> chfbvol -resgrp rg3 4a00-4a03 Date/Time: June 6, 2012 11:41:17 AM CEST IBM DSCLI Version: 7.6.30.157 DS: IBM.2107-75TV181 CMUC00026I chfbvol: FB volume 4A00 successfully modified. CMUC00026I chfbvol: FB volume 4A01 successfully modified. CMUC00026I chfbvol: FB volume 4A02 successfully modified. CMUC00026I chfbvol: FB volume 4A03 successfully modified.
Run some DS CLI commands to check whether you did the configuration correctly, as shown in Example C-5.
Example: C-5 Check the local configuration
dscli> showresgrp rg3 Date/Time: June 6, 2012 1:05:09 PM CEST IBM DSCLI Version: 7.6.30.157 DS: IBM.2107-75TV181 ID RG3 Name Resources_Client_Venus State Normal Label venus CS_Global_RS venus_sec Passthru_Global_RS PUBLIC GM_Masters_Allowed 01-FF GM_Sessions_Allowed 01-FF dscli> showuser itso_rg_venus Date/Time: June 6, 2012 11:45:20 AM CEST IBM DSCLI Version: 7.6.30.157 DS: Name itso_rg_venus Group op_copy_services State active FailedLogin 0 DaysToExpire 9999 Scope venus
766
dscli> lsfbvol -resgrp rg3 Date/Time: June 6, 2012 11:45:38 AM CEST IBM DSCLI Version: 7.6.30.157 DS: IBM.2107-75TV181 Name ID accstate datastate configstate deviceMTM datatype extpool cap (2^30B) cap (10^9B) cap (blocks) ==================================================================================================================== RGvenus_pri_4A00 4A00 Online Normal Normal 2107-900 FB 512 P8 10.0 20971520 RGvenus_pri_4A01 4A01 Online Normal Normal 2107-900 FB 512 P8 10.0 20971520 RGvenus_pri_4A02 4A02 Online Normal Normal 2107-900 FB 512 P8 10.0 20971520 RGvenus_pri_4A03 4A03 Online Normal Normal 2107-900 FB 512 P8 10.0 20971520
Remote configuration
Now, move to the remote site and configure the remote storage system. Again, the first step is to create a Resource Group with a matching RG and CS_GRS, as shown in Example C-6.
Example: C-6 Create and modify the Resource Group on the secondary system dscli> mkresgrp -label venus_sec -name Targets_for_Client_Venus Date/Time: June 6, 2012 12:45:25 PM CEST IBM DSCLI Version: 7.6.30.157 DS: IBM.2107-75ACV21 CMUC00432I mkresgrp: Resource group RG1 successfully created. dscli> manageresgrp -ctrl copyglobal -action set -scope venus rg1 Date/Time: June 6, 2012 12:46:22 PM CEST IBM DSCLI Version: 7.6.30.157 DS: IBM.2107-75ACV21 CMUC00436I manageresgrp: Resource group rg1 successfully modified.
A user scope definition is not necessary because all user interaction (create Metro Mirror relationships) is done on the primary storage system only. Important: In a real-world configuration, you most likely need a user with a matching URS on the remote system. If there is a disaster or a test, it must be possible to modify the Copy Services relationships from there. In the next step, illustrated in Example C-7, assign the secondary volumes to the new Resource Group.
Example: C-7 Assign secondary volumes to the Resource Group dscli> chfbvol -resgrp RG1 4a00-4a03 Date/Time: June 6, 2012 12:57:15 PM CEST IBM DSCLI Version: 7.6.30.157 DS: IBM.2107-75ACV21 CMUC00026I chfbvol: FB volume 4A00 successfully modified. CMUC00026I chfbvol: FB volume 4A01 successfully modified. CMUC00026I chfbvol: FB volume 4A02 successfully modified. CMUC00026I chfbvol: FB volume 4A03 successfully modified.
Example C-8 shows the commands you can use to check whether the configuration is correct.
Example: C-8 Check the secondary configuration
dscli> showresgrp rg1 Date/Time: June 6, 2012 1:02:13 PM CEST IBM DSCLI Version: 7.6.30.157 DS: IBM.2107-75ACV21 ID RG1 Name Targets_for_Client_Venus State Normal Label venus_sec CS_Global_RS venus Passthru_Global_RS PUBLIC GM_Masters_Allowed 01-FF GM_Sessions_Allowed 01-FF dscli> lsfbvol -resgrp rg1 Date/Time: June 6, 2012 1:02:22 PM CEST IBM DSCLI Version: 7.6.30.157 DS: IBM.2107-75ACV21 Name ID accstate datastate configstate deviceMTM datatype extpool cap (2^30B) cap (10^9B) cap (blocks) ==================================================================================================================== RGvenus_sec_4A00 4A00 Online Normal Normal 2107-900 FB 512 P4 10.0 20971520 RGvenus_sec_4A01 4A01 Online Normal Normal 2107-900 FB 512 P4 10.0 20971520
767
Normal Normal
Normal Normal
2107-900 2107-900
FB 512 FB 512
P4 P4
10.0 10.0
20971520 20971520
Using strings: In this example, we chose different strings for RGL and CS_GRS to illustrate how the systems resolve the relationships between the local and remote Resource Groups. As stated in Remote relationships on page 758, you can use the same string for both labels.
In Example C-10, you perform the counter test, trying to establish the same Metro Mirror relationships, but logged on to the primary system using a user ID with the wrong URS.
Example: C-10 Test the Resource Group configuration with the wrong user ID
dscli> whoami Date/Time: June 6, 2012 1:38:23 PM CEST IBM DSCLI Version: 7.6.30.157 DS: Name Group Policy Scope ================================================== itso_rg_earth op_copy_services initialPolicy earth dscli> mkpprc -remotedev IBM.2107-75ACV21 -type mmir 4a00-4a03:4a00-4a03 Date/Time: June 6, 2012 1:38:39 PM CEST IBM DSCLI Version: 7.6.30.157 DS: CMUN03176E mkpprc: 4A00:4A00: The task cannot be initiated because a user CMUN03176E mkpprc: 4A01:4A01: The task cannot be initiated because a user CMUN03176E mkpprc: 4A02:4A02: The task cannot be initiated because a user CMUN03176E mkpprc: 4A03:4A03: The task cannot be initiated because a user
IBM.2107-75TV181 resource scope policy resource scope policy resource scope policy resource scope policy
For clarity: The full wording of the error messages in Example C-10 is CMUN03176E mkpprc: 4A00:4A00: The task cannot be initiated because a user resource scope policy violation has occurred on the destination logical volume.
768
Related publications
The publications that are listed in this section are considered suitable for a more detailed discussion of the topics that are covered in this Redbooks publication.
IBM Redbooks
For information about ordering these publications, see How to get IBM Redbooks on page 770. Some of the documents that are referenced here might be available in softcopy only. DS8800 Performance Monitoring and Tuning, SG24-8013 IBM System Storage Business Continuity: Part 1 Planning Guide, SG24-6547 IBM System Storage DS8000: Architecture and Implementation, SG24-8886 IBM System Storage DS8000 Copy Services for IBM System z, SG24-6787 IBM System Storage DS8000 Host Attachment and Interoperability, SG24-8887 IBM System Storage Solutions Handbook, SG24-5250 IBM Tivoli Storage Productivity Center V4.2 Release Guide, SG24-7894
Other publications
These publications are also relevant as further information sources. Some of the documents that are referenced here might be available in softcopy only. IBM System Storage DS8000: Host Systems Attachment Guide, GC27-2298 IBM System Storage DS8000: Introduction and Planning Guide, GC27-2297 IBM System Storage DS Command-Line Interface Users Guide, GC53-1127 IBM System Storage DS Open Application Programming Interface Reference, GC35-0516 IBM System Storage Multipath Subsystem Device Driver Users Guide, GC27-2122
Online resources
These websites and URLs are also relevant as further information sources: Documentation for the DS8000, found at: http://www.ibm.com/systems/storage/disk/ds8000/index.html IBM Announcement letters (for example, search for Rel 6.3), found at: http://www.ibm.com/common/ssi/index.wss IBM Disk Storage Feature Activation (DSFA), found at: http://www.ibm.com/storage/dsfa IBM SAN information, found at: http://www.ibm.com/systems/storage/san
769
IBM System Storage Interoperation Center (SSIC), found at: http://www.ibm.com/systems/support/storage/config/ssic/index.jsp IBM Techdocs Library - The IBM Technical Sales Library, found at: http://www.ibm.com/support/techdocs/atsmastr.nsf/Web/Techdocs PSP information can be found at: http://www.ibm.com/servers/resourcelink/svc03100.nsf?OpenDatabase
770
Back cover
Plan, install, and configure DS8000 Copy Services Learn about IBM FlashCopy and Copy Services Learn through examples and practical scenarios
In todays highly competitive and real-time environment, the ability to manage all IT operations on a continuous basis makes the creation of copies and backups of data a core requirement for any IT deployment. Furthermore, it is necessary to provide proactive efficient disaster recovery strategies that can ensure continuous data availability for business operations. The Copy Services functions available with the IBM System Storage DS8000 are part of these strategies. This IBM Redbooks publication helps you plan, install, configure, and manage the Copy Services functions of the DS8000 when they are used in Open System and IBM i environments. This book provides the details necessary to implement and control each of the Copy Services functions. Numerous examples illustrate how to use the various interfaces with each of the Copy Services. This book also covers the 3-site Metro/Global Mirror with Incremental Resync feature and introduces the IBM Tivoli Storage Productivity Center for Replication solution. This book should be read with The IBM System Storage DS8000 Series: Architecture and Implementation, SG24-8886. There is also a companion book, IBM System Storage DS8000 Copy Services for IBM System z, SG24-6787, which supports the configuration of the Copy Services functions in z/OS environments.