Sei sulla pagina 1di 160

DATA CENTER NETWORK CONNECTIVITY WITH IBM SERVERS

Network infrastructure scenario designs and configurations


by Meiji Wang, Mohini Singh Dukes, George Rainovic, Jitender Miglani and Vijay Kamisetty

Juniper Networks Validated Solutions

Data Center Network Connectivity with IBM Servers


Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v Chapter.1:.Introduction.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 . Chapter.2:.Design.Considerations.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 Chapter.3:.Implementation.Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Chapter.4:.Connecting.IBM.Servers.in.the.Data.Center.Network . . . . . . . . . . . . . 45 Chapter.5:.Configuring.Spanning.Tree.Protocols. . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 . Chapter.6:.Supporting.Multicast.Traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Chapter.7:.Understanding.Network.CoS.and.Latency. . . . . . . . . . . . . . . . . . . . . . . . 105 Chapter.8:.Configuring.High.Availability.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 Appendix.A:...Configuring.TCP/IP.Networking.in.Servers. . . . . . . . . . . . . . . . . . . . . 144 Appendix.B:...LAG.Test.Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Appendix.C:...Acronyms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Appendix.D:...References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

ii

Data Center Network Connectivity with IBM Servers

2010 by Juniper Networks, Inc. All rights reserved. Juniper Networks, the Juniper Networks logo, Junos, NetScreen, and ScreenOS are registered trademarks of Juniper Networks, Inc. in the United States and other countries. Junos-e is a trademark of Juniper Networks, Inc. All other trademarks, service marks, registered trademarks, or registered service marks are the property of their respective owners. Juniper Networks assumes no responsibility for any inaccuracies in this document. Juniper Networks reserves the right to change, modify, transfer, or otherwise revise this publication without notice. Products made or sold by Juniper Networks or components thereof might be covered by one or more of the following patents that are owned by or licensed to Juniper Networks: U.S. Patent Nos. 5,473,599, 5,905,725, 5,909,440, 6,192,051, 6,333,650, 6,359,479, 6,406,312, 6,429,706, 6,459,579, 6,493,347, 6,538,518, 6,538,899, 6,552,918, 6,567,902, 6,578,186, and 6,590,785. Printed in the USA by Vervante Corporation. Version History: v1 June 2010 2 3 4 5 6 7 8 9 10

Key Contributors
Chandra Shekhar Pandey is a Juniper Networks Director of Solutions Engineering. He is responsible for service provider, enterprise and OEM partners solutions engineering and validation. Chandra has more than18 years of networking experience, designing ASICs, architecting systems and designing solutions to address customers challenges in the service providers, MSO and enterprise market. He holds a bachelors degree in Electronics Engineering from K.N.I.T, Sultanpur, India and a MBA in High Tech and Finance from Northeastern University, Boston, MA. Louise Apichell is a Juniper Networks Senior Technical Writing Specialist in the Solutions Marketing Group. She assisted as a content developer, chief editor and project manager in organizing, writing and editing this book. Louise specializes in writing and editing all types of technical collateral, such as white papers, application notes, implementation guides, reference architectures and solution briefs. Ravinder Singh is a Juniper Networks Director of Solution Architecture and Technical Marketing in the Solutions Marketing Group. He is responsible for creating technical knowledge bases and has significant experience working with sales engineers and channels to support Junipers Cloud/Data Center Solutions for the enterprise, service providers and key OEM alliances. Prior to this role, Ravinder was responsible for Enterprise Solutions Architecture and Engineering where his team delivered several Enterprise Solutions including Adaptive Threat Management, Distributed Enterprise and Juniper Simplified Data Center solutions. Ravinder holds a bachelors and masters degree in Electronics and a masters of business degree in IT Management and Marketing. Mike Barker is a Juniper Networks Technical Marketing Director, Solutions Engineering and Architectures. In this role, he focuses on developing architectures and validating multi-product solutions that create business value for enterprise and Service Provider customers. Prior to this role, Mike served in various Consulting and Systems Engineering roles for Federal, Enterprise and Service Provider markets at Juniper Networks, Acorn Packet Solutions and Arbor Networks. Earlier in his career, Mike held Network Engineering positions at Cable & Wireless, Stanford Telecom and the USAF. Mr. Barker holds a Bachelors of Science Degree in Business Management from Mount Olive College and a MBA from Mount St. Marys University. Karen Joice is a Juniper Networks Marketing Specialist who provided the technical illustrations for this book. Karen has been a graphic artist and marketing professional for more than 15 years, specializing in technical illustrations, Flash, and Web design, with expertise in print production. You can purchase a printed copy of this book, or download a free PDF version of this book, at: juniper.net/books.

About the Authors

iii

About the Authors


Meiji Wang is a Juniper Networks Solutions Architect for data center applications and cloud computing. He specializes in application development, data center infrastructure optimization, cloud computing, Software as a Service (SaaS), and data center networking. He has authored three books focusing on databases, e-business web usage and most recently data center network design Redbook in partnership with the IBM team. IBM Redbooks | IBM j-type Data Center Networking Introduction. Mohini Singh Dukes is a Juniper Networks Staff Solutions Design Engineer in the Solutions Engineering Group. She designs, implements and validates a wide range of solutions in the mobile, Carrier Ethernet, data center interconnectivity and security, business and residential services. Specializing in mobile networking solutions including backhaul, packet backbone and security, she has authored a number of whitepapers, application notes and implementation and design guides based on solution validation efforts. She has also published a series of blogs on energy efficient networking. George Rainovic is a Juniper Networks Solutions Staff Engineer. He specializes in designing technical solutions for data center networking and Video CDN. He specializes in testing IBM J-Type Ethernet switches and routers. George has more than 15 years of networking experience and IT, designing, deploying and supporting networks for network service providers and business enterprise customers. He holds a bachelors degree in Electrical Engineering from the University of Novi Sad, Serbia. Jitender K. Miglani is a Juniper Networks Solutions Engineer for data center intra and inter connectivity solutions. As part of Junipers OEM relationship with IBM, Jitender assists in qualifying Junipers EX, MX and SRX Series Platforms with IBM Open System Platforms (Power P5/P6, Blade Center and x3500).Jitender has development and engineering experience in various voice and data networking products, and with small/medium/large enterprise and carrier grade customers. Jitender holds a bachelors in Computer Science from the Regional Engineering College, Kurukshetra, India. Vijay K. Kamisetty is a Juniper Networks Solutions Engineer. He specializes in technical solutions for IPTV-Multiplay, HD-Video Conference, mobile backhaul, application level security in the data center, development of managed services, and validation of adaptive clock recovery. He assists in qualifying Juniper EX and MX platforms with IBM Power P5 and x3500 platforms. He holds a bachelors degree in Computer Science from JNTU Hyderabad, India.

iv

Data Center Network Connectivity with IBM Servers

Authors Acknowledgments
The authors would like to take this opportunity to thank Patrick Ames, whose direction and guidance was indispensible. To Nathan Alger, Lionel Ruggeri, and Zach Gibbs, who provided valuable technical feedback several times during the development of this booklet, your assistance was greatly appreciated. Thanks also to Cathy Gadecki for helping in the formative stages of the booklet. There are certainly others who helped in many different ways and we thank you all.

And Special Thanks to our Reviewers...


Juniper Networks Marc Bernstein Venkata Achanta Charles Goldberg Scott Sneddon John Bartlomiejczyk Allen Kluender Fraser Street Robert Yee Niraj Brahmbhatt Paul Parker-Johnson Travis OHare Scott Robohn Ting Zou Krishnan Manjeri

IBM Rakesh Sharma Casimer DeCusatis

Preface

Preface
ENTERPRISES DEPEND MORE THAN EVER BEFORE on their data center infrastructure efficiency and business applications performance to improve employee productivity, reduce operational costs and increase revenue. To achieve these objectives, virtualization, simplification and consolidation are three of the most crucial initiatives to the enterprise. These objectives not only demand high performance server and network technologies, but also require smooth integration between the two as well to achieve optimal performance. Hence, successful integration of servers and simplified networking infrastructure is pivotal. This guide provides enterprise architects, sales engineers, IT developers, system administrators and other technical professionals guidance on how to design and implement a high-performance data center using Juniper Networks infrastructure and IBM Open Systems. With a step-by-step approach, readers can grasp a thorough understanding of design considerations, recommended designs, technical details and sample configurations, exemplifying simplified data center network design. This approach is based on testing performed using Juniper Networks devices and IBM servers in Juniper Networks solution labs. The IBM Open System Servers solution including IBM Power systems, System x, and Blade Center Systems comprises the foundation for a dynamic infrastructure. IBM server platforms help consolidate applications and servers, and virtualize its system resources while improving overall performance, availability and energy efficiency, providing a more flexible, dynamic IT infrastructure. Juniper Networks offers a unique best-in-class data center infrastructure solution based on open standards. It optimizes performance and enables consolidation which in turn increases network scalability and resilience, simplifies operations, and streamlines management while lowering overall Total Cost of Ownership (TCO). The solution also automates network infrastructure management, making existing infrastructure easily adaptable and flexible, especially for third-party application deployment. Key topics discussed in this book focus on the following routing and switching solutions in Junipers simplified two-tier data center network architecture with IBM open systems. Best practices for integrating Juniper Networks EX and MX Series switches and routers with IBM Open Systems. Configuration details for various spanning tree protocols such as Spanning Tree Protocol (STP), Multiple Spanning Tree Protocol (MSTP), Rapid Spanning Tree Protocol (RSTP), and Virtual Spanning Tree Protocol (VSTP); deployment

vi

Data Center Network Connectivity with IBM Servers

scenarios such as RSTP/MSTP and Virtual Spanning Tree Protocol/Per-VLAN Spanning Tree (VSTP/PVST) with Juniper EX and MX Series (switches and routers) connecting to IBM Blade Center. Details for Layer 2 and Layer 3 multicast scenarios with Protocol Independent Multicast (PIM) and Internet Group Management Protocol (IGMP) snooping. Scenarios include video streaming client running on IBM servers with PIM implemented on network access and core/aggregation tiers along with IGMP snooping at the access layer. Low latency network design and techniques such as Class of Service (CoS) for improving data center network performance. Methods for increasing data center resiliency and high-availability. Configuration details for protocols such as Virtual Router Redundancy Protocol (VRRP), Redundant Trunk Group (RTG), Link Aggregation (LAG), Routing Engine Redundancy, virtual chassis, Nonstop Bridging (NSB), Nonstop Routing (NSR), Graceful Restart (GR) and In-Service-Software-Upgrade (ISSU). Juniper Networks realizes that the scope of data center network design encompasses many facets, for example servers, storage and security. Therefore, to narrow the scope of this book, we have focused on network connectivity implementation details based on Juniper EX, MX Series switches and routers and IBM Open Systems. However, as new relevant technologies and best practices evolve, we will continue to revise this book to include additional topics. Please make sure to send us your feedback with any new or relevant ideas that you would like to see in future revisions of this book, or in other Validated Solutions books, at: solutions-engineering@juniper.net.

Chapter 1
Introduction

Trends.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Challenges. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 IBM.and.Juniper.Networks.Data.Center.Solution. . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 . IBM.and.Juniper.Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 .

TODAYS.DATA.CENTER.ARCHITECTS.and.designers.do.not.have.the.luxury.. of.simply.adding.more.and.more.devices.to.solve.networkings.constant.and. continuous.demands.such.as.higher.bandwidth.requirements,.increased.speed,. rack.space,.tighter.security,.storage,.interoperability.among.many.types.of.devices. and.applications,.and.more.and.more.diverse.and.remote.users .. This.chapter.discusses.in.detail.the.data.center.trends.and.challenges.now.facing. network.designers ..Juniper.Networks.and.IBM.directly.address.these.trends.and. challenges.with.a.data.center.solution.that.will.improve.data.center.efficiency.by. simplifying.the.network.infrastructure,.by.reducing.recurring.maintenance.and. software.costs,.and.by.streamlining.daily.management.and.maintenance.tasks .

Data Center Network Connectivity with IBM Servers

Trends
Although there are several types of data centers for supporting a wide range of applications such as financial, web portals content providers, and IT back office operations, they all share certain trends, such as: More Data Than Ever Before Since the dawn of the computer age, many companies have struggled to store their electronic records. That struggle can be greater than ever today, as regulatory requirements can force some companies to save even more records than before. The growth of the Internet may compound the problem; as businesses move online, they need to store enormous amounts of data such as customer account information and order histories. The total capacity of shipped storage systems is soaring by more than 50 percent a year, according to market researcher IDC. The only thing that is growing faster than the volume of data itself is the amount of data that must be transferred between data centers and users. Numerous large enterprises are consolidating their geographically distributed data centers into mega data centers to take advantage of cost benefits and economies of scale, increased reliability, and to exploit the latest virtualization technologies. According to research conducted by Nemertes, more than 50 percent of companies consolidated their dispersed data centers into fewer but larger data centers in the last 12 months, with even more planning to consolidate in the upcoming 12 months. Server Growth Servers are continuing to grow at a high annual rate of 11 percent, while storage is growing at an even higher rate of 22 percent: both of which are causing tremendous strain on the data centers power and cooling capacity. According to Gartner, OS and application instability is increasing the server sprawl with utilization rates of 20 percent, leading to an increased adoption of server virtualization technologies. Evolution of Cloud Services Cloud computing is a style of computing in which dynamically scalable and often virtualized resources are provided as a service over the Internet. Large enterprises are adopting cloud-computing methodology into their mega data centers. Smaller businesses that cannot afford to keep up with the cost and complexity of maintaining their privately owned data centers may look to outsource those functions to cloud-hosting providers.

Challenges
Todays major data center challenges include scale and virtualization, complexity and cost, and interconnectivity for business continuity, and security: Scale and Virtualization With the evolution of mega data centers and cloud-computing architectures, tremendous strain is being placed on current network architectures. Scaling networking and security functions can quickly become a limiting factor to the success of growing data centers as they strive to meet stringent performance and

Chapter 1: Introduction to Data Center Network Connectivity

high-availability requirements. However, simply adding more equipment is not always satiating the appetite of hungry mega data centers. If the network and security architecture does not enable application workload mobility and quick responses to variable capacity requirements to support multi-tenancy within servers (as required in a cloud environment) then the full value of data center virtualization cannot be realized. Complexity and Cost Many data centers have become overly complex, inefficient and costly. Networking architectures have stagnated for over a decade, resulting in network device sprawl and increasingly chaotic network infrastructures designed largely to work around low-performance and low-density devices. The ensuing capital expenses, rack space, power consumption and management overhead all add to the overall cost, not to mention the environmental impact. Unfortunately, instead of containing costs and reallocating the savings into enhancing and accelerating business practices, the IT budget all too often is misappropriated into sustaining and rapidly growing already unwieldy data center operations. Emerging applications that use Service Oriented Architecture (SOA) and Web services are increasingly computational and network intensive; however, the network is not efficient. Gartner (2007) asserts that 50 percent of the Ethernet switch ports within the data center are used for switch interconnectivity. Interconnectivity for Business Continuity As data centers expand, they can easily outgrow a single location. When this occurs, enterprises may have to open new centers and transparently interconnect these locations so they can interoperate and appear as one large data center. Enterprises with geographically distributed data centers may want to virtually consolidate them into a single, logical data center in order to take advantage of the latest technology. Security The shared infrastructure in the data center or cloud should support multiple customers, each with multiple hosted applications, provide complete, granular and virtualized security that is easy to configure and understand, and support all major operating systems on a plethora of mobile and desktop devices. In addition, a shared infrastructure should integrate seamlessly with existing identity systems, check host posture before allowing access to the cloud, and make all of this accessible for thousands of users, while protecting against sophisticated application attacks, Distributed Denial of Service (DDoS) attacks and hackers. Today, a data center infrastructure solution requires a dynamic infrastructure, a high performance network and a comprehensive network management system.

IBM and Juniper Networks Data Center Solution


The IBM Servers solution, including IBM Power systems, System x and BladeCenter Systems comprise the foundation for a dynamic infrastructure.

10

Data Center Network Connectivity with IBM Servers

IBM Power System The IBM Power Systems family of servers includes proven server platforms that help consolidate applications and servers, virtualize its system resources while improving overall performance, availability and energy efficiency, and providing a more flexible, dynamic IT infrastructure. A Power server can run up to 254 independent servers each with its own processor, memory and I/O resources within a single physical Power server. Processor resources can be assigned at a granularity of 1/100th of core. IBM System x The IBM System x3850 X5 server is the fifth generation of the Enterprise X-Architecture, delivering innovation with enhanced reliability and availability features to enable optimal performance for databases, enterprise applications and virtualized environments. According to a recent IBM Redbooks paper, a single IBM System x3850 X5 host server can support up to 384. For details, please refer to High density virtualization using the IBM system x3850 X5 at www.redbooks.ibm.com/ technotes/tips0770.pdf. IBM BladeCenter The BladeCenter is built on IBM X-Architecture to run multiple business-critical applications with simplification, cost reduction and improved productivity. Compared to first generation Xeon-based blade servers, IBM BladeCenter HS22 blade servers can help improve the economics of your data center with: Up to 11 times faster performance Up to 90 percent reduction in energy costs alone Up to 95 percent IT footprint reduction Up to 65 percent less in connectivity costs Up to 84 percent fewer cables For detailed benefits concerning the IBM BladeCenter, please refer to www-03.ibm.com/systems/migratetoibm/systems/bladecenter/.

Juniper Network Products for a High Performance Network Infrastructure Solution


Juniper Networks data center infrastructure solutions provide operational simplicity, agility and efficiency to simplify the network with the following key technologies: Virtual Chassis technology, combined with wire-rate 10-Gigabit Ethernet performance in the Juniper Networks EX Series Ethernet Switches, reduces the number of networking devices and interconnections. This effectively eliminates the need for an aggregation tiercontributing to a significant reduction of capital equipment cost and network operational costs, improved application performance, and faster time to deploy new servers and applications. Dynamic Services Architecture in the Juniper Networks SRX Series Services Gateways consolidates security appliances with distinct functions into a highly integrated, multifunction platform that results in simpler network designs, improved application performance, and a reduction of space, power, and cooling requirements.

Chapter 1: Introduction to Data Center Network Connectivity

11

Network virtualization with MPLS in the Juniper Networks MX Series 3D Universal Edge Routers and the Juniper Networks M Series Multiservice Edge Routers enables network segmentation across data centers and to remote offices for applications and departments without the need to build separate or overlay networks. Juniper Networks Junos operating system operates across the network infrastructure, providing one operating system, enhanced through a single release train, and developed upon a common modular architecturegiving enterprises a 1-1-1 advantage. J-Care Technical Services provide automated incident management and proactive analysis assistance through the Advanced Insight Solutions technology resident in Junos OS. MX Series 3D Universal Edge Routers The Juniper Networks MX Series 3D Universal Edge Routers are a family of highperformance Ethernet routers with powerful switching features designed for enterprise and service provider networks. The MX Series provides unmatched flexibility and reliability to support advanced services and applications. It addresses a wide range of deployments, architectures, port densities and interfaces. Highperformance enterprise networks typically deploy MX Series routers in high-density Ethernet LAN and data center aggregation, and the data center core. The MX Series provides carrier grade reliability, density, performance, capacity and scale for enterprise networks with mission critical applications. High availability features such as nonstop routing (NSR), fast reroute, and unified in service software upgrade (ISSU) ensure that the network is always up and running. The MX Series delivers significant operational efficiencies enabled by Junos OS, and supports a collapsed architecture requiring less power, cooling and space consumption. The MX Series also provides open APIs for easily customized applications and services. The MX Series enables enterprise networks to profit from the tremendous growth of Ethernet transport with the confidence that the platforms they install now will have the performance and service flexibility to meet the challenges of their evolving requirements. The MX Series 3D Universal Edge Routers include the MX80 and MX80-48T, MX240, MX480 and MX960.Their common key features include: 256K multicast groups 1M MAC address and V4 routes 6K L3VPN and 4K VPLS instances Broadband services router IPsec Session boarder controller Video quality monitoring As a member of the MX Series, the MX960 is a high density Layer 2 and Layer 3 Ethernet platform with up to 2.6 Tbps of switching and routing capacity, and is the industrys first 16-port 10GbE card. It is optimized for emerging Ethernet network

12

Data Center Network Connectivity with IBM Servers

architectures and services that require high availability, advanced QoS, and performance and scalability that support mission critical networks. The MX960 platform is ideal where SCB and Routing Engine redundancy are required. All major components are field-replaceable, increasing system serviceability and reliability, and decreasing mean time to repair (MTTR). The enterprise customers typically deploy MX960 or MX480 in their data center core. NOTE We deployed the MX480 in this handbook. However, the configurations and discussions pertaining to the MX480 also apply to the entire MX product line. EX Series Ethernet Switches As a member of the EX Series Ethernet Switches, the EX4200 Ethernet switches with virtual chassis technology and the EX8200 modular chassis switches are commonly deployed in the enterprise data center. We used the EX4200 and EX8200 for most of our deployment scenarios. EX4200 Ethernet Switches with Virtual Chassis Technology The EX4200 line of Ethernet switches with Virtual Chassis technology combine the HA and carrier class reliability of modular systems with the economics and flexibility of stackable platforms, delivering a high-performance, scalable solution for data center, campus, and branch office environments. The EX4200 Ethernet switches with virtual chassis technology have the following major features: Deliver high availability, performance and manageability of chassis-based switches in a compact, power-efficient form factor. Offer the same connectivity, Power over Ethernet (PoE) and Junos OS options as the EX3200 switches, with an additional 24-port fiber-based platform for Gigabit aggregation deployments. Enable up to 10 EX4200 switches (with Virtual Chassis technology) to be interconnected as a single logical device supporting up to 480 ports. Provide redundant, hot-swappable, load-sharing power supplies that reduce mean time to repair (MTTR), while Graceful Route Engine Switchover (GRES) ensures hitless forwarding in the unlikely event of a switch failure. Run the same modular fault-tolerant Junos OS as other EX Series switches and all Juniper routers. EX8200 Modular Chassis Switches The EX8200 Modular chassis switches have the following major features: High-performance 8-slot (EX8208) and 16-slot (EX8216) switches support data center and campus LAN core and aggregation layer deployments. Scalable switch fabric delivers up to 320 Gbps per slot 48-port 10/100/1000BASE-T and 100BASE-FX/1000BASE-X line cards support up to 384 (EX8208) or 768 (EX8216) GbE ports per chassis.

Chapter 1: Introduction to Data Center Network Connectivity

13

48-port 100/1000BASE-T and 100BASE-FX/100BASE-X line cards support up to 384 (EX8208) or 768 (EX8216) GbE ports per chassis. 8-port 10GBASE-X line cards with SFP+ interfaces deliver up to 64 (EX8208) or 128 (EX8216) 10-GbE ports per chassis. Carrier-class architecture includes redundant internal Routing Engines, switch fabrics, and power and cooling, all ensuring uninterrupted forwarding and maximum availability. Run the same modular fault-tolerant Junos OS as other EX Series switches and all Juniper routers. Juniper Networks high-performance data center network architecture reduces cost and complexity by requiring fewer tiers of switching, and consolidating security services, a common operating system, and one extensible model for network management. As shown in the Figure 1.1, the Junos OS runs many data center network switching, routing and security platforms, including Juniper Networks EX Series, Juniper Networks MX Series, and Juniper Networks SRX Series, and IBM j-type data center network products Juniper Networks original equipment manufacturer (OEM) for the EX and MX Series. For details concerning product mapping between IBM and Juniper Networks products, see Table 1.1 at the end of this chapter or visit the website, IBM and Junos in the Data Center: A Partnership Made for Now, at https://simplifymydatacenter.com/ibm .

T Series Junos Space Junos Pulse SRX5000 Line

EX8216

EX8208

NSM

NSMXpress SRX3000 Line MX Series SRX650 SRX240 M Series SRX100 SRX210 J Series EX4200 Line EX3200 Line EX2200 Line

SECURITY

ROUTERS

SWITCHES

Figure 1.1

Junos Operating System Runs on the Entire Data Center Network: Security, Routers, and Switching Platforms

14

Figure 1.2

Data Center and Cloud Architecture

REMOTE/CLOUD USER
SSL VPN

PUBLIC CLOUD TELEWORKER


MX240 MX480 MX960

EX4200
IBM System z

EX8208 EX8216

EX4200

SRX100

IBM System p

EX4200 MX240 MX480 MX960 EX8208 EX8216

EX4200
IBM System x and BladeCenter

Virtual Chassis

SBR Appliance

IC6500 Unied Access SRX5600 SRX5800 Control

SSL VPN MPLS/VPLS

IPsec VPN MPLS/VPLS

WAN NETWORK

SECURITY

NETWORK

MANAGEMENT

MX240 MX480 MX960 SRX5600 SRX5800 IC6500 Unied Access Control SA6500 EX4200
Federated Identity Manager IBM System z Access Manager IBM System p

MX240 MX480 MX960 WXC3400

NSM/Junos Space STRM Series

WXC3400

EX8208 EX8216

EX8208 EX8216

SBR Appliance

NetView

EX4200

EX4200

Virtual Chassis

EX4200
Network Netcool Manager

IBM System x

Blade Center

Fibre Channel FC SAN iSCSI NAS

Ethernet NFS/CIFS File Systems

SERVER

Provisioning Manager

STORAGE

ENTERPRISE OWNED DATA CENTER LOCATION 1

15

Figure 1.2

Data Center and Cloud Architecture (cont.)


LARGE BRANCH
Kiosk

Virtual Chassis

EX4200

EX4200

Tivoli Storage Manager Fastrack (TSMF)

SRX650

WXC2600

HEADQUARTERS
IC4500

SMALL BRANCH
SRX3600

MX240 MX480 MX960 EX8208, EX8216

EX4200

SRX100 SRX210 SRX240

EX4200
Tivoli Storage Manager Fastrack (TSMF)

WXC2600 EX8208, EX8216 SSL VPN IPsec VPN MPLS/VPLS

EX4200
Virtual Chassis

WAN NETWORK

SECURITY

NETWORK

MANAGEMENT

MX240 MX480 MX960 SRX5600 SRX5800 IC6500 Unied Access Control SA6500 EX4200
Federated Identity Manager IBM System z Access Manager IBM System p

MX240 MX480 MX960 WXC3400

NSM/Junos Space STRM Series

WXC3400

EX8208 EX8216

EX8208 EX8216

SBR Appliance

NetView

EX4200

EX4200

Virtual Chassis

EX4200
Network Netcool Manager

IBM System x

Blade Center

Fibre Channel FC SAN iSCSI NAS

Ethernet NFS/CIFS File Systems

SERVER

Provisioning Manager

STORAGE

ENTERPRISE OWNED DATA CENTER LOCATION 2

16

Data Center Network Connectivity with IBM Servers

IBM and Juniper Networks Data Center and Cloud Architecture


As shown in Figure 1.2 (divided into two sections on pages 14-15), the sample data center and cloud architecture deploy IBM Servers, IBM software and Juniper Networks data center network products. Juniper is the OEM for IBM j-type e-series switches and m-series routers (EX Series and MX Series). For details concerning product mapping between IBM and Juniper Networks products, see Table 1.1.

IBM Tivoli and Juniper Networks Junos Space for Comprehensive Network Management Solution
Managing the data center network often requires many tools from different vendors, as the typical network infrastructure often is a complex meshed network deployment. This type of network deployment combines different network topologies and often includes devices from multiple vendors and network technologies for delivery. IBM Tivoli products and Juniper Networks Junos Space together can manage data center networks effectively and comprehensively. The tools include: IBM System Director Tivoli Netcool/OMNIbus IBM Tivoli Provisioning Manger Junos Space Network Application Platform Juniper Networks Junos Space Ethernet Activator Juniper Networks Junos Space Security Designer Juniper Networks Junos Space Route Insight Manager Juniper Networks Junos Space Service Now MORE For the latest IBM and Juniper Networks data center solution, visit http://www. juniper.net/us/en/company/partners/global/ibm/#dynamic.

IBM and Juniper Networks


The collaboration between IBM and Juniper Networks began a decade ago. In November of 1997, IBM provided custom Application Specific Integrated Circuits (ASICs) for Juniper Networks new class of Internet backbone devices as part of a strategic technology relationship between the two companies. Since 2007, the two companies have been working together on joint technology solutions, standards development and network management and managed security services. IBM specifically included Juniper Networks switching, routing, and security products into their data center network portfolio, with IBM playing an invaluable role as systems integrator. Most recently, the two companies jointly collaborated on a global technology demonstration highlighting how enterprises can seamlessly extend their private data center clouds. The demonstration between Silicon Valley and Shanghai showed a use case where customers could take advantage of remote servers in a

Chapter 1: Introduction to Data Center Network Connectivity

17

secure public cloud to ensure that high priority applications are given preference over lower priority ones when computing resources become constrained. IBM and Juniper are installing these advanced networking capabilities into IBMs nine worldwide Cloud Labs for customer engagements. Once these advanced networking capabilities are installed in the nine worldwide Cloud Labs, IBM and Juniper will be able to seamlessly moveclient-computing workloads between private and publicly managed cloud environments, enabling customers to deliver reliably on service-level agreements (SLAs). In July of 2009, Juniper and IBM continued to broaden their strategic relationship by entering into an OEM agreement that enables IBM to provide Junipers Ethernet networking products and support within IBMs data center portfolio. The addition of Junipers products to IBMs data center networking portfolio provides customers with a best-in-class networking solution and accelerates the shared vision of both companies for advancing the economics of networking and the data center by reducing costs, improving services and managing risk.

IBM j-type Data Center Products and Juniper Networks Products Cross Reference
The IBM j-type e-series Ethernet switches and m-series Ethernet routers use Juniper Networks technology. Table 1.1 shows the mapping of IBM switches and routers to their corresponding Juniper Networks model. For further information concerning product information, please visit the website, IBM and Junos in the Data Center: A Partnership Made For Now, at https://simplifymydatacenter.com/ibm. Table 1.1 Mapping of IBM j-type Data Center Network Products to Juniper Networks Products IBM Machine Type and Model
4273-E48 4274-E08 4274-E16 4274-M02 4274-M06 4274-M11 4274-S34. 4274-S36. 4274-S56.. 4274-S58.

IBM Description
IBM.j-type.e-series.Ethernet.Switch.J48E IBM.j-type.e-series.Ethernet.Switch.J08E. IBM.j-type.e-series.Ethernet.Switch.J16E. IBM.j-type.m-series.Ethernet.Router.J02M. IBM.j-type.m-series.Ethernet.Router.J06M. IBM.j-type.m-series.Ethernet.Router.J11M. IBM.j-type.s-series.Ethernet.Appliance.J34S. IBM.j-type.s-series.Ethernet.Appliance.J36S. IBM.j-type.s-series.Ethernet.Appliance.J56S....................................... IBM.j-type.s-series.Ethernet.Appliance.J58S.

Juniper Networks Model


EX4200 EX8208 EX8216 MX240 MX480 MX960 SRX3400 SRX3600 SRX5600 SRX5800

19

Chapter 2
Design Considerations

Network.Reference.Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Design.Considerations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 Two-Tier.Network.Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

THIS.CHAPTER.FOCUSES.ON.Juniper.Networks.data.center.network.reference. architecture ..It.presents.technical.considerations.for.designing.a.modern.day.data. center.network.that.must.support.consolidated.and.centralized.server.and. storage.infrastructure,.as.well.as.enterprise.applications .

20

Data Center Network Connectivity with IBM Servers

Network Reference Architecture


The data center network is realigning itself to meet new global demands by providing better efficiency, higher performance and new capabilities. Todays data center network can: Maximize efficiency gains from technologies such as server virtualization. Provide required components with improved capabilities security, performance acceleration, high density and resilient switching, and high performance routing. Use virtualization capabilities such as MPLS and virtual private LAN service (VPLS) to enable a flexible, high-performance data center backbone network between data centers. The evolving networking demands a new network reference architecture, which can sustain application performance, meet the demands of customer growth, reinforce security compliance, reduce operational costs and adopt innovation technologies. As shown in Figure 2.1, Juniper Networks data center network reference architecture consists of the following four tiers: Edge Services Tier provides all WAN services at the edge of data center networks; connects to the WAN services in other locations, including other data centers, campuses, headquarters, branches, carrier service providers, managed service providers and even cloud service providers. Core Network Tier acts as the data center network backbone, which interconnects other tiers within the data center and can connect to the core network tier in other data centers. Network Services Tier provides centralized network security and application services, including firewall, Intrusion Detection and Prevention (IDP) and server load balancing. Applications and Data Services Tier connects mainly servers and storage in the LAN environment and acts as an uplink to the core network tier. The subsequent sections in this chapter explain each network tier in detail.

Chapter 2: Design Considerations

21

PRIVATE WAN

INTERNET

EDGE SERVICES M Series


WAN Acceleration Server Security Gateway

M Series
Internet Access Gateway

WX Series/ WXC Series

VPN SRX Series Termination Gateway

SRX Series

SRX Series

NETWORK SERVICES
Intrusion Detection and Prevention

CORE NETWORK

IDP Series

EX4200
or Core Aggregation Router

Secure Access (SSL)

SA Series SRX Series Core Firewall MX Series EX8200

APPLICATIONS AND DATA SERVICES EX4200 EX4200 EX4200 EX4200

IP Storage Network

Internal Application Network

Business Application Network

Infrastructure Network

Figure 2. 1

Juniper Network Data Center Network Reference Architecture

22

Data Center Network Connectivity with IBM Servers

Edge Services Tier


The edge services tier is responsible for all connectivity and network level security aspects to connect the data center to the outside world, including other data centers, campuses, headquarters, branches, carrier service providers, managed service providers, or even cloud service providers. Typically, routers and firewall/ VPNs reside in this tier. It is likely that the data center connects to various leased lines connecting to partners, branch offices and to the Internet. When connecting all of these networks, it is important to plan for the following: Internet routing isolation, for example separating the exterior routing protocols from the interior routing protocols. Network Address Translation (NAT) to convert your private IP addresses to public Internet routable IP addresses. IPSec VPN tunnel termination for partner, branch and employee connections. Border security to enforce stateful firewall policies and content inspection. Quality of Service (QoS).

Core Network Tier


The core network acts as the backbone of the data center network, which interconnects others tiers within data centers and can connect to the core network tier in other data centers as well. It connects the network services tier and aggregates uplink connections for the applications and data services tier. This tier consolidates the functionality of the core and aggregation tiers in traditional three-tier network architecture, thereby significantly reducing the number of devices. Combining the traditional three-tier core and aggregation tiers into a single consolidated core provides other benefits such as: Significant power savings Reduced facilities system footprint Simplified device management Tighter security control Reduced number of system failure points

Network Services Tier


The network services tier provides centralized network security and application services, including firewall, IDP, server load balancing, SSIL offload, HTTP cache, TCP multiplex, and global server load balancing (GSLB). This tier typically connects directly to the core network tier, resulting in low latency and high throughput. This tier is responsible for handling service policies for any network, servers and/or application. Because network service is centralized, it must provide service to all servers and applications within the data center; it should apply a network-specific policy to a particular network or apply an application-specific policy to set of

Chapter 2: Design Considerations

23

servers associated to particular applications. For example, a security service, such as traffic SYN checking/sequence number checking, must apply to any server that is exposed to public networks. The network services tier requires: High performance devices, for example, high performance firewalls to process traffic associated with large numbers of endpoints, such as networks, servers and applications. Virtualization capabilities, such as virtual instance to secure many, simultaneous logical services.

Applications and Data Services Tier


The applications and data services tier (also known as the access tier) is primarily responsible for connecting servers and storage in the LAN environment and acts as an uplink to the core network tier. It includes the access tier in a data network and storage network. This tier supports interoperability with server connections and high throughput network interconnections. When the number of servers increases, the network topology remains agile and can scale seamlessly. Based on different business objectives and IT requirements, the application and data services tier can have many networks, including: External applications networks, which can have multiple external networks that serve separate network segments. These typically include applications such as the public Web, public mail transfer agent (MTA), Domain Name System (DNS) services and remote access and potential file services that are available through unfiltered access. Internal applications networks, which can have multiple internal networks serving different levels of internal access from campus or branch locations. These networks typically connect internal applications such as finance and human resources systems. Also residing in the internal network are partner applications and or any specific applications that are exposed to partners such as inventory systems and manufacturing information. Infrastructure services networks, which provide secure infrastructure network connections between servers and their supporting infrastructure services, such as Lightweight Directory Access Protocol (LDAP), databases, file sharing, content management and middleware servers. Out of Band Management is also a part of this network. Storage networks, which provide remote storage to servers using different standards, such as Fibre Channel, InfiniBand or Internet Small Computer System Interface (iSCSI). Many mission critical application servers typically use the Bus Adapter (HBA) to connect to a remote storage system, ensuring fast access to data. However, large numbers of servers use iSCSI to access remote storage systems by using the TCP/IP network for simplicity and cost efficiency.

24

Data Center Network Connectivity with IBM Servers

Design Considerations
The following key design considerations are critical attributes for designing todays data center network architecture: High availability and disaster recovery Security Simplicity Performance Innovation NOTE The design considerations discussed in this handbook are not necessarily specific to Juniper Networks solutions and can be applied universally to any data center network design, regardless of vendor selection.

High Availability and Disaster Recovery


From the perspective of a data center network designer, high availability and disaster recovery are key requirements and must be considered not only in light of what is happening within the data center, but also from across multiple data centers. Network high availability should be deployed by using a combination of link redundancy (both external and internal connectivity) and critical device redundancy to ensure network operations and business continuity. In addition, using site redundancy (multiple data centers) is critical to meeting disaster recovery and regulatory compliance objectives. Moreover, devices and systems deployed within the confines of the data center should support component-level high availability, such as redundant power supplies, fans and routing engines. Another important consideration is the software/firmware running on these devices, which should be based on a modular architecture that provides features such as ISSU features in the MX Series to prevent software failures and upgrade events from impacting the entire device. Software upgrades should only impact a particular module, thereby ensuring system availability.

Security
The critical resources in any enterprise location are typically the applications themselves, and the servers and supporting systems such as storage and databases. Financial, human resources, and manufacturing applications with supporting data typically represent a companys most critical assets and, if compromised, can create a potential disaster for even the most stable enterprise. The core network security layers must protect these business critical resources from unauthorized user access and attacks, including application-level attacks. The security design must employ layers of protection from the network edge through the core to the various endpoints, such as, for example, defense in depth. A layered security solution protects critical network resources that reside on the network. If one layer fails, the next layer will stop the attack and/or limit the damages that can occur. This level of security allows IT departments to apply the appropriate level of resource protection to the various network entry points based upon their different security, performance and management requirements.

Chapter 2: Design Considerations

25

Layers of security that should be deployed at the data center include the following: DoS protection at the edge Firewalls to tightly control who and what gets in and out of the network VPN to provide secure remote access Intrusion Prevention System (IPS) solutions to prevent a more generic set of application layer attacks. Further, application-layer firewalls and gateways also play a key role in protecting specific application traffic such as XML. For further details, refer to the National Institute of Science and Technology (NIST) recommended best practices, as described in Guide to General Server Security Recommendations of the National Institute of Standards and Technology at http://csrc.nist.gov/publications/nistpubs/800-123/SP800-123.pdf . Policy-based networking is a powerful concept that enables devices in the network to be managed efficiently, especially within virtualized configurations, and can provide granular levels of network access control. The policy and control capabilities should allow organizations to centralize policy management while offering distributed enforcement at the same time. The network policy and control solution should provide appropriate levels of access control, policy creation as well as management and network and service management ensuring secure and reliable networks for all applications. In addition, the data center network infrastructure should integrate easily into a customers existing management frameworks and third-party tools, such as Tivoli, and provide best-in-class centralized management, monitoring and reporting services for network services and the infrastructure.

Simplicity
Simplicity can be achieved by adopting new architectural designs, new technologies, and network operating systems. The two-tier network architecture is a new design that allows network administrators to simplify the data center infrastructure. Traditionally, data center networks were constructed using a three-tier design approach, resulting in access, aggregation and core layers. A large number of devices must be deployed, configured and managed within each of these tiers, increasing cost and complexity. This is primarily because of scalability requirements, performance limitations and key feature deficiencies in traditional switches and routers. Juniper Networks products support a data center network design that requires fewer devices, interconnections and network tiers. Moreover, the design also enables the following key benefits: Reduced latency due to fewer device hops Simplified device management Significant power, cooling and space savings Fewer system failure points.

26

Data Center Network Connectivity with IBM Servers

Figure 2.2 shows data center network design trends from a traditional data center network, to a network consisting of a virtualized access tier and collapsed aggregate and core tiers, to a network with improved network virtualization on the WAN.
Traditional Data Center Network Design WAN Gateway Tier 3: Core Tier 2: Aggregation
SRX5600 SRX5600

Virtualized Access, Consolidated Core/Aggregation

Integrated WAN Interface with MPLS-Enabled Core/Aggregation


SRX5800

WAN

SRX5800

MX480 EX8208 EX4200 EX8208


Virtual Chassis

MX480

Tier 1: Access Servers

EX4200

Virtual Chassis

Multiple L2/L3 switches at aggregation Multiple L2 access switches to be managed Multiple layers in the network

Up to 10 EX4200 Ethernet Switches can be managed as single device with Virtual Chassis technology High-performance L2/L3 collapsed core/aggregation with EX8208 and EX8216 Ethernet Switches reduce number of devices

Collapsed aggregation and core layer MPLS capable core with MX240, MX480 and MX960 Ethernet Routers WAN interface available on MX240, MX480 and MX960 Ethernet Routers

Figure 2. 2

Data Center Network Design Trends Converged I/O technology is a new technology that simplifies the data center infrastructure by supporting flexible storage and data access on the same network interfaces on the server side, and by consolidating storage area networks (SANs and LANs) into a single logical infrastructure. This simplification and consolidation makes it possible to allocate dynamically any resource including routing, switching, security services, storage systems, appliances and servers without compromising performance. Keeping in mind that network devices are complex, designing an efficient hardware platform is not, by itself, sufficient in achieving an effective, cost-efficient and operationally tenable product. Software in the control plane plays a critical role in the development of features and in ensuring device usability. Because Junos is a proven modular software network operating system that runs across different platforms, implementing Junos is one of the best approaches to simplifying the daily operations of the data center network. In a recent study titled, The Total Economic Impact of Juniper Networks Junos Network Operating System, Forrester Consulting reported a 41 percent reduction in overall network operational costs based on dollar savings across specific task

Chapter 2: Design Considerations

27

categories, including planned events, reduction in frequency and duration of unplanned network events, the sum of planned and unplanned events, the time needed to resolve unplanned network events, and the adding infrastructure task. As the foundation of any high performance network, Junos exhibits the following key attributes as illustrated in Figure 2.3: One operating system with a single source base and a single consistent feature implementation. One software release train extended through a highly disciplined and firmly scheduled development process. One common modular software architecture that stretches across many different Junos hardware platforms for many different Junos hardware platforms, including MX Series, EX Series, and SRX Series.

S ECU R I T Y

N AG E M E N T

API

ONE OS

Module X

RO U T I N G

ONE Architecture

MA

Frequent Releases

9.6

10.0

10.1

ONE Release Track

SW I T
Figure 2.3 Junos: A 1-1-1 Advantage

CHING

Performance
To address performance requirements related to server virtualization, centralization and data center consolidation, the data center network should boost the performance of all application traffic, whether local or remote. The data center should offer LAN-like user experience levels for all enterprise users irrespective of their physical location. To accomplish this, the data center network should optimize applications, servers, storage and network performance.

28

Data Center Network Connectivity with IBM Servers

WAN optimization techniques that include data compression, TCP and application protocol acceleration, bandwidth allocation, and traffic prioritization improve performance network traffic. In addition, these techniques can be applied to data replication, and to backup and restoration between data centers and remote sites, including disaster recovery sites. Within the data center, Application Front Ends (AFEs) and load balancing solutions boost the performance of both client-server and Web-based applications, as well as speeding Web page downloads. In addition, designers must consider offloading CPU-intensive functions, such as TCP connection processing and HTTP compression, from backend applications and Web servers. Beyond application acceleration, critical infrastructure components such as routers, switches, firewalls, remote access platforms and other security devices should be built on non-blocking modular architecture, so that they have the performance characteristics necessary to handle the higher volumes of mixed traffic types associated with centralization and consolidation. Designers also should account for remote users. Juniper Networks innovative silicon chipset and the virtualization technologies deliver a unique high performance data center solution. Junos Trio represents Junipers fourth generation of purpose-built silicon and is the industrys first network instruction set a new silicon architecture unlike traditional ASICs and network processing units (NPUs). The new architecture leverages customized network instructions that are designed into silicon to maximize performance and functionality, while working closely with Junos software to ensure programmability of network resources. The new Junos One family thus combines the performance benefits of ASICs and the flexibility of network processors to break the standard trade-offs between the two. Built in 65-nanometer technology, Junos Trio includes four chips with a total of 1.5 billion transistors and 320 simultaneous processes, yielding total router throughput up to 2.6 terabits per second and up to 2.3 million subscribers per rack far exceeding the performance and scale possible through off-the-shelf silicon. Junos Trio includes advanced forwarding, queuing, scheduling, synchronization and end-to-end resiliency features, helping customers provide service-level guarantees for voice, video and data delivery. Junos Trio also incorporates significant power efficiency features to enable more environmentally conscious data center and service provider networks. Junos Trio chipset with revolutionary 3D Scaling technology enables networks to scale dynamically for more bandwidth, subscribers and services all at the same time without compromise. Junos Trio also yields breakthroughs for delivering rich business, residential and mobile services at massive scale all while using half as much power per gigabit. The new chipset includes more than 30 patent-pending innovations in silicon architecture, packet processing, QoS and energy efficiency. The Juniper Networks data center network architecture employs a mix of virtualization technologies such as Virtual Chassis technology with VLANs and MPLS-based advanced traffic engineering, VPN enhanced security, QoS, VPLS, and other virtualization services. These virtualization technologies address many of the challenges introduced by server, storage and application virtualization. For example, Virtual Chassis supports low-latency server live migration from server to server in completely different racks within a data

Chapter 2: Design Considerations

29

center, and from server to server between data centers in a flat Layer 2 network, when these data centers are within reasonably close proximity. Virtual Chassis with MPLS allows the Layer 2 domain to extend across data centers to support live migration from server to server when data centers are distributed over significant distances. Juniper Networks virtualization technologies support low latency, throughput, QoS and high availability required by server and storage virtualization. MPLS-based virtualization addresses these requirements with advanced traffic engineering to provide bandwidth guarantees, label switching and intelligent path selection for optimized low latency and fast reroute for extreme high availability across the WAN. MPLS-based VPNs enhance security with QoS to efficiently meet application and user performance needs. These virtualization technologies serve to improve efficiencies and performance with greater agility while simplifying operations. For example, acquisitions and new networks can be folded quickly into the existing MPLS-based infrastructure without reconfiguring the network to avoid IP address conflicts. This approach creates a highly flexible and efficient data center WAN.

Innovation
Innovation, for example green initiatives, influences data center design. A green data center is a repository for the storage, management and dissemination of data in which the mechanical, lighting, electrical and computer systems provide maximum energy efficiency with minimum environmental impact. As older data center facilities are upgraded and newer data centers are built, it is important to ensure that the data center network infrastructure is highly energy and space efficient. Network designers should consider power, space and cooling requirements for all network components, and they should compare different architectures and systems so that they can ascertain the environmental and cost impacts across the entire data center. In some environments, it might be more efficient to implement high-end, highly scalable systems that can replace a large number of smaller components, thereby promoting energy and space efficiency. Green initiatives that track resource usage, carbon emissions and efficient utilization of resources, such as power and cooling are important factors when designing a data center. Among the many Juniper energy efficiency devices, the MX960 is presented in Table 2.1 to demonstrate its effects on reductions in energy consumption and footprint within the data center. Table 2. 1 Juniper Networks MX 960 Power Efficiency Analysis Juniper Networks Core MX960 2x Chassis
96 720 187.84 9020.00 36074.33 2 chassis 2/3rds of a single rack

Characteristics
Line-rate 10 GigE (ports) Throughput per chassis (Mpps) Output current (Amps) Output Power (Watts) Heat Dissipation (BTU/Hr) Chassis Required (rack space) Rack space (racks)

30

Data Center Network Connectivity with IBM Servers

Two-Tier Network Deployment


In this handbook, we deploy a two-tier network architecture in a data center network, as shown in Figure 2.4. These two tiers consist of the core network and access network tiers. These two tiers are associated with the data services and applications tier and core network tier, which define Juniper Networks data center reference architecture. For further details concerning Junipers data center reference architecture, refer to the Enterprise Data Center Network Reference Architecture Using a High Performance Network Backbone to Meet the Requirements of the Modern Enterprise Data Center at www.juniper.net/us/en/local/pdf/referencearchitectures/8030001-en.pdf. NOTE For a detailed discussion of two-tier network deployment, see Chapter 3: Implementation Overview. The two-tier network architecture defined in this handbook does not include a storage network.

To Edge Services Tier

Core Tier

MX480

MX480

Access Tier

EX4200 VC EX4200 VC EX4200 VC


Virtual Chassis

EX4200 VC EX4200 VC EX4200 VC


Virtual Chassis

EX8200

EX8200

MM1 MM2

NICs/HEA (Host EthernetAdapter)

NICs/HEA (Host EthernetAdapter)

Servers

Virtual Switch

Virtual Switch

VIOS

LPAR

VIOS

LPAR

IBM Blade Servers

IBM Power VM

IBM Power VM

Figure 2. 4

Sample Two-Tier Data Center Network Deployment

Chapter 2: Design Considerations

31

Core Network Tier


The core network tier commonly uses Juniper Networks MX Series Ethernet Services Routers or Juniper Networks EX8200 line of Ethernet switches, such as MX960, MX480, EX8216 and EX8208. Deciding on a particular device depends on various factors, including functional requirements in the core network tier, budgetary constraints or phased deployment considerations. The following represent several customer scenarios: Extend the Layer 2 broadcasting domain across a geographically dispersed data center so that all the servers associated with the Layer 2 domain appears on the same Ethernet LAN. Then the enterprise can leverage many existing provisioning and data migration tools to manage worldwide-distributed servers effectively. The MX960 and MX480 are ideal devices for building an MPLS backbone in the enterprise core network tier and for leveraging VPLS to create an extended Layer 2 broadcasting domain between data centers. In the core network tier, also known as the consolidated core layer, two MX Series routers connect to two SRX Series platforms, which have many virtual security services that can be configured into independent security zones. The MX Series routers connect to top-of-the-rack Juniper Networks EX Series Ethernet Switches in the access layer, which in turn aggregate the servers in the data center. Consolidate a traditional three-tier network infrastructure to support trafficintensive applications and multi-tier business applications to lower latency, support data and video and integrate security. The MX960 and SRX5800 are ideal products to provide a consolidated solution, as illustrated in Figure 2.5.
WAN Edge
M Series M Series
Trunk VPN Server VLAN

Consolidated Core Layer


Mapping of VLANs to Security Zones Map VRFs on core to routing instances on SRX Series Establish adjacency between VRFs on core Traffic between networks runs through SRX Series by default, or ltered on MX Series IP VPN
Mapping VRF to Security Zones

MX960
VRF #1 VRF #2

MX960
VRF #1 VRF #2

IPS #1

Firewall #1 NAT #1

SRX5800 Security Zones


Firewall IPS NAT

Mapping VRF to Security Zones

Firewall #2

IPS #2

Firewall #3

Access Layer
EX4200 Virtual Chassis

VLANs EX4200 Virtual Chassis

HR

Finance

Guest

Departments

Figure 2. 5

Integrating Security Service with Core Network Tier

32

Data Center Network Connectivity with IBM Servers

Two MX960 routers are shown to indicate high availability between these devices, providing end-to-end network virtualization for applications by mapping Virtual Routing and Forwarding (VRF) in the MX Series to security zones in the SRX. In Figure 2.5 for example, the VRF #1 is mapped to security zones Firewall #1, NAT #1, and IPS #1, and VRF #2 is mapped to Firewall #2 and NAT #2. For details concerning network virtualization on the MX Series, refer to Juniper Networks white paper, Extending The Virtualization Advantage With NetworkVirtualization Virtualization Techniques in Juniper Networks MX Series 3D Universal Edge Routers at www.juniper.net/us/en/local/pdf/whitepapers/2000342-en.pdf.

Access Tier
We typically deploy the EX4200 Ethernet Switch with Virtual Chassis Technology as a top-of-rack virtual chassis in the access tier. The EX4200, together with server virtualization technology, supports high availability and high maintainability two key requirements for mission critical, online applications.
EX4200 EX4200
TOR Virtual Chassis 2

EX4200

TOR Virtual Chassis 1 Servers uplink (LAG+ backup)

TOR Virtual Chassis Uplink (LAG)

VIOS

VIOS

VIOS

VIOS

VIOS

VIOS

VIOS

VIOS

VIOS

VIOS

VIOS

VIOS VIOS Secondary VIOS Primary VIOS

Power 570

Power 570

Power 570

VIOS

RACK 1

RACK 2

RACK 7

Figure 2. 6

Deploying PowerVM Using Dual Vios and Dual Top-Of-Rack Virtual Chassis As illustrated in Figure 2.6: The Power 570 Servers are deployed with dual Virtual I/O Servers (VIOS): the primary VIOS runs in active mode while the secondary VIOS runs in standby mode. The primary VIOS connects to one top-of-rack virtual chassis while the secondary one connects to another top-of-rack virtual chassis.

Chapter 2: Design Considerations

33

The typical bandwidth between the PowerVMs VIOS and the top-of-rack virtual chassis switch is 4Gbps, realized as 4 x 1Gbps ports in the NIC combined in a LAG. The bandwidth can scale up to 8 Gbps by aggregating eight ports in a LAG interface. The two Hardware Management Consoles (HMCs) connect to two different top-of-rack virtual chassis, for example HMC 1 and HMC 2. Besides preventing single point of failure (SPOF), this approach also provides highly available maintenance architecture for the network: when a VIOS or virtual chassis instance requires maintenance, operators can upgrade the standby VIOS or virtual chassis while the environment runs business as usual, then switch the environment to the upgraded version without disrupting application service. For connecting a larger number of servers, it is straightforward to duplicate the top-of-rack virtual chassis deployment at the access layer. Figure 2.7 shows a top-of-rack virtual chassis with seven EX4200s connected to a group of 56 Power 570 systems. To connect additional 56 Power 570 systems, an additional top-of-rack virtual chassis is deployed at the access layer. As a result, the access layer can connect a large number of Power 570 systems. After addressing all the connectivity issues, we must not lose sight of the importance of performance in the other network layers and network security because we are operating the data center network as one secured network.

CORE LAYER

EX8200

EX8200

ACCESS LAYER

EX4200

EX4200

SERVER LAYER

56 IBM Power 570 systems 4480 Client Partitions

56 IBM Power 570 systems 4480 Client Partitions

Figure 2. 7

Top-Of-Rack Virtual Chassis with Seven EX4200s Connected to Power 570 Systems

34

Data Center Network Connectivity with IBM Servers

The EX4200 top-of-rack virtual chassis supports different types of physical connections. The EX4200 provides 48, 1000 Base-TX ports and two ports for 10 Gbps XFP transceivers through its XFP uplink module. The XFP port can uplink other network devices or it can connect to the IBM Power Systems based on user requirements. Table 2.2 lists three typical 10 Gbps connections used in a Power System and the XFP Uplink module required for each EX4200 connection. MORE For further details concerning IBM PowerVM and EX4200 top-of-rack virtual chassis scalability, refer to.Implementing IBM PowerVM Virtual Machines on Juniper Networks Data Center Networks at www.juniper.net/us/en/local/pdf/ implementation-guides/8010049-en.pdf. Table 2. 2 Physical Connectivity Between IBM Power 570 and EX4200 XFP Uplink Module
XFP Uplink Module XFP LR 10 Gbps Optical Transceiver Module XFP Uplink Module XFP LR 10 Gbps Optical Transceiver Module XFP Uplink Module XFP SR 10 Gbps Optical Transceiver Module

IBM POWER 570


10 Gbps Ethernet LR PCI-Express Adapter 10 Gbps Ethernet LR PCI-X2.0 DDR Adapter Logical Host Ethernet Adapter (lp-hea)

Cable
SMF

SMF

SMF

35

Chapter 3
Implementation Overview

Implementation.Network.Topology.Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Server.and.Network.Connections.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Spanning.Tree.Protocols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Multicast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Performance.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 High.Availability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Implementation.Scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

THIS.CHAPTER.SERVES.AS.a.reference.to.the.later.chapters.in.this.handbook.. by.presenting.an.overview.of.the.next.generation.intra-data.center.network.. implementation.scenarios ..The.implementation.scenarios.summarized.in.this.. chapter.address.the.requirements,.as.previously.discussed.in.Chapter.2 ..The. network.topology.of.this.reference.data.center.is.covered.specifically.as.a.part.of. this.chapter .. Chapters.4.through.8.focus.on.the.technical.aspects.of.the.implementation.that. primarily.include.server.connectivity,.STP,.multicast,.performance,.and.high. availability ..

36

Data Center Network Connectivity with IBM Servers

Implementation Network Topology Overview


This chapter presents the implementation of a two-tier data center network topology. This topology is common to all scenarios described in the later chapters. Please note that the setup diagrams for each individual scenario can be different despite common overall network topology. As shown in Figure 3.1, the implementation network topology is a two-tier data center network architecture.
Multicast Streaming Source

WAN Edge + Core Core/Aggregation Tier LAG/VRRP

PIM PIM LAG/VRRP LAG/VRRP

MX480
Access Tier VRRP/RTG/STP

EX8200
VRRP/RTG/STP

MX480
VRRP/RTG/STP PIM

PIM

EX4200

EX4200 EX4200
Virtual Chassis 2

EX4200

EX4200

EX4200

EX4200
Virtual Chassis 1 To R1 LAG To R2 LAG

EX4200

To R3 STP

Servers - VLAN A Multicast Receiver - IGMP host

Servers - VLAN B Multicast Receiver - IGMP host

Servers - VLAN C Multicast Receiver - IGMP host

VC - EX4200 Virtual Chassis RTG - Redundant Trunk Group LAG - Link Aggregation Group VRRP - Virtual Router Redundancy Protocol STP - Spanning Tree Protocol I0 GE link Virtual Chassis 1 GE Link

Figure 3. 1

Implementation Network Topology Overview

Chapter 3: Implementation Overview

37

NOTE

Each individual implementation can differ based on network design and requirements. The topology described here consists of the following tiers and servers: Core/aggregation tier consisting of EX8200s or MX480s. Access tier comprised of EX4200s. These access switches can be deployed either individually or configured to form a virtual chassis. Either of these options can be implemented as top-of-rack switches to meet different Ethernet port density requirements. Pertaining to the topology under discussion: - - - Three EX4200 switches form a virtual chassis (VC1), functioning astop-of rack switching(ToR1). Two EX4200 switches form a virtual chassis (VC2), functioning as top-of rack switch (ToR2). The EX4200-1, EX4200-2, EX4200-3 are three individual access switches, functioning as top-of-rack switches (ToR3).

Servers where the IBM BladeCenter, IBM x3500 and IBM PowerVM reside for all scenarios presented. For ease of configuration, one server type is used for each scenario. Servers are segmented into different VLANs, for example VLAN A, B, and C, as shown in Figure 3.1. The physical network topology consists of the following connections: The servers connect to the access tier through multiple 1GbE links with Link Aggregation (LAG) to prevent single point of failure (SPOF) in the physical link and improve bandwidth. The access switches connect to the core layer with multiple 10GbE links. At the core tier, the MX480s and EX8200s interconnect to each other using redundant 10GbE links. These devices connect to the WAN edge tier, which interconnects the different data centers and connects to external networks. NOTE Choosing different connection configurations is based on network design and requirements. Redundant physical links are extremely important for achieving network high availability.

Server and Network Connections


Chapter 4 discusses the IBM System p, PowerVM, and Juniper Networks MX an EX Series network configurations. Typically, these network configurations are required for any implementation scenario. For the IBM System p and PowerVM, we discuss its production networks and management networks. We also discuss key PowerVM server virtualization concepts, including Shared Ethernet Adapter (SEA) and Virtual I/O Server (VIOS). For the Juniper Networks MX and EX Series, we discuss the Junos operating system, which runs on both the MX and EX Series platforms. In addition, we discuss the jumbo frame Maximum Transmission Unit (MTU) setting.

38

Data Center Network Connectivity with IBM Servers

Spanning Tree Protocols


Spanning Tree Protocol is enabled on the connections between the access switches and the servers, and on the connection between the access and core/aggregation devices. For server-to-access switch connections, STP is configured on the switch side so that the links to the servers are designated as edge ports. There are no other bridges attached to edge ports. Administrators can configure RSTP, MSTP or VSTP between the access and aggregation/core devices. NOTE Both the MX Series and EX Series devices support all spanning tree protocols. Spanning Tree Protocols, such as RSTP, MSTP and VSTP, prevent loops in Layer 2-based access and aggregation layers. MSTP and VSTP are enhancements over RSTP. MSTP is useful when it is necessary to divide a Layer 2 network into multiple, logical spanning tree instances. For example, it is possible to have two MSTP instances that are mutually exclusive from each other while maintaining a single broadcast domain. Thus, MSTP provides better control throughout the network by dividing it into smaller regions. MSTP is preferred when different devices must fulfill the role of the root bridge. Thus, the role of the root bridge is spread across multiple devices. The tradeoff for implementing MSTP is increased administrative overhead and network complexity. A higher number of root devices increase the latency time during the root bridge election. NOTE When using MSTP, it is important to distribute the root bridge functionality across an optimal number of devices without increasing the latency time during root bridge election. VSTP can be compared to Ciscos PVST+ protocol. VSTP is implemented when spanning tree is enabled across multiple VLANs. However, VSTP is not scalable and cannot be used for a larger number of VLANs. See Chapter 5 for a detailed discussion on STP protocols.

Multicast
The multicast protocol optimizes the delivery of video streaming and improves network infrastructure and overall efficiency. In Chapter 6, we present multicast implementation scenarios, including Protocol Independent Multicast (PIM) and IGMP snooping. In these scenarios, the video streaming client runs on IBM servers. PIM is implemented on the core/aggregation tiers, while IGMP snooping is implemented on the access tier.

Performance
In Chapter 7 two methods for improving data center network performance are covered in detail: Using CoS to manage traffic. Considering latency characteristics when designing networks using Juniper Networks data center network products.

Chapter 3: Implementation Overview

39

Using CoS to Manage Traffic


Configuring CoS on the different devices within the datacenter enables SLAs for different voice, video and other critical services. Traffic can be prioritized using different forwarding classes. Prioritization between streams assigned to a particular forwarding class can be achieved using a combination of Behavior Aggregate (BA) and Multifield (MF) classifiers.

Latency
Evolution of Web services and SOA has been critical to the integration of applications that use standard protocols such as HTML. This tight integration of applications with web services has generated an increase of almost 30-75 percent of east-west traffic (server-to-server traffic) within the data center. As a result, latency between servers must be reduced. Reduced latency can be achieved by: Consolidating the number of devices and thus the tiers within the data center. Extending the consolidation between tiers using techniques such as virtual chassis. Virtual chassis and multiple access layer switches can be grouped logically to form one single switch. This reduces the latency time to a few microseconds because the traffic from the server does not need to be forwarded through multiple devices to the aggregation layer. In the latency implementation scenario, we primarily focus on how to configure the MX480 for measuring Layer 2 and Layer 3 latency.

High Availability
High availability can provide continuous service availability when implementing redundancy, stateful recovery from a failure, and proactive fault prediction. High availability minimizes failure recovery time. Junos OS provides several high availability features to improve user experience and to reduce network downtime and maintenance. For example, features such as virtual chassis (supported on EX4200), Non Stop Routing/Bridging (NSR/NSB, both supported on MX Series), GRES, GR and Routing Engine Redundancy can help increase availability at the device level. The Virtual Routing Redundancy Protocol (VRRP), Redundant Trunk Group (RTG) and LAG features control the flow of traffic over chosen devices and links. The ISSU feature on the MX Series reduces network downtime for a Junos OS software upgrade. For further details concerning a variety of high availability features, see Chapter 8: Configuring High Availability. Each high availability feature can address certain technical challenges but may not address all the challenges that todays customers experience. To meet network design requirements, customers can implement one or many high availability features. In the following section, we discuss high availability features by comparing their characteristics and limitations within the following groups: GRES, GR versus NSR/NSB Routing Engine Switchover Virtual Chassis VRRP

40

Data Center Network Connectivity with IBM Servers

Comparing GRES and GR to NSR/NSB Table 3.1 provides an overview of the GRES, GR and NSR/NSB high availability features available in Junos. Table 3. 1 High Availability Features in Junos OS Functions
Provides uninterrupted traffic forwarding. GRES Maintains kernel state between REs and PFE. Allows a failure of a neighboring router not to disrupt adjacencies or traffic forwarding for a certain time interval. GR (protocol extensions) Enables adjoining peers to recognize RE switchover as a transitional event. This prevents them from starting the process of reconverging network paths. Neighbors are required to support graceful restart. RE switchover is transparent to network peer. No peer participation required. NSR/NSB No drop in adjacencies or session. Minimal impact on convergence. Allows switchover to occur at any point, even when routing convergence is in progress. Unsupported protocols must be refreshed using the normal recovery mechanisms inherent in each protocol.

HA Features

Implementation Considerations
Incapable of providing router redundancy by itself. Works with GR protocol extensions. Network churn and processing not proportional to effective change.

Network topology changes can interfere with graceful restart.

GR can cause blackholing if RE failure occurs due to a different cause.

Nonstop active routing/bridging and graceful restart are two different mechanisms for maintaining high availability when a router restarts. A router undergoing a graceful restart relies on its neighbors to restore its routing protocol information. Graceful restart requires a restart process where the neighbors have to exit a wait interval and start providing routing information to the restarting router. NSR/NSB does not require a route restart. Both primary and backup Routing Engines exchange updates with neighbors. Routing information exchange continues seamlessly with the neighbors when the primary Routing Engine fails because the backup takes over. NOTE NSR cannot be enabled when the router is configured for graceful restart.

Chapter 3: Implementation Overview

41

Routing Engine Switchover


Because Routing Engine switchover works well with other high availability features, including graceful restart and NSR, many implementation options are possible. Table 3.2 summarizes the feature behavior and process flow of these options. The dual (redundant) Routing Engines option means that the Routing Engine switchover is disabled. We also use the dual Routing Engine only option as a baseline to compare other options with high availability enabled, such as the graceful routing engine switchover enabled option. Table 3. 2 Routing Engine Switchover Implementation Options Summary Feature Behavior
Routing convergence takes place and traffic resumes when the switch over to the new primary Routing Engine is complete.

Implementation Options
Dual Routing Engines only (no high availability features enabled)

Process Flow
. All.physical.interfaces.are.taken.offline . . Packet.Forwarding.Engines.restart . . Backup.Routing.Engine.restarts.the.routing. protocol.process.(rpd) .. . The.new.primary.Routing.Engine.discovers.all. hardware.and.interfaces . . The.switchover.takes.several.minutes.and.all. of.the.routers.adjacencies.are.aware.of.the. physical.(interface.alarms).and.routing. (topology).change .

Graceful Routing Engine switchover enabled

Interface and kernel information preserved during switchover. The switchover is faster because the Packet Forwarding Engines are not restarted. Traffic is not interrupted during the switchover. Interface, kernel and routing protocol information is preserved. Traffic is not interrupted during the switchover. Interface and kernel information is preserved. Graceful restart protocol extensions quickly collect and restore routing information from the neighboring routers.

. The.new.primary.Routing.Engine.restarts.the. routing.protocol.process.(rpd) .. . All.adjacencies.are.aware.of.the.routers. change.in.state .

Graceful Routing Engine switchover and nonstop active routing enabled

. Unsupported.protocols.must.be.refreshed. using.the.normal.recovery.mechanisms. inherent.in.each.protocol .

Graceful Routing Engine switchover and graceful restart enabled

. Neighbors.are.required.to.support.graceful. restart.and.a.wait.interval.is.required .. . The.routing.protocol.process.(rpd).restarts .. For.certain.protocols,.a.significant.change.in. the.network.can.cause.graceful.restart.to. stop .

42

Data Center Network Connectivity with IBM Servers

Virtual Chassis
Between 2 and 10 EX4200 switches can be connected and configured to form a single virtual chassis that acts as a single logical device to the rest of the network. A virtual chassis typically is deployed in the access tier. It provides high availability to the connections between the servers and access switches. The servers can be connected to different member switches of the virtual chassis to prevent SPOF.

Virtual Router Redundancy Protocol


The Virtual Routing Redundancy Protocol (VRRP) described in IETF standard RFC 3768, is a redundancy protocol that increases the availability of a default gateway on a static routing environment. VRRP enables hosts on a LAN to use redundant routers on that LAN without requiring more than the static configuration of a single default route on the hosts. The VRRP routers share the IP address corresponding to the default route configured in the hosts. At any time, one of the VRRP routers is the master (active) and the others are backups. If the master fails, one of the backup routers becomes the new master router, thus always providing a virtual default router and allowing traffic on the LAN to be routed without relying on a single router. Junos OS provides two tracking capabilities to enhance VRRP operations: Track the logical interfaces and switch to a VRRP backup router. Track the reachability to the primary router. An automatic failover to the backup occurs if the route to the given primary no longer exists in the routing table.

Implementation Scenarios
Table 3.3 summarizes the implementation scenarios presented in this handbook. It provides mapping between each scenario, network tier, and devices. Using this table as a reference, you can map the corresponding chapter to each particular implementation scenario.

Chapter 3: Implementation Overview

43

Table 3. 3

Implementation Scenarios Summary Chapter Implementation Deployment


Access-Aggregation/Core Chapter 5 Aggregation-Aggregation Aggregation-Core Chapter 6 Access

Implementation Scenarios
Spanning Tree (MSTP/RSTP/VSTP)

Device Support
EX4200 EX8200 MX Series EX4200 EX8200, MX Series EX4200, EX8200, MX Series EX4200 EX8200, MX Series EX4200

PIM

IGMP snooping

Chapter 6

Access Access Aggregation/Core Access Access

CoS

Chapter 7

Virtual Chassis

Chapter 8

VRRP

Chapter 8

Aggregation/Core

EX4200 EX8200, MX Series

ISSU

Chapter 8

Aggregation/Core Access Aggregation Aggregation/Core

MX Series only

RTG

Chapter 8

EX Series only MX Series, EX8200

Routing Engine Redundancy

Chapter 8

Non-Stop Routing

Chapter 8

Aggregation/Core Access Aggregation/Core Access-Aggregation Access-Server Aggregation/Core

MX Series only EX4200 EX8200, MX Series EX Series only EX4200 EX8200, MX Series

GR

Chapter 8

RTG

Chapter 8

LAG

Chapter 8

44

Data Center Network Connectivity with IBM Servers

Table 3.4 functions as a reference aid to help our customers thoroughly understand how Juniper Networks products and features, which are available in Junos 9.6, can be implemented into their networks. This table summarizes implementation scenarios and their supported products that are defined in detail later in this guide. Table 3. 4 Mapping of Implementation Scenarios to Juniper Networks Supported Products EX4200 EX8200 MX480

Implementation Scenarios High Availability


NSR/NSB GRES + GR Virtual Chassis VRRP RTG LAG ISSU Routing Engine Redundancy

Yes Yes Yes Yes Yes

Yes Yes Yes Yes Yes

Yes Yes Yes Yes Yes Yes

Spanning Tree Protocol


STP/RSTP, MSTP, VSTP Yes Yes Yes

Performance
CoS Yes Yes Yes

Multicast
PIM IGMP Yes Yes Yes Yes Yes Yes

45

Chapter 4
Connecting IBM Servers in the Data Center Network
IBM.System.p.and.PowerVM.Production.Networks. . . . . . . . . . . . . . . . . . . . . . . . . . 46 IBM.System.p.and.PowerVM.Management.Networks . . . . . . . . . . . . . . . . . . . . . . . 47 Configuring.IBM.System.p.Servers.and.PowerVM.. . . . . . . . . . . . . . . . . . . . . . . . . . . 48 IBM.PowerVM.Network.Deployment.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 Junos.Operating.System.Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 Configuring.Network.Devices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

THIS.CHAPTER.DISCUSSES.the.IBM.System.p.and.PowerVM.network. configuration.and.the.Juniper.Networks.MX.and.EX.Series.network.configuration .. The.IBM.System.p.server.is.based.on.POWER.processors,.such.as.POWER5,. POWER6.and.the.recently.announced.POWER7 ..In.addition.to.the.System.p. server,.IBM.offers.PowerVM,.which.is.a.new.brand.for.the.system.virtualization. powered.by.POWER.processors,.and.which.includes.elements.such.as.MicroPartitioning,.logical.partitioning.(LPAR),.Virtual.I/O.Server.(VIOS).and.hypervisor .. Both.System.p.servers.and.PowerVM.typically.are.deployed.in.the.data.center.and. support.mission.critical.applications .

46

Data Center Network Connectivity with IBM Servers

IBM System p and PowerVM Production Networks


As illustrated in Figure 4.1, the POWER Hypervisor is the foundation of virtual machine implementation in the IBM PowerVM system. Combined with features designed into IBMs power processors, the POWER Hypervisor enables dedicatedprocessor partitions, Micro-Partitioning, virtual processors, an IEEE VLAN compatible virtual switch, virtual Ethernet adapters, virtual SCSI adapters, and virtual consoles within the individual server. The POWER Hypervisor is a firmware layer positioned between the hosted operating systems and the server hardware. It is automatically installed and activated, regardless of system configuration. The POWER Hypervisor does not require specific or dedicated processor resources assigned to it.
Standalone Servers IBM Power 570 System Client Partitions (Virtual Machines)

App

App

App Virtual I/O Server (VIOS) Shared Ethernet Adapters Virtual Virtual

App

App

App

OS

OS

OS

OS Virtual

OS Virtual

OS Virtual

VLAN 100 VLAN 200 Hypervisor VLAN 100 VLAN 200

EX4200

Figure 4.1

IBM Power Systems Virtualization Overview The VIOS, also called the Hosting Partition, is a special-purpose LPAR in the server, which provides virtual I/O resources to client partitions. The VIOS owns the resources, such as physical network interfaces and storage connections. The network or storage resources, reachable through the VIOS, can be shared by client partitions running on the machine, enabling administrators to minimize the number of physical servers deployed in their network. In PowerVM, client partitions can communicate among each other on the same server without requiring access to physical Ethernet adapters. Physical Ethernet adapters are required to allow communication between applications running in the client partitions and external networks. A Shared Ethernet Adapter (SEA) in VIOS bridges the physical Ethernet adapters from the server to the virtual Ethernet adapters functioning within the server.

Chapter 4: Connecting IBM Servers in the Data Center Network

47

Because the SEA functions at Layer 2, the original MAC address and VLAN tags of the frames associated with the client partitions (virtual machines) are visible to other systems in the network. For further details, refer to IBMs white paper Virtual Networking on AIX 5L at www.ibm.com/servers/aix/whitepapers/aix_vn.pdf. In PowerVM, the physical Network Interface Card (NIC) typically is allocated on VIOS for improved utilization and in the IBM System p, the physical NIC is exclusively allocated to a LPAR.

IBM System p and PowerVM Management Networks


The System p servers and PowerVM in the data center can be managed by HMC, which is a set of applications running on a dedicated IBM X Series server that provides a CLI-based and web-based server management interface. HMC typically connects monitor, keyboard and mouse for local access. However, the management network, which connects HMC and its managed servers, is critical to the remote access, an essential operational task in todays data center.
Out-of-Band Network Management

Client

NIC

httpd

sshd X-Window Server

NIC

dhcpd

RMC HMC

HMC

HMC Private Management Network

Server SRV 1 LPAR VIOS NIC Server SRV 2 LPAR VIOS NIC LPAR 1 LPAR 2 LPAR 3 LPAR 1 LPAR 2 LPAR 3

FSP

NIC

NIC

FSP

NIC

NIC

Figure 4.2

IBM Power Systems Management Networks Overview

As illustrated in Figure 4.2, IBM Power Systems management networks require two networks: Out-of-Band management network. HMC private management network.

48

Data Center Network Connectivity with IBM Servers

The out-of-band management network connects HMC and client networks so that a clients request for access can be routed to the HMC. A HMC private management network is dedicated for communication between the HMC and its managed servers. The network uses a selected range of non-routable IP addresses, and the Dynamic Host Configuration Protocol (DHCP) server is available in the HMC for IP allocation. Each p server connects to the private management network through its Flexible Service Processor (FSP) ports. Through the HMC private management network, the HMC manages servers in the following steps: 1. Connects the p servers FSP port to the HMC private management network so that HMC and the server are in the same broadcast domain, and HMC runs DHCP server (dhcpd).

2. Powers on the server. The servers FSP runs the DHCP client and requests a new IP address. FSP gets the IP address, which is allocated from HMC. 3. HMC communicates with the server and updates its managed server list with this new server. 4. HMC performs operations on the server, for example powers on/off the server, creates LPAR, sets shared adapters (Host Ethernet and Host Channel) and configures virtual resources.

Configuring IBM System p Servers and PowerVM


In this section, we discuss IBMs system p servers and PowerVM network configuration, including NIC, virtual Ethernet Adapter and virtual Ethernet switch configurations, SEA in VIOS, and Host Ethernet Adapter.

Network Interface Card


As illustrated in Figure 4.3, the NIC can be allocated exclusively to a LPAR through HMC. In LPAR, the system administrators will further configure NIC operation parameters, such as auto-negotiation, speed, duplex, flow control and support for jumbo frames.
Server SRV 1 LPAR VIOS NIC LPAR 1 LPAR 2 LPAR 3

to HMC

FSP

NIC

NIC

Figure 4.3

IBM Power Systems Management Networks Overview

To allocate (or remove) the NIC on the LPAR, perform the following steps: 1. 2. 3. 4. 5. 6. 7. Select LPAR. Select: Configuration >> Manage Profiles. Select the profile that you want to change. Select I/O tab. Select NIC (physical I/O resource). Click Add to add the NIC (or Remove to remove the NIC). Select OK to save changes, then click Close.

Chapter 4: Connecting IBM Servers in the Data Center Network

49

NOTE

The NIC can be allocated to multiple profiles. Because the NIC allocation is exclusive during the profile runtime, only one profile activates and uses this NIC. If the NIC is already used by one active LPAR, and you attempt to activate another LPAR, which requires the same NIC adapter, the activation process will be aborted. Adding or removing the NIC requires re-active LPAR profile.

Configuring Virtual Ethernet Adapter and Virtual Ethernet Switch


As illustrated in Figure 4.4, the POWER Hypervisor implements an IEEE 802.1Q VLAN style virtual Ethernet switch. Similar to a physical Ethernet switch, it provides virtual ports which supports IEEE 802.1Q VLAN tagged or untagged Ethernet frames. Similar to a physical Ethernet adapter on the physical server, the virtual Ethernet adapter on the partition provides network connectivity to the virtual Ethernet switch. When you create a virtual Ethernet adapter on the partition from the HMC, the corresponding virtual port is created on the virtual Ethernet switch and there is no need to attach explicitly a virtual Ethernet adapter to a virtual port. The virtual Ethernet adapter and virtual Ethernet switch form a virtual network among the client partitions so that they can communicate with each other running on the same physical server. The VIOS is required for client partition to further access the physical network outside of the physical server. As shown in Figure 4.4, three LPARs and VIOS connect to two virtual Ethernet switches through virtual Ethernet adapters. The VIOS also connects to the physical NIC so that LPAR2 and LPAR3 can communicate with each other; LPAR1, LPAR2 and VIOS can communicate with each other and further access the external physical network through the physical NIC.
Server SRV 1 LPAR VIOS to HMC FSP NIC 1 Virtual Switch 2 LPAR 1 LPAR 2 LPAR 3

Figure 4.4

Configuring Virtual Ethernet Switches and Virtual Ethernet Adapters

This section provides steps for the following: Creating a virtual Ethernet switch Removing a virtual Ethernet switch Creating a virtual Ethernet adapter Removing a virtual Ethernet adapter Changing virtual Ethernet adapter properties

50

Data Center Network Connectivity with IBM Servers

To Create a Virtual Ethernet Switch 1. Select server (Systems Management >> Servers >> select server).

2. Select Configuration >> Virtual Resources >> Virtual Network Management. 3. Select Action >> Create VSwitch. 4. Enter a name for the VSwitch then select OK. 5. Click Close to close dialog. To Remove a Virtual Ethernet Switch 1. Select server (Systems Management >> Servers >> select server).

2. Select Configuration >> Virtual Resources >> Virtual Network Management. 3. Select Action >> Remove VSwitch. 4. Click Close to close dialog. To Create a Virtual Ethernet Adapter 1. Select server (Systems Management >> Servers >> select server).

2. Select LPAR. 3. Select Configuration >> Manage Profiles. 4. Select the profile that you want to change. 5. Select Virtual Adapters tab. 6. Select Actions >> Create >> Ethernet Adapter (see Figure 4.5).

Figure 4.5

Virtual Ethernet Adapter Properties Window

Chapter 4: Connecting IBM Servers in the Data Center Network

51

7.

In the virtual Ethernet Adapter Properties window (as shown in Figure 4.5), enter the following: a. Adapter ID, (default value displays). b. VSwitch, virtual Ethernet Switch that this adapter connects to. c. VLAN ID, VLAN ID for untagged frames, Vswitch will add/remove the VLAN header. d. Select the checkbox, this adapter is required for partition activation. e. Select the checkbox, IEEE 802.1q compatible adapter, for control if VLAN tagged frames are allowed on this adapter. f. Select the Add, Remove, New VLAN ID and Additional VLANs for adding/ removing VLAN IDs that are allowed for VLAN tagged frames.

g. Select the checkbox Access external network enabled only on LPARs used for bridging traffic from the virtual Ethernet Switch to some other NIC. Typically this should be kept unchecked for regular LPARs and checked for VIOS. h. Click OK to save changes made in the profile and then select Close. To Remove a Virtual Ethernet Adapter 1. Select server (Systems Management >> Servers >> select server).

2. Select LPAR. 3. Select Configuration >> Manage Profiles. 4. Select the profile that you want to change. 5. Select Virtual Adapters tab. 6. Select the Ethernet Adapter that you want to remove. 7. Select: Actions >> Delete.

8. Click OK to save changes made in the profile and then select Close. To Change a Virtual Ethernet Adapters Properties 1. Select server (Systems Management >> Servers >> select server).

2. Select LPAR. 3. Select Configuration >> Manage Profiles. 4. Select the profile that you want to change. 5. Select the Virtual Adapters tab. 6. Select the Ethernet Adapter that you want to remove. 7. Select Actions >> Edit.

8. Enter the required information in the fields, as illustrated in Figure 4.5 9. Click OK to save changes made in the profile and then select Close.

52

Data Center Network Connectivity with IBM Servers

Shared Ethernet Adapter in VIOS


The SEA is a software implemented Ethernet bridge that connects a virtual Ethernet network to an external Ethernet network. With this connection, the SEA becomes a logical device in VIOS, which typically connects two other devices: the virtual Ethernet adapter on VIOS connects to the virtual Ethernet switch; the physical NIC connects to the external Ethernet network. NOTE Make sure that the Access External network option is checked when the virtual Ethernet adapter is created on VIOS. To create a SEA on VIOS, use the following command syntax:
mkvdev -sea <target_device> -vadapter <virtual_ethernet_adapters> -default <DefaultVirtualEthernetAdapter> -defaultid <SEADefaultPVID>

Table 4.1 lists and defines the parameters associated with this command. Table 4.1 Parameters
target _ device virtual _ ethernet _ adapters DefaultVirtualEthernetAdapter SEADefaultPVID

mk/debv Command Parameters and Description Description


Is.the.physical.port.that.connects.to.the.external.network,.on. NIC.exclusively.allocated.to.VIOS,.LPAR.or.LHEA . Represents.one.or.more.virtual.Ethernet.adapters.that.SEA. will.bridge.to.target_device.(typically.only.one.adapter) . Is.the.default.virtual.Ethernet.adapter.that.will.handle. untagged.frames.(typically.the.same.as.previous.parameter) . Is.the.VID.for.the.default.virtual.Ethernet.adapter.(typically. has.the.value.of.1) .

The following sample command creates a SEA, as shown in Figure 4.6:


mkvdev -sea ent1 -vadapter emt2 -default ent3 -defaultid 1
Server SRV 1 LPAR VIOS to HMC FSP SEA (ent3) (ent1) NIC LPAR 1 LPAR 2

(ent2) Virtual Switch 1

ent1 Ethernet Interface to NIC Assigned to VIOS LPAR ent2 Ethernet Interface to Virtual Switch ent3 Shared Ethernet Adaper (Logical Device)

Figure 4.6

Creating a shared Ethernet Adapter in VIOS

Chapter 4: Connecting IBM Servers in the Data Center Network

53

Host Ethernet Adapter


The HEA, also called Integrated virtual Ethernet Adapter, is an integrated highspeed Ethernet adapter with hardware-assisted virtualization, which is a standard feature on every POWER6 processor-based server. The HEA provides physical high-speed connection (10G) to the external network and provides a logical port. Figure 4.7 shows the LHEAs for LPARs.
Server SRV 1 LPAR VIOS LPAR 1 LPAR 2 LPAR 3

to HMC

FSP

HEA Ext Port 1

HEA Ext Port 2

HEA

Figure 4.7

Host Ethernet Adapter Overview

Because HEA creates a virtual network for the client partitions and bridges the virtual network to the physical network, it replaces the need for both the virtual Ethernet and the Shared Ethernet Adapter. In addition, HEA enhances performance and improves utilization for Ethernet because HEA eliminates the need to move packets (using virtual Ethernet) between partitions and then through a SEA to the physical Ethernet interface. For detailed information, refer to IBMs Redpaper Integrated Virtual Ethernet Adapter Technical Overview and Introduction at www.redbooks.ibm.com/abstracts/redp4340.html. HEA is configured through HMC. The following list includes some HEA configuration rules: LPAR uses only one logical port to connect to HEA. HEA consists of one or two groups of logical ports. Each group of logical ports has 16 logical ports (16 or 32 total for HEA). Each group of logical ports can have one or two external ports assigned to it (predefined). A logical port group consists of one or two Ethernet switch partitions, one for each external port. LPAR can have only one logical port connected to an Ethernet switch partition. This means that only one logical port can connect to the external port. MCS increases bandwidth between LPAR and NIC. MCS reduces the number of logical ports, for MCS=2 the number of logical ports is 16/2=8. For MCS to take effect, a server restart is required. Only one logical port in a port group can be set in promiscuous mode. In this section, we discuss the following HEA configurations: Configuring a HEA physical port Adding a LHEA logical port Removing a LHEA logical port.

54

Data Center Network Connectivity with IBM Servers

Configuring a HEA Physical Port


To configure the HEA physical port, perform the following steps (refer to Figure 4.8 as a reference): 1. Select server (Systems Management >> Servers >> select server).

2. Select Hardware Information >> Adapters >> Host Ethernet. 3. Select adapter (port). 4. Click the Configure button. 5. Enter parameters for the following fields: Speed, Duplex, Maximum receivingpacket size (Jumbo frames), Pending Port Group Multi-Core Scaling value, Flow control, Promiscuous LPAR. 6. Click OK to save your changes.

Figure 4.8

HEA Physical Port Configuration Overview Window

Adding a LHEA Logical Port


To add a LHEA, perform the following steps: 1. Select server (Systems Management >> Servers >> select server).

2. Select LPAR. 3. Select Configuration >> Manage Profiles. 4. Select the profile that you want to change. 5. Select the tab Logical Host Ethernet Adapters (LHEA). 6. Select the external port that LHEA connects to. 7. Click Configure. 8. Enter the parameters for the following fields: Logical port, select one port 116, if MCS is greater than 1 some logical ports will be identified as Not Available. 9. Select the checkbox Allow all VLAN IDs. Otherwise, enter the actual VLAN ID in the VLAN to add field, as shown in Figure 4.9. 10. Click OK.

Chapter 4: Connecting IBM Servers in the Data Center Network

55

Figure 4.9

HEA Physical Port Configuration Overview

Removing a LHEA Logical Port


To remove the LHEA, perform the following steps: 1. Select server (Systems Management >> Servers >> select server).

2. Select LPAR. 3. Select Configuration >> Manage Profiles. 4. Select the profile that you want to change. 5. Select the tab Logical Host Ethernet Adapters (LHEA). 6. Select the external port that LHEA connects to. 7. Click the Reset button. 8. Click OK to close the window. 9. Click OK to save changes and close the window.

56

Data Center Network Connectivity with IBM Servers

IBM PowerVM Network Deployment


In this section, we discuss a typical IBM PowerVM network deployment. As illustrated in the Figure 4.10, two IBM System p6 servers are deployed in a data center and three networks are required: HMC private network (192.168.128.0/17). Out of Band management network (172.28.113.0/24). Production network (11.11.1.0/8). Typically, testing traffic is sent to interfaces on the production network.
dhcpd Web App sshd X-Window Server Keyboard and Monitor

HMC

Ethernet Switch Private Network (192.168.128.0/17) p6 Server LPAR VIOS FSP Host Ethernet Adapter (HEA) LPAR RHEL LPAR SUSE LPAR AIX 5.3

LPAR AIX 6.1

NIC under test 1

p5 Server LPAR VIOS FSP Hypervisor NIC under test 2 LPAR RHEL LPAR SUSE

Ethernet Switch Management Network (172.28.113.0/24)

DUT Production Network (11.11.1.0/24)

Management Workstation (Web clientTelnet/SSH Client)

Figure 4.10

IBM Power Series Servers, LPARs, and Network Connections

Chapter 4: Connecting IBM Servers in the Data Center Network

57

HMC runs on a Linux server with two network interfaces: one connects to a private network for all managed P5/P6 systems (on-board Ethernet adapter on servers, controlled by FSP process); the other network interface connects to a management network. In the management network, the management workstation accesses the HMC Web interface through a Web browser. There are two ways to set up communication with LPAR (logical partitions): Using a console window through HMC. Using Telnet/SSH over the management network. Each LPAR has one dedicated Ethernet interface for connecting to the management network using the first physical port on HEA (IVE) shared among LPARs. Each LPAR must connect to the virtual Ethernet Switch using the virtual Ethernet Adapter. You create a virtual Ethernet switch and a virtual Ethernet adapter using the HMC. Virtual Ethernet adapters for VIOS LPARs must have the Access External Network option enabled. VIOS LPAR, which is a special version of AIX, performs the bridging between the virtual Ethernet switch (implemented in Hypervisor) and the external port. For bridging frames between the physical adapter on the NIC and the virtual Ethernet adapter connected to the virtual Ethernet switch, another logical device (the SEA) is created in VIOS. As illustrated in Figure 4.11, the typical network deployment with the access switch and LPAR (virtual machine) is as follows: The access switch connects to physical NIC, which is assigned to ent1 in VIOS. The ent3 (SEA) bridges ent1 (physical NIC) and ent2 (virtual Ethernet adapters). The ent2 (virtual Ethernet adapter) is created and dedicated to LPAR which runs Red Hat Enterprise Linux. The ent3 also supports multiple VLANs. Each VLAN will associate with one logical Ethernet adapter, for example ent4.
LPAR RHEL ent2 ent4 Virtual Switch 1

NIC Ethernet Switch

ent1

LPAR VIOS SEA (ent3)

ent1 Ethernet Interface to NIC Assigned to VIOS LPAR ent2 Ethernet Interface to Virtual Switch (Sw. in Hypervisor) ent3 Shared Ethernet Adaper (Logical Device) ent4 Logical Ethernet Adaper for One VLAN

Figure 4.11

Detailed Network Deployment with SEA

58

Data Center Network Connectivity with IBM Servers

Junos Operating System Overview


As shown in Figure 4.12, the Junos OS includes two components: Routing Engine and the Packet Forwarding Engine. These two components provide a separation of control plane functions such as routing updates and system management from packet data forwarding. Hence, products from Juniper Networks can deliver superior performance and highly reliable Internet operation.
Junos So ware Routing Engine User

SNMP

Routing Tables

Routing Protocol Process

Interface Process

Command-Line Interface (CLI)

Chassis Process

Forwarding Table

Kernel

Forwarding Table

Interface Process

Distributed ASICs

Chassis Process

Embedded Microkernel Packet Forwarding Engine Microkernel

Figure 4.12

Junos OS Architecture

Routing Engines
The Routing Engine runs the Junos operating system, which includes the FreeBSD kernel and the software processes. The primary operator processes include the device control process (dcd), routing protocol process (rpd), chassis process (chassisd), management process (mgd), traffic sampling process (sampled), automatic protection switching process (apsd), simple network management protocol process (snmpd) and system logging process (syslogd). The Routing Engine installs directly into the control panel and interacts with the Packet Forwarding Engine.

Chapter 4: Connecting IBM Servers in the Data Center Network

59

Packet Forwarding Engine


The Packet Forwarding Engine is designed to perform Layer 2 and Layer 3 switching, route lookups and rapid forwarding of packets. The Packet Forwarding Engine includes the backplane (or midplane), Flexible PIC Concentrator (FPC), Physical Interface Cards (PICs) and the control board (switching/forwarding) and a CPU that runs the microkernel. The microkernel is a simple, cooperative, multitasking, real-time operating system designed and built by Juniper Networks. The microkernel, which has many features, comprises fully independent software processes, each with its own chunk of memory. These applications communicate with one another. The hardware in the router prevents one process from affecting another. A snapshot is taken wherever the failure occurred so that engineers can analyze the core dump and resolve the problem. The Switch Control Board (SCB) powers on and off cards, controls clocking, resets, boots, and then monitors and controls systems functions, including the fan speed, board power status, PDM status and control, and the system front panel.

Interaction Between Routing Engine and Packet Forwarding Engine


The kernel on the Routing Engine communicates with the Packet Forwarding Engine and synchronizes a copy of the forwarding table on the Packet Forwarding Engine with that on the Routing Engine. Figure 4.12 shows the interaction between the Routing Engine and Packet Forwarding Engine with respect to the forwarding activity. The Routing Engine builds a master-forwarding table based on its routing table. The kernel on the Routing Engine communicates with the Packet Forwarding Engine and provides the Packet Forwarding Engine the forwarding table. From this point on, the Packet Forwarding Engine performs traffic forwarding. The Routing Engine itself is never involved in the forwarding of packets. The ASICs in the forwarding path only identify and send the Routing Engine any exception packets or routing control packets for processing. There are security mechanisms in place that prevent the Routing Engine (and control traffic) from becoming attached or overwhelmed by these packets. Packets sent to the control plane from the forwarding plane are rate limited to protect the router from DOS attacks. The control traffic is protected from excess exception packets using multiple queues that provide a clean separation between the two. The packets are prioritized by the packet-handling interface, which sends them to the correct queues for appropriate handling. The redundant function components in the network devices prevent SPOF and increase high availability and reliability. Juniper Networks devices typically configure with a single Routing Engine and Packet Forwarding Engine. To achieve high availability and reliability, the user has two options: Create redundant Routing Engines and a single Packet Forwarding Engine, or Create redundant Routing Engines and redundant Packet Routing Engines.

60

Data Center Network Connectivity with IBM Servers

Junos Processes
Junos processes run on the Routing Engine and maintain the routing tables, manage the routing protocols used on the router, control the router interfaces, control some chassis components, and act as the interface for system management and user access to the router. Major processes are discussed in detail later this section. The Junos process is a UNIX process that runs nonstop in the background while a machine is running. All of the processes operate through the Command Line Interface (CLI). Each process is a piece of the software and has a specific function or area to manage. The processes run in separated and protected address spaces. The following sections briefly cover two major Junos processes: the routing protocol process (rpd) and the management process (mgd).

Routing Protocol Process


The routing protocol process (rpd) provides the routing protocol intelligence to the router, controlling the forwarding of packets. Sitting in the user space (versus the kernel) of the routing engine, rpd is a mechanism for the Routing Engine to learn routing information and construct the routing table, which stores route information. This process starts all configured routing protocols and handles all routing messages. It maintains one or more routing tables, which consolidates the routing information learned from all routing protocols. From this routing information, the rdp process determines the active routes to network destinations and installs these routes into the Routing Engines forwarding table. Finally, the rdp process implements the routing policy, which enables an operator to control the routing information that is transferred between the routing protocols and the routing table. Using a routing policy, operators can filter and limit the transfer of information as well as set properties associated with specific routes. NOTE RPD handles both unicast and multicast routing protocols as data travels to one destination and travels to many destinations, respectively.

Management Process
Several databases connect to the management process (mgd). The config schema database merges the packages /usr/lib/dd/libjkernel-dd.so, /usr/lib/dd/ libjroute-dd.so and /usr/lib/dd/libjdocs-dd at initialization time to make /var/ db/schema.db, which controls what the user interface (UI) is. The config database holds /var/db/juniper.db. The mgd works closely with CLI, allowing the CLI to communicate with all the other processes. Mgd knows which process is required to execute commands (user input). When the user enters a command, the CLI communicates with mgd over a UNIX domain socket using Junoscript, an XML-based remote procedure call (RPC) protocol. The mgd is connected to all the processes, and each process has a UNIX domain management socket.

Chapter 4: Connecting IBM Servers in the Data Center Network

61

If the command is legal, the socket opens and mgd sends the command to the appropriate process. For example, the chassis process (chassisd) implements the actions for the command show chassis hardware. The process sends its response to mgd in XML form and mgd relays the response back to the CLI. Mgd plays an important part in the commit check phase. When you edit a configuration on the router, you must commit the change for it to take effect. Before the change actually is made, mgd subjects the candidate configuration to a check phase. The management process writes the new configuration into the config db (juniper.db).

Junos Operating System Network Management


The Junos operating system network management features work in conjunction with an operations support system (OSS) to manage the devices within the network. The Junos OS can assist in performing the following management tasks: Fault management: includes device monitoring and detecting and fixing faults Configuration management Accounting management: collects statistics for accounting purposes Performance management: monitors and adjusts device performance Security management: controls device access and authenticates users The following interfaces (APIs) typically are used to manage and monitor Juniper Networks network devices: CLI J-Web SNMP NETCONF In addition, Junos also supports other management interfaces to meet various requirements from enterprise and carrier providers, including J-Flow, sFlow, Ethernet OAM, TWAMP, etc. MORE For detailed configuration information concerning the network management interfaces, please refer to Junos Software Network Management Configuration Guide Release 10.0 at http://www.juniper.net/techpubs/en_US/junos10.0/ information-products/topic-collections/config-guide-network-mgm/frameset. html.

Configuring Network Devices


Table 4.2 lists and describes the ways by which IBM servers can connect to Juniper switches and routers in the data center.

62

Data Center Network Connectivity with IBM Servers

Table 4.2

Methods for Connecting IBM Servers to Juniper Switches and Routers Description
To the IBM servers, the network device appears as a Layer 2 switch. The network device interfaces and IBM servers NIC are in the same Layer 2 broadcast domain. Because the network device interfaces do not configure Layer 3 IP addresses, they do not provide routing functionality. To the IBM servers, the network device appears as a Layer 2 switch. The network device interfaces and IBM servers NIC are in the same Layer 2 broadcast domain. The network device interfaces configure Layer 3 IP addresses so that they can route traffic to other connected networks. To the IBM servers, the network device appears as a Layer 3 router with a single Ethernet interface and IP address. The network device does not provide Layer 2 switching functionality.

Connection Types
The network device acts as Layer 2 switch.

The network device acts as a switch with Layer 3 address.

The network device acts as a router.

In the next section, several different but typical methods for configuring the MX Series routers and EX Series switches are presented.

Configuring MX Series 3D Universal Edge Routers


In an MX Series configuration, one physical interface can have multiple logical interfaces, so that each logical interface is defined as a unit under the physical interface, followed by the logical interface ID number. Use the following statement to configure the mapping of Ethernet traffic to logical interfaces:
encapsulation and vlan-tagging

Configuring Layer 2 Switching As illustrated in the following code, two Ethernet ports are in the same broadcast domain: ge-5/1/5 interface is configured with untagged VLAN, while ge-5/1/7 interface is configured with tagged VLAN. Ethernet interfaces in MX Series routers can support one or many VLANs. Each Ethernet VLAN is mapped into one logical interface. If logical interfaces are used to separate traffic to different VLANs, we recommend using the same numbers for logical interface (unit) and VLAN ID. For instance, the logical interface and the VLAN ID in the following sample use the same number (100):
interfaces ge-5/1/5 { unit 0 { family bridge; } } interfaces ge-5/1/7 { vlan-tagging; encapsulation flexible-ethernet-services; unit 100 { encapsulation vlan-bridge; family bridge; } } bridge-domains { Data01 {

Chapter 4: Connecting IBM Servers in the Data Center Network

63

domain-type bridge; vlan-id 100; interface ge-5/1/5.0; interface ge-5/1/7.100;

Configuring Layer 2 Switching and Layer 3 Interface


As illustrated in the following code, two Ethernet ports are in the same broadcast domain, ge-5/1/5 interface is configured with untagged VLAN, while ge-5/1/7 interface is configured with tagged VLAN. In addition, IRB on the MX Series provides simultaneous support for Layer 2 bridging and Layer 3 routing on the same interface, such as irb.100 so that the local packets are able to route to another routed interface or to another bridging domain that has a Layer 3 protocol configured.
interfaces ge-5/1/5 { unit 0 { family bridge; } } interfaces ge-5/1/7 { vlan-tagging; encapsulation flexible-ethernet-services; unit 100 { encapsulation vlan-bridge; family bridge; } } interfaces irb { unit 100 { family inet { address 11.11.1.1/24; } } } bridge-domains { Data01 { domain-type bridge; vlan-id 100; interface ge-5/1/5.0; interface ge-5/1/7.100; routing-interface irb.100; } } Configuring Layer 3 Routing As illustrated in the following code, one Ethernet interface (ge-5/0/0) is configured with a tagged VLAN and an IP address. interfaces ge-5/0/0 description P6-1; vlan-tagging; unit 30 { description Data01; vlan-id 30; family inet { address 11.11.1.1/24; } }

64

Data Center Network Connectivity with IBM Servers

Configuring EX Series 4200 and 8200 Ethernet Switches


In a typical EX Series configuration, one physical interface can have multiple logical interfaces so that a logical interface is defined as a unit under the physical interface, followed by a logical interface ID number. However, for Ethernet switching between ports in the EX Series, configuration on interfaces must include family Ethernetswitching under unit 0. Define the configuration of Layer 2 broadcast (bridge) domains under vlan stanza. Interface membership in VLANs can be defined using one of the following two methods: Under vlan x interface (preferred method). Under interface y unit 0 family ethernet-switching vlan members. If the Ethernet port carries only untagged frames for one VLAN, port mode should be defined as access (default). If the Ethernet port carries tagged frames, port mode must be defined as trunk (case with two or more VLANs on one port).

Configuring Layer 2 Switching


As illustrated in the following code, two Ethernet ports are in the same broadcast domain: ge-5/1/5 interface is configured with untagged VLAN, while ge-5/1/7 interface is configured with tagged VLAN. The Ethernet interfaces in EX Series routers can support one or many VLANs. Each VLAN is mapped into one logical interface. If logical interfaces are used to separate traffic to different VLANs, we recommend using the same numbers for the logical interface (unit) and VLAN ID. For example, the logical interface and the VLAN ID in the following sample use the same number (100):
interfaces ge-5/1/5 { unit 0 { family ethernet-switching; } interfaces ge-5/1/7 { unit 0 { family ethernet-switching { port-mode trunk; } } vlans { Data01 { vlan-id 100; interface { ge-5/1/5.0; ge-5/1/7.0; } } }

Chapter 4: Connecting IBM Servers in the Data Center Network

65

Configuring Layer 2 Switching and Layer 3 Interface


As illustrated in the following code, two Ethernet ports are in the same broadcast domain: ge-5/1/5 interface is configured with untagged VLAN, while ge-5/1/7 interface is configured with tagged VLAN. In addition, the EX series Ethernet Switch supports routed interfaces called Routed VLAN Interfaces (RVI). RVIs are needed to route the traffic from one VLAN to another. As opposed to IRB, which routes bridge domains, RBI routes VLANs. In the following code, the RVI interface with IP address 11.11.1.1/24 is associated with VLAN 100 logical interface.
interfaces ge-5/1/5 { unit 0 { family ethernet-switching; } interfaces ge-5/1/7 { unit 0 { family ethernet-switching { port-mode trunk; } } interfaces vlan { unit 100 { family inet { address 11.11.1.1/24; } } } vlans { Data01 { vlan-id 100; interface { ge-5/1/5.0; ge-5/1/7.0; } l3-interface vlan.100; } }

Configuring Layer 3 Routing


As illustrated in the following code, the Ethernet interface (ge-5/0/0) is configured with tagged VLAN and IP address:
interfaces ge-5/0/0 description P6-1; vlan-tagging; unit 30 { description Data01; vlan-id 30; family inet { address 11.11.1.1/24; } }

66

Data Center Network Connectivity with IBM Servers

MX and EX Series Ethernet Interface Setting


In general, the default value of the Ethernet interface setting is as follows: Auto-negotiation for the speed setting. Automatic for the link mode setting. Flow-control for the flow-control setting link-mode= automatic. Because these default settings on the MX and EX Series worked well in many use cases, we recommend using these default settings as a starting point and then optimizing some of the settings only when necessary. The Ethernet interface configuration stanzas on the MX and EX Series are different. On the MX Series, the interface settings can be changed under interface x gigetheroptions stanza; on the EX Series, the interface settings can be changed under interface x ether-options stanza. Under the configuration stanzas, the following settings are available: Link speed can be set to 10m, 100m, 1g or auto-negotiation. Link-mode can be set to automatic, full duplex or half-duplex. Flow-control can be set to flow-control or no-flow-control. NOTE When one device is set to auto-negotiate link-mode while the other device is set to full-duplex link mode, the connection between these two devices does will not work properly due to the limitation set by IEEE802.3 standard. We highly recommend using the auto-negotiate link setting for gigabit Ethernet. The MX Series does not support half-duplex operation on 10/100/1000BaseT interfaces.

NOTE

MX and EX Series Support for Jumbo Frames (MTU)


The EX and MX Series can support frame sizes on Ethernet interfaces up to 9216 octets. In a Junos configuration, this parameter is called Maximum Transmission Unit (MTU). In Junos, MTU includes Ethernet overhead such as source address, destination address, and VLAN-tag. However, it does not include the preamble or frame check sequence (FCS). The default Ethernet frame size in Junos is 1514 octets, while the default frame size on other vendor devices can be 1500 octets. It is important to understand that all devices in one broadcast domain must have the same jumbo frames MTU size. Otherwise, devices that do not support jumbo frames could discard some frames silently. As a result, this creates remittance network problems, such as failures between routers to establish OSPF neighbor adjacency. The EX and MX Series devices have different types of interfaces, such as the physical and irb interfaces. Because MTU is associated with each interface type, the MTU configuration syntax is different, as listed in Table 4.3. Table 4.3 MTU Configuration Syntax
Command

Juniper Network Devices Interface Type MX Series Routers EX Series Ethernet Switches Physical interface IRB interface

set set Physical interface set VLAN interface Interface VLAN unit set

set interfaces mtu <mtu> interfaces interfaces interfaces interfaces

irb mtu <mtu> mtu <mtu> vlan mtu <mtu> vlan unit 100 family inet mtu <mtu>

67

Chapter 5
Configuring Spanning Tree Protocols

Spanning.Tree.Protocols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Configuring.RSTP/MSTP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Configuring.VSTP/PVST+/Rapid-PVST+. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

THIS.CHAPTER.FOCUSES.on.the.different.spanning.tree.protocols..STP,.. RSTP,.MSTP,.and.VSTP..that.are.used.in.Layer.2.networks.to.prevent.loops .. Typically,.STP.is.supported.only.on.legacy.equipment.and.has.been.replaced.with. RSTP.and.other.variants.of.Spanning.Tree ..Support.for.RSTP.is.mandatory.on.all. devices.that.are.capable.of.spanning.tree.functionality ..When.interoperating.with. legacy.switches,.a.RSTP.capable.switch.automatically.reverts.to.STP ..We.discuss. STP.in.this.chapter.to.provide.a.background.on.spanning.tree.functionality .

68

Data Center Network Connectivity with IBM Servers

Spanning Tree Protocols


The STP protocol works on the concept of a switch elected as a root bridge that connects in a mesh to other non-root switches. The active path of least cost is selected between each of the non-root bridges and the root bridge. In addition, a redundant path is identified and used when failure occurs. All these bridges exchange Bridge Protocol Data Units (BPDU) that contain the bridge IDs and cost in order to reach the root port. The root bridge is elected based on priority. A switch, assigned the lowest priority, is elected as the root. The ports on a switch that are closest (in cost) to the Root Bridge become the Root Ports (RP). NOTE There can only be one RP on a switch. A root bridge cannot have an RP. The ports that have a least cost to the Root Bridge in the network are known as the Designated Ports (DP). Ports that are not selected as RP or DP are considered to be Blocked. An optimized active path based on bridge/port priority and cost is chosen to forward data in the network. The BPDUs that provide the information on the optimal path are referred to as superior BPDUs while those that provide sub-optimal metrics are referred to as inferior BPDUs. BPDUs mainly consist of the following fields that are used as the basis for determining the optimal forwarding topology: Root Identifier - A representation of the switchs current snapshot of the network assuming it as the root bridge. Root path cost - Link speed of the port on which the BPDU is received. Bridge Identifier - Identity used by the switch to send BPDUs. Port Identifier - Identity of the port from which BPDU originated. Convergence of a spanning tree based network consists of a three-step process: 1. Root Bridge Election.

2. Root Port Election (on non-root switches). 3. Designated Port Election (on network segment). Figure 5.1 shows three switches: one root and two non-root bridges. The ports on the root bridge are the designated ports (DP). The ports with least cost to the root bridge are the root ports (RP). All other interfaces running STP on the non-root bridges are alternate ports (ALT).

Rapid Spanning Tree Protocol


Rapid Spanning Tree Protocol (RSTP) is a later and enhanced version of STP that provides faster convergence times. The faster times are possible because RSTP uses protocol handshake messages unlike STP, which uses fixed timeouts. When compared to STP, RSTP provides enhanced performance by:

Chapter 5: Configuring Spanning Tree Protocols

69

Generating and transmitting BPDUs from all nodes at the configured Hello interval, irrespective of whether they receive any BPDUs from the RP. This allows the nodes to monitor any loss of Hello messages and thus detect link failures more quickly than STP. Expediting changes in topology by directly transitioning the port (either edge port or a port connected to a point-to-point link) from a blocked to forwarding state. Providing a distributed model where all bridges in the network actively participate in network connectivity.
Root Bridge DP DP

RP ALT DP

RP

Non-Root Bridge 1

Non-Root Bridge 2

RP Root Port DP Designated Port ALT Alternate Port

Figure 5.1 STP Network New interface types defined in RSTP are: Point to Point Edge Shared or Non-edge Point to Point A point-to-point (P2P) interface provides a direct connection between two switches. Usually, a full duplex interface is set automatically to be P2P. Edge The edge interface is another enhancement in RSTP that helps reduce convergence time when compared to STP. Ports connected to servers (there are no bridges attached) are typically defined as edge ports. Any changes that are made to the status of the edge port do not result in changes to the forwarding network topology and thus are ignored by RSTP.

70

Data Center Network Connectivity with IBM Servers

Shared or Non-edge A Shared or Non-edge interface is an interface that is half-duplex or has more than two bridges on the same LAN. When compared to STP, RSTP introduces the concept of a port state, role and interface. The state and role of a RSTP based port are independent. A port can send or receive BPDUs or data based on its current state. The role of a port depends on its position in the network. The role of a port can be determined by performing a BPDU comparison during convergence. Table 5.1 shows the mapping between RSTP port states and roles. Table 5.1 Mapping between RSTP Port States and Roles RSTP State
Forwarding Forwarding Discard Discard Discard

RSTP Role
Root Designated Alternate Backup Disabled

The Alternate role in RSTP is analogous to the Blocked port in STP. Defining an edge port allows a port to transition into a forwarding state, eliminating the 30-second delay that occurs with STP.

Multiple Spanning Tree Protocol


Multiple Spanning Tree Protocol (MSTP) is an enhancement to RSTP. MSTP supports the logical division of a Layer 2 network, or even a single switch into regions. A region here refers to a network, single VLAN or multiple VLANs. With MSTP, separate spanning tree groups or instances can be configured for each network, VLAN or group of VLANs. There can be Multiple Spanning Tree Instances (MSTI) for each region. MSTP can thus control the spanning tree topology within each region. On the other hand, Common Instance Spanning Tree (CIST) is a separate instance that is common across all regions. It controls the topology between the different regions. Each MSTI has a spanning tree associated with it. RSTP based spanning tree tables are maintained per MSTI. Using CIST to distribute this information over the common instance minimizes the exchange of spanning tree related packets and thus network traffic between regions. MSTP is compatible with STP and makes use of RSTP for convergence algorithms.

Chapter 5: Configuring Spanning Tree Protocols

71

MSTI-A VLAN 501


BPDUs - internal to instance MSTI-A

CIST
BPDUs between instances

MSTI-B VLANs 990, 991


BPDUs - internal to instance MSTI-B

MSTI-C VLANs 100, 200, 300


BPDUs - internal to instance MSTI-C

Figure 5.2

MSTP Example

Figure 5.2 shows the three MSTIs: A, B, and C. Each of these instances consists of either one or more VLANs. BPDUs specific to the particular instance are exchanged within each of the MSTIs. The CIST handles all BPDU information which is required to maintain the topology across the regions. CIST is the instance that is common to all regions. With MSTP, bridge priorities and related configurations can be applied on a per instance basis. Thus, a root bridge in one instance does not necessarily have to be in a different instance.

VLAN Spanning Tree Protocol


In case of VLAN Spanning Tree Protocol (VSTP), each VLAN has a spanning tree associated with it. The problem with this approach is mainly that of scalability the processing resources consumed increase proportionally with the number of VLANs. When configuring VSTP, the bridge priorities and the rest of the spanning tree configuration can be applied on a per VLAN basis. NOTE When configuring VSTP, please pay close attention to the following: When using virtual switches, VSTP cannot be configured on virtual switch bridge domains that contain ports with either VLAN ranges or mappings. VSTP can only be enabled for a VLAN ID that is associated with a bridge domain or VPLS routing instance. All logical interfaces assigned to the VLAN must have the same VLAN ID. VSTP is compatible with the Cisco PVST implementation. Table 5.2 lists the support existing on different platforms for the different spanning tree protocols.

72

Data Center Network Connectivity with IBM Servers

Table 5.2 Platforms


Protocols STP

Overview of Spanning Tree Protocols and Platforms

IBM BladeCenter (Cisco ESM) Configuration not supported, works with MSTP/PVST (backwards compatible) Configuration not supported, works with MSTP/PVST (backwards compatible) MSTP PVST+ Rapid-PVST+

EX4200 STP

EX8200 STP

MX Series Configuration not supported, works with RSTP (backwards compatible) RSTP MSTP VSTP VSTP

RSTP MSTP PVST+(Cisco)/ VSTP (Juniper) Rapid-PVST+ (Cisco)/ VSTP (Juniper)

RSTP MSTP VSTP VSTP

RSTP MSTP VSTP VSTP

Configuring RSTP/MSTP
Figure 5.3 shows a sample MSTP network that can be used to configure and verify RSTP/MSTP functionality. The devices in this network connect in a full mesh. The switches and IBM BladeCenter connect in a mesh and are assigned these priorities: EX 4200-A 0K (lowest bridge priority number) MX480 8K EX8200 16K IBM BladeCenter (Cisco ESM) 16K EX4200-B 32K We configure EX4200-A as the root bridge. Two MSTP instances, MSTI-1 and MSTI-2, correspond to VLANs 1122 and 71, respectively. Either one, or both, of these VLANs are configured on links between the switches on this spanning tree network. Table 5.3 shows the association between the links, VLANs and MSTI instances. Table 5.3 Association between Links, VLANs and MSTI Instances MSTI Instance
MSTI-1 MSTI-1 MSTI-1, MSTI-2 MSTI-2 MSTI-1, MSTI-2 MSTI-1 MSTI-1, MSTI-2

Links between Switches


EX4200-B EX8200 EX4200-B IBM BladeCenter EX4200-B MX480 EX4200-A IBM BladeCenter EX4200-A MX480 EX4200-A EX8200 MX480 IBM BladeCenter

VLAN ID
1122 1122 1122, 71 71 1122, 71 1122 1122, 71

Chapter 5: Configuring Spanning Tree Protocols

73

MSTP

Priority 8K

ge-5/3/4

BladeCenter * blade 6, eth 1, ip=11.22.1.6

ge-5/3/3 VLAN [71, 1122] ge-5/2/2 ge-5/1/2 ge-1/0/8

BladeCenter *blade 8, eth 1, ip=11.22.1.8

Priority 16K ge-1/0/2 ge-1/0/6

MX480
ge-5/1/1 VLAN [71, 1122] ge-0/0/21

BladeCenter *blade 7, eth 1, ip=11.22.1.7 ge-0/0/7 ge-0/0/9

VLAN [1122]

ge-0/0/20 ge-0/0/0

EX4200 EX8200
VLAN [1122] ge-0/0/12 ge-0/0/15 VLAN [71, 1122] ge-1/0/4 ge-0/0/14 ge-0/0/10 VLAN [71] Trunk Port 17 VLAN [1122] Trunk Port 18 Trunk Port 19 VLAN [71, 1122] ge-0/0/23 172.28.113.175 Priority 0K BladeCenter *blade 9, eth 1, ip=11.22.1.9

EX4200
172.28.113.180 Priority 32K BladeCenter * blade 10, eth 1, ip=11.22.1.10

ge-0/0/13

IBM BladeCenter Priority 16K Pass-Through Module via eth 1 interface on Blades [6, 7, 8, 9, 10]

9.3.71.39 9.3.71.50 9.3.71.40 9.3.71.35 9.3.71.41

Internal eth 0 interface for each blade [6, 7, 8, 9, 10] connected via Trunk Ports [17, 18, 19, 20]

*Server connections simulated to each DUT via eth 1 interface connected to BladeCenter Pass-Through Module

Figure 5.3

Spanning Tree MSTP/RSTP Another instance MSTI-0 (constitutes the CIST) is created by default to exchange the overall spanning tree information for all MSTIs between the switches. The blade servers connect to each of the switches as hosts/servers. The switch ports on the different switches that connect to these BladeCenter servers are defined as edge ports and are assigned IP addresses. The selection of the root bridge is controlled by explicit configuration. That is, a bridge can be prevented from being elected as a root bridge by enabling root protection.

74

Data Center Network Connectivity with IBM Servers

Configuration Snippets
The following code pertains to the EX4200-A (RSTP/MSTP):
// Enable RSTP by assigning bridge priorities. // Set priorities on interfaces to calculate the least cost path. // Enable root protection so that the interface is blocked for the RSTP instance that receives superior BPDUs. Also, define the port to be an edge port. rstp { bridge-priority 4k; interface ge-0/0/0.0 { priority 240; } interface ge-0/0/7.0 { priority 240; edge; no-root-port; } interface ge-0/0/9.0 { priority 240; edge; no-root-port; } interface ge-0/0/20.0 { priority 240; } interface ge-0/0/21.0 { priority 240; } } chandra@EX-175-CSR# show protocols mstp configuration-name MSTP; bridge-priority 8k; interface ge-0/0/0.0 { priority 240; } // Enable RSTP by assigning bridge priorities // Set priorities on interfaces. // Enable root protection so that the interface is blocked when it receives BPDUs. An operator can configure a bridge not to be elected as a root bridge by enabling root protection. Root protection increases user control over the placement of the root bridge in the network. Also, define the port to be an edge port. // Define MST-1, provide a bridge priority for the instance. Associate a VLAN with the instance. // Define MSTI-2, provide a bridge priority for the instance. Associate a VLAN and interface with the instance. interface ge-0/0/7.0 { priority 240; edge; no-root-port; } interface ge-0/0/9.0 { priority 240; edge; no-root-port; } interface ge-0/0/20.0 {

Chapter 5: Configuring Spanning Tree Protocols

75

priority 224; } interface ge-0/0/21.0 { priority 192; } interface ge-0/0/23.0 { priority 224; } msti 1 { bridge-priority 8k; vlan 1122; } msti 2 { bridge-priority 8k; vlan 71; interface ge-0/0/23.0 { priority 224; } }

The following code snippet pertains to the EX4200-B:


// Enable RSTP by assigning bridge priorities // Set priorities on interfaces. // Enable root protection so that the interface is blocked when it receives BPDUs. Also, define the port to be an edge port. rstp { bridge-priority 4k; interface ge-0/0/10.0 { priority 240; } interface ge-0/0/12.0 { priority 240; } interface ge-0/0/14.0 { priority 240; } interface ge-0/0/15.0 { priority 240; edge; no-root-port; } } // Assign bridge priorities. // Set priorities on interfaces. // Enable root protection so that the interface is blocked when it receives BPDUs. Also, define the port to be an edge port. chandra@SPLAB-EX-180> show configuration protocols mstp configuration-name MSTP; bridge-priority 0; interface ge-0/0/10.0 { priority 240; } interface ge-0/0/11.0 { priority 240; } interface ge-0/0/12.0 { priority 224; }

76

Data Center Network Connectivity with IBM Servers

interface ge-0/0/13.0 { priority 224; } interface ge-0/0/14.0 { priority 192; } interface ge-0/0/15.0 { priority 240; edge; no-root-port; } // Define MSTI-1, provide a bridge priority for the instance. Associate a VLAN with the instance. msti 1 { bridge-priority 0; vlan 1122; } // Define MSTI-2, provide a bridge priority for the instance. Associate a VLAN and interface with the instance. msti 2 { bridge-priority 0; vlan 71; interface ge-0/0/13.0 { priority 224; } }

The following code snippet pertains to the MX480:


rstp { bridge-priority 40k; interface ge-5/1/1 { priority 240; } interface ge-5/1/2 { priority 240; } interface ge-5/2/2 { priority 240; } interface ge-5/3/3 { priority 240; } interface ge-5/3/4 { priority 240; edge; no-root-port; }

} chandra@HE-RE-0-MX480# show protocols mstp bridge-priority 8k; interface ge-5/1/1 { priority 224; } interface ge-5/1/2 { priority 192; } interface ge-5/2/2 { priority 192; }

Chapter 5: Configuring Spanning Tree Protocols

77

interface ge-5/3/3 { priority 224; } interface ge-5/3/4 { priority 240; edge; no-root-port; } msti 1 { bridge-priority 4k; vlan 1122; } msti 2 { bridge-priority 4k; vlan 71; interface ge-5/1/1 { priority 224; } }

Verification
Based on the sample network, administrators can verify the RSTP/MSTP configuration by issuing the show commands to verify that there are two MSTI instances and one MTSI-0 common instance present on each switch. The following CLI sample shows these three different MSTI instances and the VLANs associated with each of them:
chandra@SPLAB-EX-180> show spanning-tree mstp configuration MSTP information Context identifier : 0 Region name : MSTP Revision : 0 Configuration digest : 0xeef3ba72b1e4404425b44520425d3d9e MSTI Member VLANs 0 0-70,72-1121,1123-4094 1 1122 2 71

Each of these instances should have a RP (ROOT), BP (ALT) and DP (DESG) of its own:
chandra@SPLAB-EX-180> show spanning-tree interface Spanning tree interface parameters for instance 0 Interface Port ID Designated Designated Port State Role port ID bridgeID Cost ge-0/0/10.0 240:523 240:513 0.0019e2544040 20000 FWD ROOT ge-0/0/11.0 240:524 240:524 32768.0019e2544ec0 20000 FWD DESG ge-0/0/12.0 224:525 224:525 32768.0019e2544ec0 20000 FWD DESG ge-0/0/13.0 224:526 224:526 32768.0019e2544ec0 20000 FWD DESG ge-0/0/14.0 192:527 192:213 8192.001db5a167d1 20000 BLK ALT ge-0/0/15.0 240:528 240:528 32768.0019e2544ec0 20000 FWD DESG ge-0/0/36.0 128:549 128:549 32768.0019e2544ec0 20000 FWD DESG ge-0/0/46.0 128:559 128:559 32768.0019e2544ec0 20000 FWD DESG Spanning tree interface parameters for instance 1 Interface Port ID Designated Designated Port State Role port ID bridge ID Cost ge-0/0/10.0 128:523 128:513 1.0019e2544040 20000 FWD ROOT

78

Data Center Network Connectivity with IBM Servers

ge-0/0/12.0 128:525 128:525 32769.0019e2544ec0 20000 FWD DESG ge-0/0/14.0 128:527 192:213 4097.001db5a167d1 20000 BLK ALT ge-0/0/15.0 128:528 128:528 32769.0019e2544ec0 20000 FWD DESG Spanning tree interface parameters for instance 2 Interface Port ID Designated Designated Port State Role port ID bridge ID Cost ge-0/0/10.0 128:523 128:513 2.0019e2544040 20000 FWD ROOT ge-0/0/13.0 224:526 224:526 16386.0019e2544ec0 20000 FWD DESG ge-0/0/14.0 128:527 192:213 4098.001db5a167d1 20000 BLK ALT

The following CLI output shows the MSTI-0 information on the Root Bridge. All ports are in the forwarding state.
chandra@EX-175-CSR> show spanning-tree interface Spanning tree interface parameters for instance 0 Interface Port ID Designated Designated Port State Role port ID bridge ID Cost ge-0/0/0.0 240:513 240:513 12288.0019e2544040 20000 FWD DESG ge-0/0/7.0 240:520 240:520 12288.0019e2544040 20000 FWD DESG ge-0/0/9.0 240:522 240:522 12288.0019e2544040 20000 FWD DESG ge-0/0/20.0 240:533 240:533 12288.0019e2544040 20000 FWD DESG ge-0/0/21.0 240:534 240:534 12288.0019e2544040 20000 FWD DESG ge-0/0/24.0 128:537 128:537 12288.0019e2544040 20000 FWD DESG ge-0/0/25.0 128:538 128:538 12288.0019e2544040 20000 FWD DESG

1.

Check that only the information from instance MSTI-0 (but not MSTI-1 and MSTI-2) is available on all switches.

2. Confirm that there is only one direct path to any other interface within each MSTI instance on a switch. All other redundant paths should be designated as Blocked. Use the show spanning-tree interface command for this purpose. 3. Verify that a change in priority on any MSTI instance on a switch is propagated through the entire mesh using the show spanning-tree interface command.

Configuring VSTP/PVST+/Rapid-PVST+
Figure 5.4 depicts a sample network consisting of a mesh of EX8200/4200 and MX480 devices with the Cisco ESM switch. VSTP and PVST+ must be enabled on the Cisco and Juniper devices, respectively, for interoperability. Two VLANs 1122 and 71 are created on all devices; VSTP is enabled for both of these VLANs.

Chapter 5: Configuring Spanning Tree Protocols

79

Cisco (PVST +/ Rapid-PVST+) and Juniper (VSTP)


Priority bc_ext:16K bc_int:16K ge-5/3/4 ge-5/3/3 Priority bc_ext:24K bc_int:12K ge-1/0/2 ge-1/0/6 ge-1/0/8 VLAN [71, 1122] VLAN [71, 1122] VLAN [71, 1122] ge-5/2/2

BladeCenter * blade 6, eth 1, ip=11.22.1.6

BladeCenter * blade 8, eth 1, ip=11.22.1.8

MX480
ge-5/1/2 ge-5/1/1

VLAN [71, 1122]

Trunk Port 17
9.3.71.39 9.3.71.50 9.3.71.40 9.3.71.35 9.3.71.41

Trunk Port 18

EX8200
VLAN [1122]

ge-1/0/4 Trunk Port 19 VLAN [1122]

Internal eth 0 interface for each blade [6, 7, 8, 9, 10] connected via Trunk Ports [17, 18, 19]

ge-0/0/12 ge-0/0/14 VLAN [71] ge-0/0/20 ge-0/0/7

ge-0/0/21

VLAN [71] ge-0/0/23

ge-0/0/15

ge-0/0/13

EX4200

EX4200
172.28.113.175 Priority bc_ext:16K bc_int:8K

ge-0/0/9

BladeCenter * blade 10, eth 1, ip=11.22.1.10

BladeCenter * blade 7, eth 1, ip=11.22.1.7

BladeCenter * blade 9, eth 1, ip=11.22.1.9

*Server connections simulated to each DUT via eth 1 interface connected to BladeCenters Pass-Through Module for blade slots [6, 7, 8, 9, 10]

Figure 5.4

Spanning Tree VSTP/(PVST+, Rapid-PVST+)

80

Data Center Network Connectivity with IBM Servers

Table 5.4 lists the bridge priorities for each of the VLANs. Table 5.4 VLAN ID VSTP Bridge Priorities Bridge Priority
EX4200-A 8K 71 EX4200-B 4K EX8200 12K MX480 16K EX4200-A 16K 1122 EX4200-B 32K EX8200 24K MX480 16K

Verification
Based on the sample setup as shown in Figure 5.4, verify interoperability of the VSTP configuration with Cisco PVST+ by performing the following steps. 1. Verify that each of the switches with VSTP/ PVST+ enabled has two spanning trees corresponding to two VLANs. Each VLAN has its own RP (ROOT), BP (ALT) and DP (DESG). Use the show spanning tree command.
chandra@SPLAB-EX-180> show spanning-tree interface Spanning tree interface parameters for VLAN 1122 Interface Port ID Designated Designated Port State Role port ID bridge ID Cost ge-0/0/10.0 128:523 128:513 17506.0019e2544040 20000 FWD ROOT ge-0/0/12.0 224:525 224:525 33890.0019e2544ec0 20000 FWD DESG ge-0/0/14.0 240:527 240:213 17506.001db5a167d0 20000 BLK ALT ge-0/0/15.0 240:528 240:528 33890.0019e2544ec0 20000 FWD DESG Spanning tree interface parameters for VLAN 71 Interface Port ID Designated Designated Port State Role port ID bridge ID Cost ge-0/0/10.0 128:523 128:523 4167.0019e2544ec0 20000 FWD DESG ge-0/0/13.0 224:526 224:526 4167.0019e2544ec0 20000 FWD DESG ge-0/0/14.0 240:527 240:527 4167.0019e2544ec0 20000 FWD DESG

2. Confirm that there is only one direct active path per VLAN instance to any other non-root bridge. All redundant paths should be identified as Blocked. Use the output of the show spanning-tree interface command for this purpose. Rebooting the root port should cause the device with the next lower priority step up as the root for the particular VLAN. This information must be updated in the VLAN table on all devices. 3. Verify that the original root bridge becomes the primary (active), after the reboot. This information should be updated on all devices in the mesh. NOTE Any change in bridge priorities on either of the VSTP must be propagated through the mesh.

Chapter 5: Configuring Spanning Tree Protocols

81

Configuration Snippets
The following code pertains to the EX4200-A.
chandra@EX-175-CSR> show configuration protocols vstp // Define a VLAN bc-external, assign bridge and interface priorities. // Enable root protection so that the interface is blocked when it receives BPDUs. Also, define the port to be an edge port. vlan bc-external { bridge-priority 16k; interface ge-0/0/7.0 { priority 240; edge; no-root-port; } interface ge-0/0/20.0 { priority 224; } interface ge-0/0/21.0 { priority 240; } } // Define a VLAN bc-internal, assign bridge and interface priorities. // Enable root protection so that interface is blocked when it receives BPDUs. Also, define the port to be an edge port. vlan bc-internal { bridge-priority 8k; interface ge-0/0/9.0 { priority 240; edge; no-root-port; } interface ge-0/0/21.0 { priority 240; } interface ge-0/0/23.0 { priority 224; } }

The following code pertains to the MX480.


// Define VLAN71, assign bridge and interface priorities. // Define VLAN1122, assign bridge and interface priorities. chandra@HE-RE-0-MX480> show configuration protocols vstp vlan 71 { bridge-priority 16k; interface ge-5/1/1 { priority 240; } interface ge-5/1/2 { priority 240; } interface ge-5/2/2 { priority 240; } interface ge-5/3/3 { priority 240; } }

82

Data Center Network Connectivity with IBM Servers

vlan 1122 { bridge-priority 16k; interface ge-5/1/1 { priority 240; } interface ge-5/1/2 { priority 240; } interface ge-5/2/2 { priority 240; } interface ge-5/3/3 { priority 240; } interface ge-5/3/4 { priority 240; } }

83

Chapter 6
Supporting Multicast Traffic

Internet.Group.Management.Protocol.Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Configuring.Protocol.Independent.Multicast. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 IGMP.Snooping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Configuring.IGMP.Snooping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

IPv4.SENDS.IP.DATAGRAMS.to.a.single.destination.or.a.group.of.interested. receivers.by.using.three.fundamental.types.of.addresses:. . Unicast..sends.a.packet.to.a.single.destination .. . Broadcast..sends.a.datagram.to.an.entire.subnetwork .. . Multicast..sends.a.datagram.to.a.set.of.hosts.that.can.be.on.different.. sub-networks.and.can.be.configured.as.members.of.a.multicast.group .

84

Data Center Network Connectivity with IBM Servers

Internet Group Management Protocol Overview


A multicast datagram is delivered to destination group members with the same best-effort reliability as a standard unicast IP datagram. This means that multicast datagrams are not guaranteed to reach all members of a group or to arrive in the same order in which they were transmitted. The only difference between a multicast IP packet and a unicast IP packet is the presence of a group address in the IP header destination address field. NOTE According to RFC3171, IP addresses 224.0.0.0 through 239.255.255.255 are designated as multicast addresses in IP v4. Individual hosts can join or leave a multicast group at any time. There are no restrictions on the physical location or on the number of members in a multicast group. A host can be a member of more than one multicast group at any time and does not have to belong to a group to send packets to members of a group. Routers use a group membership protocol to learn about the presence of group members on directly attached subnetworks. When a host joins a multicast group, it transmits a group membership protocol message to the group, and sets its IP process and network interface card to receive frames addressed to the multicast group. Junos software supports IP multicast routing with many protocols, such as: Internet Group Management Protocol (IGMP), versions 1, 2 and 3. Multicast Listener Discovery (MLD), versions 1 and 2. Distance Vector Multicast Routing Protocol (DVMRP). Protocol Independent Multicast (PIM). Multicast Source Discovery Protocol (MSDP). Session Announcement Protocol (SAP) and Session Description Protocol(SDP). For details concerning the IP multicast feature and how to configure it using Junos OS v10.0, please refer to the IP Multicast Operational Mode Commands Guide at https://www.kr.juniper.net/techpubs/en_US/junos10.0/information-products/ topic-collections/swcmdref-protocols/chap-ip-multicast-op-mode-cmds. html#chap-ip-multicast-op-mode-cmds. Implementing an IP multicast network requires a number of building blocks. Figure 6.1 shows a typical end-to-end video streaming service with IP multicasting. Both the client computer and adjacent network switches use IGMP to connect the client to a local multicast router. Between the local and remote multicast routers, we used Protocol Independent Multicast (PIM) to direct multicast traffic from the video server to many multicast clients. The Internet Group Management Protocol (IGMP) manages the membership of hosts and routers in multicast groups. IP hosts use IGMP to report their multicast group memberships to any adjacent multicast routers. For each of their attached physical networks, multicast routers use IGMP to learn which groups have members.

Chapter 6: Configuring the Internet Group Management Protocol

85

Multicast Router

L2 Switch with IGMP Snooping

Video Client

Video Server

Router 1

Local Multicast Router

Laptop

PIM

IGMP

IGMP

UDP/RTP Multicast Tra c LAN

Figure 6-1

IP Multicasting Network Deployment

IGMP manages the membership of hosts and routers in multicast groups. IP hosts use IGMP to report their multicast group memberships to any neighboring multicast routers. In addition, IGMP is used as the transport for several related multicast protocols, such as DVMRP and PIMv1. IGMP has three versions that are supported by hosts and routers: IGMPv1 The original protocol defined in RFC 1112. An explicit join message is sent to the router, but a timeout is used to determine when hosts leave a group. IGMPv2 Defined in RFC 2236. Among other features, IGMPv2 adds an explicit leave message to the join message so that routers can easily determine when a group has no listeners. IGMPv3 Defined in RFC 3376. IGMPv3 supports the ability to specify which sources can send to a multicast group. This type of multicast group is called a source-specific multicast (SSM) group and its multicast address is 232/8. IGMP v3 is also backwards compatible with IGMP v1 and IGMP v2. For SSM mode, we can configure the multicast source address so that the source can send the traffic to the multicast group. In this example, we create group 225.1.1.1 and accept IP address 10.0.0.2 as the only source.
user@host# set protocols igmp interface fe-0/1/2 static source 10.0.0.2 user@host# set protocols igmp interface fe-0/1/2 static source 10.0.0.2 source-count 3 user@host# set protocols igmp interface fe-0/1/2 static source 10.0.0.2 source-count 3 source-increment 0.0.0.2 user@host# set protocols igmp interface fe-0/1/2 static exclude source 10.0.0.2 group 225.1.1.1 group 225.1.1.1 group 225.1.1.1 group 225.1.1.1

NOTE

The SSM configuration requires that the IGMP version on the interface be set to IGMPv3.

86

Data Center Network Connectivity with IBM Servers

IGMP Static Group Membership


We can create IGMP static group membership for multicast forwarding without a receiver host. The following are some of the examples with various options used while creating static groups:
user@host# set protocols igmp user@host# set protocols igmp group-count 3 user@host# set protocols igmp group-count 3 group-increment interface fe-0/1/2 static group 225.1.1.1 interface fe-0/1/2 static group 225.1.1.1 interface fe-0/1/2 static group 225.1.1.1 0.0.0.2

When we enable IGMP static group membership, data is forwarded to an interface without that interface receiving membership reports from downstream hosts. NOTE When we configure static IGMP group entries on point-to-point links that connect routers to a rendezvous point (RP), the static IGMP group entries do not generate join messages toward the RP.

Various Multicast Routing Protocols


Multicast routing protocols enable a collection of multicast routers to build (join) distribution trees when a host on a directly attached subnet, typically a LAN, wants to receive traffic from a certain multicast group. There are five multicast routing protocols: DVMRP, Multicast Open Shortest Path First (MOSPF), CBT (Core Based Tree), PIM-Sparse, and PIM-Dense that can be used to achieve this. Table 6.1 summarizes the differences among the five multicast routing protocols. Table 6.1 Multicast Routing Protocols Summary Dense Mode
Yes Yes Yes No No

Multicast Routing Protocols


DVMRP MOSPF PIM dense mode PIM sparse mode CBT

Sparse Mode
No No No Yes Yes

Implicit Join
Yes No Yes No No

Explicit Join
No Yes No Yes Yes

(S,G) SBT
Yes Yes Yes Yes No

(* Shared ,G) Tree


No No No Yes Yes

Because PIM Sparse Mode and PIM Dense Mode are the most widely deployed techniques, they were used in this reference design.

Protocol Independent Multicast


The predominant multicast routing protocol used on the Internet today is Protocol Independent Multicast (PIM). PIM has two versions, v1 and v2. The main difference between PIMv1 and PIMv2 is the packet format. PIMv1 messages use Internet Group Management Protocol (IGMP) packets, whereas PIMv2 has its own IP protocol number (103) and packet structure. In addition, it is important to select the appropriate mode. Although PIM provides four modes: sparse mode, dense mode, sparse-dense mode, and source-specific mode, users typically use one of two basic modes: sparse mode or dense mode.

Chapter 6: Configuring the Internet Group Management Protocol

87

PIM dense mode requires only a multicast source and a series of multicast-enabled routers that run PIM dense mode to allow receivers to obtain multicast content. Dense mode ensures that the traffic reaches its prescribed destination by periodically flooding the network with multicast traffic, and relies on prune messages to ensure that subnets (where all receivers are un-interested in that particular multicast group) stop receiving packets. PIM sparse mode requires establishing special routers called rendezvous points (RPs) in the network core. This is the point where these routers upstream join messages from interested receivers and meet downstream traffic from the source of the multicast group content. A network can have many RPs, but PIM sparse mode allows only one RP to be active for any multicast group. On the multicast router, it typically has two IGMP interfaces: upstream IGMP interface and downstream IGMP interface. We must configure PIM on the upstream IGMP interfaces to enable multicast routing and to perform reverse path forwarding for multicast data packets to populate the multicast-forwarding table for the upstream interfaces. In the case of PIM sparse mode, to distribute IGMP group memberships into the multicast routing domain. Only one pseudo PIM interface is required to represent all IGMP downstream (IGMP-only) interfaces on the router. Therefore, PIM is generally not required on all IGMP downstream interfaces, reducing the amount of router resources, such as memory. IGMP and Nonstop Active Routing NSR configurations include passive support with IGMP in association with PIM. The primary Routing Engine uses IGMP to determine its PIM multicast state, and this IGMP-derived information is replicated on the backup Routing Engine. IGMP on the new primary Routing Engine (after failover) relearns the state information quickly through the IGMP operation. In the interim, the new primary Routing Engine retains the IGMP-derived PIM state as received by the replication process from the original primary Routing Engine. This state information times out unless refreshed by IGMP on the new primary Routing Engine. Additional IGMP configuration is not required. Filtering Unwanted IGMP Reports at the IGMP Interface Level The group-policy statement enables the router to filter unwanted IGMP reports at the interface level. When this statement is enabled on a router running IGMP version 2 (IGMPv2) or version 3 (IGMPv3), after the router receives an IGMP report, the router compares the group against the specified group policy and performs the action configured in that policy. For example, the router rejects the report if the policy doesnt match the defined address or network. To enable IGMP report filtering for an interface, include the following group-policy statement:
protocols { igmp { interface ge-1/1/1.0 { group-policy reject_policy; } } }

88

Data Center Network Connectivity with IBM Servers

policy-options { //IGMPv2 policy policy-statement reject_policy { from { router-filter 192.1.1.1/32 exact; } then reject; } policy-statement reject_policy { //IGMPv3 policy from { router-filter 192.1.1.1/32 exact; source-address-filter 10.1.0.0/16 orlonger; } then reject; } }

IGMP Configuration Command Hierarchy To configure the Internet Group Management Protocol (IGMP), include the following igmp statement:
igmp { accounting; // Accounting Purposes interface interface-name { disable; (accounting | no-accounting); // Individual interface specific accounting group-policy [ policy-names ]; immediate-leave; // see Note 1 at end of code snippet. oif-map map-name; promiscuous-mode; // See Note 2 at end of code snippet. ssm-map ssm-map-name; static { group multicast-group-address { exclude; group-count number; group-increment increment; source ip-address { source-count number; source-increment increment; } } } version version; // See Note 3 at end of code snippet.

} query-interval seconds; query-last-member-interval seconds; // Default 1 Second query-response-interval seconds; // Default 10 Seconds robust-count number; // See Note 4 at end of code snippet. traceoptions { // Tracing Purposes file filename <files number> <size size> <world-readable | no-world-readable>; flag flag <flag-modifier> <disable>; // Flag can be : [leave (for IGMPv2 only)| mtrace | packets | query | report] } }

Chapter 6: Configuring the Internet Group Management Protocol

89

NOTE 1

Use this statement only on IGMP version 2 (IGMPv2) interfaces to which one IGMP host is connected. If more than one IGMP host is connected to a LAN through the same interface, and one host sends a leave group message, the router removes all hosts on the interface from the multicast group. The router loses contact with the hosts that must remain in the multicast group until they send join requests in response to the routers next general group membership query.

NOTE 2 By default, IGMP interfaces accept IGMP messages only from the same subnetwork. The promiscuous-mode statement enables the router to accept IGMP messages from different sub-networks. NOTE 3 By default, the router runs IGMPv2. If a source address is specified in a multicast group that is configured statically, the IGMP version must be set to IGMPv3. Otherwise, the source will be ignored and only the group will be added. The join will be treated as an IGMPv2 group join. When we reconfigure the router from IGMPv1 to IGMPv2, the router will continue to use IGMPv1 for up to 6 minutes and will then use IGMPv2. NOTE 4 The robustness variable provides fine-tuning to allow for expected packet loss on a subnetwork. The value of the robustness variable is used in calculating the following IGMP message intervals: Group member interval=(robustness variable x query-interval) + (1 x query-response-interval) Other querier present interval= (robustness variable x query-interval) + (0.5 x query-response-interval), lastmember query count=robustness variable. By default, the robustness variable is set to 2. Increase this value if you expect a subnetwork to lose packets.

Configuring Protocol Independent Multicast


This section focuses on configuring PIM-Sparse Mode on the MX Ethernet router and EX Ethernet switch series with various routing protocols based on the following scenarios. Scenario 1: configure PIM on MX480 and EX4200 with OSPF. Scenario 2: configure PIM on EX8200 and EX4200 with RIP. In each scenario, we used an IGMP server as the source of the multicast streams and used the VideoLAN (VLC) media player as the IGMP client, which makes a request to join the multicast group.

Scenario 1: Configuring PIM on the MX480 and EX4200 with OSPF


As illustrated in Figure 6.2, the MX480 and EX4200 are the multicast routers, which interoperate with OSPF routing protocol. PIM is configured on both routers and is configured only on upstream interfaces to enable multicast routing. The multicast client runs on the IBM Blade Server, which connects to the access switch, for example the EX4200.

90

Data Center Network Connectivity with IBM Servers

IGMP Multicast Source

IGMP Source (Streaming)


ge-5/2/6

Multicast Router Io0.0 8.8.8.8

MX480
PIM with OSPF

ge-5/2/5 ge-0/0/44

Multicast Router

EX4200
ge-0/0/2 VLAN 1119 IGMP Multicast Client BNT Pass-Through

Io0.0 6.6.6.6

Eth1

Figure 6.2

Configuring PIM on MX480 and EX4200 with OSPF

Configuring the MX480


chandra@HE-RE-1-MX480# show ge-5/2/5 unit 0 { family inet { address 22.11.5.5/24; } } {master}[edit interfaces] chandra@HE-RE-1-MX480# show lo0 unit 0 { family inet { address 8.8.8.8/32; } } chandra@HE-RE-1-MX480# show protocols igmp interface all { promiscuous-mode; } interface ge-5/2/6.0 { static { group 239.168.1.1 { group-count 10; source 10.10.10.254; } }

Chapter 6: Configuring the Internet Group Management Protocol

91

} interface ge-5/2/5.0 { static { group 239.168.1.4; } } {master}[edit] chandra@HE-RE-1-MX480# show protocols pim rp { local { address 8.8.8.8; } } interface all { mode sparse; } interface fxp0.0 { disable; } chandra@HE-RE-1-MX480# show protocols ospf area 0.0.0.0 { interface ge-5/2/5.0; interface lo0.0 { passive; } interface fxp0.0 { disable; } } chandra@HE-RE-1-MX480# show routing-options router-id 8.8.8.8;

Configuring the EX4200


chandra@EX-175-CSR# show interfaces ge-0/0/2 unit 0 { family ethernet-switching; } chandra@EX-175-CSR# show interfaces ge-0/0/44 unit 0 { family inet { address 22.11.5.44/24; } } chandra@EX-175-CSR# show interfaces vlan unit 1119 { family inet { address 10.10.9.100/24; } } chandra@EX-175-CSR# show protocols igmp interface me0.0 { disable; } interface vlan.1119 { immediate-leave; } interface ge-0/0/6.1119; interface all; chandra@EX-175-CSR# show protocols pim

92

Data Center Network Connectivity with IBM Servers

rp { static { address 8.8.8.8; } } interface vlan.1119; interface me0.0 { disable; } interface all { mode sparse; } chandra@EX-175-CSR# show interfaces lo0 unit 0 { family inet { address 6.6.6.6/32; } } chandra@EX-175-CSR# show protocols ospf area 0.0.0.0 { interface ge-0/0/44.0; interface lo0.0 { passive; } interface me0.0 { disable; } } chandra@EX-175-CSR# show routing-options router-id 6.6.6.6;

Validating the MX480 Configuration


chandra@HE-RE-1-MX480> show route |grep PIM 224.0.0.2/32 *[PIM/0] 06:21:14 MultiRecv 224.0.0.13/32 *[PIM/0] 06:21:14 MultiRecv 239.168.1.1,10.10.10.254/32*[PIM/105] 01:28:54 Multicast (IPv4) 239.168.1.2,10.10.10.254/32*[PIM/105] 01:23:33 Multicast (IPv4) 239.168.1.3,10.10.10.254/32*[PIM/105] 01:23:33 Multicast (IPv4) 239.168.1.4,10.10.10.254/32*[PIM/105] 01:23:33 Multicast (IPv4) chandra@HE-RE-1-MX480> show pim neighbors Instance: PIM.master B = Bidirectional Capable, G = Generation Identifier, H = Hello Option Holdtime, L = Hello Option LAN Prune Delay, P = Hello Option DR Priority Interface IP V Mode Option Uptime Neighbor addr ge-5/2/5.0 4 2 HPLG 01:14:14 22.11.5.44 chandra@HE-RE-1-MX480> show pim join Instance: PIM.master Family: INET R = Rendezvous Point Tree, S = Sparse, W = Wildcard Group: 239.168.1.1

Chapter 6: Configuring the Internet Group Management Protocol

93

Source: * RP: 8.8.8.8 Flags: sparse,rptree,wildcard Upstream interface: Local Group: 239.168.1.1 Source: 10.10.10.254 Flags: sparse,spt Upstream interface: ge-5/2/6.0 Group: 239.168.1.2 Source: * RP: 8.8.8.8 Flags: sparse,rptree,wildcard Upstream interface: Local Group: 239.168.1.2 Source: 10.10.10.254 Flags: sparse,spt Upstream interface: ge-5/2/6.0

chandra@HE-RE-1-MX480> show pim source Instance: PIM.master Family: INET Source 8.8.8.8 Prefix 8.8.8.8/32 Upstream interface Local Upstream neighbor Local Source 10.10.10.254 Prefix 10.10.10.0/24 Upstream interface ge-5/2/6.0 Upstream neighbor 10.10.10.2 Source 10.10.10.254 Prefix 10.10.10.0/24 Upstream interface ge-5/2/6.0 Upstream neighbor Direct

Validating the EX4200 Configuration


chandra@EX-175-CSR# run show pim join Instance: PIM.master Family: INET R = Rendezvous Point Tree, S = Sparse, W = Wildcard Group: 239.168.1.1 Source: * RP: 8.8.8.8 Flags: sparse,rptree,wildcard Upstream interface: ge-0/0/44.0 Group: 239.168.1.1 Source: 10.10.10.254 Flags: sparse,spt Upstream interface: ge-0/0/44.0 chandra@EX-175-CSR# run show pim neighbors Instance: PIM.master B = Bidirectional Capable, G = Generation Identifier,

94

Data Center Network Connectivity with IBM Servers

H = Hello Option Holdtime, L = Hello Option LAN Prune Delay, P = Hello Option DR Priority Interface IP V Mode Option ge-0/0/44.0 4 2 HPLG chandra@EX-175-CSR# run show pim source Instance: PIM.master Family: INET Source 8.8.8.8 Prefix 8.8.8.8/32 Upstream interface ge-0/0/44.0 Upstream neighbor 22.11.5.5 Source 10.10.10.254 Prefix 10.10.10.0/24 Upstream interface ge-0/0/44.0 Upstream neighbor 22.11.5.5 Uptime Neighbor addr 01:06:07 22.11.5.5

Scenario 2: Configuring PIM on the EX8200 and EX4200 with RIP


As illustrated in Figure 6.3, the EX8200 and EX4200 are the multicast routers with RIP enabled. PIM is configured in both routers, and is configured only on upstream interfaces to enable multicast routing. The multicasting client runs on the IBM PowerVM, which connects to the EX4200 access switch.

Configuring the EX4200


chandra@EX-175-CSR# show interfaces ge-0/0/2 unit 0 { family ethernet-switching; } chandra@EX-175-CSR# show interfaces ge-0/0/17 unit 0 { family inet { address 22.11.2.17/24; } } chandra@EX-175-CSR# show interfaces vlan unit 2211 { family inet { address 10.10.9.200/24; } } chandra@EX-175-CSR# show protocols igmp interface me0.0 { disable; } interface vlan.2211 {

Chapter 6: Configuring the Internet Group Management Protocol

95

IGMP Multicast Source

IGMP Multicast Source (Streaming)


ge-1/0/20

Multicast Router Io0.0 9.9.9.9

EX8200
PIM with RIP Multicast Router

ge-1/0/26 ge-0/0/17 Io0.0 6.6.6.6

EX4200

ge-0/0/2 VLAN 2211

IGMP Multicast Client IBM POWERVM

NICs/HEA (Host Ethernet Adapter) SEA (Shared Ethernet Adapter) Virtual Network

Figure 6.3

Configuring PIM on EX8200 and EX4200 with RIP

immediate-leave; } interface ge-0/0/2.2211; interface all; chandra@EX-175-CSR# show protocols pim rp { static { address 9.9.9.9; } } interface vlan.2211; interface me0.0 { disable; } interface all { mode sparse; } chandra@EX-175-CSR# show interfaces lo0 unit 0 { family inet { address 6.6.6.6/32; } }

96

Data Center Network Connectivity with IBM Servers

chandra@EX-175-CSR# show protocols rip send broadcast; receive both; group jweb-rip { export jweb-policy-rip-direct; neighbor ge-0/0/2.0; neighbor lo0.0; neighbor vlan.2211; } chandra@EX-175-CSR# show policy-options policy-statement jweb-policy-rip-direct { term 1 { from { protocol [ direct rip ]; interface [ ge-0/0/2.0 ge-0/0/17.0]; } then accept; } term 2 { then accept; } }

Configuring the EX8200


chandra@SPLAB-8200-1-re0# show protocols rip send broadcast; receive both; group jweb-rip { export jweb-policy-rip-direct; neighbor ge-1/0/26.0; neighbor lo0.0; } chandra@SPLAB-8200-1-re0# show policy-options policy-statement jweb-policy-rip-direct { term 1 { from { protocol [ direct rip ]; interface [ ge-1/0/26.0]; } then accept; } term 2 { then accept; } }

IGMP Snooping
An access switch usually learns unicast MAC addresses by checking the source address field of the frames it receives. However, a multicast MAC address can never be the source address for a packet. As a result, the switch floods multicast traffic on the VLAN, consuming significant amounts of bandwidth.

Chapter 6: Configuring the Internet Group Management Protocol

97

IGMP snooping regulates multicast traffic on a VLAN to avoid flooding. When IGMP snooping is enabled, the switch intercepts IGMP packets and uses the content of the packets to build a multicast cache table. The cache table is a database of multicast groups and their corresponding member ports and is used to regulate multicast traffic on the VLAN. When the switch receives multicast packets, it uses the cache table to selectively forward the packets only to the ports that are members of the destination multicast group. As illustrated in Figure 6.4, the access switch EX4200 connects four hosts and segments their data traffic with two VLANs, where host1 and host2 belong to VLAN1 and host3 and host4 belong to VLAN2. The hosts at the same VLAN might take different action on whether to subscribe or to unsubscribe the multicast group. For instance, host1 has subscribed to multicast group 1, while host2 is not interested in multicast group1 traffic; host3 has subscribed to multicast group 2, while host4 is not interested in multicast group 2 traffic. The EX4200 IGMP snooping feature can accommodate this request so that host1 receives multicast group1 traffic, and host2 does not; host3 receives multicast group 2 traffic, and host4 does not.

Host 1 in VLAN 1 Subscribes Group 1

Host 2 in VLAN 1

Trunk EX4200
Host 3 in VLAN 2 Subscribes Group 2

VLAN 1 VLAN 2 Multicast Group 1 Tra c Multicast Group 2 Tra c Host 4 in VLAN 2

Figure 6.4

IGMP Traffic Flow with IGMP Snooping Enabled

Hosts can join multicast groups in two ways: By sending an unsolicited IGMP join message to a multicast router that specifies the IP multicast that the host is attempting to join. By sending an IGMP join message in response to a general query from a multicast router. A multicast router continues to forward multicast traffic to a VLAN if at least one host on that VLAN responds to the periodic general IGMP queries. To leave a multicast group, a host can either not respond to the periodic general IGMP queries, which results in a silent leave, or send a group-specific IGMPv2 leave message.

98

Data Center Network Connectivity with IBM Servers

IGMP Snooping in EX Series Ethernet Switches


In the EX Series Ethernet switches, IGMP snooping works with both Layer 2 interfaces and the routed VLAN interfaces (RVIs) to regulate multicast traffic in a switched network. Switches use Layer 2 interfaces to send traffic to hosts that are part of the same broadcast domain and use a RVI to route traffic from one broadcast domain to another. When an EX Series switch receives a multicast packet, the Packet Forwarding Engines in the switch perform an IP multicast lookup on the multicast packet to determine how to forward the packet to its local ports. From the results of the IP multicast lookup, each Packet Forwarding Engine extracts a list of Layer 3 interfaces (which can include VLAN interfaces) that have ports local to the Packet Forwarding Engine. If an RVI is part of this list, the switch provides a bridge multicast group ID for each RVI to the Packet Forwarding Engine. Figure 6.5 shows how multicast traffic is forwarded on a multilayer switch. The multicast traffic arrives through the xe-0/1/0.0 interface. A multicast group is formed by the Layer 3 interface ge-0/0/2.0, vlan.0 and vlan.1. The ge-2/0/0.0 interface is a common trunk interface that belongs to both vlan.0 and vlan.1. The letter R next to an interface name in Figure 6.5 indicates that a multicast receiver host is associated with that interface.
xe-0/1/0.0 Multicast Tra c xe-0/1/0.0 ge-0/0/2.0 (R) ge-0/0/3.0

EX4200 Series Switch

VLAN 0 v100

VLAN 1 v200

(R) ge-0/0/0.0

ge-2/0/0.0 (R)

(R) ge-0/0/0.0

ge-2/0/0.0 (R)

ge-1/0/0.0

ge-1/0/0.0

Multicast Tra c Non-Multicast Tra c (R) Receiving Multicast Tra c

Figure 6.5

IGMP Traffic Flow with Routed VLAN Interfaces

Chapter 6: Configuring the Internet Group Management Protocol

99

IGMP Snooping Configuration Command


The IGMP snooping feature is available on the MX Ethernet router and the EX Ethernet switch series. However, the configuration command hierarchy is different on these two devices. In the EX Ethernet switch series, the configuration hierarchy is at the [edit protocols] hierarchy level in Junos CLI and the detailed configuration stanza is as follows:
igmp-snooping { vlan (vlan-id | vlan-number { disable { interface interface-name } immediate-leave; interface interface-name { multicast-router-interface; static { group ip-address; } } query-interval seconds; query-last-member-interval seconds; query-response-interval seconds; robust-count number; } }

NOTE

By default, IGMP snooping is not enabled. Statements configured at the VLAN level apply only to that particular VLAN. With the MX Ethernet Router Series in the Junos CLI, we can configure a Layer 2 broadcasting domain with a bridge domain, so that IGMP snooping is configured at the [bridge-domains] configuration hierarchy. The detailed configuration stanza is as follows:
multicast-snooping-options { flood-groups [ ip-addresses ]; forwarding-cache { threshold suppress value <reuse value>; } graceful-restart <restart-duration seconds>; ignore-stp-topology-change; }

100 Data Center Network Connectivity with IBM Servers

Configuring IGMP Snooping


This section focuses on configuring IGMP-Snooping on the MX Ethernet Router and EX Ethernet switch series with various IGMP client platforms on the following scenarios. Scenario 1: MX480, EX Series and IBM Blade Center. Scenario 2: MX480 and IBM x3500 Server. In each scenario, we used a IGMP server as the source of multicast steams and used the VideoLAN (VLC) media player as the IGMP Client, which requests to join onto the multicast group.

Scenario 1: MX480, EX Series and IBM Blade Center


As illustrated in Figure 6.6, the IGMP multicast source generates the IGMP Group 2 flow: from the MX480 to the EX800, and then on to the IGMP client, which runs on the IBM Blade Center. Two interfaces (ge-5/2/3 and ge-5/2/6) in the MX480 are configured as Layer 2 switches by using bridge domain, which is associated with VLAN 1117. The ge-5/2/6 interface is configured with the multicast-router interface and this interface connects to the multicasting source; interface ge-5/2/3 is a Layer 2 interface with the multicasting IP address (239.168.1.3). This configuration allows the interface to receive and then forward the multicasting packets to their target.
IGMP Multicast Source (Streaming) Multicast Router
ge-5/2/5 ge-0/0/44 ge-5/2/6

ge-5/2/3

MX480
ge-0/0/2 ge-0/0/6

ge-5/2/4

ge-1/0/20

ge-1/0/22 VLAN 1117 VLAN 2211

EX4200

EX8200
Cisco MM1 ESM 1 BNT Pass-Through

VLAN 1119 Eth1 Trunk Port 17 Up to 14 GigE Links SoL Eth0

Figure 6.6

MX480, EX8200, EX4200 and IBM Blade Center IGMP Traffic Flow with IGMP Snooping

Chapter 6: Configuring the Internet Group Management Protocol

101

Configuring the MX480


chandra@HE-RE-1-MX480> show configuration bridge-domains 1117 domain-type bridge; vlan-id 1117; interface ge-5/2/3.0; interface ge-5/2/6.0; protocols { igmp-snooping { interface ge-5/2/3.0 { static { group 239.168.1.3; } } interface ge-5/2/6.0 { multicast-router-interface; } } }

Configuring the EX4200


{master:0} chandra@EX-175-CSR> show configuration protocols igmp-snooping vlan IGMP { interface ge-0/0/2.0 { static { group 239.168.1.1; } } interface ge-0/0/17.0 { static { group 239.168.1.1; } multicast-router-interface; } } chandra@EX-175-CSR> show configuration vlans 2211 vlan-id 2211; interface { ge-0/0/2.0; ge-0/0/17.0; }

Configuring the EX8200


chandra@SPLAB-8200-1-re0> show configuration protocols igmp-snooping vlan 1117 { interface ge-1/0/18.0 { static { group 239.168.1.3; } multicast-router-interface; } interface ge-1/0/22.0 { static { group 239.168.1.3; } } }

102

Data Center Network Connectivity with IBM Servers

Validating IGMP Snooping


laka-bay1#show ip igmp snooping group Vlan Group Version Port List -------------------------------------------2211 239.168.1.1 v2 Gi0/17 laka-bay1#show ip igmp snooping mrouter Vlan ports -------2211 Gi0/19(dynamic) laka-bay1#show ip igmp snooping querier Vlan IP Address IGMP Version Port ----------------------------------------------2211 11.22.3.24 v2 Gi0/19 chandra@HE-RE-1-MX480> show igmp snooping statistics Bridge: bc-igmp IGMP Message type Received Sent Rx errors . . . Membership Query 0 9 0 V1 Membership Report 0 0 0 DVMRP 0 0 0 . . . Group Leave 1 4 0 . . . V3 Membership Report 43 56 0 . . . chandra@HE-RE-1-MX480> show igmp snooping membership detail Instance: default-switch Bridge-Domain: bc-igmp Learning-Domain: default Interface: ge-5/2/6.0 Interface: ge-5/2/5.0 Group: 239.168.1.2 Group mode: Exclude Source: 0.0.0.0 Last reported by: 10.10.10.1 Group timeout: 76 Type: Dynamic chandra@EX-175-CSR> show igmp-snooping membership VLAN: IGMP 239.168.1.2 * Interfaces: ge-0/0/2.0, ge-0/0/44.0 chandra@EX-175-CSR> show igmp-snooping membership detail VLAN: IGMP Tag: 2211 (Index: 10) Router interfaces: ge-0/0/44.0 static Uptime: 00:31:59 Group: 239.168.1.2 Receiver count: 1, Flags: <V2-hosts Static> ge-0/0/2.0 Uptime: 00:39:34 ge-0/0/44.0 Uptime: 00:39:34 chandra@EX-175-CSR> show igmp-snooping statistics Bad length: 0 Bad checksum: 0 Invalid interface: 0

Chapter 6: Configuring the Internet Group Management Protocol

103

Not local: 0 Receive unknown: 0 Timed out: 2 IGMP Type Received Transmitted Receive Errors Queries: 156 12 0 Reports: 121 121 0 Leaves: 2 2 0 Other: 0 0 0

Scenario 2: MX480 and IBM x3500 Server


In this scenario, the IGMP group traffic flow is generated from the IGMP source and sent to the MX480; it then continues to the client, which runs on the IBM x3500 Series Platform. As shown in Figure 6.7, two interfaces in the MX480 (ge-5/2/4 and ge-5/2/6) in the MX480 are configured as Layer 2 switches by using bridge domain, which is associated with VLAN 1118. The interface ge-5/2/6 which is configured with the multicast-router-interface, connects to the multicasting source; interface ge-5/2/4 is a Layer 2 interface with multicasting IP address (239.168.1.4) and is set up to receive and forward multicasting packets to their respective servers.
IGMP Multicast Source

IGMP Multicast Source (Streaming)


ge-5/2/6

Multicast Router

MX480

ge-5/2/4

ip = 239.168.1.4

IGMP Client

IBM X3500 Server IP = 10.10.9.1 GW = 10.10.9.1

Figure 6.7

MX480 and IBM x3500 IGMP Traffic Flow with IGMP Snooping

chandra@HE-RE-1-MX480> show configuration bridge-domains 1118 domain-type bridge vlan-id 1118; interface ge-5/2/4.0; interface ge-5/2/6.0; protocols { igmp-snooping { interface ge-5/2/6.0 { multicast-router-interface; } interface ge-5/2/4.0 { static { group 239.168.1.4; } } } }

105

Chapter 7
Understanding Network CoS and Latency
Class.of.Service. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 Configuring.CoS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Latency. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . 115

AN.APPLICATIONS.PERFORMANCE.directly.relies.on.network.performance .. Network.performance.typically.refers.to.bandwidth.because.bandwidth.is.the. primary.measure.of.computer.network.speed.and.represents.overall.capacity.of.a. connection ..Greater.capacity.typically.generates.improved.performance ..However,. network.bandwidth.is.not.the.only.factor.that.contributes.to.network. performance . The.performance.of.an.application.relies.on.different.network.characteristics ... Some.real-time.applications.such.as.voice.and.video.are.extremely.sensitive.to. latency,.jitter,.and.packet.loss,.while.some.non.real-time.applications,.such.as.web. applications.(HTTP),.email,.File.Transfer.Protocol.(FTP),.and.Telnet,.do.not. require.any.specific.reliability.on.the.network,.and.best.effort.policy.works.well.in. transmitting.these.traffic.types .

106 Data Center Network Connectivity with IBM Servers

In todays converged network, including data/voice converged networks and data/storage converged networks, and in cloud-ready data centers with server virtualization, different types of applications are transmitted throughout the same network. To ensure application performance for all types of applications, additional provisions are required within the network to minimize latency and packet loss. This chapter covers two techniques for improving data center network performance: Using class of service (CoS) to manage packet loss. Considering latency characteristics when designing networks using Juniper Networks data center network products.

Class of Service
Typically, when a network experiences congestion and delay, some packets will be dropped. However, as an aid in preventing dropped packets, Junos CoS allows an administrator to divide traffic into classes and offers various levels of throughput and packet loss when congestion and delay occur. This allows packet loss to occur only when specific rules are configured on the system. In designing CoS applications, we must consider service needs, and we must thoroughly plan and design CoS configuration to ensure consistency across all routers in a CoS domain. We must also consider all the routers and other networking equipment in the CoS domain to ensure interoperability among different types of equipment. However, before we further proceed with implementing CoS in Junos, we should understand CoS components and packet flow through the CoS process.

Junos CoS Process


Figure 7.1 shows a typical CoS process, the general flow of a packet as it passes through CoS in a QoS implemented router.
Ingress Processing
IFL-Based Classier

BA Classiers

MF Classiers

Policing

Packet Forwarding

Engress Processing
Rewrite Packet Dropping Queueing and Shaping Policing MF Classiers

Packet Classication/Marking Packet Queueing/Shaping

Figure 7.1

CoS Processing Model

Chapter 7: Understanding Network CoS and Latency

107

The following is a list of the key steps in the QoS process, together with the corresponding configuration commands for the process. 1. Classifying: This step examines (for example, EXP bits, IEEE 802.1p bits, or DSCP bits) to separate incoming traffic. One or more classifiers must be assigned to a physical interface or a logical interface must be assigned one or more classifiers to separate the traffic flows. The classifier configuration is at the [edit class-of-service interfaces] hierarchy level in Junos CLI. In addition, the classifier statement further defines how to assign the packet to a forwarding class with a loss priority. The configuration is at the [edit classof-service classifiers] hierarchy level in Junos CLI. For details concerning packet loss priority and forwarding class, see Defining Loss Priorities and Defining Forwarding Classes on page 109 of this handbook. Furthermore, each forwarding class can be assigned to a queue. The configuration is at the [edit class-of-service forwarding-classes] hierarchy level. 2. Policing: This step meters traffic. It changes the forwarding class and loss priority if a traffic flow exceeds its pre-defined service level. 3. Scheduling: This step manages all attributes of queuing, such as transmission rate, buffer depth, priority, and Random Early Detection (RED) profile. A schedule map will be assigned to the physical or logical interface. The configuration is at the [edit class-of-service interfaces] hierarchy level in Junos CLI. In addition, the scheduler statement defines how traffic is treated in the output queuefor example, the transmit rate, buffer size, priority, and drop profile. The configuration is at the [edit class-of-service schedulers] hierarchy level. Finally, the scheduler-maps statement assigns a scheduler to each forwarding class. The configuration is at the [edit class-of-service scheduler-maps] hierarchy level. 4. Packet Dropping: This step manages drop-profile to avoid TCP synchronization and protect high priority traffic from being dropped. The drop-profile defines how aggressively to drop packets that are using a particular scheduler. The configuration is at the [edit class-of-service drop-profiles] hierarchy level. 5. Rewrite Marker: This step rewrites the packet CoS fields (for example, EXP or DSCP bits) according to the forwarding class and loss priority of the packet. The rewrite rule takes effect as the packet leaves a logical interface that has a rewrite rule. The configuration is at the [edit class-of-service rewrite-rules] hierarchy level in Junos CLI.

108 Data Center Network Connectivity with IBM Servers

JUNOS CoS Implementation Best Practices


Best practices include the following: Selecting the appropriate classifier. Using code-point aliases. Defining loss priorities. Defining forwarding classes. Defining comprehensive schedulers. Defining policers for traffic classes. Selecting the Appropriate Classifier Selecting the appropriate classifier is key in distinguishing traffic. Table 7.1 lists classifier comparisons between Juniper Networks MX Series and EX Series. Table 7.1 Packet Classifiers Comparison Between MX Series and EX Series MX960 Series & MX480 Series
Yes Yes Yes Yes Yes Yes

Packet Classifiers
dscp exp ieee-802.1 ieee-802.1ad inet-precedence

EX8200 Series & EX4200 Series


Yes Yes Yes

Function
Handles incoming IPv4 packets. Handles incoming IPv6 packets. Handles MPLS packets using Layer 2 headers. Handles Layer 2 CoS. Handles IEEE-802.1ad (DEI) classifier. Handles incoming IPv4 packets. IP precedence mapping requires only the upper three bits of the DSCP field.

dscp-ipv6

Using Code-Point Aliases Using code-point aliases requires an operator to assign a name to a pattern of code-point bits. We can use this name instead of the bit pattern when configuring other CoS components, such as classifiers, drop-profile maps, and rewrite rules, for example ieee-802.1 { be 000; af12 101; af11 100; be1 001; ef 010; } . Defining Loss Priorities Loss priority affects the scheduling of a packet without affecting the packets relative ordering. An administrator can use the packet loss priority (PLP) bit as part of a congestion control strategy and can use the loss priority setting to identify packets that have experienced congestion. Typically, an administrator will mark packets exceeding a specified service level with a high loss priority and set the loss priority by configuring a classifier or a policer. The loss priority is used later in the work flow to select one of the drop profiles used by random early detection (RED).

Chapter 7: Understanding Network CoS and Latency

109

Defining Forwarding Classes The forwarding class affects the forwarding, scheduling, and marking of policies applied to packets as they move through a router. Table 7.2 summarizes the mapping between queues and different forwarding classes for both the MX and EX Series. Table 7.2 Forwarding Classes for MX480, EX4200 and EX8200 Series MX Series Queue
Q3 Q2 Q0

Forwarding Class
Voice (EF) Video (AF) Data (BE) Network Control (NC)

EX Series Queue
Q5 Q4 Q0 Q7

The forwarding class, plus the loss priority defines the per-hop behavior. If the use case requires associating the forwarding classes with next hops, then the forwarding policy options are available only on the MX Series. Defining Comprehensive Schedulers An individual router interface has multiple queues assigned to store packets. The router determines which queue to service based on a particular method of scheduling. This process often involves a determination of which type of packet should be transmitted before another type of packet. Junos schedulers allow an administrator to define the priority, bandwidth, delay buffer size, rate control status, and RED drop profiles to be applied to a particular queue for packet transmission. Defining Policers for Traffic Classes Policers allow an administrator to limit traffic of a certain class to a specified bandwidth and burst size. Packets exceeding the policer limits can be discarded or can be assigned to a different forwarding class, a different loss priority, or both. Juniper defines policers with filters that can be associated with input or output interfaces. Table 7.3 compares the multicast routing protocols as they pertain to Juniper Networks MX4800, EX8200, and EX4200.

110

Data Center Network Connectivity with IBM Servers

Table 7.3 Field


classifiers code-pointaliases

Comparison of Multicast Routing Protocols Description


Classify incoming packets based on code point value Mapping of code point aliases to bit strings Random Early Drop (RED) data point map Define CoS parameters of switch fabric One or more mappings of forwarding class to queue number Class-of-service forwarding policy Mapping of forwarding class to fragmentation options Classify and mark host traffic to forwarding engine Apply class-of-service options to interfaces Multicast class of service Map forwarding classes to restricted queues Write code point value of outgoing packets Apply CoS options to routing instances with VRF table label Mapping of forwarding classes to packet schedulers Packet schedulers Traffic shaping and scheduling profiles Translation table Enable tricolor marking

MX4800 Series
Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

EX8200 Series
Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes -

EX4200 Series
Yes Yes Yes Yes Yes Yes Yes -

drop-profiles fabric forwardingclasses forwardingpolicy fragmentationmaps host-outboundtraffic interfaces multidestination restrictedqueues rewrite-rules routinginstances schedulermaps schedulers traffic-controlprofiles translationtable tri-color

Chapter 7: Understanding Network CoS and Latency

111

Configuring CoS
In this section, we demonstrate a sample configuration scenario for configuring CoS on the EX4200. Two blade servers connect to two different interfaces to simulate production traffic by issuing a ping command; the test device (N2X) will generate significant network traffic classified as background traffic through the EX4200 to one of the blade servers. This background package will congest with production traffic, causing packet loss in the product traffic. Because the EX4200 is central to network traffic aggregation in this scenario, it is reasonable to apply a CoS packet loss policy on the EX4200 to ensure that no packet loss occurs in the product traffic. NOTE The configuration scenario and snippet is also applicable to MX Series Ethernet Routers.

Configuration Description
As illustrated in Figure 7.2, the EX4200 is the DUT, which interconnects IBM blade servers, and the Agilent Traffic Generator N2X.
ge-203/1 11.22.1.100 N2X ge-304/4 11.22.1.200 ge-0/0/25

ge-0/0/24

ge-0/0/9 11.22.1.9

EX4200

IBM BladeCenter
Pass-Through Module via eth 1 Interface on 9th Blade

ge-0/0/7

11.22.1.7

IBM BladeCenter
Pass-Through Module via eth 1 Interface on 7th Blade

Background Tra c Production Tra c

Figure 7.2

EX4200 CoS Validation Scenario

The test includes the following steps: 1. The N2X generates network traffic as background traffic onto the EX4200 through two ingress GigE ports (ge-0/0/24 and ge-0/0/25).

2. The EX4200 forwards the background traffic to a single egress GigE port (ge-0/0/9). 3. At the same time, the blade server uses the ping command to generate production traffic onto the EX4200 through a different interface (ge-0/0/7). 4. The EX4200 also forwards the production traffic to the same egress port (ge-0/0/9). From a packet loss policy perspective, the production traffic is low loss priority, while the background traffic is high.

112

Data Center Network Connectivity with IBM Servers

To verify the status of packets on ingress/egress ports, we enable the show interface queue <ge-0/x/y> command to confirm that only high loss priority packets from the BACKGROUND forwarding class were being tail dropped.

NOTE

The configuration used in this setup was sufficient to achieve confirmation on CoS functionality (in simplest form). Other detailed configuration options are available and can be enabled as needed. Refer to the CoS command Hierarchy Levels in the Junos Software CLI User Guide at www.juniper.net/techpubs/software/junos/ junos95/swref-hierarchy/hierarchy-summary-configuration-statement-classof-service.html#hierarchy-summary-configuration-statement-class-of-service. The following steps summarize the setup configuration process. 1. Configure the setup as illustrated in Figure 7.2 and by reviewing the CoS configuration code snippet. 2. Create some simple flows on N2X to send from each port-to-port ge0/0/9. 3. Send the traffic at 50% from each port to 11.22.1.9. (in absence of two ports, one port could be used to send 100% traffic). 4. Configure the DUT to do the CoS-based processing on ingress traffic from source 11.22.1.7 coming over interface ge-0/0/7 as High Class and low probability to get dropped and from interfaces ge-0/0/24 and ge-0/0/25 as High Priority to get dropped. 5. Now start the ping from 11.22.1.7 onto 11.22.1.9. 6. Tune the line-rate parameter with N2X traffic coming to ge-0/0/9. 7. Observe the egress interface statistics and ingress ports statistics to get confirmation that ping traffic is tagged higher forwarding class and does not get dropped, while traffic coming from port ge-0/0/24 and ge-0/0/25 gets dropped on ingress.

CoS Configuration Snippet


chandra@EX> show configuration class-of-service classifiers { ieee-802.1 DOTP-CLASSIFIER { //define the type of classifer forwarding-class CONVERSATIONAL { //Assign Expedited forwarding to CONVERSATIONAL forwarding-class loss-priority low code-points ef; } forwarding-class INTERACTIVE { loss-priority low code-points af12; } forwarding-class STREAMING { loss-priority low code-points af11; } forwarding-class BACKGROUND { loss-priority high code-points be; } } }

Chapter 7: Understanding Network CoS and Latency

113

code-point-aliases { ieee-802.1 { //associate the code point aliases be 000; af12 101; af11 100; be1 001; ef 010; } } forwarding-classes { //assigned the four queues to the forwarding classes queue 0 BACKGROUND; queue 3 CONVERSATIONAL; queue 2 INTERACTIVE; queue 1 STREAMING; } interfaces { ge-0/0/9 { //associate the scheduler map, rewrite rules and classifer with the interface scheduler-map SCHED-MAP; unit 0 { classifiers { ieee-802.1 DOTP-CLASSIFIER; } rewrite-rules { ieee-802.1 DOTP-RW; } } } } rewrite-rules { //define the rewrite rules for each of the forwarding classes. Set the code points to be used in each case ieee-802.1 DOTP-RW { forwarding-class CONVERSATIONAL { loss-priority low code-point ef; } forwarding-class INTERACTIVE { loss-priority low code-point af12; } forwarding-class STREAMING { loss-priority low code-point af11; } forwarding-class BACKGROUND { loss-priority high code-point be; } } } scheduler-maps { //define the scheduler maps for each forwarding class SCHED-MAP { forwarding-class BACKGROUND scheduler BACK-SCHED; forwarding-class CONVERSATIONAL scheduler CONV-SCHED; forwarding-class INTERACTIVE scheduler INTERACT-SCHED; forwarding-class STREAMING scheduler STREAMING-SCHED; } } schedulers { //Specify the scheduler properties for each forwarding class. Priorities assigned here define how the scheduler handles the traffic. CONV-SCHED { transmit-rate remainder; buffer-size percent 80; priority strict-high; }

114

Data Center Network Connectivity with IBM Servers

INTERACT-SCHED; STREAMING-SCHED { transmit-rate percent 20; } BACK-SCHED { transmit-rate remainder; priority low; } } chandra@EX> show configuration firewall family ethernet-switching { //Configure a multifield classifer for better granularity. CONVERSATIONAL class gets higher priority than BACKGROUND filter HIGH { term 1 { from { source-address { 11.22.1.7/32; } } then { accept; forwarding-class CONVERSATIONAL; loss-priority low; } } term 2 { then { accept; count all; } } } filter LOW { term 1 { from { source-address { 11.22.1.100/32; 11.22.1.101/32; } } then { accept; forwarding-class BACKGROUND; loss-priority high; } } term 2 { then { accept; count all; } } } } chandra@EX > show configuration interfaces ge-0/0/24 unit 0 { family ethernet-switching { //Assign the firewall filter to the interface port-mode access; filter { input LOW; output LOW; } } } chandra@EX> show configuration interfaces ge-0/0/25 unit 0 {

Chapter 7: Understanding Network CoS and Latency

115

family ethernet-switching { port-mode access; filter { input LOW; output LOW; } } } chandra@EX> show configuration interfaces ge-0/0/7 unit 0 { family ethernet-switching { port-mode access; filter { input HIGH; output HIGH; } } } chandra@EX> show configuration interfaces ge-0/0/9 unit 0 { family ethernet-switching { port-mode access; } }

Latency
Network latency is critical to business. Today, the competitiveness in the global financial markets is measured in microseconds. High performance computing and financial trading demand an ultra low-latency network infrastructure. Voice and video traffic is time-sensitive and typically requires low latency. Because network latency in a TCP/IP network can be measured on different layers, such as Layer 2/3, and for different types of traffic, such as unicast or multicast, it often refers to one of the following: Layer 2 unicast, Layer 3 unicast, Layer 2 multicast or Layer 3 multicast. Often, latency is measured in various frame sizes 64, 128, 256, 512, 1024, 1280, 1518 bytes for Ethernet. The simulated traffic throughput is a critical factor in the accuracy of test results. For a 1 Gbps full-duplex interface, the transmitting (TX) throughput of simulated traffic and the receiving (TR) throughput require 1Gbps and the TX/TR throughput ratio must be at least 99%. Measuring network latency often requires sophisticated test appliances, such as Agilent N2X, Spirent Communications, and IXIA. NetworkWorld validated Juniper Networks EX4200 performance, including Layer 2 unicast latency, Layer 3 unicast, Layer 2 multicast and Layer 3 multicast. For detailed test results, please refer to www.networkworld.com/ reviews/2008/071408-test-juniper-switch.html.

116

Data Center Network Connectivity with IBM Servers

In this section, we discuss the concept of measuring device latency and demonstrate the sample configuration for measuring Layer 2 and Layer 3 unicast latency on the MX480.

Measuring Latency
IEFT standard RFC 2544 defines performance test criteria for measuring latency of the DUT. As shown in Figure 7.3, the ideal way to test DUT latency is to use a tester with both transmitting and receiving ports. The tester connects DUT with two connections: the transmitting port of the tester connects to the receiving port of the DUT, and the sending port of the DUT connects to the receiving port of the tester. The setup also applies to measuring the latency of multiple DUTs, as shown in Figure 7.3.
DUT 1 Tester DUT Tester DUT 2

Figure 7.3

Measuring Latency

Figure 7.4 illustrates two latency test scenarios. We measured the latency of the MX480 in one scenario; we measured the end-to-end latency of MX480 and Ciscos ESM in another scenario. We used Agilents N2X with transmitting port (ge2/3/1) and receiving port (ge-3/4/4) as a tester.
ge-5/3/5 11.22.1.1 N2X ge-2/3/1 11.22.1.2 ge-5/3/7 11.22.2.1

MX480
Port 18 N2X ge-3/4/4 11.22.2.2 Port 20 Cisco ESM IBM BladeCenter Device Latency End-to-End Latency

Figure 7.4

Latency Setup

Chapter 7: Understanding Network CoS and Latency

117

In the first test scenario, the N2X and MX480 connections, represented by the dashed line,are made from the sending ports (ge-2/3/1) of the N2X to the receiving ports (ge-5/3/5) of the MX480 and from the sending ports (ge-5/3/6) of the MX480 back to the receiving ports (ge-3/4/4) of the tester. In second test scenario, the connection among the N2X, MX480 and Ciscos ESM (represented by the solid line in Figure 7.4) occurs in the following order: Connection from the sending ports of the N2X to the receiving ports of the MX480 Connection from the sending port of the MX480 to the receiving port (Port 18) of Ciscos ESM Connection from the sending port (Port 20) of Ciscos ESM to the receiving port of the N2X.

Configuration on Measuring Layer 2 Latency


To measure the Layer 2 Latency, all participating ports on the DUTs must be configured with the same VLAN. That is the same Layer 2 broadcast domain. Here is a sample Layer 2 configuration on the MX480:
ge-5/3/5 { //Define a VLAN tagged interface and Ethernet-bridge encapsulation vlan-tagging; encapsulation ethernet-bridge; } unit 1122 { //Define a logical unit, vlan-id and a vlan-bridge type encapsulation encapsulation vlan-bridge; vlan-id 1122; ge-0/0/35.0; } bc-ext { //Define a bridge domain and assign VLAN id and interface. domain-type bridge; vlan-id 1122; interface ge-5/3/5.1122; interface ge-5/3/7.1122; }

118

Data Center Network Connectivity with IBM Servers

Configuration on Measuring Layer 3 Latency


To measure the Layer 3 latency, all the participating ports on the DUTs must be configured with the same IP subnet. Configuring the MX480
ge-5/3/5 { unit 0 { family inet { address 11.22.1.1/24; } } } ge-5/3/7 { unit 0 { family inet { address 11.22.2.1/24; } } }

119

Chapter 8
Configuring High Availability

Routing Engine Redundancy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 Graceful Routing Engine Switchover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Virtual Chassis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 Nonstop Active Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Nonstop Bridging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 Graceful Restart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 In-Service Software Upgrade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Virtual Router Redundancy Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Link Aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Redundant Trunk Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

IMPLEMENTING HIGH AVAILABILITY (HA) is critical when designing a network. Operators can implement high availability using one or more of the several methods described in Chapter 3: Implementation Overview.

120

Data Center Network Connectivity with IBM Servers

This chapter covers the following software-based high availability features that operators can enable in the data center: Routing Engine Redundancy Graceful Routing Engine Switchover (GRES) Virtual Chassis Nonstop Routing (NSR) Nonstop Bridging (NSB) Graceful Restart (GR) In-Service Software Upgrade (ISSU) Virtual Router Redundancy Protocol (VRRP) Link Aggregation (LAG) Redundant Trunk Group (RTG) Enabling either one or a combination of the features listed increases the reliability of the network. This chapter first introduces Junos OS based features such as Routing Engine redundancy, GRES, GR, NSR, NSB and ISSU that are critical to implementing high availability in the data center. Reliability features such as VRRP, RTG and LAG are implemented over these key high availability elements.

Routing Engine Redundancy


Routing Engine redundancy occurs when two physical Routing Engines reside on the same device. One of the Routing Engines functions as the primary, while the other serves as a backup. When the primary Routing Engine fails, the backup Routing Engine automatically becomes the primary Routing Engine, thus increasing the availability of the device. (Routing Engine Redundancy with respect to the scope of this handbook is available only on the MX Series and EX8200 platforms.) Any one of the following failures can trigger a switchover from the primary to the backup Routing Engine: Hardware failure This can be a hard disk error or a loss of power on the primary Routing Engine. Software failure This can be a kernel crash or a CPU lock. These failures cause a loss of keepalives from the primary to the backup Routing Engine. Software process failure Specific software processes that fail at least four times within the span of 30 seconds on the primary Routing Engine. NOTE To revert to the original primary post-failure recovery, operators must perform a manual switchover. Configuration Hierarchy for Routing Engine Redundancy The following redundancy statements that define the routing engine roles and failover mechanism and are available at the [edit chassis] hierarchy:

Chapter 8: Configuring High Availability

121

redundancy { graceful-switchover; keepalive-time seconds; routing-engine slot-number (master | backup | disabled); }

1.

Configuring the automatic failover from an active to backup Routing Engine without any interruption to packet forwarding can be done at the edit chassis redundancy heirarchy. The triggers are either a detection of a hard disk error or a loss of keepalives from the primary Routing Engine:
[edit chassis redundancy]{ failover on-disk-failure; failover on-loss-of-keepalives; }

2. Specify the threshold time interval for loss of keepalives after which the backup Routing Engine takes over from the primaryRouting Engine. The failover occurs by default after 300 seconds when Graceful Routing Engine Switchover is not configured.
[edit chassis redundancy] keepalive-time seconds;

3. Configure automatic switchover to the backup Routing Engine following a software process failure by including the failover other-routing-engine statement at the [edit system processes process-name] hierarchy level:
[edit system processes] <process-name> failover other-routing-engine;

4. The Routing Engine mastership can be manually switched using the following CLI commands:
request chassis routing-engine master acquire on backup Routing Engine request chassis routing-engine master release on primary Routing Engine request chassis routing-engine master switch on either primary or backup Routing Engines

Graceful Routing Engine Switchover


Junos OS provides a separation between the routing and control planes. Graceful Routing Engine switchover (leverages this separation to provide a switchover between the Routing Engines without disrupting traffic flow). Configuring graceful Routing Engine switchover on a router enables the interface information and kernel state to be synchronized on both Routing Engines. This leads to the same routing and forwarding states to be preserved on both Routing Engines. Any routing changes occurring on the primary Routing Engine are replicated on the kernel of the backup Routing Engine. Although graceful Routing Engine switchover synchronizes the kernel state, it does not preserve the control plane.

122

Data Center Network Connectivity with IBM Servers

It is important to note that graceful Routing Engine switchover only offers Routing Engine redundancy, not router level redundancy. Traffic flows through the router for a short interval during the Routing Engine switchover. However, the traffic is dropped as soon as any of the routing protocol timers expire and the neighbor relationship with the upstream router ends. To avoid this situation, operators must apply graceful Routing Engine switchover in conjunction with Graceful Restart (GR) protocol extensions. NOTE Although graceful Routing Engine switchover is available on many other platforms, with respect to the scope of this handbook, graceful Routing Engine switchover is available only on the MX Series and EX8200 platforms. Figure 8.1 shows a primary and backup Routing Engine exchanging keepalive messages.

Keep-Alives

Master Routing Engine

Backup Routing Engine

Figure 8.1

Primary and Backup Routing Engines

For details concerning GR, see the Graceful Restart section on page 126.

Configuring Graceful Routing Engine Switchover


1.
redundancy hierarchy:

Graceful Routing Engine switchover can be configured under the edit chassis
[edit chassis redundancy] graceful-switchover;

2. The operational show system switchover command can be used to check the graceful Routing Engine switchover status on the backup Routing Engine:
{backup} chandra@HE-Routing Engine-1-MX480-194> show system switchover Graceful switchover: On Configuration database: Ready Kernel database: Ready state: Steady State

Virtual Chassis
Routing Engines are built into the EX Series chassis. In this case, Routing Engine redundancy can be achieved by connecting and configuring two (or up to ten) EX switches as a part of a virtual chassis. This virtual chassis operates as a single network entity and consists of designated primary and backup switches. Routing Engines on each of these two switches then become the master and backup Routing Engines of the virtual chassis, respectively. The rest of the switches of

Chapter 8: Configuring High Availability

123

the virtual chassis assume the role of line cards. The master Routing Engine on the primary switch manages all the other switches that are members of the virtual chassis and has full control of the configuration and processes. It receives and transmits routing information, builds and maintains routing tables, and communicates with interfaces and the forwarding components of the member switches. The backup switch acts as the backup Routing Engine of the virtual chassis and takes over as the master when the primary Routing Engine fails. The virtual chassis uses GRES and NSR to recover from control plane failures. Operators can physically connect individual chassis using either virtual chassis extension cables or 10G/1G Ethernet links. Using graceful Routing Engine switchover on a virtual chassis enables the interface and kernel states to be synchronized between the primary and backup Routing Engines. This allows the switchover between primary and backup Routing Engine to occur with minimal disruption to traffic. The graceful Routing Engine switchover behavior on the virtual chassis is similar to the description in the Graceful Routing Engine Switchover section on page 121. When graceful Routing Engine switchover is not enabled, the line card switches of the virtual chassis initialize to the boot up state before connecting to the backup that takes over as the master when Routing Engine failover occurs. Enabling graceful Routing Engine switchover eliminates the need for the line card switches to re-initialize their state. Instead, they resynchronize their state with the new master Routing Engine thus ensuring minimal disruption to traffic. Some of the resiliency features of a virtual chassis include the following: A software upgrade either succeeds or fails on all or none of the switches belonging to the virtual chassis. A virtual chassis fast failover, a hardware mechanism that automatically reroutes traffic and reduces traffic loss when a link failure occurs. A virtual chassis split and merge that causes the virtual chassis configuration to split into two separate virtual chassis when member switches fail or are removed. Figure 8.2 shows a virtual chassis that consists of three EX4200 switches: EX-6, EX-7 and EX-8. A virtual chassis cable connects the switches to each other, ensuring that the failure of one link does not cause a virtual chassis split.
Line Card

EX4200

(EX-6)

Backup

EX4200

(EX-7)

Primary

EX4200

(EX-8)

Figure 8.2

Virtual Chassis Example Consisting of Three EX4200s

124

Data Center Network Connectivity with IBM Servers

Virtual Chassis Configuration Snippet


// Define members of a virtual chassis. virtual-chassis { member 1 { mastership-priority 130; } member 2 { mastership-priority 130; } } // Define a management interface and address for the VC. interfaces { vme { unit 0 { family inet { address 172.28.113.236/24; } } } }

The show virtual-chassis CLI command provides a status of a virtual chassis that has a master and backup switch and line card. There are three EX4200 switches connected and configured to form a virtual chassis. Each switch has a member ID and sees the other two switches as its neighbors when the virtual chassis is fully functioning. The master and backup switches are assigned the same priority (130) to ensure a non-revertive behavior after the master recovers.
show virtual-chassis Virtual Chassis ID: 555c.afba.0405 Member ID Status 0 (FPC 0) Prsnt

Mastership Neighbor List Serial No Model priority Role ID Interface BQ0208376936 ex4200-48p 128 Linecard 1 vcp-0 2 vcp-1 1 (FPC 1) Prsnt BQ0208376979 ex4200-48p 130 Backup 2 vcp-0 0 vcp-1 2 (FPC 2) Prsnt BQ0208376919 ex4200-48p 130 Master* 0 vcp-0 1 vcp-1 Member ID for next new member: 0 (FPC 0)

Use the following operational CLI command to define the 10/1G Ethernet ports that are used only for virtual chassis inter-member connectivity. request virtual-chassis vc-port set pic-slot 1 port 0 or request virtual-chassis vcport set pic-slot 1 port 1

Nonstop Active Routing


Nonstop Active Routing (NSR) preserves kernel and interface information in a manner similar to graceful Routing Engine switchover. However, compared to graceful Routing Engine switchover, NSR goes a step further and saves the routing protocol information on the backup Routing Engine. It also preserves the protocol connection information in the kernel. Any switchover between the Routing Engines is dynamic, is transparent to the peers, and occurs without any disruption to protocol peering. For these reasons, NSR is beneficial in cases where the peer routers do not support graceful Routing Engine switchover.

Chapter 8: Configuring High Availability

125

Juniper Networks recommends enabling NSR in conjunction with graceful Routing Engine switchover because this maintains the forwarding plane information during the switchover. State information for a protocol that is not supported by NSR is the primary Routing Engine. State information must be refreshed using the normal recovery mechanism inherent to the protocol. Automatic route distinguishers for multicast can be enabled simultaneously with NSR. It is not necessary to start the primary and backup Routing Engines at the same time. Activating a backup Routing Engine at any time automatically synchronizes the primary Routing Engine. For further details, refer to the Junos High Availability Guide for the latest Junos software version at www.juniper.net/techpubs/en_US/junos10.1/informationproducts/topic-collections/swconfig-high-availability/noframescollapsedTOC.html. Configuring Nonstop Active Routing 1. Enable graceful Routing Engine switchover under the chassis stanza.
[edit chassis redundancy] graceful-switchover;

2. Enable nonstop active routing under the routing-options stanza.


[edit routing-options] nonstop-routing;

3. When operators enable NSR, they must synchronize configuration changes on both Routing Engines.
[edit system] commit synchronize;

4. A switchover to the backup Routing Engine must occur when the routing protocol process (rpd) fails three times consecutively, in rapid intervals. For this to occur, the following statement must be included.
[edit system processes routing failover] routing failover other-routing-engine;

5. Operators must add the following command to achieve synchronization between the Routing Engines after configuration changes.
[edit system] commit synchronize

6. Operators can use the following operational command to verify if NSR is enabled and active.
show task replication

126

Data Center Network Connectivity with IBM Servers

Nonstop Bridging
Nonstop Bridging (NSB) enables a switchover between the primary and backup Routing Engines without losing Layer 2 Control Protocol (L2CP) information. NSB is similar to NSR in that it preserves interface and kernel information. The difference is that NSB saves the Layer 2 control information by running a Layer 2 Control Protocol process (l2cpd) on the backup Routing Engine. For NSB to function, operators must enable Graceful Routing Engine switchover. The following Layer 2 control protocols support NSB: Spanning Tree Protocol (STP) Rapid STP (RSTP) Multiple STP (MSTP) Configuring Nonstop Bridging 1. Enable graceful Routing Engine switchover under the chassis stanza.
[edit chassis redundancy] graceful-switchover; Explicitly enable NSB [edit protocols layer2-control] nonstop-bridging;

2. Ensure synchronization between Routing Engines whenever a configuration is required.


[edit system] commit synchronize

NOTE

It is not necessary to start the primary and backup Routing Engines at the same time. Implementing a backup Routing Engine at any time automatically synchronizes with the primary Routing Engine when NSB is enabled.

Graceful Restart
A service disruption necessitates routing protocols on a router to recalculate peering relationships, protocol specific information and routing databases. Disruptions due to an unprotected restart of a router can cause route flapping, greater protocol reconvergence times or forwarding delays, ultimately resulting in dropped packets. However, Graceful Restart (GR) alleviates this situation, acting as an extension to the routing protocols. A router with GR extensions can be defined either in a role of restarting or helper. These extensions provide the neighboring routers with the status of a router when a failure occurs. Consider a router on which a failure has occurred, the GR extensions signal the neighboring routers that a restart is occurring. This prevents the neighbors from sending out network updates to the router for the duration of the graceful restart wait interval. A router with GR enabled must negotiate the GR support with its neighbors at the start of a routing session. The primary advantages of GR are uninterrupted packet forwarding and temporary suppression of all routing protocol updates.

Chapter 8: Configuring High Availability

127

NOTE

A helper router undergoing Routing Engine switchover drops the GR wait state that it may be in and propagates the adjacencys state change to the network. GR support is available for routing/MPLS related protocols and Layer 2 or Layer 3 VPNs. See Table-B.3 in Appendix B of this handbook for a list of GR protocols supported on the MX and EX Series platforms. Configuring Graceful Restart 1. Enable GR either at global or at specific protocol levels. When configuring on a global level, operators must use the routing-options hierarchy. The restart duration specifies the duration of the GR period.

MORE

NOTE

The GR helper mode is enabled by default even though GR may not be enabled. If necessary, the GR helper mode can be disabled on a per-protocol basis. If GR is enabled globally, it can be disabled only if required for each individual protocol.
edit routing-options] graceful-restart restart-duration

2. GR can be enabled for static routes under the routing-options hierarchy


[edit routing-options] graceful-restart

In-Service Software Upgrade


In-service software upgrade (ISSU) facilitates software upgrades of Juniper devices in environments where there is a high concentration of users and business critical applications. Operators can use ISSU to upgrade the software from one JUNOS release to another without any disruption to the control plane. Any disruption to traffic during the upgrade is minimal. ISSU runs only on platforms that support dual Routing Engines and requires that graceful Routing Engine switchover and NSR be enabled. Graceful Routing Engine switchover is required because a switch from the primary to the backup Routing Engine must happen without any packet forwarding loss. The NSR with graceful Routing Engine switchover maintains routing protocol and control information during the switchover between the Routing Engines. NOTE Similar to regular upgrades, Telnet sessions, SNMP, and CLI access can be interrupted briefly when ISSU is being performed. If BFD is enabled, the detection and transmission session timers increase temporarily during the ISSU activity. The timers revert to their original values once the ISSU activity is complete. When attempting to perform an ISSU, the following conditions must be met: The primary and backup Routing Engines must be running the same software version. The status of the PICs cannot be changed during the ISSU process. For example, the PICs cannot be brought online/offline. The network must be in a steady, stable state.

128

Data Center Network Connectivity with IBM Servers

An ISSU can be performed in one of the following ways: Upgrading and rebooting both Routing Engines automatically Both Routing Engines are upgraded to the newer version of software and then rebooted automatically. Upgrading both Routing Engines and then manually rebooting the new backup Routing Engine The original backup Routing Engine is rebooted first after the upgrade to become the new primary Routing Engine. Following this, the original primary Routing Engine must be rebooted manually for the new software to take effect. The original primary Routing Engine then becomes the backup Routing Engine. Upgrading and rebooting only one Routing Engine In this case, the original backup Routing Engine is upgraded and rebooted and becomes the new primary Routing Engine. The former primary Routing Engine must be upgraded and rebooted manually. MORE For more details when performing an ISSU using the above-listed methods, see Appendix A of this handbook. Verifying Conditions and Tasks Prior to ISSU Operation 1. Verify that the primary and backup Routing Engines are running the same software version using the show version invoke-on all-routing-engines CLI command:
{master} chandra@MX480-131-0> show version invoke-on all-routing-engines re0: ------------------------------------------------------------------------Hostname: MX480-131-0 Model: mx480 JUNOS Base OS boot [10.0R1.8] JUNOS Base OS Software Suite [10.0R1.8] JUNOS Kernel Software Suite [10.0R1.8] JUNOS Crypto Software Suite [10.0R1.8] JUNOS Packet Forwarding Engine Support (M/T Common) [10.0R1.8] JUNOS Packet Forwarding Engine Support (MX Common) [10.0R1.8] JUNOS Online Documentation [10.0R1.8] JUNOS Voice Services Container package [10.0R1.8] JUNOS Border Gateway Function package [10.0R1.8] JUNOS Services AACL Container package [10.0R1.8] JUNOS Services LL-PDF Container package [10.0R1.8] JUNOS Services Stateful Firewall [10.0R1.8] JUNOS AppId Services [10.0R1.8] JUNOS IDP Services [10.0R1.8] JUNOS Routing Software Suite [10.0R1.8] re1: ------------------------------------------------------------------------Hostname: MX480-131-1 Model: mx480 JUNOS Base OS boot [10.0R1.8] JUNOS Base OS Software Suite [10.0R1.8] JUNOS Kernel Software Suite [10.0R1.8] JUNOS Crypto Software Suite [10.0R1.8]

Chapter 8: Configuring High Availability

129

JUNOS Packet Forwarding Engine Support (M/T Common) [10.0R1.8] JUNOS Packet Forwarding Engine Support (MX Common) [10.0R1.8] JUNOS Online Documentation [10.0R1.8] JUNOS Voice Services Container package [10.0R1.8] JUNOS Border Gateway Function package [10.0R1.8] JUNOS Services AACL Container package [10.0R1.8] JUNOS Services LL-PDF Container package [10.0R1.8JUNOS Services Stateful Firewall [10.0R1.8] JUNOS AppId Services [10.0R1.8] JUNOS IDP Services [10.0R1.8] JUNOS Routing Software Suite [10.0R1.8]

2. Verify that graceful Routing Engine switchover and NSR are enabled using the show system switchover and show task replication commands. 3. BFD timer negotiation can be disabled explicitly during the ISSU activity using the [edit protocols bfd] hierarchy:
[edit protocols bfd] no-issu-timer-negotiation;

4. Perform a software backup on each Routing Engine using the request system snapshot CLI command:
{master} chandra@MX480-131-0> request system snapshot Verifying compatibility of destination media partitions... Running newfs (899MB) on hard-disk media / partition (ad2s1a)... Running newfs (99MB) on hard-disk media /config partition (ad2s1e)... Copying /dev/ad0s1a to /dev/ad2s1a .. (this may take a few minutes) Copying /dev/ad0s1e to /dev/ad2s1e .. (this may take a few minutes) The following filesystems were archived: / /config

Verifying a Unified ISSU Execute the show chassis in-service-upgrade command on the primary Routing Engine to verify the status of FPCs and their corresponding PICs after the most recent ISSU activity.

130

Data Center Network Connectivity with IBM Servers

Virtual Router Redundancy Protocol


Virtual Router Redundancy Protocol (VRRP) is a protocol, which runs on routing devices that are connected to the same broadcast domain. VRRP configuration assigns these devices to a group. The grouping eliminates the possibility of a single point of failure and thus provides high availability of network connectivity to the hosts on the broadcast domain. Routers participating in VRRP share a virtual IP address and virtual MAC address. The shared Virtual IP address corresponds to the default route configured on the hosts. For example, hosts on a broadcast domain can use a single default route to reach multiple redundant routers belonging to the VRRP group on that broadcast domain. One of the routers is elected dynamically as a default primary of the group and is active at a given time. All the other participating routing devices perform a backup role. Operators can assign priorities to devices manually, forcing them to act as primary and backup devices. The VRRP primary sends out multicast advertisements to the backup devices at regular intervals (default interval is 1 second). When the backup devices do not receive an advertisement for a configured period, the device with the next highest priority becomes the new primary. This occurs dynamically, thus enabling an automatic transition with minimal traffic loss. This VRRP action eliminates the dependence on achieving connectivity using a single routing platform that can result in a single point of failure. In addition, the change between the primary and backup roles occurs with minimum VRRP messaging and no intervention on the host side. Figure 8.3 shows a set of hosts connected to three EX switches: EX4200-0, EX8200-1 and EX8200-2 on the same broadcast domain. EX4200-0 is configured as a Layer 2 switch only, without any routing functionality. EX8200-1 and EX8200-2 are configured to have their respective IP addresses on the broadcast domain and are configured to be VRRP members with a virtual address of 172.1.1.10/16. EX82001 is set to be the primary, while EX8200-2 is the backup. The default gateway on each of the hosts is set to be the virtual address. Traffic from the hosts is sent to hosts on other networks through EX8200-1 because it is the primary. When the hosts lose connectivity to EX8200-1 either due to a node or link failure, EX8200-2 becomes the primary. The hosts start sending the traffic through EX8200-2. This is possible because the hosts forward the traffic to the gateway that owns virtual IP address 172.1.1.10, and IP packets are encapsulated in Ethernet frames destined to a virtual MAC address. Junos provides a solution that prevents re-learning of ARP information on the backup router when the primary router fails. This solution increases performance when large numbers of hosts exists on the LAN.

Chapter 8: Configuring High Availability

131

EX8200 - 1
Virtual Address 172.1.1.10/16

EX4200 - 0

Default Gateway on Each Host set to 172.1.1.10

EX8200 - 2

Figure 8.3 MORE

VRRP

For VRRP configuration details, refer to the Junos High Availability Guide at www.juniper.net/techpubs/software/junos/junos90/swconfig-highavailability/high-availability-overview.html. VRRP Configuration Diagram Figure 8.4 shows a sample VRRP network scenario. In this scenario, two EX4200 devices (EX4200-A and EX4200-B) are configured as part of a VRRP group.

NOTE

Although this VRRP sample scenario uses EX4200 devices, it is possible to configure other combinations of VRRP groups consisting of devices such as: EX8200 EX4200 EX8200 MX480 MX480 MX480 EX8200 EX8200 Figure 8.4 shows devices EX8200-A and EX8200-B, MX480-A and MX480-B to illustrate the choices of different platforms when configuring VRRP in the network.

132

Data Center Network Connectivity with IBM Servers

VRRP Conguration Options: MX480-A MX480-B MX480-A EX4200-B MX480-A EX8200-B EX8200-A EX8200-B EX8200-A EX4200-B 11.22.5.1/24 ge-5/3/5

11.22.3.1/24 ge-5/3/9 11.22.3.16/24 ge-0/0/16

MX480

11.22.2.1/24 ge-5/3/7 11.22.2.36/24 ge-0/0/36

MX480-A EX4200-A

MX480-B EX4200-B

11.22.1.11/24 ge-0/0/11

Virtual Router Group IP address = 11.22.1.1

11.22.1.31/24 ge-0/0/31

Primary Path

(when VRRPs Primary Interface Fails)

Backup Path

Trunk Port 18

Trunk Port 19

EX8200-A
MM1 MM2

BNT Cisco ESM 1 PassThrough

EX8200-B

Eth1 SoL Eth0 IBM Blade Center connected via Cisco ESM/IBM Power 5/ Power6/IBM x3500 servers

Figure 8.4

VRRP Test Network

The virtual address assigned to the EX4200 group discussed here is 11.22.1.1. The two devices and the IBM Blade servers physically connect on the same broadcast domain. EX4200-A is elected as the primary and so the path between the servers to EX4200-A through the Cisco ESM is the primary preferred path. The link between the Cisco ESM and EX4200-B is the backup path. NOTE Ciscos ESM included in the IBM Blade Center is a Layer 2 switch that does not support VRRP, but it serves as an access network layer switch connected to routers that use VRRP. Other switch modules for the IBM Blade Center support Layer3 functionality but are out of the scope of this book.

Chapter 8: Configuring High Availability

133

Configuring VRRP To configure VRRP on the sample network perform the following steps: 1. Create two trunk ports on Ciscos ESM. Assign an internal eth0 port on Blade[x] to same network as VRRP, for example 11.22.1.x.

2. Add a router with a Layer 3 address that is reachable from the 11.22.1.x network on the blade center. In this case, the MX480 acts as a Layer 3 router that connects to both EX4200-A and EX4200-B through the 11.22.2.x and11.22.3.x networks, respectively. 3. This Layer 3 MX480 router also terminates the 11.22.5.X network via interface ge-5/3/5 with family inet address 11.22.5.1. 4. Verify that this address is reachable from the blade server by configuring the default gateway to be either 11.22.1.11(ge-0/0/11) or 11.22.1.31 (ge-0/0/31). 5. Configure VRRP between the two interfaces ge-0/0/11 (EX4200-A) and ge-0/0/31 (EX4200-B). The default virtual address (known as vrrp-id) is 11.22.1.1 with ge-0/0/11 on EX4200-A set to have a higher priority. Verify operation on the sample network by performing the following steps. 1. Reconfigure the default route on 11.22.1.60 (blade server) to 11.22.1.1 (vrrp router id).

2. Confirm that 11.22.5.1 is reachable from 11.22.1.60 and vice-versa .Perform a traceroute to ensure that the next hop is 11.22.1.11 on EX4200-A. 3. Either lower the priority on EX4200-A or administratively disable the interface ge-0/0/11 to simulate an outage of EX4200-A. 4. Confirm that pings from 11.22.1.60 to 11.22.5.1 are still working but use the backup path to EX4200-B. 5. Perform a traceroute to confirm that the backup path is being used. NOTE The traceroute command can be used for confirmation in both directions to and from the BladeCenter. VRRP Configuration Snippet The VRRP configuration snippet shows the minimum configuration required on the EX Series to enable a VRRP group.
// Configure the interface ge-0/0/31 on EX4200-B with an IP address of 11.22.1.31/24 on the logical unit 0. // Define a VRRP group with a virtual IP of 11.22.1.1 and priority of 243. show configuration interfaces ge-0/0/31 unit 0 { family inet { address 11.22.1.31/24 { vrrp-group 1 { virtual-address 11.22.1.1; priority 243; preempt { hold-time 0; } accept-data; } } } }

134

Data Center Network Connectivity with IBM Servers

// Interface ge-0/0/36 to MX480 with an IP of 11.22.2.36/24 show configuration interfaces ge-0/0/36 unit 0 { family inet { address 11.22.2.36/24; } } // Configure the interface ge-0/0/11 on EX4200-A with an IP address of 11.22.1.11/24 on the logical unit 0. // Define a VRRP group with a virtual IP of 11.22.1.1 and priority of 240. show configuration interfaces ge-0/0/11 unit 0 { family inet { address 11.22.1.11/24 { vrrp-group 1 { virtual-address 11.22.1.1; priority 240; preempt { hold-time 0; } accept-data; } } } }

VRRP Configuration Hierarchy for IPv4 This section shows that VRRP statements can be included at the interface hierarchy level.
[edit interfaces interface-name unit <unit-number> family inet address address] vrrp-group group-id { (accept-data | no-accept-data); advertise-interval seconds; authentication-key key; authentication-type authentication; fast-interval milliseconds; (preempt | no-preempt) { hold-time seconds; } priority number; track { interface interface-name { priority-cost priority; bandwidth-threshold bits-per-second { priority-cost priority; } } priority-hold-time seconds; route prefix routing-instance instance-name { priority-cost priority; } } virtual-address [ addresses ]; }

Chapter 8: Configuring High Availability

135

Configuring VRRP for IPv6 (MX Series Platform Only) As mentioned earlier, operators can configure VRRP for IPv6 on the MX platform. To configure VRRP for IPv6, include the following statements at this hierarchy level:
[edit interfaces interface-name unit <-unit-number> family inet6 address address] vrrp-inet6-group group-id { (accept-data | no-accept-data); fast-interval milliseconds; inet6-advertise-interval seconds; (preempt | no-preempt) { hold-time seconds; } priority number; track { interface interface-name { priority-cost priority; bandwidth-threshold bits-per-second { priority-cost priority; } } priority-hold-time seconds; route prefix routing-instance instance-name { priority-cost priority; } } virtual-inet6-address [ addresses ]; virtual-link-local-address ipv6-address }

Link Aggregation
Link Aggregation (LAG) is a feature that aggregates two or more physical Ethernet links into one logical link to obtain higher bandwidth and to provide redundancy. LAG provides high link availability and capacity which results in improved performance and availability. Traffic is balanced across all links that are members of an aggregated bundle. The failure of a member link does not cause traffic disruption. Instead, because there are multiple member links, traffic continues over active links. LAG is an 802.3ad standard that can be used in conjunction with Link Aggregation Control Protocol (LACP). Using LACP, multiple physical ports can be bundled together to form a logical channel. Enabling LACP on two peers that participate in a LAG group enables them to exchange LACP packets and negotiate the automatic bundling of links. NOTE LAG can be enabled on interfaces spread across multiple chassis; this is known as Multichassis LAG (MC-LAG). This means that the member links of a bundle can be configured between multiple chassis instead of only two chassis. Currently, MC-LAG support only exists on the MX platforms.

136

Data Center Network Connectivity with IBM Servers

Some points to note with respect to LAG: Ethernet links between two points support LAG. A maximum of 16 Ethernet interfaces can be included within a LAG on the MX Series Platforms. The LAG can consist of interfaces that reside on different Flexible PIC Concentrators (FPC) cards in the same MX chassis. However, these interface links must be of the same type. The EX Series Platforms supports a maximum of 8 Ethernet interfaces in a LAG. In case of an EX4200 based virtual chassis, the interfaces that belong to a LAG can be on different switch members of the virtual chassis. Link Aggregation Configuration Diagram Figure 8.5 shows a sample link aggregation and load balancing setup. In this configuration, LAG is enabled on the interfaces between the MX480 and Ciscos ESM switch on the IBM Blade Center, thus bundling the physical connections into one logical link.
LAG EX and MX Series DUT 17 18 ge-5/0/1 Aggregated Ethernet Etherchannel Cisco ESM IBM Blade

N2X

ge-5/0/5 ge-304/4

MX480

Trunk Port 20 N2X ge-201/1

Figure 8.5 NOTE

LAG and Load Balancing Setup

The EX8200 or any of the MX Series devices can be used instead of the MX480, as shown in Figure 8.5. Link Aggregation Configuration Hierarchy This section describes the different steps involved in configuring and verifying LAG on the test network. A physical interface can be associated with an aggregated Ethernet interface on the EX and MX Series Platforms. Enable the aggregated link as follows: 1. At [edit chassis] hierarchy level, configure the maximum number of aggregated-devices available on system:
aggregated-devices { ethernet { device-count X; } }

Chapter 8: Configuring High Availability

137

NOTE

Here X refers to the number of aggregated interfaces (0-127). 2. At [edit interfaces interface-name] hierarchy level, include the 802.3ad statement:
[edit interfaces interface-name (fastether-options | gigether-options)] 802.3ad aeX;

3. A statement defining aeX also must be included at the [edit interfaces] hierarchy level. 4. Some of the physical properties that specifically apply to aggregated Ethernet interfaces also can be configured:
chandra@HE-Routing Engine-1-MX480> show configuration interfaces aeX aggregated-ether-options { minimum-links 1; link-speed 1g; lacp { active; periodic fast; } } unit 0 { family bridge { interface-mode trunk; vlan-id-list 1122; } }

An aggregated Ethernet interface can be deleted from the configuration by issuing the delete interfaces aex command at the [edit] hierarchy level in configuration mode.
[edit] user@host# delete interfaces aeX

NOTE

When an aggregated Ethernet interface is deleted from the configuration, Junos removes the configuration statements related to aeX and sets this interface to the DOWN state. However, the aggregated Ethernet interface is not deleted until the chassis aggregated-devices ethernet device-count configuration statement is deleted. Forwarding Options in LAG (MX 480 only) By default, hash-key algorithms use the interface as the default parameter to generate hash-keys for load distribution. Forwarding options must be configured to achieve load balancing based on source and destination IP; source and destination MAC or any other combination of Layer 3 or Layer 4 parameters.

NOTE

Although EX Series Platforms can also perform hash-key based load balancing as of release 9.6R1.13, they do not have the flexibility to configure the criteria for hashing.
hash-key { family multiservice { source-mac; destination-mac;

138

Data Center Network Connectivity with IBM Servers

payload { ip { layer-3 { [source-ip-only | destination-ip-only]; } layer-4; } } symmetric-hash;

Link Aggregation Configuration Description


// Specify the number of aggregated devices aggregated-devices { ethernet { device-count X; } } // Specify the aeX interface properties such as minimum number of links, speed and LACP options. aggregated-ether-options { minimum-links 1; link-speed 1g; lacp { active; periodic fast; } } // Define a logical unit that is a bridge type trunk interface and vlan-id. unit 0 { family bridge { interface-mode trunk; vlan-id-list 1122; } }

Link Failover Scenarios - LAG with LACP and NSR Link failover between members of LAG on MX480 can occur in conjunction with different combinations of LACP and NSR. There are various failure scenarios such as Routing Engine/ FPC/ Switch fabric failover, system upgrade with and without ISSU possible for each of the LACP/NSR combinations. The different LACP/NSR combinations on the MX480 include the following: LACP Enabled, NSR Enabled LACP Enabled, NSR Disabled LACP Disabled, NSR Enabled LACP Disabled, NSR Disabled Table B.3 and Table B.4 in Appendix B of this handbook provide detailed LAG testing results based on the scenarios listed above.

Chapter 8: Configuring High Availability

139

The salient test results, listed in Appendix B are as follows: Enabling LACP provided seamless recovery from Routing Engine failover on the MX480. The Routing Engine took approximately 20 seconds to recover from a failure with LACP disabled as opposed to no disruption when it was enabled. FPCs with only one LAG interface recovered more quickly (in 1.5 seconds) than FPCs with two interfaces (approximately 55 seconds). The switch fabric recovered immediately after a failure in all the scenarios. A similar validation was performed using the EX4200 instead of the MX480. In this case, enabling or disabling the LACP did not make a difference. The following scenarios were validated: - - - - - MORE Routing Engine Failover FPC Failover (two LAG links and an interface to the traffic generator) Switch Fabric Failover System Upgrade (without ISSU or graceful Routing Engine switchover) System Upgrade (without ISSU, with graceful Routing Engine switchover)

Table C.1 and Table C.4 in Appendix C of this handbook provide detailed LAG test results using the EX4200 and MX480.

Redundant Trunk Group


Redundant Trunk Group (RTG), which is a Layer 2 based redundancy mechanism similar to STP, is available on the EX Series switches. RTG eliminates the need for spanning tree. In its simplest form, RTG is implemented on a switch that is dual homed to network devices. Enabling RTG makes one of the links active and the other a backup; traffic is forwarded over the active link. The backup link takes over the traffic forwarding when the active link fails thus reducing the convergence time. There is, however, a distinction between how data and control traffic are handled by the backup link. Layer 2 controls traffic, for example, LLDP session messages are permitted over the backup link while data traffic is blocked. This behavior is consistent irrespective of whether the switch is a physical or virtual chassis. Figure 8.6 shows an EX Series switch that has links to Switch1 and Switch2, respectively. RTG is configured on the EX Series switch so that the link to Switch1 is active and performs traffic forwarding. The link to Switch2 is the backup link and starts forwarding traffic when the active link fails. NOTE Given the multi-chassis scenario, it is better to use RTG instead of MC-LAG.

140

Data Center Network Connectivity with IBM Servers

Switch 1 Active

EX Series
Backup

Switch 2

Figure 8.6

RTG-based Homing to Two Switches

Figure 8.7 shows an EX Series switch that has two links to Switch1. RTG is configured on the EX Series switch so that one of the links to Switch1 is active and performs traffic forwarding while the other link acts as the backup. The backup link starts forwarding traffic to Switch1 when the active link fails. NOTE In this scenario, it may be more efficient in terms of bandwidth and availability to use LAG instead of RTG. LAG provides better use of bandwidth and faster recovery because there is no flushing and relearning of MAC addresses.
Active

EX Series

Backup

Switch 1

Figure 8.7

RTG-homing to Single Switch

Based on these two scenarios, RTG can be used to control the flow of traffic over links from a single switch to multiple destination switches while providing link redundancy. This feature is enabled on a physical interface and is similar specifically to STP. However, RTG and STP are mutually exclusive on a physical port. Junos does not permit the same interface to be a part of both RTG and STP simultaneously. The significance of RTG is local and not network wide since decisions are made locally on the switch. Typically, RTG is implemented on an access switch device or on a virtual chassis that is connected to two or more devices that do not operate as a virtual chassis, multi-chassis or use STP. It is configured between the access and core layers in a two-tier data center architecture or between the access and aggregation layers in a three-tier model. There can be a maximum of 16 RTGs in a standalone switch or in a virtual chassis. Both RTG active and backup links must be members of the same VLANs. NOTE Junos does not allow the configuration to take effect if there is a mismatch of VLAN IDs between the links belonging to a RTG.

Chapter 8: Configuring High Availability

141

Figure 8.8 shows a sample two-tier architecture with RTG and LAG enabled between the access-core layers and access-to-server layers. The core consists of two MX Series devices: MX480-A and MX480-B. Two EX4200 based virtual chassis (EX4200 VC-A, EX4200 VC-B) and EX8200s-A and B form the access layer. There are connections from each of the access layer devices to MX480-A and B, respectively.

MX480-A
ae1

ae4 RTG

ae3 MX480-B RTG ae2

EX4200

(EX-1)

EX4200

(EX-4)

EX4200

(EX-2)

EX4200

(EX-5)

EX4200

(EX-3)

EX4200

(EX-6)

Figure 8.8

RTG and LAG in 2-Tier Model

We enable LAG and RTG on these links to ensure redundancy and control traffic flow. We enable LAG on the access devices for links between the following devices: A-ae1 (EX4200 VC-A -> MX480-A) A-ae2 (EX4200 VC-A -> MX480-B) B-ae1 (EX4200 VC-B -> MX480-A) B-ae2 (EX4200 VC-B -> MX480-B) EX-A-ae1 (EX8200-A -> MX480-A) EX-A-ae2 (EX8200-A -> MX480-B) EX-B-ae1 (EX8200-B -> MX480-A) EX-B-ae1 (EX8200-B -> MX480-B) In addition, we configure LAG on the EX8200-A and EX8200-B to provide aggregation on links to the IBM Power VMServers. We enable RTG on the EX4200 VC-A and B so that that links AL-A and Al-B to MX480-A are active and are used to forward traffic. The set of backup links RL-A and RL-B from the virtual chasses to MX480-B take over the traffic forwarding activity when the active link(s) fails.

142

Data Center Network Connectivity with IBM Servers

Configuration Details
To configure a redundant trunk link, a RTG first must be created. As stated earlier, RTG can be configured on the access switch that has two links a primary (active) and a secondary (backup) link. The secondary link automatically starts forwarding data traffic when the active link fails. Execute the following commands to configure RTG and to disable RSTP on the EX switches. Define RTG on the LAG interface ae1:
set ethernet-switching-options redundant-trunk-group group DC_RTG interface ae1

Define RTG on the LAG interface ae2:


set ethernet-switching-options redundant-trunk-group group DC_RTG interface ae2

Disable RSTP on interface ae1 that is member of RTG:


set protocols rstp interface ae1 disable

Disable RSTP on interface ae2 that is member of RTG:


set protocols rstp interface ae2 disable

143

Appendices

Appendix.A:...Configuring.TCP/IP.Networking.in.Servers. . . . . . . . . . . . . . . . . . . . . 144 Appendix.B:...LAG.Test.Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Appendix.C:...Acronyms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Appendix.D:...References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

144

Appendices

Appendix A: Configuring TCP/IP Networking in Servers


Server network configuration includes many tasks such as enabling the interface, setting an IP address, and routing information, creating a logical interface, and optimizing Ethernet port settings which includes speed, duplex, flow control, MTU (Jumbo frames), or VLAN ID. The engineering testers enabled many network configuration commands in different OSs, including RHEL, SUSE, AIX, and Windows. This appendix lists the common network configuration commands with their associated OS as a convenient reference. Table A.1 lists tasks that are associated with system-dependent commands. Obviously, a command that works on one platform may not work on another. For example, the lsdev command only works on the AIX platform. Table A.1 Interfaces
Physical NIC

Network Interface Configuration Tasks on Different Server Platform Server Platform


IBM system P

Configuration Tasks
Uses HMC to allocate the physical NIC to partition. The adapter configuration in the partition depends on the OS, including RHEL, SUSE and AIX. Uses HMC to allocate the virtual Ethernet Adapter to each partition.

Virtual Ethernet Adapter

IBM PowerVM

The adapter configuration in the partition depends on the OS, including RHEL, SUSE, AIX. Uses HMC to allocate the virtual Ethernet Adapter to each partition.

Host Ethernet Adapter (HEA) Logical Host Ethernet Adapter (LHEA) Shared Ethernet Adapter (SEA) Interfaces in the Ethernet PassThru Module

IBM PowerVM

The adapter configuration in the partition depends on the OS, including RHEL, SUSE, AIX. Uses HMC to allocate the virtual Ethernet Adapter to each partition.

IBM PowerVM

The adapter configuration in the partition depends on the OS, including RHEL, SUSE, AIX. Uses HMC to allocate the interface to VIOS. Uses VIOS commands to configure SEA. Uses Blade Center Management Module (GUI) to allocate the interface to the blade server. Interface configuration in the blade server depends on the OS, including RHEL, SUSE, AIX, Windows. The physical NIC configuration depends on the OS, including RHEL, SUSE, AIX and Windows.

IBM PowerVM

IBM Blade Center

Physical NIC

IBM x3500

NOTE

Some of these commands will change IP address settings immediately, while some of them require a restart of network service. Not all tools will save changes in the configuration database. It means that the changes may not be preserved after server reboot.

NOTE

Appendices

145

Configuring Red Hat Enterprise Linux Network


In Red Hat Enterprise Linux (RHEL), the configuration files for network interfaces and the scripts to activate and deactivate them are located in /etc/sysconfig/ network-scripts/ directory: File/etc/sysconfig/network specifies routing and host information for all network interfaces File/etc/sysconfig/network-scripts/ifcfg-<interface-name> For each network interface on a Red Hat Linux system, there is a corresponding interface configuration script. Each of these files provide information specific to a particular network interface. The following is a sample ifcfg-eth0 file for a system using a fixed IP address:
DEVICE=eth0 BOOTPROTO=none ONBOOT=yes NETWORK=10.0.1.0 NETMASK=255.255.255.0 IPADDR=10.0.1.27 USERCTL=no

In addition, several other commands can be helpful, as listed in Table A.2. Table A.2 Commands
ethtool kudzu ifconfig

Additional Commands Description


Queries and changes settings of an Ethernet device, such as auto-negotiation, speed, link-mode, flow-control. Detects and configures new and or changed hardware on a system. Queries and changes settings of an Ethernet interface. The changes made via ifconfig take effect immediately but they are not saved in the configuration database.

The following is a sample ifconfig command to create eth0 interface with a fixed IP address.
# ifconfig eth0.5 192.168.1.100 netmask 255.255.255.0 broadcast 192.168.1.255 up

146

Appendices

Vconfig adds or removes a VLAN interface. When vconfig adds a VLAN interface, a new logical interface will be formed with its base interface name and the VLAN ID. Below is a sample vconfig command to add a VLAN 5 interface on the eth0 interface:
#vconfig add eth0 5 The eht0.5 interface configuration file will be created in /etc/sysconfig/ network-scripts/ifcfg-eth0.5

Service network restart restarts networking. System-config-network launches a GUI-based network administration tool for configuring the interface. Route allows operators to inquire about a routing table or to add a static route. The static route added by the route command is not persistent after a system reboot or network service restart. Netstat allows operators to check network configuration and activity. For instance, netstat I shows interface statistic reports; netstat r shows routing table information. Ping allows operators to check network connectivity. Traceroute allows operators to trace the route packets transmitted from an IP network to a given host. For further details concerning these commands, refer to Red hat Linux Reference Guide at www.redhat.com/docs/manuals/linux/RHL-9-Manual/pdf/ rhl-rg-en-9.pdf.

Configuring SUSE Linux Enterprise Network


Table A.3 lists and defines commonly used SUSE Linux network configuration commands. Table A.3 Commands
ifconfig rcnetwork restart netstat ping traceroute

SUSE Linux Enterprise Network Configuration Commands Description


Configures network interface parameters. Restarts network service. Provides format for printing network connections, routing tables, interface statistics and protocol statistics. Checks network connectivity. Tracks the route packets taken from an IP network on their way to a given host.

For further details concerning the SUSE Linux network configuration commands, refer to Novells Command Line Utilities at www.novell.com/documentation/oes/ tcpipenu/?page=/documentation/oes/tcpipenu/data/ajn67vf.html.

Appendices

147

Configuring AIX Network


AIX network configuration can be performed using smitty, a system management tool that is a cursor-based text (command line) interface. Table A.4 lists and defines smitty commands. Table A.4 Commands
lscfg lslot lsdev rmdev cfgmgr lsattr smitty smitty chgenet smit mktcpip ifconfig Netstat Entstat ping traceroute

Smitty Commands and Definitions Definition


Displays configuration, diagnostic and vital product data (VPD) information about the system and its resources. Displays dynamically reconfigurable slots, such as hot plug slots and their characteristics. Displays devices in the system and their characteristics. Removes devices from the configuration database. Configures devices and optionally installs device software by running the programs specified in the Configuration Rules object class. Displays attribute characteristics and possible values of attributes for devices in the system. Provides a cursor-based text interface to perform system management. In addition to a hierarchy of menus, smitty allows FastPath to take users directly to the dialog, by passing the menu interactive. Configures an adapter, determines a network adapter hardware address, sets an alternative hardware address or enables Jumbo Frames. Sets the required value for starting TCP/IP on a host, including setting the host name, setting the IP address of the interface in the configuration database, setting the sub network mask, or adding a static route. Configures or displays network interface parameters for a TCP/IP network. Displays network status, including the number of packets received, transmitted and dropped, and the routes and their status. Shows Ethernet device driver and device statistics. For example, the command entstat ent0 displays the device generic statistics for ent0. Checks network connectivity. Tracks the route packets from an IP network to a given host.

For the details concerning the above-listed commands, refer to publib.boulder.ibm.com/infocenter/aix/v6r1/index.jsp.

148

Appendices

Configuring Virtual I/O Server Network


Virtual I/O Server (VIOS) network configuration is used in POWER5, POWER6 and POWER7 systems. Table A.5 lists and defines some of the more commonly used VIOS network configuration commands. Table A.5 Commands VIOS Commands and Definitions Definitions
Creates a mapping between a virtual adapter and a physical resource. For example, the following command creates a SEA that links physical ent0 to virtual ent2:
mkvdev sea ent0 vadapter ent2 default ent1 defaultid 1

mkvdev

lsmap

Lists the mapping between virtual adapters and physical resources. For example, use the following lsmap command to list all virtual adapters attached to vhost1:
lsmap vadapter vhost1

chdev

Changes the attribute on the device. For instance, use the following chdev command to enable jumbo frames on the ent1 device:
chdev dev ent0 attr jumbo _ frame=yes

chtcpip

Changes the VIOS TCP/IP setting and parameters. For example, use the following command to change the current network address and mask to the new setting:
chtcpip interface en0 inetaddr 9.1.1.1 netmask 255.255.255.0

lstcpip

Displays the VIOS TCP/IP setting and parameters. For example, use the following command to list the current routing table:
lstcpip routetable

oem _ setup _ env

Initiates the OEM installation and setup environment so that users can install and set up software in the traditional way. For example, the oem_setup_env command can place a user in a non-restricted UNIX root shell so that the user can implement the AIX commands to install and set up software and use most of the AIX network commands, including lsdev, rmdev, chdev, netstat, entstat, ping and traceroute.

For further details concerning VIOS network commands, refer to publib.boulder.ibm.com/infocenter/powersys/v3r1m5/index.jsp?topic=/iphcg/ iphcg_network_commands.htm .

Appendices 149

Configuring Windows 2003 Network


Typically, Windows 2003 network configuration is performed by the network applet in the GUI-based control panel. NIC vendors also might provide a web GUI to configure the NIC setting, including frame size. Table A.6 lists and defines some of the more commonly used Windows 2003 commands for network configuration. Table A.6 Commands
ipconfig route ping tracert

Windows 2003 Network Commands Definitions


Command line utility to get TCP/IP network adapters configuration. Command line utility to add or remove a static route. You can make the change persistent by using the p option when adding routes. Used to check network connectivity. Used to tracks the route packets taken from an IP network on their way to a given host.

For the details concerning Windows 2003 network commands, refer to Windows 2003 product help at technet.microsoft.com/en-us/library/ cc780339%28WS.10%29.aspx .

150

Appendices

Appendix B: LAG Test Results


Table B.1 lists detailed LAG test results for the MX480. NOTE Table B.1 The following values listed in Table B.1 represent approximations in seconds.

MX480 Link Aggregation Failover Scenarios Routing Engine Failover FPC Failover (FPC with one link of LAG) FPC Failover (FPC with 2 links of LAG and interface to traffic generator) Switch Fabric Failover System Upgrade without ISSU System Upgrade with ISSU (NSR must be enabled)

LAG Failover Scenarios (Graceful Restart Enabled)

0 LACP Enabled NSR Enabled 0 1.5 53 (53, 53 ) Immediate (upgrade backup first, and then upgrade the primary) ~20 LACP Disabled NSR Enabled ~ 20 10 ~63 (57, 63, 64) Immediate (upgrade backup first, and then upgrade the primary) ~20 * 0

LACP Enabled

NSR Disabled

1.5

~52 (51, 52, 53)

Immediate

LACP Disable

NSR Disabled

~ 20

10

~63 (63, 64)

Immediate

NOTE Table B.2

The following values listed in Table B.2 represent approximations in seconds.

EX8200 Link Aggregation Failover Scenarios Routing Engine Failover FPC Failover (FPC with LAG and interface to traffic) Switch Fabric Failover System Upgrade (without ISSU/ without GRES) System Upgrade (without ISSU/with GRES)

LAG Failover Scenarios

LACP Enabled/ Disabled (Does not matter if LACP is enabled/disabled)

~84 0 (82, 86) Immediate 527 152

Appendices

151

NOTE

Refer to TableB.2 when reviewing the following system upgrade steps. Steps associated with system upgrade (without ISSU/without GRES): 1. Break GRES between the primary and the backup device.

2. Upgrade the backup device. 3. Upgrade the primary device. (Observe the outage in approximate seconds). 4. Re-establish the GRES between the primary and the backup device. Steps associated with System Upgrade (without ISSU/with GRES): 1. Break GRES between the primary and the backup device.

2. Upgrade the backup device. 3. Re-establish the GRES between the primary and the backup device. 4. Reverse the roles between the primary and backup devices (The primary device becomes the backup and the backup device becomes the primary). Ignore the warning about version-mismatch. 5. Break GRES between the primary device and the backup device. 6. Upgrade the backup device. 7. Re-establish the GRES between the primary and the backup device.

Methods for Performing Unified ISSU


The three methods for performing a unified ISSU are the following: Upgrading and Rebooting Both Routing Engines Automatically. Upgrading Both Routing Engines and Manually Rebooting the New Backup Routing Engine. Upgrading and Rebooting Only One Routing Engine. Method 1: Upgrading and Rebooting Both Routing Engines Automatically This method uses the following reboot command:
request system software in-service-upgrade package-name reboot

1.

Download the software package from the Juniper Networks Support Web site.

2. Copy the package to the /var/tmp directory on the router:


user@host>file copy ftp://username:prompt@ftp.hostname.net/filename /var/ tmp/filename version invoke-on all-routing-engines command: {backup} user@host> show version invoke-on all-routing-engines

3. Verify the current software version on both Routing Engines, using the show

152

Appendices

4. Issue the request system software in-service-upgrade package-name reboot command on the master Routing Engine:
{master} user@host> request system software in-service-upgrade /var/tmp/jinstall9.0-20080114.2-domestic-signed.tgz reboot ISSU: Validating Image PIC 0/3 will be offlined (In-Service-Upgrade not supported) Do you want to continue with these actions being taken ? [yes,no] (no) yes ISSU: Preparing Backup RE Pushing bundle to re1 Checking compatibility with configuration . . . ISSU: Old Master Upgrade Done ISSU: IDLE Shutdown NOW! . . . *** FINAL System shutdown message from root@host *** System going down IMMEDIATELY Connection to host closed.

5. Log in to the router once the new master (formerly backup Routing Engine) is online. Verify that both Routing Engines have been upgraded:
{backup} user@host> show version invoke-on all-routing-engines

6. To make the backup Routing Engine (former master Routing Engine) the primary Routing Engine, issue the following command:
{backup} user@host> request chassis routing-engine master acquire Attempt to become the primary routing engine ? [yes,no] (no) yes Resolving mastership... Complete. The local routing engine becomes the master. {master} user@host>

7.

Issue the request system snapshot command on each of the Routing Engines to back up the system software to the routers hard disk.

Method 2: Upgrading Both Routing Engines and Manually Rebooting the New Backup Routing Engine 1. Issue the request system software in-service-upgrade command.

2. Perform steps 1 through 4 as described in Method 1. 3. Issue the show version invoke-on all-routing-engines command to verify that the new backup Routing Engine (former master) is still running the previous software image, while the new primary Routing Engine (former backup) is running the new software image:
{backup} user@host> show version

4. At this point, a choice between installing newer software or retaining the old version can be made. To retain the older version, execute the request system software delete install command.

Appendices

153

5. To ensure that a newer version of software is activated, reboot the new backup Routing Engine, by issuing the following:
{backup} user@host> request system reboot Reboot the system ? [yes,no] (no) yes Shutdown NOW! . . . System going down IMMEDIATELY Connection to host closed by remote host.

6. Log in to the new backup Routing Engine and verify that both Routing Engines have been upgraded:
{backup} user@host> show version invoke-on all-routing-engines

7.

To make the new backup the primary, issue the following command:
{backup} user@host> request chassis routing-engine master acquire Attempt to become the master routing engine ? [yes,no] (no) yes

8. Issue the request system snapshot command on each of the Routing Engines to back up the system software to the routers hard disk. Method 3: Upgrading and Rebooting Only One Routing Engine Use the request system software in-service-upgrade package-name no-oldmaster-upgrade command on the master Routing Engine. 1. Request an ISSU upgrade:
{master} user@host> request system software in-service-upgrade /var/tmp/jinstall-9.0-20080116.2-domestic-signed.tgz no-old-masterupgrade

2. To install the new software version on the new backup Routing Engine, issue the request system software add command. Troubleshooting Unified ISSU NOTE The following Unified ISSU steps relate only to the Junos 9.6 release. Perform the following steps if the ISSU procedure stops progressing. 1. Execute a request system software abort in-service-upgrade command on the master Routing Engine.

2. To verify that the upgrade has been aborted, check the existing router session for the following message: ISSU: aborted!

154

Appendices

Appendix C: Acronyms
A AFE: Application Front Ends apsd: automatic protection switching process B BPDU: Bridge Protocol Data Unit BSR: Bootstrap Router C CBT: Core Based Tree CIST: Common Instance Spanning Tree CLI: Command Line Interface CoS: class of service D dcd: device control process DDoS: Distributed Denial of Service DHCP: Dynamic Host control Protocol DNS: Domain Name System DSCP: Diffserv Code Points DUT: Device Under Test DVMRP: Distance Vector Multicast Routing Protocol E F ESM: Ethernet Switch Module, Embedded Syslog Manager FC: Fibre Channel FCS: frame check sequence FPC: Flexible PIC Concentrator FSP: Flexible Service Processor G GRES: Graceful Route Engine Switchover GSL: global server load balancing H HBA: Host Bus Adapter HEA: Host Ethernet Adapter HMC: Hardware Management Console I

Appendices

155

IDP: Intrusion Detection and Prevention IGMP: Internet Group Management Protocol ISCSI: Internet Small Computer System Interface iSSU: In Service Software Upgrade IVE: Instant Virtual Extranet IVM: Integrated Virtualization Manager L LAG: Link Aggregation LDAP: Lightweight Directory Access Protocol LPAR: Logical Partitions LHEA: Logical Host Ethernet Adapter M MAC: Media Access Control MCS: Multi Core Scaling mgd: management process MLD: Multicast Listener Discovery MM: Management Module MOSPF: Multicast Open Shortest Path First MSTI: Multiple Spanning Tree Instance MSDP: Multicast Source Discovery Protocol MSTP: Multiple Spanning Tree Protocol MTA: mail transfer agent MTTR: mean time to repair MTU: Maximum Transmission Unit N NAT: Network Address Translation NIC: Network Interface Card NIST: National Institute of Science and Technology NPU: network processing unit NSB: Nonstop Bridging NSR: nonstop active routing O

156

Appendices

OEM: Original Equipment Manufacturer OSS: operation support systems P PDM: Power Distribution Module PIC: Physical Interface Card PIM: Protocol Independent Multicast PLP: packet loss priority PM: Pass: through Module PoE: Power over Ethernet PVST: Per-VLAN Spanning Tree Q R QoS: Quality of Service RED: random early detection ROI: return on investment RP: rendezvous point RPC: remote procedure call rpd: routing protocol process RTG: Redundant Trunk Group RSTP: Rapid Spanning Tree Protocol RVI: routed VLAN interface S SAN: storage area network SAP: Session Announcement Protocol SCB: Switch Control Board SDP: Session Description Protocol SEA: Shared Ethernet Adapter SMT: Simultaneous Multithreading SNMP: Simple Network Management Protocol snmpd: simple network management protocol process SOA: Service Oriented Architecture SOL: Serial over LAN SPOF: single point of failure

Appendices

157

STP: Spanning Tree Protocol SSH: source-specific multicast SSL: Secure Sockets Layer SSM: source: specific multicast Syslogd: system logging process T V TWAMP: Two-Way Active Measurement Protocol VID: VLAN Identifier (IEEE 802.1q) VIOS: Virtual I/O Server VLAN: Virtual LAN VLC: VideoLAN VPLS: virtual private LAN service VRF: Virtual Routing and Forwarding VRRP: Virtual Router Redundancy Protocol VSTP: Virtual Spanning Tree Protocol W WPAR: Workload based Partitioning

158

Appendices

Appendix D: References
www.juniper.net/techpubs/software/junos/junos90/swconfig-highavailability/swconfig-high-availability.pdf The Junos High Availability Configuration Guide, Release 9.0 presents an overview of high availability concepts and techniques. By understanding the redundancy features of Juniper Networks routing platforms and the Junos software, a network administrator can enhance the reliability of a network and deliver highly available services to customers. IEEE 802.3ad link aggregation standard STP - IEEE 802.1D 1998 specification RSTP - IEEE 802.1D-2004 specification MSTP - IEEE 802.1Q-2003 specification www.nettedautomation.com/standardization/IEEE_802/standards_802/ Summary_1999_11.html Provides access to the IEEE 802 Organization website with links to all 802 standards. RFC 3768, Virtual Router Redundancy Protocol https://datatracker.ietf.org/wg/vrrp/ Provides access to all RFCs associated with the Virtual Router Redundancy Protocol (VRRP). RFC 2338, Virtual Router Redundancy Protocol for IPv6 https://datatracker.ietf.org/doc/draft-ietf-vrrp-ipv6-spec/ Provides access to the abstract that defines VRRP for IPv6.

Data Center Network Connectivity with IBM Servers

Data Center Network Connectivity Handbook


This handbook serves as an-easy-to-use reference tool for implementing a two-tier data center network by deploying IBM open systems as the server platform with Juniper Networks routing and switching solutions.

A MuST-READ, pRACTICAl GuIDE fOR IT pROfESSIONAlS, NETWORK ARCHITECTS AND ENGINEERS, WHO WISH TO DESIGN AND IMplEMENT A HIGH pERfORMANCE DATA CENTER INfRASTRuCTuRE. THIS BOOK pROVIDES A STEp-BY-STEp AppROACH, WITH VAlIDATED SOluTION SCENARIOS fOR INTEGRATING IBM OpEN SYSTEM SERVERS AND JuNIpER NETWORKS DATA CENTER NETWORK, INCluDING TECHNICAl CONCEpTS AND SAMplE CONfIGuRATIONS. Scott Stevens, VP Technology, Worldwide Systems Engineering, Juniper Networks

THIS BOOK IS A VAluABlE RESOuRCE fOR ANYONE INTERESTED IN DESIGNING NETWORK INfRASTRuCTuRE fOR NExT GENERATION DATA CENTERS...IT pROVIDES ClEAR, EASY TO uNDERSTAND DESCRIpTIONS Of THE uNIquE REquIREMENTS fOR DATA COMMuNICATION IN AN IBM OpEN SYSTEMS ENVIRONMENT. HIGHlY RECOMMENDED! Dr. Casimer DeCusatis, IBM Distinguished Engineer

7100125-001-EN

June 2010

Potrebbero piacerti anche