Sei sulla pagina 1di 151

MPLS Traffic Engineering

NANOG18
Robert Raszuk - IOS Engineering raszuk@cisco.com

1999, Cisco Systems, Inc.

Location of files
This presentation, handouts & demo are located at: ftp://ftpeng.cisco.com/rraszuk/nanog18
RR_MPLS_TE_Nanog.pdf - this presentation

TE_Monitor.pdf - show & debug commands


TE_Config.pdf - full configuration syntax

TE_SampleCfg.pdf - configuration sample


TE_DEMO.tar - Tared TE offline demo (HTML) TEisistdp_1.pdf - Demos Lab Topology
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

Traffic Engineering: Motivations

Reduce the overall cost of operations by more efficient use of bandwidth resources
by preventing a situation where some parts of a service provider network are over-utilized (congested), while other parts under-utilized

The ultimate goal is cost saving !


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

Traffic Engineering: Motivations

MPLS and Traffic Eng allows for one to spread the traffic and distribute it across the entire network infrastructure like magnetic fields between poles while also providing the redundancy required for high availability service.
(Eric Dean)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

Without Traffic Engineering


Cars:
SFO-LAX LAX-SFO SAN-SMF SMF-SAN

No Traffic Engineering
analogy

to Human Drivers

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

With Traffic Engineering


Cars:
SFO-LAX LAX-SFO SAN-SMF SMF-SAN

Traffic Engineering
analogy

to Auto Pilot

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

Routing solution to Traffic Engineering


R2 R3

R1

Construct routes for traffic streams within a service provider in such a way, as to avoids causing some parts of the providers network to be over-utilized, while others parts remain underutilized
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

The Overlay Solution


L3 L2 L3 L2 L2 L3 L2 L3 L2 L2 L3 L3 L3 L3

L3

L3

L3 L3

Physical

Logical

Routing at layer 2 (ATM or FR) is used for traffic engineering Analogy to direct highways between SFO-LAX & SAN-SMF. Nobody enters the highway in between.
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

Traffic engineering with overlay


R2 R3

R1

PVC for R2 to R3 traffic


PVC for R1 to R3 traffic
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

Overlay solution: drawbacks


Extra network devices (cost)
More complex network management (cost)
two-level network without integrated network management additional training, technical support, field engineering

IGP routing scalability issue for meshes Additional bandwidth overhead (cell tax)
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

10

Traffic engineering with Layer 3


R2 R3

R1

IP routing: destination-based least-cost routing

Path for R2 to R3 traffic


Path for R1 to R3 traffic under-utilized alternate path
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

11

Traffic engineering with Layer 3


R2 R3

R1

IP routing: destination-based least-cost routing

Path for R2 to R3 traffic


Path for R1 to R3 traffic under-utilized alternate path
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

12

Traffic engineering with Layer 3 what is missing ?


Path computation based just on IGP metric is not enough Support for explicit routing (aka source routing) is not available Analogy:
San Jose San Jose

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

13

MPLS Traffic Engineering

1999, Cisco Systems, Inc.

14

TE - key mechanisms

Explicit routing (aka source routing)


Constrained-based Path Selection Algorithm
(Example: Choose path with no congestion, avoid highways, select scenic roads etc)

Extensions to OSPF/ISIS for flooding of resources / policy information (Live collection of


traffic statistics - pilot tests in Europe)

MPLS as the forwarding mechanism (Auto Pilot


programmed in each car when entering city)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

15

TE - key mechanisms

Explicit routing (aka source routing)


RSVP as the mechanism for establishing Label Switched Paths (LSPs) use of the explicitly routed LSPs in the forwarding table

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

16

What is a traffic trunk ?


A B

Aggregation of (micro) flows that are:


forwarded along a common path (within a service provider)
often from a POP to another POP share a common QoS requirement (if L-LSPs are used)

Essential for scalability


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

17

TE basics
Traffic within a Service Provider as a collection of POP to POP traffic trunks with known bandwidth and policy requirements
TE provides traffic trunk routing that meets the goal of Traffic Engineering
via a combination of on-line and offline procedures
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

18

Requirements:
Differentiating traffic trunks:
large, critical traffic trunks must be well routed in preference to other trunks

Handling failures:
automated re-routing in the presence of failures

Pre-configured paths:
for use in conjunction with the off-line route computation procedures

Support of multiple Classes of Service


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

19

Requirements (cont.)
Constraining sub-optimality:
should re-optimize on new/restored bandwidth
in a non-disruptive fashion - maintain the existing route until the new route is established, without any double counting

Ability to spread traffic trunk across multiple Label Switched Paths (LSPs)
could provide more efficient use of networking resources

Ability to include / exclude certain links for certain traffic trunks


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

20

Design Constraints
Constrained to a single routing domain
initially constrained to a single area

Requires OSPF or IS-IS


Unicast traffic

Focus on supporting routing based on a combination of administrative + bandwidth constraints

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

21

Trunks Attributes

1999, Cisco Systems, Inc.

22

Trunk Attributes

Configured at the head-end of the trunk Bandwidth

Priorities
setup priority: priority for taking a resource

holding priority: priority for holding a resource

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

23

Trunk attributes
Ordered list of Path Options
possible administratively specified paths (via an off-line central server) - {explicit list} Constrained-based Dynamically computed paths based on combo of Bw and policies

Re-optimization
each path option is enabled or not for reoptimization, interval given in seconds. Max 1 week (7*24*3600), Disable 0, Def 1h.
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

24

Trunk Attributes
Resource class affinity (Policy)
supports the ability to include/exclude certain links for certain traffic trunks based on a user-defined Policy Tunnel is characterized by a
32-bit resource-class affinity bit string 32-bit resource-class mask (0= dont care, I care)

Link is characterized by a 32-bit resource-class attribute string Default-value of tunnel/link bits is 0 Default value of the tunnel mask = 0x0000FFFF
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

25

Example0: 4-bit string, default


C A 0000 D 0000 0000 E 0000 0000 B

Trunk A to B:
tunnel = 0000, t-mask = 0011

ADEB and ADCEB are possible


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

26

Example1a: 4-bit string


C A 0000 D 0000 0010 E 0000 0000 B

Setting a link bit in the lower half drives all tunnels off the link, except those specially configured
Trunk A to B:
tunnel = 0000, t-mask = 0011

Only ADCEB is possible


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

27

Example1b: 4-bit string


C A 0000 D 0000 0010 E 0000 0000 B

A specific tunnel can then be configured to allow such links by clearing the bit in its affinity attribute mask
Trunk A to B:
tunnel = 0000, t-mask = 0001

Again, ADEB and ADCEB are possible


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

28

Example1c: 4-bit string


C A 0000 D 0000 0010 E 0000 0000 B

A specific tunnel can be restricted to only such links by instead turning on the bit in its affinity attribute bits
Trunk A to B:
tunnel = 0010, t-mask = 0011

No path is possible
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

29

Example2a: 4-bit string


C A 0000 D 0000 0100 E 0000 0000 B

Setting a link bit in the upper half drives has no immediate effect
Trunk A to B:
tunnel = 0000, t-mask = 0011

ADEB and ADCEB are both possible


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

30

Example2b: 4-bit string


C A 0000 D 0000 0100 E 0000 0000 B

A specific tunnel can be driven off the link by setting the bit in its mask
Trunk A to B:
tunnel = 0000, t-mask = 0111

Only ADCEB is possible


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

31

Example2c: 4-bit string


C A 0000 D 0000 0100 E 0000 0000 B

A specific tunnel can be restricted to only such links Trunk A to B:


tunnel = 0100, t-mask = 0111

No path is possible
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

32

Trunk Attribute
Resource Class Affinity (Policy)

The user defines the semantics:


this bit/mask says low-delay path excluded

Flexible (maybe too flexible :)


1c vs 2c ? in 1c, the default tunnels will not be willing to flow via the special links

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

33

Link Attributes and their flooding

1999, Cisco Systems, Inc.

34

Link Resource Attributes

Resource attributes are configured on every link in a network


bandwidth

Link Attributes
TE-specific link metric

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

35

Link Resource Attributes

Resource attributes are flooded throughout the network


bandwidth per priority (0-7)

Link Attributes (Policy)


TE-specific link metric

draft-li-mpls-igp-te-00.txt

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

36

Per-Priority Available BW
D T=0 Link L, BW=100 D advertises: AB(0)=100== AB(7)=100 AB(i) = Available Bandwidth at priority I

T=1 T=2

Setup of a tunnel over L at priority=3 for 30 units D

Link L, BW=100

D advertises: AB(0)=AB(1)=AB(2)=100 AB(3)=AB(4)==AB(7)=70

T=3

Setup of an additional tunnel over L at priority=5 for 30 units

D T=4

Link L, BW=100

D advertises: AB(0)=AB(1)=AB(2)=100 AB(3)=AB(4)=70 AB(5)=AB(6)=AB(7)=40


37

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

Information Distribution

Re-use the flooding service from the Link-State IGP


opaque LSA for OSPF
draft-katz-yeung-ospf-traffic-00.txt

new wide TLV for IS-IS


draft-ietf-isis-traffic-00.txt

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

38

Information Distribution

Periodic (timer-based) On significant changes of available bandwidth (threshold scheme)

On link configuration changes


On LSP Setup failure

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

39

Periodic Timer

Periodically, a node checks if the current TE status is the same as the one lastly broadcasted.
If different, it floods its updated TE Links status

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

40

Significant Change
Each time a threshold is crossed, an update is sent

100% 92% 85% 70%

Update

50%
Update

Denser population as utilization increases


Different thresholds for UP and Down (stabler)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

41

LSP Setup Failure

Due to the threshold scheme, it is possible that one node thinks he can signal an LSP tunnel via node Z while in fact, Z does not have the required resources
When Z receives the Resv message and refuses the LSP tunnel, it broadcasts an update of its status

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

42

Constrained-based Computation

1999, Cisco Systems, Inc.

43

Constrained-Based Routing
In general, path computation for an LSP may seek to satisfy a set of requirements associated with the LSP, taking into account a set of constraints imposed by administrative policies and the prevailing state of the network -- which usually relates to topology data and resource availability. Computation of an engineered path that satisfies an arbitrary set of constraints is referred to as "constraint based routing.
Draft-li-mpls-igp-te-00.txt
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

44

Path Computation
On demand by the trunks head-end:
for a new trunk

for an existing trunk whose (current) LSP failed for an existing trunk when doing reoptimization

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

45

Path Computation
Input:
configured attributes of traffic trunks originated at this router attributes associated with resources

available from IS-IS or OSPF


topology state information

available from IS-IS or OSPF


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

46

Path Computation
Prune links if:
insufficient resources (e.g., bandwidth) violates policy constraints

Compute shortest distance path


TE uses its own metric Tie-break: selects the path with the highest minimum bandwdith so far, then with the smallest hop-count
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

47

Path Computation

Output:
explicit route - expressed as a sequence of router IP addresses interface addresses for numbered links loopback address for unnumbered links used as an input to the path setup component

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

48

Example
C 1000 BW(3)=60 BW(3)=80 0100 0000 BW(3)=20 0000 BW(3)=80

0000 BW(3)=50

D 1000 BW(3)=50

E 0010 BW(3)=70

Tunnels request:
Priority 3, BW = 30 units, Policy string: 0000, mask: 0011
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

49

MPLS as the forwarding mechanism

1999, Cisco Systems, Inc.

50

MPLS Labels
Two types of MPLS Labels:

Prefix Labels & Tunnel Labels LDP


Distributed by:

RSVP CR-LDP

MP-BGP PIM

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

51

MPLS as forwarding engine


Traffic engineering requires explicit routing capability IP supports only the destination-based routing
not adequate for traffic engineering

MPLS provides simple and efficient support for explicit routing


label swapping separation of routing and forwarding
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

52

LSP tunnel Setup

1999, Cisco Systems, Inc.

53

RSVP Extensions to RFC2205 for LSP Tunnels


downstream-on-demand label distribution instantiation of explicit label switched paths allocation of network resources (e.g., bandwidth) to explicit LSPs rerouting of established LSP-tunnels in a smooth fashion using the concept of make-before-break tracking of the actual route traversed by an LSP-tunnel

diagnostics on LSP-tunnels
the concept of nodal abstraction preemption options that are administratively controllable draft-ietf-mpls-rsvp-lsp-tunnel-0X.txt

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

54

RSVP Extensions: new objects


LABEL_REQUEST found in Path LABEL found in Resv EXPLICIT_ROUTE found in Path RECORD_ROUTE found in Path, Resv SESSION_ATTRIBUTE found in Path 0x01 Fast Reroute Capable, 0x02 Permit Merging, 0x04 May Reoptimize => SE

New C-Types are also assigned for the SESSION, SENDER_TEMPLATE, FILTER_SPEC, FLOWSPEC objects.
All new objects are optional with respect to RSVP (RFC2205).

The LABEL_REQUEST and LABEL objects are mandatory with respect to MPLS LSP signalisation specification.

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

55

LSP Setup

Initiated at the head-end of a trunk Uses RSVP (with extensions) to establish Label Switched Paths (LSPs) for traffic trunks

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

56

Path Setup - Example


R8 R9

R3
R4 R2 R1
Label 49 Label 17 Pop

R5
Label 32

R6

R7

Label 22

Setup: Path (ERO = R1->R2->R6->R7->R4->R9) Reply: Resv communicates labels and reserves bandwidth on each link
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

57

Path Setup - more details


R1 2 1 R2 R3 2 1

Path: Common_Header Session(R3-lo0, 0, R1-lo0) PHOP(R1-2) Label_Request(IP) ERO (R2-1, R3-1) Session_Attribute (S(3), H(3), 0x04) Sender_Template(R1-lo0, 00) Sender_Tspec(2Mbps) Record_Route(R1-2)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

58

Path Setup - more details


R1 2 1 R2 R3 2 1

Path State: Session(R3-lo0, 0, R1-lo0) PHOP(R1-2) Label_Request(IP) ERO (R2-1, R3-1) Session_Attribute (S(3), H(3), 0x04) Sender_Template(R1-lo0, 00) Sender_Tspec(2Mbps) Record_Route (R1-2)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

59

Path Setup - more details


R1 2 1 R2 R3 2 1

Path: Common_Header Session(R3-lo0, 0, R1-lo0) PHOP(R2-2) Label_Request(IP) ERO (R3-1) Session_Attribute (S(3), H(3), 0x04) Sender_Template(R1-lo0, 00) Sender_Tspec(2Mbps) Record_Route (R1-2, R2-2)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

60

Path Setup - more details


R1 2 1 R2 R3 2 1

Path State: Session(R3-lo0, 0, R1-lo0) PHOP(R2-2) Label_Request(IP) ERO () Session_Attribute (S(3), H(3), 0x04) Sender_Template(R1-lo0, 00) Sender_Tspec(2Mbps) Record_Route (R1-2, R2-2, R3-1)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

61

Path Setup - more details


R1 2 1 R2 R3 2 1

Resv: Common_Header Session(R3-lo0, 0, R1-lo0) PHOP(R3-1) Style=SE FlowSpec(2Mbps) Sender_Template(R1-lo0, 00) Label=POP Record_Route(R3-1)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

62

Path Setup - more details


R1 2 1 R2 R3 2 1

Resv State Session(R3-lo0, 0, R1-lo0) PHOP(R3-1) Style=SE FlowSpec (2Mbps) Sender_Template(R1-lo0, 00) OutLabel=POP IntLabel=5 Record_Route(R3-1)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

63

Path Setup - more details


R1 2 1 R2 R3 2 1

Resv: Common_Header Session(R3-lo0, 0, R1-lo0) PHOP(R2-1) Style=SE FlowSpec (2Mbps) Sender_Template(R1-lo0, 00) Label=5 Record_Route(R2-1, R3-1)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

64

Path Setup - more details


R1 2 1 R2 R3 2 1

Resv state: Session(R3-lo0, 0, R1-lo0) PHOP(R2-1) Style=SE FlowSpec (2Mbps) Sender_Template(R1-lo0, 00) Label=5 Record_Route(R1-2, R2-1, R3-1)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

65

Trunk Admission Control


Performed by routers along a Label Switched Path (LSP)

Determines if resources are available


May tear down (existing) LSPs with a lower priority Does the local accounting

Triggers IGP information distribution when resource thresholds are crossed


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

66

Link Admission Control


Already invoked by Path message
if BW is available, this BW is put aside in a waiting pool (waiting for the RESV msg) if this process required the pre-emption of resources, LCAC notified RSVP of the pre-emption which then sent PathErr and/or ResvErr for the preempted tunnel if BW is not available, LCAC says No to RSVP and a Path error is sent. A flooding of the nodes resource info is triggered, if needed draft-ietf-mpls-rsvp-lsp-tunnel-02.txt
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

67

Path Monitoring

Use of new Record Route Object


keep track of the exact tunnel path

detects loops
copy of RRO to ERO allows for route pinning

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

68

Path Re-Optimization

Looks for opportunities to re-optimize


make before break no double counting of reservations via RSVP shared explicit style!

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

69

Non-disruptive rerouting - new path setup


R8 R9

R3
R4 R2 R1
49 17 Pop

R5
32

R6

R7

22

Current Path (ERO = R1->R2->R6->R7->R4->R9) New Path (ERO = R1->R2->R3->R4->R9) - shared with Current Path Until R9 gets new Path Message, current Resv is refreshed
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

70

Non-disruptive rerouting switching paths


R8 R9

R3
R4 R2
89
26 Pop Pop

R1
38 49 17

R5
32

R6

R7

22

Resv: allocates labels for both paths Reserves bandwidth once per link PathTear can then be sent to remove old path (and release resources)
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

71

Reroute - More Details


Session(R3-lo0, 0, R1-lo0) ERO (R2-1, R3-1) Sender_Template(R1-lo0, 00)
00 R1 2 1 R2 2 01 R3 1 3 01

3 01

Resource Sharing

ERO (R2-1, , R3-3) Sender_Template(R1-lo0, 01)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

72

Reroute - More Details


R1 2 1 R2 R3 2 3 1

Path: Common_Header Session(R3-lo0, 0, R1-lo0) PHOP(R1-2) Label_Request(IP) ERO (R2-1, ,R3-3) Session_Attribute (S(3), H(3), 0x04) Sender_Template(R1-lo0, 01) Sender_Tspec(3Mbps) Record_Route(R1-2)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

73

Reroute - More Details


R1 2 1 R2 R3 3 3

Path State: Session(R3-lo0, 0, R1-lo0) PHOP(R1-2) Label_Request(IP) ERO (R2-1, ,R3-3) Session_Attribute (S(3), H(3), 0x04) Sender_Template(R1-lo0, 01) Sender_Tspec(3Mbps) Record_Route (R1-2)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

74

Reroute - More Details


R1 2 1 R2 R3 3 3

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

75

Reroute - More Details


R1 2 1 R2 R3 3 3

RSVP: Common_Header Session(R3-lo0, 0, R1-lo0) PHOP(R3-3) Style=SE FlowSpec(3Mbps) Sender_Template(R1-lo0, 01) Label=POP Record_Route(R3-3)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

76

Reroute - More Details


R1 2 1 R2 R3 3 3

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

77

Reroute - More Details


R1 2 1 R2 R3 3 3

RSVP: Common_Header Session(R3-lo0, 0, R1-lo0) PHOP(R2-1) Style=SE FlowSpec (3Mbps) Sender_Template(R1-lo0, 01) Label=6 Record_Route(R2-1, , R3-3) Sender_Template(R1-lo0, 00) Label=5 Record_Route(R2-1, R3-1)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

78

Reroute - More Details


R1 2 1 R2 R3 3 3

RSVP state: Session(R3-lo0, 0, R1-lo0) PHOP(R2-1) Style=SE FlowSpec Sender_Template(R1-lo0, 01) Label=6 Record_Route(R2-1, , R3-3) Sender_Template(R1-lo0, 00) Label=5 Record_Route(R2-1, R3-1)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

79

Fast Restoration

Handling link failures - two complementary mechanisms:


Path protection Link/Node protection

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

80

Path Protection

1999, Cisco Systems, Inc.

81

Path Protection
Step1: link failure detection
O(depends on L2/L1)

Step2a: IGP reaction (ISIS case)


Either via Step1 or via IGP hello expiration (30s by default for ISIS) 5s (default) must occur by default before the generation of a new LSP 5.5s (default) must occur before a change of the LSPDB and the consecutive SPF run. The next SPF run can only occur 10s after (default) Flooding time (LSP are paced (16ms for first LSP, 33ms between LSPs, depend also on link speed) Once the RIB is updated, this change must be incorporated into CEF. The Head-end finally computes the new topology and finds out that some established LSPs are affected. It schedules a reoptimization for them

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

82

Path Protection

Step2b: RSVP signalisation


rsvp path states with the failed intf as oif is detected

check if another oif available (if loose ero) if not, clear path state and send tear to head-end Step2: Either stepA or stepB alarms the head-end Step3: Re-optimization
dijkstra computation: O(0.5)ms per node (rule of thumb)

RSVP signalisation time to instal rerouted tunnel convergence in the order of several seconds (at least).

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

83

Path Protection Speed it Up

Fine Tune the IGP convergence


Through adequate tuning, ISIS could be tuned to converge in 2-3s, this ensuring that the convergence time bottleneck is the signalisation time for the new tunnel.

Several tunnels in parallel with load-babalancing


if combined with the IGP convergence, the path resilience could be brought to around 2-3s

One end-2-end tunnel in parallel but in backup mode


feature under development (Fast Path Protection)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

84

Fast ReRoute (aka Link Protection) An Overview

1999, Cisco Systems, Inc.

85

Objective

FRR allows for temporarily routing around a failed link or node while the head-end may reoptimize the entire LSP
rerouting under 50ms

scalable (must support lots of LSPs)


NANOG18 - Robert Raszuk 86

2000, Cisco Systems, Inc.

Fast reroute Overview


Controlled by the routers at ends of a failed link
link protection is configured on a per link basis

Session_Attributes Flag 0x01 allows the use of Link Protection for the signalled LSP

Uses nested LSPs (stack of labels)


original LSP nested within link protection LSP

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

87

Static backup Tunnel


R8 R4 R2 R1
17

R9

R5
Pop

R6

R7

22

Setup: Path (R2->R6->R7->R4) Labels Established on Resv message

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

88

Routing prior R2-R4 link failure


R8 R4 R2 R9

Pop
R1

14
37
R6
R7

R5

Setup: Path (R1->R2->R4->R9) Labels Established on Resv message

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

89

Link Protection Active


R8 R4 R2 R1 R5 R9

R6

R7

On failure of link from R2 -> R4, R2 simply changes outgoing Label Stack from 14 to <17, 14>

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

90

Link Protection Active


R8 Swap 37->14 Push 17 Pop 14 R4 R9

R2
Push 37 R1

R5

R6 Swap 17->22

R7 Pop 22

Label Stack:

R1 37

R2 17 14

R6 22 14

R7 14

R4

R9 None

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

91

Fast ReRoute More details on Link Protection (FRR v1)

1999, Cisco Systems, Inc.

92

V1 Constrain
We protect the facility (link), not individual LSPs
scalability vs granularity

No node resilience
Static backup tunnel

The protected link must use the Global Label space


A backup tunnel can backup at most one link, but n LSPs travelling via this link
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

93

Terminology
R8 R4 R2 R5

R9

R1

R6

R7

LSP: end-to-end tunnel onto which data normally flows (eg R1 to R9) BackUp tunnel: temporary route to take in the event of a failure
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

94

Terminology

Link Protection
In the event of a link failure, an LSP is rerouted to the next-hop using a preconfigured backup tunnel

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

95

How to indicate a link is protected and which tunnel is the backup?

On R2 (For LSPs flowing from R2 to R4):


interface pos <r2tor4> mpls traffic-eng backup tunnel 1000 link

LSPs are unidirectional, so the same protection should be enable for the opposite direction if reverse LSP is conf.

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

96

How to setup the backup tunnel?


Just as a normal tunnel whose headend is R2 and tail-end is R4
v1 requires a manually configured ERO
interface Tunnel1000 ip unnumbered Loopback0 tunnel destination R4 tunnel mode mpls traffic-eng tunnel mpls traffic-eng priority 7 7 tunnel mpls traffic-eng bandwidth 800 tunnel mpls traffic-eng path-option 1 explicit name backuppath1

ip explicit-path name backuppath1 enable next-address R6 next-address R7 next-address R4

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

97

Which LSPs can be rerouted on R2 in the event of R2-R4 failure?

The LSPs flowing through R2 that


have R2-R4 as Outgoing Interface have been signalled by their respective head-ends with a session attribute flag 0x01=ON (may use fastreroute tunnels)
int tunnel 1
## config on the head-end

tunnel mpls traffic-eng fast-reroute


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

98

Global Label Allocation


R8 14 R2 R1 R5

POP
R4

R9

R6

R7

For the blue LSP, R4 bound a global label of 14 Any MPLS frame received by R4, with label 14, will be switched onto the link to R9 with a POP, whatever the incoming interface
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

99

How fast is fast?


Link Failure Notification
Usual PoS alarm detection PoS driver optimisation to interrupt RP in < 1ms

Expected call to net_cstate(idb, UP/DOWN) identifying the DOWN state of the protected int to start our protection action.

RP updates the master TFIB (replace a swap by a swap-push)


< 1ms

Master TFIB change notified to the linecards


< 1ms
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

100

Path state while Rerouting


Path (, PHOP=R2, )
R8 BackUP tunnel

Path state
R4

R9

R2 R1 R5

R6

R7

PathError (Reservation in Place)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

101

Path & Resv Msgs [Error & Tear]


R1

R2

R3

R4

When no link protection:


Resv Tear Conf. Path Tear Resv Tear Conf.

When link protection:


Path Error Resv in place
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

R4 waits for refresh

102

LSP reoptimization

Head-end notified by PathError


special flag (reservation in place) indicates that the path states must not be destroyed. It is just a hint to the head-end that the path should be reoptimized

Head-end notified by IGP


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

103

Why the Patherror?

The Patherror might be faster


In case of multi-area IGP, the IGP will not provide the information

In case of very fast up-down-up, the LSP will be put on the backup tunnel and will stay there as the IGP will not have originated a new LSP/LSA
a router waits a certain time before originating a new LSP/LSA after a topological change

Reliable PathErr optimization


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

104

Resv state while Rerouting


The loss of the interface does not affect the Path and Resv states for the LSPs received on that interface that are marked fast reroutable!
R8 R9

Resv state
R2 R1

BackUP tunnel

R4 R5

Resv
R6 R7

Resv Message is unicast to the Phop (R2)

R2s Path State has been informed that the Resv might arrive over a different intf as the one used by the Path message
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

105

DiffServ and LSP Reoptimization

In order to optimize the bandwdith usage, backup tunnels might be configured with 0kbps
no non-working bandwdith as in SDH!

Although usually the backbone is though as being congestion-free, during rerouting some local congestion might occur
Use diffserv to handle this short-term congestion
Use LSP reoptimization to handle the long-term congestion
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

106

Layer1/2 and Layer3

Backup Tunnel should not use


the protected L3 link
the protected L1/L2 links!!!

Use WANDL (loaded with both L3 and L1/2 topologies) to compute the best paths for backup tunnels
Download this as static backup tunnels to the routers
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

107

Fast ReRoute Node Protection

1999, Cisco Systems, Inc.

108

Overview
R8

R9

R4 R3

R2

R1

R5

R7 R6

Backup Tunnel to the next-hop of the LSPs next-hop


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

109

A few More details


Assume
R2 is configured with resilience for R3
R2 receives a path message for a new LSP whose ERO is {R3, R4, }, whose Session is (R9, 1, R1), whose sender is (R1, 1) and whose session attribute is (0x01 ON, 0x02 OFF) 0x01: may use local fast-reroute if available 0x02: merge capable

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

110

A few More details

Then
R2 checks if it already has a tunnel to R4 If not, R2 builds a backup tunnel to R4 (currently just like in link protection - manual explicit setup). R2 sends a Path onto the tunnel with Session (R9, 1, R1), Sender (R2, 1), Session Attribute (0x01 OFF, 0x02 ON) and PHOP R2

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

111

A few More details

When R4 receives this Path message,


it matches the session with the LSPs one
merge (and thus stop) this path message

sends a RESV back to R2 (unicast) and allocate the appropriate label L

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

112

A few More details


When R2 detects R3s failure,
For the TFIB entry for the LSP, R2 changes the existing swap by a swap to L and a push of the backup tunnel label

R4s states are refreshed by the secondary path messages (over the backup tunnels)
ERO of the original path is adjusted at R2 NHOP is modified in R2 (from R3 to R4) PHOP is modified in R4 (from R3 to R2)
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

113

A few More Details

RESV is being sent back from R4 to R2 directly If R3 is still active and just the R2-R3 link failed R4 needs to ignore & drop any Tear-Down msg R3 would be sending after the termination of reception of path refreshes from R2.
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

114

How to detect R3s failure?

A node may fail while the link is still up A nodes linecard processes might survive, a main process failure (freeze of the RP process)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

115

A possible solution
RP RP

LC ... LC

LC

Keepalives between LCs Keepalives between a LC and its master RP


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

116

Assigning traffic to Paths (aka autoroute)

1999, Cisco Systems, Inc.

117

Enhancement to SPF
During SPF each new node found is moved from a TENTative list to PATHS list. Now the first-hop is being determined via:
A. Check if there is any TE tunnel terminating at this node from the current router and if so do the metric check B. If there is no TE tunnel and the node is directly connected use the first-hop from adj database C. In non of the above applies the first-hop is copied from the parent of this new node.

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

118

Enhancement to SPF - metric check


Tunnel metric:

A. Relative +/- X
B. Absolute Y The default is relative metric of 0. Example: Metric of native IP path to the found node = 50 1. Tunnel with relative metric of -10 => 2. Tunnel with relative metric of +10 => 40 60

3. Tunnel with absolute metric of 10 =>


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

10
119

Enhancement to SPF - metric check

If the metric of the found TE tunnel at this node is higher then the metric for other tunnels or native IGP path this tunnel is not installed as next hop

If the metric of the found TE tunnel is equal to other TE tunnels the tunnel is added to the existing nexthops If the metric of the found TE tunnel is lower then the metric of other TE tunnels or native IGP the tunnel replaces them as the only next-hop.

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

120

Other TE New Features

1999, Cisco Systems, Inc.

121

Auto-Bandwidth

Global command:

Monitor marked tunnels 5-min average counters every X minutes


default: X = 300 (seconds) (config)# mpls traffic-eng auto-bw
timers frequency <seconds>

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

122

Auto-Bandwidth
Per tunnel command:
Every Y minutes, update the BW constraint of the tunnel with the maximum of:
the largest 5-min values sampled during the last Y minutes (Def Y = 24 * 3600sec) - 24h a configured maximum value (config-if)# tunnel mpls traffic-eng auto-bw
{frequency <seconds>} {max-bw <kbs>}

if the new Bw is not available, the old one is maintained (the new BW is signalled via a 2nd tunnel to follow make before break model)
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

123

Example

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

124

Verbatim
Applies to explicitly routed LSPs Disable any check against TE/IGP database of the head end RSVP still check BW (and policy when this will be in Path) hop by hop Application: manual TE through multi-area IGP
CLI: tunnel mpls traffic-eng path-option verbatim

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

125

In-Progress

Allows an end-head to account for bw consumed by tunnels that it has just signalled and for whom the IGP LSA/LSP update has not reflected the available bandwdith

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

126

Example

In-Prog Bw: 55 10 Avail Bw: 100

All tunnels require 45 units of BW In-progress counters reset upon new LSA/LSP reception In-progress counter decremented upon receipt of path-error
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

127

Benefits

Speed-up the installation of tunnels as it avoids spending time trying not working solutions Allows for better load-balancing
igp metric then max(min(path-bw)!

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

128

Under/Overbook
ML: Maximum link bandwidth:
This sub-TLV contains the maximum bandwidth that can be used on this link in this direction (from the system originating the LSP to its neighbors). This is useful for traffic engineering.

MR: Maximum reservable link bandwidth:


This sub-TLV contains the maximum amount of bandwidth that can be reserved in this direction on this link. Note that for oversubscription purposes, this can be greater than the bandwidth of the link.

UR(I): Unreserved bandwidth at Priority i:


This sub-TLV contains the amount of bandwidth reservable on this direction on this link, at a certain priority. Note that for oversubscription purposes, this can be greater than the bandwidth of the link.
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

129

Under/Overbook

As config: int s0 bandwidth <B1> ip rsvp bandwdith <B2>

(eg 1500 kbps) (eg 4000 kbps)

...

Physical T1

s0

...

ML is set to B1 (eg 1500) MR is set to B2 (eg 4000)

At t=0, for all i 0 to 7, UB(i) = M = (eg 4000)


routerA's LCAC will not accept an LSP tunnel asking more than ML even if there is available bandwdith at the requested priority. However, LCAC would allow for example 5 trunks each asking 700 kbps (thus each asking less than ML) while the aggregate is smaller than MR: because { 700 < ML=1500 } and { 3500 < MR=4000 }

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

130

Standby
Current solution
Tu1: bw1 A B

Tu2: bw2
Tu3: bw3 Tu4: bw4

Solution:
4 tunnels from A to B:
Tu1s relative metric: -3

Tu2 and tu3s relative metric: -2


Tu4s relative metric: -1
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

131

Last hop label


IETF draft-ietf-mpls-label-encaps-07.txt
A value of 0 represents the "IPv4 Explicit NULL Label

A value of 1 represents the "Router Alert Label


A value of 2 represents the "IPv6 Explicit NULL Label"

A value of 3 represents the "Implicit NULL Label


New cli forces tailend to send implicit-null (3) instead of explicit null (0) - default. # [no] mpls traffic-eng signalling advertise implicit-null [<acl>] On receipt (n-1) node we must map 0, 1 or 3 to internal Implicit Null [1 only for historical reasons]
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

132

QoS and RRR

1999, Cisco Systems, Inc.

133

QoS and RRR

MPLS TE can operate simultaneously (and orthogonally) with MPLS Diff-Serv All Precedence/DSCP packets follow the same TE tunnels
Diff-Serv provides selective discard (via WRED), and selective scheduling (via WFQ)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

134

QoS and RRR


Future:
Scalable per-tunnel scheduling and policing
Guaranteed PIPE in MPLS-VPN CoS

per-DSCP/per-FEC traffic engineering


diffserv backbone capacity management

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

135

DiffServ and fast-reroute/TE

In order to optimize the bandwdith usage, backup tunnels might be configured with 0kbps
no non-working bandwdith as in SDH!

Although usually the backbone is though as being congestion-free, during rerouting some local congestion might occur
Use diffserv to handle this short-term congestion
Use LSP reoptimization to handle the long-term congestion
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

136

RSVP LSP Signalling Protocol for Traffic Engineering

1999, Cisco Systems, Inc.

137

MPLS-TE Signalling Protocol

Two proposed signaling mechanisms for MPLS traffic engineering are being considered by the IETFs MPLS work group
RSVP (Cisco and a number of Gigabit router startups (Avici, Argon, Ironbridge, Juniper, and Torrent)) CR-LDP (Ericsson, Ennovate, GDC, Nortel)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

138

Why RSVP ?
What is needed: An IP signalling Protocol!
ability to establish and maintain Label Switched Path along an explicit route

ability to reserve resources when establishing a path

Interdependent, not independent tasks


benefit from consolidation

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

139

Do I need RSVP only for TE ?


NO !

Other uses of RSVP in todays networks:


Voice over IP call setup, Video (IPTV)

Hybrid deployments (only where needed)


QoS DiffServ Engineering (Cops) Qualitative Service for DiffServ with RSVP
(as opposed to Quantitative RSVP IntServ model)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

140

RSVP is a natural choice

RFC2205: provides a general facility for creating and maintaining distributed reservation state across a mesh of multicast and unicast delivery paths TE: use as a general facility for creating and maintaining distributed forwarding & reservation state across a mesh of delivery paths

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

141

RSVP is a natural choice


RFC2205: transfers and manipulates QoS control parameters as opaque data, passing them to the appropriate traffic control module for interpretation

TE: transfer and manipulate explicit route and label control parameters as opaque data pass explicit route parameter to the appropriate routing module, and label parameter to the MPLS module

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

142

RSVP is a natural choice


Leverage Standardized Protocols
PIM for Multicast MPLS
BGP for MPLS VPNs

RSVP for MPLS Traffic Engineering


LDP (TDP) has been designed because it was easier than fixing all IGPs (RIP, EIGRP, OSPF, ISIS)

fast deployments and engineering consistency

Leverage Deployed Experience


RSVP deployed since 1996 (IOS 11.2) ww.isi.edu/rsvp/DOCUMENTS/ietf_rsvp_qos_survey for a list of RSVP implementations
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

143

RSVP is a natural choice

RSVP easily supports


Dynamic resizing of tunnels or paths through refresh messages Supports strict as well as loose source routes

No double counting of bandwidth when rerouting sub-optimal routes

Extensible via definition of new objects


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

144

RSVP/TE and Scalability


Very Different than IntServ context

State applies to a collection of flows (i.e. a traffic trunk), rather than to a single (micro) flow RSVP sessions are used between routers, not hosts

Sessions are long-lived (up to a few weeks)


Paths are not bound by destination-based routing Reference: Applicability Statement for Extensions to RSVP for LSP-Tunnels (draft-awduche-mpls-rsvp-tunnelapplicability-01.txt)

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

145

RSVP/TE and Scalability


Very Different than IntServ context
RFC2208: the resource requirements for running RSVP on a router increases proportionally with the number of separate sessions

TE: that is why using traffic trunks to aggregate flows is essential


RFC2208: supporting numerous small reservations on a high-bandwidth link may easily overtax the routers and is inadvisable TE : n/a in the context of TE - traffic trunks aggregate multiple flows

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

146

TE/RSVP Scalability
With basic RSVP (RFC2205), 10000 RRR LSP tunnels flowing through a 75x0 or 12000 is not a problem Already Deployed on a number of Tier-1 ISP backbones
http://www.nanog.org/mtg-9905/hanna.html
Ship with 12.0(5)S

Refresh Aggregation work will again enhance this scalability

NANOG18 - Robert Raszuk

2000, Cisco Systems, Inc.

147

Conclusion

Using RSVP as MPLS/TE signalling protocol is the natural and consistent choice

It is however only one part of a whole solution:


MPLS as forwarding engine IGP (OSPF/ISIS) extensions Constrained Base Routing (RRR) RSVP as MPLS/TE Signalling Protocol

Installation of Tunnels in the FIB


NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

148

Summary

1999, Cisco Systems, Inc.

149

Traffic Eng

Provides traffic engineering capabilities at Layer 3


above and beyond of what is provided with ATM

Could be used for other applications as well Shipping and deployed in production
NANOG18 - Robert Raszuk
2000, Cisco Systems, Inc.

150

Presentation_ID

1999, Cisco Systems, Inc.

151

Potrebbero piacerti anche