Sei sulla pagina 1di 19

Carnegie Mellon University

Research Showcase
CyLab Research Centers and Institutes

12-28-2010

SCI-FI: Domain-based Scalability, Control and


Isolation for the Future Internet (CMU-
CyLab-10-020)
Hsu-Chun Hsiao
Carnegie Mellon University, hsuchunh@andrew.cmu.edu

Xin Zhang
Carnegie Mellon University, xzhang1@cmu.edu

Geoff Hasker
Carnegie Mellon University, hasker@cmu.edu

Haowen Chan
Carnegie Mellon University, haowenchan@cmu.edu

Adrian Perrig
Carnegie Mellon University, perrig@cmu.edu

See next page for additional authors

Recommended Citation
Hsiao, Hsu-Chun; Zhang, Xin; Hasker, Geoff; Chan, Haowen; Perrig, Adrian; and Andersen, David G., "SCI-FI: Domain-based
Scalability, Control and Isolation for the Future Internet (CMU-CyLab-10-020)" (2010). CyLab. Paper 78.
http://repository.cmu.edu/cylab/78

This Technical Report is brought to you for free and open access by the Research Centers and Institutes at Research Showcase. It has been accepted
for inclusion in CyLab by an authorized administrator of Research Showcase. For more information, please contact kbehrman@andrew.cmu.edu.
Authors
Hsu-Chun Hsiao, Xin Zhang, Geoff Hasker, Haowen Chan, Adrian Perrig, and David G. Andersen

This technical report is available at Research Showcase: http://repository.cmu.edu/cylab/78


SCI-FI: Domain-based Scalability, Control and Isolation
for the Future Internet

Hsu-Chun Hsiao, Xin Zhang, Geoff Hasker, Haowen Chan, Adrian Perrig, David Andersen

December 28, 2010

CMU-CyLab-10-020

CyLab
Carnegie Mellon University
Pittsburgh, PA 15213
SCI-FI: Domain-based Scalability, Control and Isolation for the Future
Internet
Hsu-Chun Hsiao Xin Zhang Geoff Hasker Haowen Chan Adrian Perrig
David Andersen
CyLab / Carnegie Mellon University
28 December 2010

Abstract We show that a high level of control and isolation nat-


urally leads to security and reliability without the use of
We present the first Internet architecture designed for high-overhead security mechanisms, while exposing ex-
control and isolation. We propose to separate ASes into pressive and diverse communication path sets to the end-
groups of independent routing sub-planes which then in- points.
terconnect to form complete routes. Our architecture,
SCI-FI, provides superior resilience and security proper- In particular, we introduce the abstract notion of a hi-
ties as an intrinsic consequence of good design principles, erarchy of trust domains whose members all share a com-
without needing additional add-on protocols or external mon contractual, legal, cultural, geographical, or other
checks to provide resilience. Our security analysis shows basis for extending limited trust between each other. Ex-
that SCI-FI can naturally prevent several long-standing amples may be a domain of U.S. educational institutions,
security plagues to existing interdomain routing proto- ISPs that participate in the same peering point who share
cols even with their semantics perfectly secured. Our a common, binding legal contract on their behavior, or
evaluation results further demonstrate SCI-FI’s routing ISPs in the same state or country who are subject to
efficiency, path expressiveness, and substantial reliability the same laws and regulations. Using this abstraction,
improvements over existing (secured) routing protocols in we provide the machinery to guarantee a critical isolation
the presence of malicious attacks. property at the network level: Entities outside a trust do-
main cannot affect communication within that trust do-
main. For communication that must span trust domains,
we provide the property that limits the entities who can
1 Introduction affect the communication to a necessary and explicitly
identified set of other trust domains.
The Internet is the most geographically, administratively,
and socially diverse distributed system ever invented. Finally, SCI-FI assigns control of route selection to the
While today’s Internet architecture admits some admin- source and destination. The source and destination agree
istrative diversity, such as by separating routing inside a jointly on which path to use from the set of ISP-supported
domain (intra-AS routing) from the global inter-domain routes. The architecture naturally controls routing infor-
routing, it falls short in handling key challenges of se- mation flow, and provides for explicit trust in path selec-
curity and isolation that arise in this intensely heteroge- tion.
neous setting. As a result, we see surprisingly frequent
incidents in which communication is interrupted by ac- Contributions. We design and analyze an Internet ar-
tions or actors far from the communicating entities: In chitecture emphasizing the principles of control and iso-
addition to classical examples such as YouTube being lation. The resulting architecture enables route control
globally disrupted by routing announcements from Pak- for ISPs, senders and receivers at an appropriate level
istan [1], other issues surrounding the lack of resource of granularity, balancing efficiency, expressiveness, policy
control and isolation are not solved by existing propos- compliance, and security. The isolation properties dra-
als such as S-BGP: The introduction of excessive routing matically reduce the TCB and provide situational aware-
churn [2]; traffic flooding; router memory resource ex- ness which entities need to be relied upon for various net-
haustion; and even issues of global conflicts over naming work operations. We find that the resulting architecture
and name resolution. offers strong security properties and demonstrate that the
In this paper, we propose a clean-slate Internet archi- resulting routes are comparable to the routes that BGP
tecture that provides strong guarantees for isolation and uses. We anticipate that the proposed architecture of-
control in ways that map well to existing geographic, po- fers a useful point in the design space for the search of
litical, and legal boundaries. next-generation Internet routing protocols.

1
2 Limitations of Current Routing

...

...
Design Compromised
AS
C
Peering Peering
4 3 2 B M
To motivate our new design, we first demonstrate the fun-
damental limitations in the current inter-domain routing Source’s Dest’s P1
A P2
Provider AS Provider AS
protocols. Through concrete examples and discussion,
we show that recent popular inter-domain routing pro- 5 1
tocols [3, 4, 5, 6, 7, 8, 9, 10], even with their semantics E
Source AS Dest AS
perfectly secured (e.g., via S-BGP [11] like approaches), (a) Valley-free violation. (b) Lack of inbound
do not provide several important security properties. traffic control.
Limitation 1: Arbitrary Information Flow. We
first show that existing inter-domain routing protocols Figure 1: Arbitrary information flow. Small arrows indi-
give endpoints and ISPs little control over how their rout- cate customer-provider (downstream/upstream) relation-
ing announcements are being further propagated, which ships. Large arrows indicate constructed paths. In each
can cause several security vulnerabilities. More specifi- case, P1 is preferred by the endpoint but P2 is the path
cally, path-vector routing is used by many inter-domain that is used.
routing designs due to its ability to support rich rout-
ing policies [12]. For example, in addition to the current
In summary, routing systems with one-directional and
de facto inter-domain routing protocol BGP, MIRO [5],
unregulated flow of routing update dissemination can suf-
R-BGP [6], Route Deflection [8], and ACR [9] use path-
fer from the following problems: (i) paths can be con-
vector routing as the underlying vehicle for route dissem-
structed from provider to customer and back to a different
ination; and Pathlet routing still uses BGP for dissemi-
provider (a “valley” situation); (ii) paths can traverse un-
nating the “pathlets”. In these routing systems, however,
trusted ISPs; and (iii) the routing system is generally sub-
once a certain node N announces its prefix, or path, or
ject to arbitrary blackhole and wormhole attacks. This
pathlet to its neighbors, N has no control over the way in
unprincipled manner of path construction is a well-known
which its routing update is further propagated and paths
source of persistent Internet route fluctuation [13].
are constructed as they flow towards or away from N .
Figure 1(a) depicts an example scenario. A destination Limitation 2: No Joint Path Selection between
AS 1 is served by a provider AS 2; likewise the source Source and Destination. We argue that the lack of
AS (AS 5) is served by its provider AS 4. An interme- joint path selection between the source and destination
diate AS 3 peers with both providers; the providers do nodes prevents effective defenses against long-standing
not peer with each other directly. Suppose that AS 3 is Denial-of-Service attacks. More specifically, traditional
malicious or compromised and wishes to control the route path-vector routing places more control to the intermedi-
between the source and destination. It can forward the ate ASes: after receiving advertisements from their neigh-
route it learns from AS 2 to AS 4 even if S-BGP security bors, each AS selects its most-preferred routes which are
is used. Because routes announced by peers are gener- then advertised to its peer and customer ASes. In such
ally preferred over routes announced by providers, this a system, endpoints have no control over path construc-
will likely result in the destination using the AS-PATH tion. Newer proposals for multi-path routing [3, 4, 5, 8]
{1, 2, 3, 4, 5}, which violates the valley-free routing prin- recognize that the users of a path – the communicating
ciple [13]. Using conventional routing security measures, endpoints – should have the final say over whether a route
such as S-BGP, (which only secures the strict semantics is acceptable. These proposals give the source the abil-
of path vector) it is impossible to distinguish this route ity to select from a set of diverse paths, but they do not
from a legitimate route. Currently, the only practical similarly empower the destination to control its inbound
method for dealing with such anomalies is to use hand- traffic, as illustrated in Figure 1(b). Consequently, the
tuned ingress or egress filters to custom-configure the sys- routing infrastructure provides no built-in mechanisms for
tem, which can be extremely error-prone and cause incon- a destination to block unwanted traffic early in the net-
sistencies [14] [15]. work before it reaches the destination. Without inbound
Figure 1(b) depicts another example where the end- traffic control, a destination lacks the intrinsic ability to
point AS E is the destination of traffic. AS E generates a defend against malicious incoming traffic such as Denial-
route advertisement for its address prefix which is propa- of-Service attacks.
gated through its provider A; this advertisement is further Limitation 3: Lack of Routing Isolation. A cen-
propagated into separate paths P1 and P2 going through tral tenet of current inter-domain routing architectures
B and M , respectively, and re-converging at AS C. Sup- has been to achieve global reachability, where a routing
pose AS C selects P2 to re-advertise; then P1 is discarded announcement or prefix advertisement from a certain AS
and all inbound traffic to E now must pass through the can potentially be propagated throughout the entire In-
AS M which is less preferred by E. ternet. In other words, most (if not all) ASes are in the

2
Inter-top
same flat routing dissemination domain. For example, in TD route
addition to the aforementioned multipath routing proto- Top-level
Top-level
D
cols, NIRA [16] organizes all the ASes in one tree-based Trust Domain In est.
Trust Domain
d gr Co
routing domain, and Landmark routing [17] also makes te es n
lec s r tro
- se te ou ll
the routing “landmarks” available throughout the net- ce u te ed
ur s ro
So res
work. While global visibility simplifies achieving global Eg Local
Local Trust Domain
routing reachability, it also lends great help to individual Trust Domain
malicious ASes which can easily launch attacks influential Source
Destination
to the entire Internet. For example, two distant collud- “Localized” within each TD
Strict Inter-TD Isolation
ing ASes can announce a (non-existing) wormhole link
between each other to create a (bogus) short path, which
can be seen potentially by the entire Internet and thus Figure 2: Proposed Internet Architecture
attract a considerable amount of traffic.
Limitation 4: Lack of Route Freshness. In today’s well-specified influence perimeter, knowing exactly which
routing protocols, an adversary who can cause messages other domains hold influence over their routing and for-
to be delayed or dropped can force traffic to continue to
warding.
use an older path p with obsolete state. More specifically,
due to the global visibility of routing updates from each Principle 2: scalable route update propagation –
individual AS, current inter-domain routing protocol de- Well-structured information flow and high path
signs avoid the use of pro-active, periodic routing updates freshness. SCI-FI provides a scalable architecture where
(like in link-state protocols) to achieve scalability, but use routing updates can be sent proactively to periodically
incremental routing updates which are sent out only after refresh path state, so that each node always maintains a
route changes. However, this incremental manner of rout- fresh (and accurate) network topology on which routing
ing updates compromises route freshness, as the loss (or decisions can be more efficiently made.
malicious drop) of incremental updates concerning a cer- Principle 3: Mutual Controllable path selection
tain path p (such as path withdrawal messages) can pre- – Joint path selection between source and des-
vent other ASes from receiving the update. These ASes tination. SCI-FI greatly increases both the source and
may thus keep the path p with the obsolete state. Con- destination’s ability to affect, select and control the con-
sider the example in Figure 1(b) for illustration, where struction of the routes to and from themselves, while still
the AS PATH {C, M, A, E} is active to reach destination fully respecting intermediate ISPs’ preference and not ex-
E. Suppose that A withdraws E’s prefix, however the posing the remainder of the network to the potential for
malicious AS M intentionally suppresses this withdrawal endpoint-based source routing attacks.
message from C. Consequently, the same AS PATH still
remains active, because B’s withdrawal of the prefix does
not affect the path through M . 4 Architecture Overview
4.1 Hierarchical Decomposition (Sec-
3 Design Principles and Overview tion 5)
SCI-FI is based upon three grounding principles: domain- SCI-FI first divides the Internet into a hierarchy of trust
based isolation, proactive and scalable route update prop- domains, or TDs, as shown in Figure 2, used to provide
agation, and mutually controllable path selection. These the domain-based isolation property. All nodes in the
principles, which we detail below, provide a framework architecture know a set of paths from themselves to a
within which SCI-FI achieves high resilience as a natu- set of exit/entrance nodes from their trust domains, and
ral outcome. Figure 2 illustrates SCI-FI’s domain-based they make these paths available via a lookup service. To
architecture. send data to a node in another TD, the sender selects
an “up-path” from themselves to one of the egress nodes
3.1 Design Principles from their (highest-level) trust domain, and can pick from
among the published paths back to the receiver within
Principle 1: Domain-based isolation – Dividing that domain. In this way, nodes retain complete control
the routing control plane into independent do- over the paths their packets take within their own trust
mains. With properly designed isolation among inde- domains, and the sender is given the freedom to choose
pendent domains, routing in one domain cleanly protects from among those allowed paths. Below, we discuss how
itself from malicious activities and routing churn stem- each of these aspects works.
ming from other domains, benefiting both security and Each TD has a TD Core, a set of specifically designated
scalability while retaining reachability and path diver- Autonomous Domains (ADs) forming a mutually reach-
sity across domains. SCI-FI also lets domains have a able clique that interfaces with other TDs. In the current

3
Internet, the top-tier ISPs would constitute the TD Core.
The architecture defines the AD, or autonomous domain,
as the atomic failure unit (AFU), representing both ISPs Inter TLTD routing
(or transit ADs) and endpoint ADs.

4.2 Routing and Policy Enforcement Intra TD Core routing

Inter-TD routing takes place using human-configured


routes or a path-vector protocol such as BGP. Given the
envisoned small size of TD-level topology (e.g., a few hun-
dred TDs), scalability and security are no longer major
concerns for Inter-TD routing. We envision the effort
to establish a TD to closely mirror that of starting a
certificate authority. The cost involved remains moder-
ately high, so after after a few hundred TDs, the benefit
of creating additional TDs when considering the present sub-TD
amount of ADs will not justify the effort. ADs will em- Top Level Trust Domain
ploy IGP or OSPF for routing within their confines.
Up-path Construction (Section 6) SCI-FI uses Figure 3: A Top-level trust domain. Black nodes are
“up-paths,” a set of valley-free paths, to reach the core. ADs in the TD Core. Arrows indicate customer-provider
The nodes in the TD Core transmit multiple one-hop relationships. Dashed lines indicate peering relationships.
paths to their customer ADs via up-path construction
beacons. These providers then disseminate core reach-
ability information downwards to their customers. The 3. Source nodes apply policy to select an outbound-
endpoint ADs then selects among the k up-paths to the path to the TD Core and a path from the list of
core received from each provider forming k, ideally max- k inbound paths retrieved from the Path Server.
imally disjoint, paths. Finally, the endpoint ADs publish
these paths on the TD’s Path Server, a service located in
the TD Core, queried by local and foreign ADs for rout- 4.3 Forwarding
ing information. For added flexibility, ADs may announce Forwarding remains simple, as this scheme eliminates the
paths to specific ingress/egress points. need for AD-level forwarding tables, making them only
Lookup (Section 7) Name resolution occurs intrin- necessary at the TD level. Forwarding takes place by
sically within the architecture. Each TD will provide simply verifying the specified route in the packet and de-
address servers within the TD Core that resolve human livering the packet on the specified interface.
readable names to addresses and TDs.
Route joining (Section 8) An AD chooses one of 5 Anatomy of a Trust Domain
the up-paths constructed earlier to reach the TD Core
and query the destination TD’s Path Server for the desti- A trust domain (TD) is the fundamental unit of trust
nation’s inbound path list. The source then splices one of in the SCI-FI architecture. TDs are communities of net-
the offered paths onto its own choice of up-path to reach work entities held together by enforceable rules such as
the destination. The destination can simply reverse the contracts, shared legislative and judicial frameworks, or
embedded path or query the source’s path server for al- physical locality. SCI-FI groups entities into these ag-
ternative down-paths to reach the source. Before naively gregates to avoid a non-scalable network architecture in
splicing an up-path to a down-path, the principal will which every entity (ISP, endpoint, or application) must
compare the two paths for common ancestors or peering pairwise evaluate the trustworthiness of every other ad-
neighbors along the path. ministrative entity in the network. Given these aggre-
Policy Decisions SCI-FI reduces policy decisions by gates, the fundamental goal of the architecture is to en-
all stakeholders to three levels: force isolation between TDs while allowing interconnec-
tion. Each trust domain can be considered an indepen-
1. Transit ADs apply their local policy when deciding dent networking plane which can be shielded from the
which paths to permeate downward via the up-path external influence of entities in other trust domains. The
construction beacon. global goal of the architecture is to allow any endpoint to
explicitly specify which of set of these networking planes
2. endpoint ADs apply policy in their selection of k up- it wishes to use and facilitate a connection based on these
paths to publish as inbound to the TD’s Path Server. requirements.

4
The architecture of a trust domain is shown in Fig- 5.1 Trust Domain Membership
ure 3. Conceptually, it is a contiguous set of ADs, along
with explicitly marked customer-provider relationships. We assume that each trust domain is administered by
A specially designated set of ADs, the top-level ADs or some kind of identifiable organization (a government, an
TD Core ADs, represent the top level of the AD hierar- industry consortium, etc). When a new AD wishes to join
chy: this set contains the entities that perform several a TD, the TD authority must first ensure that it is able
authoritative functions of the trust domain in the various to enforce whatever rules and guidelines that membership
protocols described in this paper. We enforce the con- entails regarding the TD (for example, a country-based
straint that there cannot be any cycles in the customer- TD may require that the corresponding ISPs be regis-
provider graph; furthermore, any AD that does not have tered and headquartered within a given country). Then
a provider in the TD must be a TD Core AD. This en- the TD authority determines the topological relationship
sures that every node in the network has a strictly up- of the joining ISP with respect to the current TD. To
stream path (i.e., a path that traverses edges only in the join an existing trust domain, the new AD should either
customer-to-provider direction) that leads to one of the 1) have one provider already inside this TD, or 2) be
TD Core ADs. The set of TD Core ADs must be con- capable of being a core AD in this TD. Specifically, to
nected and mutually reachable in the AD-graph. Since meet the second requirement, the new AD needs to be
most TD Core ADs in the current Internet are mutually able to directly reach other core ADs by communicating
peering, this topology is expected to be extremely simple. with only other core ADs, as well as satisfy a subject as-
Mutual reachability can be either implemented as a set sessment of being a sufficiently well-connected AD: this
of human-configured primary and backup routes for TDs requirement is similar to customer / traffic requirements
with a small set of TD Core ADs, or a separate path- for peering.
vector protocol (similar to S-BGP) can be used for larger When an AD establishes a new service connection with
numbers of TD Core ADs. an AD in a different trust domain, it needs to inherit
one or more of the TD associations of the provider in
order to access the relevant sets of upstream paths of that
A trust domain can be a top-level trust domain provider. The exact TD assignment is dependent on the
(TLTD), or a sub-TD. A sub-TD is wholly contained terms of the service and contingent on whether the child
within a TLTD and may contain other sub-TDs. A top- AD can satisfy the conditions of joining the new TDs.
level TD is not contained in any other TDs (although its
member set may overlap partially with other TDs). We
5.2 Subsidiary Trust Domains
assume that there will be relatively few top-level TDs in
the world (between 10 to 100), with each TD correspond- A top-level TD may contain subsidiary trust domains
ing to a large, globally identifiable real-world group (such (“sub-TDs”). A sub TD is depicted inside the main top-
as a country, or a well-known international organization). level TD of Figure 3. The purpose of a sub-TD is to al-
low finer-grained trust domain selection (for example, the
armed forces of a country may operate a sub-TD within
The TD Core ADs in the top-level TDs facilitate inter- its own country’s TD, to support a higher level of assur-
connection between top-level TDs using the Inter-TLTD ance than civilian ISPs). A sub-TD is internally struc-
routing protocol. This protocol is essentially a path- tured similarly to a top-level TD, with its own TD Core
vector protocol identical to S-BGP, except at the gran- ADs which are mutually-connected, its own route servers,
ularity of TLTDs instead of ADs; since this topology is and so on. The only difference between a sub-TD and a
extremely small and densely connected (the majority of top-level TD is that the TD Core ADs of a sub-TD does
routes should not need to traverse more than 2 TDs), not take part in the the inter-top-level TD (inter TLTD)
most of the routes are static and can be directly con- protocol.
figured. When automatic route discovery is needed we
assume that TD-level routing policy (e.g., which TD to
use to reach a distant TD) is agreed-upon among the TD 5.3 Benefits of Using Trust Domains
Core ADs beforehand. To facilitate the Inter-TLTD rout-
We end this section with a list of intrinsic benefits of
ing protocol, the TD Core ADs engage in a protocol to
building the inter-domain routing architecture based on
discover their mutual interfaces to other TLTDs in a man-
the notion of TDs.
ner similar to IGP; since there are only a few of these TD
Security against malicious attacks. Due to the
Core ADs per TLTD, each TD Core AD can simply keep
a table of what TLTDs are reachable from each of its fel- strong isolation and control that TDs provide, multiple
low TD Core ADs. Inter-TD packets (from ADs in one long-standing attacks are naturally eliminated in SCI-FI.
TD to ADs in another TD) are tunnelled across this pro- For example, the worm-hole attacks at large scale are
tocol via encrypting at the TD Core AD gateways such no longer possible in SCI-FI, because routing paths are
that forwarding routers from third-party TDs are unable constructed and authenticated within each domain sepa-
to inspect the packets that belong to the endpoint TDs. rately to provide a natural defense against cross-domain

5
notation meaning
worm-hole paths. In SCI-FI, two colluding ADs can P-C link Pij AD i → AD j AD i is AD j ’s provider
only create worm-hole up-paths within the same domain, peering Qij AD i ↔ AD j AD i and AD j are peers
t
which can only incur limited damage given the “shallow- timestamping r
AD i −→AD j link is up at time tr
TD
tree” structure of the Internet [18]. link scoping AD i −→AD j link is valid in TD
AD cert Certi∈T D TD certifies AD i is in TD
Resilience against human misconfiguration. The sub-TD cert CertT D1⊆T D TD certifies TD1 is its sub-TD
TD Core cert CertT D→i TD certifies AD i is TD Core AD
intrinsic isolation provided by the division of TDs peering cert Signi(Qi,j ) AD i certifies a peer AD j
achieves an in-depth resilience against human misconfig- up-path cert Signi(Pj ) AD i certifies one of its up-path Pj
uration, which proves to be the most popular reason for
current routing system outages [19]. First, designing a Table 1: Notation.
robust inter-domain routing architecture based on TDs is
conceptually simple and convenient, which can help re-
duce human configuration errors compared to “ad hoc” to TDs and their respective public keys.
engineering hacks used in the current practice. Second,
even if a misconfiguration happens, the damage is only Up-path definition. The fundamental unit of rout-
confined within that TD because a routing update is only ing in the SCI-FI architecture is the upstream path, or
propagated within the local TD. “up-path” for short. To support multi-path routing, the
protocol enables all ADs to be supplied with multiple dis-
Elimination of a single point of trust/failure. tinct policy-compliant AD-level paths to reach a TD Core.
The existence of multiple TDs eliminates the need for To keep route lookup overhead practical, we restrict the
a single authority of the entire Internet, which causes de- number of paths that each endpoint AD maintains in its
ployment issues and a single point of failure. reachability record to at most k per TD. In facilitating the
construction of these k up-paths, the upstream ADs are
not required to exhaustively enumerate all possible paths
6 up-path Construction to all possible TD Core, but simply provide a sufficient
number of alternatives.
In this section we describe how each AD learns of its
AS-level paths to the TD Core ADs through periodic up- Overview of up-path construction. The path
path construction beacons containing routing information. discovery process starts at the TD Core in each TD where
The process of obtaining this information consists first of each TD Core AD initiates an up-path construction
trust bootstrap to establish the identity of the TD core; beacon in each time period. Each AD passes along
second, of the beaconing process itself; and third, of us- an up-path construction beacon to each customer after
ing the beacon-derived information to construct one or receiving such a message from its provider and appending
more up-paths. Also, SCI-FI supports routing at the necessary information. Specifically, a up-path construc-
ingress/egress point level to enable finer-grain routing tion beacon has three components: U = {P, S, M }. P
control. Finally, to reach nodes outside the TD, SCI-FI contains an up-path in a containing TD. Each link on
provides a mechanism to traverse TD boundaries. this up-path is timestamped and scoped to a TD. S
In the next section we describe how to lookup destina- is a cryptographic authenticator signing the path and
tion AD’s reachability information. Combining these two certifying every on-path AD’ TD membership. Finally,
parts enables paths to be discovered between endpoints. M represents a capability token enabling efficient for-
warding control as proposed in prior work [20, 21, 22].
6.1 Management and Trust Bootstrap
Fig. 4 illustrates the high-level idea of up-path con-
We assume that the TD Core members of each TD act struction for a simple AD topology where all ADs belong
as a central authority that administers the TD. This au- to the same TD and there is no peering links. AD1 is
thority may in fact be implemented using a distributed the TD Core initiates periodic up-path construction bea-
functionality; for simplicity we assume it is a single en- cons. AD1 sends U12 and U13 to its customer AD2 and
tity. Each top-level trust domain is associated with a AD3 , respectively. After receiving U13 , AD3 appends
fixed human readable identifier as well as a well-known information regarding the link to its customer AD4 to
public/private keypair. Due to the high level of visibility construct U134 . AD4 , which receives two up-path con-
of the top-level trust domains, we assume that bootstrap- struction beacon from its upstream providers, can decide
ping the well-known TD authority public key onto the which one(s) to export to the downstream AD based on
relevant principals (specifically, the members of the TD the routing policy. In the remainder of this section, we
as well as other top-level TD authorities) is secure. For will extent our up-path construction protocol to support
example, the public key could be passed on by trusted peering links, fine-grain routing based on ingress/egress
service providers. The TD authority then operates a PKI points, and links that cross TD boundaries. Table 1 sum-
CA for the member ADs of the TD, signing certificates of marizes the notation used to express the detailed con-
membership binding ISP identification and AD numbers struction of up-path construction beacon.

6
1
t ,T D
P13 = AD1 1−→ AD3 As an intermediate AD receives paths from its up-
U12 U13 = S13 = CertT D→1 kCert3∈T D kSign1(P13 ) stream providers, it can then disseminate them to subse-
M13 = MACK1 (P13 kT D) quent downstream customers in the natural way. Suppose
2 3 an intermediate AD (ADi ) receives a set of paths from its
t ,T D
U124 P134 = P13 kAD2 3−→ AD4 upstream providers. It checks the signatures on each of
4 U134 =
U1245 and/or S134 = S13 kCert4∈T D kSign3(P134 ) the paths, and discards any ill-formed or unauthenticated
U1345 M134 = M13 kMACK3 (P134 kM13 ) paths with bad signatures.
5 To identify more shortcuts (as described below), each
pair of peering ADs (h, i) also exchange a peering certifier
Figure 4: An example of up-path construction beacon as follows:
th
dissemination. Arrows indicate the flow of up-path con- Qh,i = ADh ↔AD i
struction beacon. All ADs are in the same TD. U13 =
{P13 , S13 , M13 } and U134 = {P134 , S134 , M134 }. Sh,i = Signh(Qh,i )

Mh,i = MACKh (Qh,i )


6.2 Constructing up-path in one TD Note that peering certifiers are unscoped by TD since the
TD membership of the AD on each end of the peering edge
The following protocol is used to construct the up-paths
should be separately certified on the path authenticator.
for each AD. We first assume every provider AD in the
For each downstream AD (ADj ), the parent AD (ADi )
following protocol description is part of the trust domain
then chooses a (preferably maximally disjoint) path set
T D. We discuss extensions to multiple TDs later.
of m paths P1 , . . . , Pm , where m ≤ k. Each of these
The protocol starts with the top level ADs. Specifically,
paths necessarily terminates at the parent AD, ADi .
those ADs that do not have any upstream providers that
For each path Pp with path authentication information
are also in the trust domain. Each top level AD ADr
Sp , ADi considers the set of peering links that it will
sends a message to each of its customer ADs (ADi ) as
support for downstream customers from ADj and at-
(Pi , Si , Mi ) as follows:
taches this information into the path. It then appends
tr ,T D ADj onto the end of the path, and computes new au-
Pi = ADr −→ ADi thentication information resulting in the message tuple
Up,j = {Pp,j , Sp,j , Mp,j } as below (this example shows
Si = CertT D→r kCerti∈T D kSignr(Pi ) only one peering edge (h, i), but additional peering edges
can be appended as needed):
Mi = MACKr (Pi kT D)
ti ,T D
The path information portion of the message, Pi in- Pp,j = Pp kQh,i −→ ADj
dicates a one-hop path from ADt to ADi that is times-
tamped with (internal) time tr which makes it valid for a Sp,j = Sp kSh,i kCertj∈T D kSigni(Pp,j kSp kSh,i )
time depending on the AD’s internal policy. The edge is Mp,j = Mp kMh,i kMACKi (Pp,j kMp kMh,i )
also labeled with a trust domain scoping term, T D, indi-
cating in which trust domain it is valid for. The path au- This process continues until each AD has obtained a
thentication information Si contains a certificate (signed set of paths terminating at itself, where each edge in the
by the TD authority) authenticating ADr as a TD Core path is authenticated by each parent AD. Each of these
AD in the trust domain T D, as well as a membership authenticated edges essentially represents a network ca-
certificate authenticating ADi as a member of T D, and pability given from a parent provider to a customer; when
a signature by ADr on the path information. The mes- these routes are used, the parent AD will check the corre-
sage authentication code (MAC) portion Mi is computed sponding self-MAC to ensure that the provided path truly
over the message using a secret key Kr known only to corresponds to a path that it supports for that customer.
ADr , and is essentially a data plane capability used to re- As an example, Fig. 4 shows the up-path construc-
mind ADr of its own decision that Pi is an approved path tion beacon that is delivered to AS4 for the path
when used to carry data packets of the trust domain T D. AD1 , AD3 , AD4 . It can be seen that the respective sig-
Note that since these capability MACs are only verified natures and MACs are constructed in an onion fashion,
by the issuing AD (ADr ) there is no requirement for time with each succeeding authentication tag authenticating
synchronization with other ADs as long as the time syn- not just the newly extended path, but also all the previ-
chronization within the routers of each AD are sufficient ous authentication steps involved in creating that path.
to enforce its individual policy regarding route announce- Once an endpoint AD receives the up-path construc-
ment timeouts. Since all IDs are self-certifying, any third tion beacon from its providers it selects up to k up-paths
party can extract the relevant public key of ADr from and signs the entire set of k paths and their authenti-
the ID of ADr contained in Mi , and can thus verify the cators and uploads it to a lookup server provided by the
signature. trust domain to enable reachability. To send a packet, the

7
source AD queries the Path Server server to find the k- i → j to TD*. Subsequently this route can be identified
paths associated with the destination AD and thus splice as a valid TD* route but not a valid TD2 route. By using
together a route. This process is described in Section 7. the TD* domain, we can still support use of the route at
a lower assurance level. Since TD* is a virtual domain
6.3 Managing multiple ingress/egress without authorities or servers, routes for TD* are hosted
at the route servers of the originating TD.
points
Often, routing at the AD-level granularity is insufficient.
For example, consider a large carrier with continental 7 Lookup
reach but only a single AD number. A route passing
Once an endpoint D has selected its k up-paths, D must
through the AD could be entirely local, or it could cross
publish this reachability information in a reliable man-
the entire continent. To distinguish these cases, BGP
ner such that source endpoints may determine suitable
supports an attribute called the “mult-exit discrimina-
routes to D. To support this lookup, each TD Core runs
tor” or MED, which essentially allows an AD to annotate
replicated Address Server(s) and Path Server(s) that are
its path advertisements with which exits are supported.
reachable via a well-known address.
Since the up-path construction is essentially also a
The following steps take place in the lookup process: we
path-vector algorithm, the concept of MED can be seam-
start with a source endpoint wishing to reach a destina-
lessly integrated into the protocol. When an AD wishes
tion endpoint, which is identified using a human-readable
to constrain an up-path to a certain ingress/egress point,
label and a TD identifier, defaulting to the local TD. The
it can simply annotate the path announcement with this
source endpoint issues a query to the local Address Server
fact (e.g., ignoring peering edges):
for resolution of the label. The TD’s Address Server
ti ,T D resolves this query by returning an identifier indicating
Pp,j = Pp −→ ADj : MEDx
which AD(s) serves an endpoint associated with the la-
Sp,j = Sp kSigni(Pp,j kSp ) bel, as well as which sub-TDs contain this AD ID. The
source can use this TD-membership information to query
Mi = Mp kMACKi (Pp,j kMp )
the Path Server in an appropriate TD for the k-up-paths
This embedding of the MED into the path description of the destination. After discovering the set of up-paths,
prevents a downstream customer from selecting routes the source constructs a route from itself to the destination
that violate the ingress/egress semantics of the provider endpoint by selecting a segment of one of the k paths.
AD. The use of this field can also be extended to indi- We assume that knowledge of the identity of the TD
cate policies regarding how the provider wishes to sup- Core implies possession of the public key of that TD Core;
port transit between two customers. For example, if a this is similar to the assumption of, e.g., browsers having
provider wishes to restrict a path to only some subset of ICANN’s root public key for DNSSEC. Also, to certify the
its descendants, it could annotate this in a similar way to mapping between the endpoint identifier and the address,
the MED. For brevity, we do not describe policy annota- each TD Core effectively maintains its own autonomous
tions in detail in this paper. endpoint identifier space. Hence, an endpoint identifier
can be an unambiguous human readable name (as long
6.4 Traversing TD Boundaries as it includes at least one TD identifier). For exam-
ple, an identifier for the US-based car manufacturer Ford
The protocol above guarantees that a customer AD can may be “Ford” qualified with the “USA” TD. Naturally,
discover a route from a provider as long as there exists a this system would also support cryptographic identifiers
common trust domain to which both of the ADs belong. made of non-human-readable bitstrings, such as AIP [23].
However, the protocol does not describe how a customer For simplicity, we assume that human-readable names are
AD (ADj ) in some trust domain (TD1) could use a route used in the rest of this paper.
Pp from a provider (ADj ) in a different trust domain
(TD2) when TD1 and TD2 are in different TDs.
7.1 Address Resolution Service
In some cases two ADs may still want to communicate
even though they are in different TDs. SCI-FI support Name resolution in SCI-FI occurs in accordance with
the use of routes that cross TD boundaries at the cost of the same foundational principles of the routing design,
isolation by expanding the scope of a path to a “univer- namely the isolation of influence to the TDs involved.
sal” trust domain, TD*. TD* is a virtual domain that The architecture devolves control to the endhosts, allow-
contains all TDs. By introducing the TD* domain, every ing them to scope name resolution. Furthermore, the
pair of ADs now have at least one common containing design settles disputes by resolving non-TD specified ad-
domain. For example, consider the case that the next dresses at the local domain, where presumably an enforce-
largest containing TD of TD1 and TD2 is TD*. In this able dispute resolution process exists.
case, the provider ADi will re-scope the advertisement Every TD will provide its own internal Address Server
from TD2 to TD* by changing the context of the edge within the TD Core, accessible at a default address, which

8
will resolve the name locally if possible, or query the ap- The top level TD C signs a certificate authenticating
propriate foreign Address Server should the user specify a the correctness of this name resolution lookup. Each self-
different TD. Should the local Address Server not contain certifying AD identifier (or AID) comes with a co-signed
the name, the server will query the other TD’s Address certificate attesting to the membership of the AID in each
Server on the user’s behalf. While SCI-FI remains ag- of its respective trust domains, in particular C. For each
nostic to the exact scheme for human readable naming, AID:EID pair there is also a co-signed certificate indicat-
the example of DNS provides the most accessible exam- ing that the EID is part of the AID.
ple. Consider the domains ABC.us and ABC.cn, residing The name resolution query is routed using the identifier
within the US and CN TD’s respectively. A user within of the TD C. A querying source endpoint sends this query
the US TD will query the local Address Server and receive to the targeted top level TD C via one of its up-paths P ,
address information to ABC within the US TD. Should into the top level routing infrastructure which sends it
the user in the US TD request ABC.cn, the US Address to the correct ingress point at C. TLTD C then resolves
Server queries the CN Address Server, and returns this the query (possibly by querying sub-TDs) and returns the
information to the user. response via the originating up-path P .
The namespace can be flat, or hierarchical. For now,
we can assume that the service is structured similarly to 7.2 k-Path Resolution
DNS: there is one canonical root server associated with
each trust domain, which can delegate the lookup to a At this point the endpoint is in possession of the AID:EID
number of sub-servers in possible other sub trust domains, of the label that was looked up, as well as possibly a
until a query is resolved. hierarchy of nested trust domains that contain this AID,
To look up a path, the source queries the local Address but not the actual route itself. The source now issues
Server with the destination identifier. The destination a route lookup query to the respective trust domains as
identifier consists of a human-readable label, such as a appropriate (e.g., if the AID is contained in TD1 which
DNS name, and a TD identifier, which defaults to that of is contained in TD2, and the source knows how to reach
the local TD. The Address Server returns an identifier in- the Path Servers of TD1, it can contact TD1 directly). In
dicating which AD(s) serves an endpoint associated with the following description we assume that the source has
the label, along with which sub-TDs contain the AD ID. no such advance information and can only reach the top
If the local Address Server cannot resolve an identifier lo- level TD.
cally, the Address Server queries the other TDs’ Address The destination AD actively uploads its k up paths
Servers. on any or all of its containing TD’s Path Servers. The
The name resolution service takes as an input a human route resolution query from the source first goes to the
readable name: NE (e.g., “Ford”) and a TD Identifier top level TD ingress point, which may then query one of
C (e.g., “USA”), and outputs a list of (self-certifying) the top level Path Servers to see if the route has been
AD numbers and endpoint identifiers: each result is an uploaded at top level. If not, or if the top level trust
AID:EID pair constructed in a way similar to that of AIP. domain prefers not to resolve individual AD routes, it may
Optionally associated with each AID:EID pair is a hier- also delegate the lookup based on the TD containment
archical nesting of trust domains (e.g., local, regional, information provided in the lookup query. Eventually, a
and continental trust domains) that can be used to dele- trust domain is found that contains the destination AD
gate the reachability functionality to sub-trust domains. and whose Path Servers have a fresh copy of the k up
Specifically, query on NE at TD C should produce the paths of the AID.
following record: Trust-scoped resolution. A trust scoped query en-
NE in TD C resolves to: forces that a given route computation should only involve
(and be restricted to paths using) ADs within a specific
AID1 : EID1
AID1 ∈ T D1,1 ⊆ T D1,2 ⊆ T LDC set of trust domains. For example, if a query is scoped
AID1 : EID2
to within top level domain C, any AD that is not either
AID2 : EID3 AID2 ∈ T LDC
explicitly specified by the querier, or in C, should not be
Table 2: Example of a lookup record for name NE in TD able to directly affect the communication or the results of
C the query. For top-level scoped reachability queries, the
implementation is straightforward. The querier sets a flag
In the example of Table 2, the lookup of NE under TD in its query indicating that this is a trust-scoped query
C returned three records: two endpoints EID1 , EID2 in restricted to the top level trust domain C. Then, when
the same AD (AID1 ) and another in a second AD, AID2 . a Path Server is queried for routes, it simply withholds
The trust domain memberships of each AD are indicated: any paths that are not scoped for C. If the Path Server
for example, for AID1 , it is contained in T D1,1 which is needs to follow a delegation path (e.g., querying the Path
a sub-domain of T D1,2 which is a subdomain of the top Server of a sub-TD), the Path Server ensures that the
level domain T LDC . For AID2 , it is simply indicated as delegation path never exits C (in terms of forwarders or
a member of the top level domain T LDC . communication principals). This effectively restricts the

9
set of participants in the route computation strictly to Root
members of C. A label lookup can be scoped to a top
level domain in a similar way.
Subtleties arise if a more narrow scoping is desired. For E
example, suppose a source issues a route query (AID1 :
EID1 )(AID1 ∈ T D1,1 ⊆ T D1,2 ⊆ T LDC ), with the trust
C D
scoping constraint of T D1,2 (where T D1,2 ⊆ T LDC ).
Since T D1,2 may not be a top level TD, and may not
participate in the inter-TD top level routing layer, this
implies that the route query is expected to traverse T LDC A B
but that the ADs in T LDC (and, indeed the authority of
T LDC itself) may not be trusted. In such a situation, Figure 5: Short cut example. The link between C and D
the source is required to designate a particular explicitly is a peering link; other links are customer-provider links.
source-selected route to reach T D1,2 . The ADs in this
route are allowed to drop the packet if it violates their
routing policy but are expected to strictly follow the se- A, C, E, D, B. When also comparing the peering nodes
mantics of the routing otherwise. It is the responsibility of at the up-paths, A and B are able to find a shorter path
the source to discover a trusted route to reach the trusted A, C, D, B.
TD T D1,2 ; this can involve a separate route query (possi- The source endpoint needs to find a common ances-
bly, scoped to a group which contains only the high-level tor provider or a common peering link to splice together
members of the top level domain T LDC ), or the route the up-path of the source and the destination AD. If the
can be hand-configured based on source preferences. source and destination ADs are in distinct trust domains,
In-bound traffic control. Another way for the pol- then the route may need to traverse top-level TDs via the
icy to be reflected is via in-bound traffic control. A des- inter-TLTD protocol. Similarly, if source and destination
tination AD can attach distinct sets of k-path upgraphs are in the same trust domain but have distinct TD Cores,
depending on the identity of the source endpoints that are then the message will need to traverse the TD Cores in
performing the query. For example, if a query is originat- the same TD via the inter-TD Core routing protocol. To
ing from outside the local TD, the destination AD can reflect this, a common ancestor C is attached as the par-
instruct the Path Server to only serve up a set of paths ent of all top-level ADs in both path sets. This ensures
that pass through a set of specific high-security gateway that a common ancestor can always be found.
ADs; whereas if the query originates inside the TD then Finding the common join point can be accomplished via
a more general and efficient set of paths can be served up. a number of methods. Since we anticipate that the paths
will be reasonably short and k is also quite small (less than
10), a simple way is to hash the IDs of each provider and
8 Route Joining peering link of the destination AD up-paths into a hash
table and look up the providers and peering links of the
Section 6 described the process of route construction source AD. This takes time and space proportional to the
(where a destination AD determines its upstream topol- total number of provider ADs and peering links named in
ogy) and Section 7 described route lookup (where a source both source and destination.
AD discovers the k up-paths of the destination AD). It Once a common transition point X (either a common
remains to show how this information can be used to con- ancestor or a common peering edge) is found between an
struct a working end-to-end route. The joining algorithm up-path PA of the destination AD A and an up-path PB
occurs at the source endpoint. It can be implemented ei- of the source AD B, the endpoints are ready to initiate
ther on the network stack of the endpoint device, or on communication. The source endpoint attaches into the
the endpoint routers of the source AD. It takes as inputs packet PA and PB , along with their MAC authentication
the k up-paths of both the source and destination hosts. information (MA and MB ) as constructed in Section 6.
Each up-graph is a list of k path vectors, where each path The header also contains the transition point X that was
vector is in the following format: selected by the source endpoint. It now contains enough
information to be routed to the destination. On-route,
hAkQ(A) → BkQ(B) → CkQ(C) → . . .i routers inspect the header to determine the next AD hop,
where A, B, C, . . . are the nodes (in order) in the path, as well as check the MACs that locally authenticate that
and Q(A) is the set of peering neighbors of node A, that this is a well-constructed, legitimate route. When the
A attached onto the path. The use of Q(A) is to enable packet has reached the destination, the destination can
the source and destination to find not only joining points reverse the paths to construct a symmetric return path,
at a common ancestor, but also join points at a common thus facilitating two-way communication. During the life-
peering link. For example in Figure 5, without consider- time of the connection, the source endpoint can moni-
ing the peering links, node A and B can only find a route tor path quality and switch to any of the other k2 com-

10
bination of alternative path choices to improve latency, ISP attempting to implement a highly reliable service to
throughput, or drop rates. other important ISPs. Currently, without setting up a
dedicated sub-network for this transport, it cannot do this
in existing routing protocols. We would like to facilitate
9 Security Properties this naturally within inter-domain routing, by providing
route control and selection properties at the AD level.
Rather than design our architecture with specific coun- In existing interdomain routing protocols, a destination
termeasures against known attacks, we have designed it AD has essentially no control over its route: it must sim-
using sound principles of isolation, control and scalabil- ply advertise its prefix to its providers and hope for the
ity such that security follows as a natural intrinsic prop- best. The source AD has very limited control since, if
erty rather than as an external property added-on by a it is multi-homed, it can pick any one of its providers
separate subprotocol. This section discusses some of the to serve the route. If it is not multi-homed (as many
natural resilience afforded by our architecture in contrast stub ADs are) then it likewise has no route choice once it
with the current Internet infrastructure. has selected its provider. In the SCI-FI architecture, in
contrast, both endpoints ADs get to selection a set of k
Isolation: enabling attack localization. As dis- well-defined paths from the path set provided from their
cussed in Section 2, the primary weakness of current in- respective providers. This choice is made explicitly, with
terdomain routing protocols (even with their semantics full knowledge of the exact identities supplying the path
perfectly secured) is that the system is vulnerable to var- and the full authentication information of each path is
ious forms of routing plane attack to any on-path ad- provided. A source AD gets both these sets to choose
versary, i.e., any adversary that is, or could make itself from, yielding potentially up to k2 end to end AD-level
be, on any path between the source and the destination. paths. A destination AD can in fact select different sets
Since the routing announcements in those routing proto- of k paths to serve up to different source endpoints; for
cols are globally disseminated without isolation and con- example, the destination AD can approve a separate set
trol, this includes every AD in the world to a varying of high-assurance paths for trusted entities; this can be
degree; closer ADs have increased attack power. In con- provided even if the TD route servers are untrusted by
trast, SCI-FI provides strict isolation and control prop- encrypting the route record using a group key. Further-
erties. Since customers never “announce” any routing more, these route sets are all separated by trust domain,
control messages to providers, influence over route con- with each domain maintaining a different set of k paths.
struction flows strictly downstream: this means that any An AD that is in more than one TD can thus not only
malicious provider can only attack routes belonging to its switch paths but also change the routing context to a
own customers. Furthermore, since routing control is di- different TD.
vided by trust domain, a malicious AD can only attack
routes which have at least one endpoint inside its trust Control: facilitating resource control. In terms
domain. The attacker is limited to a limited capability of resource management, TD separation can also be used
of falsifying upstream routes within the same domain to to provide a form of resilience against resource consump-
downstream customer ADs of the malicious AD: it can- tion attacks and DoS. For example, routers could allo-
not, for example, attempt to forge a false route lookup cate independent bandwidth to each TD. An adversary
result from another TD. Finally, notice that the target of that does not control any malicious routers in a given
these attacks have to be downstream endpoint ADs, who TD X will never be able to eavesdrop on route adver-
are the final approvers of the route computation (each tisements belonging in X. Hence, the adversary cannot
endpoint must explicitly select k up paths, sign the selec- consume router resources dedicated to TD X unless it
tion and upload it to a route server). Because the target is specifically targets an endpoint inside X. However, since
actively involved in the final route approval stage, attack- this endpoint is legitimate, it can shut down the adver-
ing the up-path computation cannot be done stealthily: sary’s access to its routes in a number of ways: it could
the target of the attack always has a chance to examine blacklist the malicious AD at the route server, causing
the forged route and always has to approve it explicitly. it to be unable to receive new capabilities to send pack-
Compare this, for example, with the case of path compu- ets once its current capabilities time out; or if the end-
tation in BGP, where a destination has no control over, point determines that it is under DoS it could ensure
and is often unaware of what routes to itself are eventu- that all unauthorized traffic passes through a dedicated
ally adopted by the rest of the Internet. lower-priority AD by serving up alternative paths to well-
established trusted flows. Existing interdomain routing
Control: providing resilience and flexibility. An- protocols have only one single routing plane and is thus
other advantage of the architecture is resilience and flexi- unable to perform this kind of isolation.
bility. When an AD is unable to determine or control the
route that it uses to another AD, it is unable to facilitate Scalability: preventing stale routes. SCI-FI also
a well-quantified level of trustworthiness or reliability for supports a much more flexible and fine-grained ability to
its network service provision. For example, consider an manage route updating. In existing interdomain routing

11
Isolation
BGP no isolation; can be attacked by any AD on any path
an AD topology based on real-world datasets to evaluate
SCI-FI isolation to restrict attack scope; can only be given the effectiveness of SCI-FI. Specifically, we simulate SCI-
false route data by upstream providers in the same TD FI on a measured Internet AD topology from the CAIDA
Path control
BGP Dest: no path choice dataset to evaluate the routing efficiency, security, and
Source: pick one out of d paths expressiveness.
SCI-FI Source/Dest: pick k out of kd path segments
Dest: can enforce different paths for different traffic classes
Source: pick k2 ways to combine segments into path
Resource Control 10.1 Evaluation Methodology
BGP Unsupported
SCI-FI Reliably segregate traffic by TD on data & control planes We evaluate our protocol on the measured AD-level topol-
Scalability
BGP attacker can drop route withdrawal to keep stale routes
ogy annotated with the business relationships obtained
SCI-FI fresh routes due to scalably frequent route update from CAIDA dataset.1
Trust domains. Given the measured AD topology, we
Table 3: SCI-FI Security Advantages. d is number of virtually group the ADs into several trust domains and as-
providers for multihome source/dest. sign some of the ADs as the TD Core ADs in order to sim-
ulate our trust-domain-based routing. In this proof-of-
concept evaluation, we consider a two-level trust domain
systems, the only mechanism for route changes is to in- hierarchy in which the lower level consists of five local and
crementally push out another routing update or a route non-overlapping trust domains, and the upper level is the
withdrawal. Because there is no route control from the T D∗ domain containing all the lower level TDs. Each of
destination, these updates may or may not propagate cor- these five local TDs associates with one Regional Internet
rectly to other ADs. Hence, stale (invalid) routes may Registries (RIRs), which are the regional organizations al-
remain in routing tables for a long time. Furthermore, locating AD numbers. In other words, ADs registered to
attempting to secure these explicit updates is problem- the same RIR belong to the same trust domain. Such a
atic. An attacker could re-order or re-inject route update division reflects the geographical and administrative re-
messages causing invalid and inconsistent routes to prop- lationships to some extent. Table 4 summarizes the size
agate in the network. Path-vector based route announce- of each of the trust domains. The TD Core ADs are de-
ments cannot have short timeouts, since a path-vector fined as top-level ADs that have no providers themselves
update requires a destination AD to push its announce- in their respective trust domain.
ment to the entire network, and the protocol is not scal-
able if every AD is performing this broadcast at a high BGP Routing. When simulating BGP (and S-BGP)
rate like link-state routing, since this would cause O(n2 ) routing,2 we assume that in benign cases ADs respect
communication overhead per update (since it involves all- and make routing decisions based on the business rela-
to-all communication). In SCI-FI on the other hand, all tionships with their neighbors, and then use path length
routes come essentially directly from the destination AD as the tie-breaking factor. We also assume that the TD
via the route servers of the TD. There is no global dis- Core ADs form a clique, and thus the length of any in-
tributed consistency issue since the route servers are cen- ter TD Core routing path is 1. This is accomplished by
trally administered. Since all route discovery occurs in an adding a peering link (if it does not yet exist) between
upstream-to-downstream direction, frequent updates can every pair of top-level ADs.
be done highly scalably (O(n) communication per up- Finding k up-paths. In practice, each AD running
date) in a coordinated fashion. Hence, published routes SCI-FI can have a different set of policies in determining
can have very short timeouts and be updated frequently which k paths to export, and the infrastructure can vary
and securely. the value of k as well as the algorithm for finding the
These observations are not exhaustive; they are simply (presumably disjoint) k paths. However, the optimiza-
examples of a number of BGP flaws that are naturally tion of the export policies, the k value, and the finding-
prevented through good design supporting well-founded disjoint-path algorithm are outside the scope of this pa-
isolation and control. Note that none of these useful prop- per. Instead, for the purpose of simulation, we implement
erties required an additional add-on protocol to provide a simple k-path discovery algorithm that takes a source
functional security, and the validity of each property is AD, the complete AD-level topology, and a Trust Domain
quite clear without requiring extensive cryptographic re- as inputs, and yields an up-path containing k maximally
duction proofs. A summary of the properties discussed edge disjoint paths to the TD Core ADs in the specified
in this section is in Table 3. Trust Domain. The existing algorithms of “finding k dis-
joint paths” focus on minimizing the total cost of the k
paths. Such algorithms are inapplicable in the context of
10 Evaluation 1 CAIDA. http://as-rank.caida.org/data/
2 The reason we compare with BGP is because it is the current de
Due to the infeasibility of evaluating a completely new facto interdomain protocol and the CAIDA dataset directly reflects
architecture on the current Internet, we have constructed BGP paths.

12
Trust Domain number of ADs number of top-level ADs
African Network Information Centre (AfriNIC) 613 39
American Registry for Internet Numbers (ARIN) 21619 38
Asia-Pacific Network Information Centre (APNIC) 6039 29
Latin America and Caribbean Network Information Centre (LACNIC) 1912 60
RIPE NCC 19569 34

Table 4: Results of trust domain formation.

8000
in TD*
10.3 Expressiveness
7000 in local TD
As explained in Section V, with the ability to re-scope to
6000
the TD* universal domain, SCI-FI can express all the
number of ASes

5000 valley-free BGP paths, because the TD* domain con-


4000 tains all the ADs and thus routing in the TD* domain
3000 is essentially equivalent to routing in the Internet us-
2000
ing BGP. Specifically, we can show that every policy-
compliant BGP path between a pair of source and des-
1000
tination ADs is always contained in the union of their
0
1 10 100 full up-paths (i.e., k → ∞) in the TD* domain. We con-
number of available paths (logarithmic) sider the full up-paths because the selection of k (and also
which k paths) is a policy decision, and designing an al-
Figure 6: Measurement results of AD-level end-to-end gorithm to optimize such a selection is outside the scope
path diversity. of our paper.
However, re-scoping to the TD* domain sometimes is
undesired because it achieves only a minimal security as-
finding routing paths in SCI-FI because the probability of surance. Hence in our simulation we evaluate how of-
using a backup path may be much lower than using a high ten the endpoint ADs have to use a TD* path to reach
priority path. For example, one may prefer a set of paths each other. The evaluation proceeds as follows: we ran-
with cost {2,10} to another set with cost {6,6} despite domly select 1000 pairs of source and destination ADs.
that the total costs are the same. Therefore, rather than For every pair, we compute the BGP path between the
minimizing the total cost, our k-up-path discovery algo- pair as well as the SCI-FI up-paths of the source and the
rithm selects the disjoint paths using an iterative, greedy destination ADs when k → ∞, i.e., the full up-paths in
algorithm. At step i, the greedy algorithm 1) finds the their respective local trust domain (not the TD* univer-
current shortest path as the ith maximally disjoint path, sal domain). Again, considering full up-paths eliminates
and 2) increases the weight of all the edges on the ith path the uncertainty and result variance due to the k-path se-
such that these edges become less preferred in the next lection policies. Then we check whether the BGP path
iteration. Fig. 6 shows the distribution of the number of between the pair is contained, i.e., in the union of the
available paths. More than 90% and 80% of ADs have up-paths of the source and the destination. The ratio of
fewer than 10 available paths in the sub TDs and TD*, contained paths represents the expressiveness without us-
respectively. ing the TD* domain. We also evaluate the expressiveness
as a function of k based on our simple k-path selection al-
gorithm to demonstrate the practicability of SCI-FI and
10.2 SCI-FI Efficiency: Route Stretch the trend as k increases. Fig. 7 summarizes the results
of SCI-FI’s expressiveness experiments, from which we
We evaluate the efficiency of SCI-FI based on its route can see that with only a small k = 5 SCI-FI can already
stretch, which is defined as the ratio of the average path capture more than 85% of BGP path diversity without
length in SCI-FI to the average path length in BGP. Our rescoping to the TD* domain.
simulation takes 1000 random pairs of source and desti-
nation ADs. For each pair, we measure the length of the 10.4 Security
BGP route and the SCI-FI route between the source and
the destination. The result shows that the average BGP In Section 9, we have discussed attacks that are natu-
path length is 3.9 hops and 4.72 hops in SCI-FI when rally infeasible in our architecture. In this section, we
k = 1. Our result shows that routing in SCI-FI only adds quantitatively investigate how severe such attacks are in
a small amount of overhead in terms of path length even networks lacking our well-defined properties. As an ex-
without considering the shortcut paths that do not go ample, we consider the impact of long-standing wormhole
through the TD Core ADs. attacks when TD isolation is not enforced, i.e., an AD in

13
0.8 AfriNIC 0.14 0.8

ratio of unavoidable hijacked traffic


ARIN
0.7 APNIC 0.12 0.7

ratio of total hijacked traffic


ratio of attractable traffic
LACNIC
0.6 RIPE NCC 0.6
0.1
0.5 0.5
0.08
0.4 0.4
0.06
0.3 0.3
0.04
0.2 0.2

0.1 0.02 0.1

0 0 0
2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20
number of malicoius ASes number of malicoius ASes number of malicoius ASes

Figure 8: Impact of wormhole attacks for S-BGP, which are intrinsically mitigated in SCI-FI.

1 with TD*
ADs. For each simulation we measure if any of the com-
w/o TD* promised nodes would see the packet, in which case the
0.95
route is passively compromised (unavoidable). For the
active case (attractable), we simply allow the compro-
expressiveness

0.9
mised nodes to form a wormhole between each other by
0.85
announcing bogus paths and only count ”attracted” paths
0.8 that result from these bogus path announcements.
Fig. 8 shows the impact of malicious ADs on the path-
0.75 vector-based routing protocols without isolation. The x-
0.7
axis represents the number of ADs that collude and at-
1 2 3 4 5 6 7 8 9 10 all tempt to attract traffic. A value of i on the x-axis means
K that we select the i most influential ADs, where the in-
fluence score of an AD is evaluated based on how many
Figure 7: Measurement results of SCI-FI expressiveness. other AD nodes would consider this AD the shortest hop
to reach outside their respective TD. Each of the data
points is an average over 1000 runs of simulation with
TD1 would prefer a route through an untrusted TD2 as randomly selected pairs of source and destination. The
long as the route is shorter than others that are contained result shows that without strong isolation the Internet is
entirely in its trust domain. In a wormhole attack, the extremely fragile because the attacker can control over
attacker attempts to attract routes by announcing a non- 50% of Internet traffic with as few as five compromised
existing shortcut (or “wormhole”). Clearly, in SCI-FI, it ADs. In contrast, by designing around the isolation, con-
is infeasible for an outsider to pull traffic out of a TD trol, and scalability principles, SCI-FI can intrinsically
with a strong isolation property, whereas in TDs where prevent these attacks as analyzed in Section 9.
no isolation is enforced, a wormhole residing in any corner
of the Internet can possibly attract a significant portion
of traffic from these TDs. 11 Related Work
In our simulation, a wormhole attack is done by a group
of colluding ADs that announce a minimum cost to route While no previously proposed solution seamlessly ad-
between each other. We consider such attacks by the most dresses all the security issues SCI-FI addresses simulta-
influential ADs in a compromised domain, except the TD neously, prior work falls into a few distinct categories.
Core ADs that we assume trusted. The influence score of Routing security. Goldberg et al. analyze the weak-
an AD in T Dx is evaluated by the number of ADs that see nesses of BGP and S-BGP, quantifying their efficacy in
that AD as the shortest path to T Dx . The observation is defending against traffic attraction attacks. Indeed, ex-
that the attacker would be able to compromise a handful isting secure routing protocols such as soBGP [24], ps-
of the ADs in a less trustworthy domain. We apply two BGP [25], and SPV [26] only address the security of route
metrics to evaluate the power of route attraction attacks announcement semantics, which, at best, only guarantees
by a small number of compromised ADs. These two met- the paths are topologically valid but fails to to ensure the
rics are ratio of unavoidable paths and ratio of attractable logical trustworthiness and contractual legitimacy of the
paths. Unavoidable means that the normal BGP traffic routes.
traverses a passive attacker that does not announce bogus Routing flexibility and scalability. Researchers have
short paths. Attractable means that an active attacker proposed many inter-domain routing protocols attempt-
that sends bogus route announcements can attract BGP ing to improve the security of the current routing stan-
routes that would otherwise go through some benign ADs. dard from a system’s point of view [27, 28, 29]. Also,
Given a set of compromised ADs, our simulation re- while there abound next-generation interdomain routing
peats 1000 random selections of source and destination proposals [3, 4, 5, 6, 7, 8, 9, 10] none of these proposals

14
focus on providing inherent routing security. In contrast, mining the trusted elements in each route computation.
we propose a new architecture that solves the attraction This makes it feasible to reason about the assurance level
weaknesses in S-BGP, intrinsically eliminates IP spoofing, of a given path, and to generate different routes with re-
and easily extends to provide DDOS limiting. silience and security appropriate to each application.
DoS resilience. Many capability-based architectures
and forwarding protocols such as SIFF [20], TVA [30],
and NetFence [31] employ access control and rate control References
to regulate inbound traffic to the victim destination. In
addition, other DoS-resilience architectures rely on build- [1] “Insecure routing redirects youtube to pakistan,” Febru-
ing overlay networks (such as Phalanx [32]) to empower ary 2008, http://arstechnica.com/old/content/2008/02/
insecure-routing-redirects-youtube-to-pakistan.ars.
the destination with inbound traffic control. As a sepa- [2] S. Goldberg, M. Schapira, P. Hummon, and J. Rexford, “How
rate line of research, AIP [23] provides a building block – secure are secure interdomain routing protocols,” SIGCOMM
source accountability – for defending against DOS and IP Comput. Commun. Rev., vol. 40, no. 4, pp. 87–98, 2010.
spoofing by the use of self-certifying addresses. However, [3] P. B. Godfrey, I. Ganichev, S. Shenker, and I. Stoica, “Pathlet
routing,” in In Proc. SIGCOMM Workshop on Hot Topics in
all of these proposals simply provide a patch to the exist- Networking, 2008.
ing Internet. Instead we argue the necessity of an entirely [4] M. Motiwala, M. Elmore, N. Feamster, and S. Vempala, “Path
new architecture to achieve strong security properties. splicing,” in ACM SIGCOMM, 2008.
[5] W. Xu and J. Rexford, “MIRO: Multi-path Interdomain Rout-
ing,” in ACM SIGCOMM, 2006.
[6] N. Kushman, S. Kandula, D. Katabi, and B. M. Maggs, “R-
12 Conclusion BGP: Staying Connected In a Connected World,” in USENIX
NSDI, 2007.
[7] X. Zhang, A. Perrig, and H. Zhang, “Centaur: A hybrid ap-
We have presented SCI-FI, an architecture for supporting proach for reliable policy-based routing,” in Proceedings of the
well-principled isolation in Internet routing. The archi- International Conference on Distributed Computing Systems
tecture provides a number of useful properties to inter- (ICDCS), Jun. 2009.
domain routing including path choice, local accountabil- [8] X. Yang and D. Wetherall, “Source Selectable Path Diversity
via Routing Deflections,” in ACM SIGCOMM, 2006.
ity, active route control at both endpoints, flexible path [9] D. Wendlandt, I. Avramopoulos, D. Andersen, and J. Rexford,
choice without exposing ISPs to source-routing attacks, “Don’t secure routing protocols, secure data delivery,” in ACM
and effective isolation which allows endpoints to estab- Hotnets, 2006.
[10] X. Zhang and A. Perrig, “Correlation-resilient path selection
lish routes limited to using a certain subset of ADs on
in multi-path routing,” in IEEE Globecom, 2010.
the Internet, which involves computations that only in- [11] S. Kent, C. Lynn, J. Mikkelson, and K. Seo, “Secure border
volve those ADs in such a way that ADs outside of the gateway protocol (S-BGP) — real world performance and de-
set are unable to influence the result. Rather than rely ployment issues,” in Symposium on Network and Distributed
Systems Security (NDSS), Feb. 2000.
on add-on mechanisms for security, each of these proper- [12] M. Caesar and J. Rexford, “BGP routing policies in ISP net-
ties is provided as a natural outcome of structuring and works,” IEEE Network Magazine, vol. Special issue on inter-
controlling the flow of the protocol computation in such domain Routing, Dec 2005.
a way that it respects and reinforces real-world relation- [13] L. Gao and J. Rexford, “Stable internet routing without global
coordination,” IEEE/ACM Trans. Netw., vol. 9, no. 6, pp.
ships and dependencies. 681–692, 2001.
Our evaluation results show that on current Internet [14] A. Greenberg, G. Hjalmtysson, D. A. Maltz, A. Myers, J. Rex-
structure, SCI-FI incurs only about one additional hop in ford, G. Xie, H. Yan, J. Zhan, and H. Zhang, “A clean slate
4D approach to network control and management,” in ACM
average routing path length. With a small number of up-
SIGCOMM CCR, 2005.
paths (e.g., k=5), SCI-FI can capture more than 85% of [15] R. Mahajan, D. Wetherall, and T. E. Anderson, “Understand-
current BGP paths, thus exhibiting high path expressive- ing BGP misconfiguration,” in ACM SIGCOMM, 2002.
ness with small overhead. Thanks to the strong security [16] X. Yang, “NIRA: A new routing architecture,” in ACM SIG-
COMM FDNA Workshop, 2003.
properties as an intrinsic consequence of built-in isola- [17] M. Gerla, X. Hong, and G. Pei, “Landmark routing for large
tion, scalability, and control, our evaluation results show ad hoc wireless networks,” 2000.
that SCI-FI can improve end-to-end availability by up to [18] P. Smith, “Internet routing table analysis update,” www.
80% compared to existing (secure) interdomain routing sanog.org/resources/sanog14/sanog14-pfs-RoutingTable.pdf,
2009.
protocols. [19] R. Mahajan, D. Wetherall, and T. Anderson, “Understanding
The SCI-FI architecture demonstrates the importance bgp misconfiguration,” in ACM SIGCOMM, 2002.
of building in good control and isolation properties in [20] A. Yaar, A. Perrig, and D. Song, “SIFF: A stateless Internet
a network protocol for a large number of heterogenous flow filter to mitigate DDoS flooding attacks,” in IEEE Sym-
posium on Security and Privacy, 2004.
principals. Control properties empower the principals [21] X. Liu, X. Yang, and Y. Lu, “To filter or to authorize:
that have the highest stake in the communication; while Network-layer dos defense against multimillion-node botnets,”
isolation protects the network elements that facilitate in Proceedings of ACM SIGCOMM, 2008.
[22] B. Parno, D. Wendlandt, E. Shi, A. Perrig, B. Maggs, and Y.-
this communication establishment from external influ-
C. Hu, “Portcullis: Protecting connection setup from denial-of-
ence. Together, these elements allow endpoints and net- capability attacks,” in Proceedings of ACM SIGCOMM, 2007.
work providers to take an active role in explicitly deter- [23] D. G. Andersen, H. Balakrishnan, N. Feamster, T. Koponen,

15
D. Moon, and S. Shenker, “Accountable Internet Protocol
(AIP),” in Proc. ACM SIGCOMM, Aug. 2008.
[24] R. White, “Securing BGP through secure origin BGP,” Cisco
Internet Protocol Journal, Tech. Rep., Sep. 2003.
[25] T. Wan, E. Kranakis, and P. van Oorschot, “Pretty secure
BGP (psBGP),” in Proceedings of Symposium on Network and
Distributed System Security (NDSS’05), 2005.
[26] Y.-C. Hu, A. Perrig, and M. Sirbu, “SPV: Secure path vec-
tor routing for securing BGP,” in Proceedings of ACM SIG-
COMM, Sep. 2004.
[27] K. Butler, P. McDaniel, and W. Aiello, “Optimizing bgp se-
curity by exploiting path stability,” in Proceedings of the 13th
ACM conference on Computer and communications security,
ser. CCS ’06, 2006, pp. 298–310.
[28] J. Karlin, S. Forrest, and J. Rexford, “Pretty good bgp: Im-
proving bgp by cautiously adopting routes,” in Proceedings of
the Proceedings of the 2006 IEEE International Conference on
Network Protocols, Washington, DC, USA, 2006, pp. 290–299.
[29] J. P. John, E. Katz-Bassett, A. Krishnamurthy, T. Anderson,
and A. Venkataramani, “Consensus routing: the internet as a
distributed system,” in Proceedings of the 5th USENIX Sym-
posium on Networked Systems Design and Implementation,
ser. NSDI’08, 2008, pp. 351–364.
[30] X. Yang, D. Wetherall, and T. Anderson, “TVA: A DoS-
limiting network architecture,” in IEEE/ACM Transactions
on Networking, 2008.
[31] X. Liu, X. Yang, and Y. Xia, “NetFence: Preventing internet
denial of service from inside out,” in ACM SIGCOMM, 2010.
[32] C. Dixon, T. Anderson, and A. Krishnamurthy, “Phalanx:
Withstanding multimillion-node botnets,” in USENIX NSDI,
2008.

16

Potrebbero piacerti anche