Sei sulla pagina 1di 22

On Evaluating Policy-Based Bandwidth Management Devices

Huan-Yun Wei1

Ying-Dar Lin

Department of Computer and Information Science


National Chiao Tung University, Hsinchu, Taiwan
Tel: +886-3-5712121-ext56667
FAX: +886-3-5721490
Email: {hywei,ydlin}@cis.nctu.edu.tw
Policy-based bandwidth management defines how to allocate bandwidth resources according to
organizational policy rules. Enterprises often employ such policy-based devices at their organizational
edges to manage the narrow but expensive Internet access links. This work designs a novel testbed and
uses it to evaluate the functionality and performance of many such devices, including six commercial
products and one open source solution. Their policy rules can be categorized into (1) class-based rule;
(2) connection rule within a class; (3) bandwidth borrowing rule among classes. The testbed mimics the
real-life Internet with heterogeneous Internet delays/delay jitters/packet loss rates, and evaluates the
effectiveness of policy enforcement of the above three policy types in terms of accuracy, fairness,
stability, robustness, bandwidth borrowing, and voice over IP (VoIP) quality. The test results 2 reveal that
(1) explicitly sizing the TCP window could cause performance or fairness degradation even under slight
packet loss rates; (2) the open source solution can compete with commercial products in accurately
limiting flow aggregates; (3) the voice qualities over IP networks significantly depends on the packet
sizes of all other traffic when using a narrowband (125kbps) access link.
Keywords: policy-based, bandwidth management, TCP, testbed, emulator

Corresponding author
All test results are verified by the vendors and are reproducible through our open tools. Nowadays most benchmark reports
are financed by vendors and may be biased, without practical testbeds. Guided by this neutral test, readers can obtain in-depth
sights when examining bandwidth management devices.
2

1. Introduction
Internet services provide an economic and convenient system to carry out business, such as
efficient information exchange among branch offices, or efficient customer/provider access to the
services. However, the importance of the services varies, and enterprises often fail to effectively utilize
the narrow but expensive WAN link bandwidth. For instance, the bandwidth required by ERP
(Enterprise Resource Planning), voice over IP (VoIP), and e-business may be occupied by less-important
applications such as FTP. Since end-to-end Internet QoS such as DiffServ [1] is still under experiment,
enterprises seek to at least manage their inbound and outbound links. Thus, policy-based bandwidth
management devices are employed at organizational edges to set and enforce organizational policies for
pursuing the utmost benefits.
Network administrators define policy rules to achieve resource management objectives for the
enterprise. Each policy rule contains condition and action fields to define specific actions for
specific conditions. Condition defines the packet-matching criteria, such as a certain subnet, application,
or protocol. Action defines the bandwidth parameters, such as at least 100kbps or at most 200kbps.
So each policy rule is class-based that it groups a set of traffic flows into a per-class queue according to
the specified packet filter (condition), and then the class of traffic is scheduled out at its corresponding
specified bandwidth (action). Moreover, the class-based rules can be further configured with bandwidth
borrowing among the classes to dynamically utilize available bandwidth effectively. Additionally, each
connection within a class can be guaranteed to have at least a certain amount of bandwidth. Throughout
this work we evaluate the effectiveness of various policy enforcements for the above three policy types:
(1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule among classes.
The following subsections review traditional and prevalent technologies to enforce these policy rules.

Traditional TechnologyQueuing
A straightforward method for bandwidth management is to queue less-important traffic and pass
important traffic as soon as possible. Queuing can be roughly categorized into (1) priority-based queuing
and (2) rate-based queuing. Priority-based queuing sets the priority among the classes and the highest
priority class is scheduled out first. This is suitable for short-lived, extremely important, or transactionoriented flows. However, priority-based queuing cannot quantitatively guarantee/limit the bandwidth for
a class. As an analogy, if everyone is VIP, then no one is real VIP. In contrast, rate-based queuing
employs various packet scheduling algorithms [2] that can decide from which class comes the next
packet for transmission. This can effectively limit senders who are trying to overburden the resource.
Besides, the minimum bandwidth for important applications can be quantitatively guaranteed. Floyd and
Jacobson [3] further investigate the bandwidth borrowing among the classes. Queuing has different
impacts upon UDP and TCP data flows. Next we briefly review UDP and TCP protocols.

Queuing the Internet Traffic: TCP vs. UDP


The majority of software applications today use TCP (Transmission Control Protocol) for data
transmission because TCP can establish a reliable end-to-end connection. TCP receivers acknowledge
the successful reception of each data packet by replying an Ack to their TCP senders. Thus, Ack packets
can trigger senders sending out new data packets. Unacknowledged data packets are retransmitted to
guarantee reliability of data transfers. TCP also incorporates flow control mechanisms that prevent a
sender from overburdening the network capacity or overflowing its receivers buffer. Thus each TCP
sender keeps two window values, congestion window (CWND) and receiver advertised window
(RWND), and seeks to satisfy both network capacity (congestion control) and receiver's capability of
receiving the data, respectively. So each TCP sender do not have unacknowledged data more than
min(CWND, RWND). RWND is advertised by the receiver in TCP Ack packets and ranges widely
among operating systems. CWND is kept increasing exponentially during the slow-start phase and
linearly during the congestion avoidance phase, probing available bandwidth until packet losses occur.
Loss behavior differs among versions but mainly on how the CWND is shrunk and raised, or how the
lost segments are accurately retransmitted. Falls and Floyd [4] give a good overview and problems on
Tahoe, Reno, NewReno, and SACK TCP versions. Padhye and Floyd [5] further investigate the TCP
version distribution among 4550 Web servers. Unlike TCP, UDP (User Datagram Protocol) lacks the
connection establishment, reliability of data transfer, and flow control. UDP only provides port number
multiplexing and is commonly used by real-time applications such as video conferencing and Voice over
IP (VoIP).
Queuing has different impacts upon UDP and TCP flows. As for real-time UDP traffic, the bit rate
is often fixed and the video/voice quality heavily depends on the loss rate, delay, and delay jitter. The
packet scheduler must precisely allocate enough bandwidth for real-time UDP traffic to minimize packet
losses and delay at the controlling device. Moreover, the packets of the real-time traffic require to be
smoothly scheduled out with even intervals for minimizing the delay jitter. As for TCP traffic, TCP
flows competing for the same queue can cause a great amount of data packets queued in the device,
resulting in high buffer requirement and large packet latency at the device. Moreover, the TCP flows
may not fairly share the class bandwidth, especially when their round-trip times (RTT) are different.
Thus many vendors apply specific algorithms for regulating TCP traffic.

Specific Algorithms for TCP Traffic


To guarantee each TCP connection bandwidth within a class, and hence achieve fairness among
the flows within a class, the ideal solution is to actively control the sending rate of each sender within
the class instead of letting them compete with each other. Thus queuing and its queuing delay, buffer
requirement can be reduced. Other types of traffic such as UDP can only resort to the primitive solution,
queuing, to passively control its bandwidth. Two methods exist for controlling each TCP connection: (1)
window-sizing and (2) packet-dropping.

1.

2.

Window Sizing: Since a TCP connection can be actively controlled through the feedback
Acks, the window-sizing method directly influences the amount of sending bytes by shrinking the
RWND in the TCP Acks. In this test, iPolicer, PacketShaper, WiseWAN, QoSWorks and Guardian
Pro belong to this type. Karandikar et al. [6] sponsored by Packeteer investigate the window-sizing
technique. Though window-sizing can directly control per-connection bandwidth, it needs to
readjust its Ack regulations when another connection enters or leaves the class.
Packet Dropping: Because a TCP sender slows down its transmission rate in response of
network congestion by halving its congestion window size, the packet-dropping method drops
packets and expects that the sender will slow down its rate when detecting the packet loss events
[7]. In this test, FloodGate (uses per-flow queuing) and ALTQ_CBQ+RED belong to this type.

This work designs a novel testbed for evaluating the effectiveness of various policy enforcement
techniques used by existing products or solutions. The testbed mimics the real-life Internet
characteristics such as WAN delay, delay jitter, and packet loss. Section 2 compares the relevant
information of the devices under test (DUT). Section 3 then describes the design of our testbed and the
test methodology. Section 4 demonstrates the test results. Finally, a summary of the test results and
conclusions are given in Section 5.

2. Device under Test (DUT)


This test project invites nine vendors, and six of them join this test. Table 1 compares the
relevant information of all the DUTs. Most DUTs are installed at LAN-router link to prevent router
queues from overflowing and causing congestion. Because the grade of each DUT differs, so only low
bandwidth configurations (below 1.544Mbps) are tested. This minifies hardware differences so that test
results can reflect true management capability of each DUT.
Vendor/

Grade

Model

S/W

(Announced) Ver.

OS,

Install at

HW/SW

Hardware
Boot CPU RAM

Interface

from
100Mbps

2.2 FreeBSD, Software Between Our P!!! 700MHz PC with 256M

NetGuards Guardian Pro [9]

10Mbps

5.02

NT 4.0, Software

LAN

CheckPoints FloodGate [10]

45 Mbps

4.1

NT 4.0, Software

and

1.6.4

Embedded NT,

Router

iPolicer 100 Mbps

100-CR2202 [11][12]
Packeteers PacketShaper

Hardware
45 Mbps

4500 [13]
Sitaras QoSWorks

4.1.2 Embedded Linux,

Same FreeBSD

Same NT server

HA*

Same NT server

10/100Mbps

Another NT server

10/100Mbps

Embedded Hard

booting from a hard disk.


Flash P!!! 128M
600

Flash P!!! 128M

Hardware
100 Mbps

SDRAM, 2 Intel 100M NICs installed,

32M

600

1.8 Embedded FreeBSD,

Hard

Log to

Over

ALTQ 2.2 [8]

BroadWeb/Acutes

Fail

P!!! 192M

Disk
10/100Mbps

Embedded Hard

QWX-10000 [14]
NetRealitys WiseWan

Hardware
5Mbps

4.0

Disk

Proprietary,

200/500 [15]

WAN link Flash

Hardware

32M

600
P

Disk
32M

133

V.35

Another NT server

(10Mbps log)

Note 1: Invited venders also include Lucents Access Point, Allots NetEnforcer these two decide not to join this test after examining our test plan and
Ciscos Cisco Assure (did not want to join at the beginning).
Note 2: Fail Over is defined as the capability of bypassing traffic when the power is off. HA means high availability module (optional).
Note 3: Sitira revealed to us that QoSWorks uses ALTQ_CBQ.

Table 1: Product information and software/hardware platforms

2.1 Functionality of Policy Console


Network administrators use policy console to define organizational bandwidth policy rules. Table 2
lists the functionality of each policy console. All DUTs can limit the bandwidth of a class. Moreover,
most DUTs can guarantee the minimum bandwidth of each connection within the class, except for
Guardian Pro and ALTQ. These two settings can be further set by (a) inter-class bandwidth borrowing
and (b) intra-class bandwidth borrowing, respectively. In (a) the DUTs can redistribute any available
bandwidth unused by some classes to other active classes; in (b) if any flow in a class terminates, its
bandwidth will be fairly redistributed to other flows.
Vendor/
Model

Packet Classifier
Src/Dst IP/Port#,

Host

mask, Prot. ID

list

Direction

UDP

WAN

Per-Class Bandwidth Control

Bandwidth Borrowing

(In/Out)

traffic

Link Class Guarantee BW for each

Inter-class Intra-class

control Speed limit

connection in the class

Setup
ALTQ

Both

Auto

Compete2

NetGuards Guardian Pro

Both

Degree1

Compete

CheckPoints FloodGate

Both

Degree

Degree

NetRealitys WiseWan

Both

Auto

Auto

Acute/Broadwebs iPolicer

Both

Packeteers PacketShaper

Both

Degree

Degree

Sitaras QoSWorks

Both

Auto

Auto

Degree means that administrators can manually specify the degree of bandwidth borrowing.

DUTs without connection guarantee let the flows within the class compete with each other.

Table 2: Functionality Comparison of the Devices under Test

2.3 Protocol Support


Table 3 compares the protocol support of each DUT. Most Internet services/protocols can be
recognized by layer-4 TCP/UDP port numbers. However, layer-7 awareness can increase the simplicity
and capability of bandwidth management. For example, FTP protocol includes the passive mode, in

which FTP-data port (port 20, for sending data) can be dynamically changed to another by negotiation in
the FTP-Cmd port (port 21, for sending FTP commands). If the DUT cannot recognize what negotiation
is in the FTP-Cmd port, obviously it cannot control the connection that is actually sending the data.
PacketShaper and WiseWAN have the richest layer-7 awareness. In terms of quantity of port-service
mapping entries, WiseWAN and PacketShaper are the richest. The next richest are FloodGate and
Guardian Pro. iPolicer, QoSWorks, and ALTQ have few or no built-in port-service mapping entries and
require manual lookups in the port-service mapping table. Although iPolicer can identify UDP, it cannot
control its bandwidth.

Vendor/
Model

Layer awareness

Built-in port-service mappings

Layer

Layer-7 TYPE

TCP

ALTQ

NetGuards Guardian Pro

60

CheckPoints FloodGate

URL/MIME-TYPE

NetRealitys WiseWan

Acute/BroadWebs iPolicer

ICMP IPX

# of other protocols

UDP

0 (Manually assign port #)

Manually assign port #

35

15

60

35

Manually assign port #

URL/MIME-TYPE

109

79

Above 250

12

Cannot control

Manually assign port #

Packeteers PacketShaper

URL

Total above 200 (layer 2 ~7)

Above 200

Sitaras QoSWorks

0 (Manually assign port #)

Manually assign port #

*Note: This table only lists the protocols that can control rather than just recognize only.

Table 3: Comparison of Protocols Support


Appendix A-1 and A-2 further compare the policy console user interface and special functions of
the DUTs. Most DUTs mix priority-based and rate-based queuing, however, this test focuses on ratebased policy that controls TCP connections flowing from enterprises (LAN) to WAN since TCP
traffic occupies most of the Internet traffic. As for UDP traffic, this test focuses on real-time applications
such as Voice over IP (VoIP). Differences between configured bandwidth and measured results will be
quantified.

3. Testbed and Test Methodology


Testbed and test methodology significantly influence test results and require careful examination to
avoid misinterpretation of the results.

3.1 Testbed: Mimics the Real-Life Internet


Internet is very dynamic. Different connections have different paths and therefore have different
distances and path qualities. Our testbed mimics the above properties by setting WAN delay, WAN delay
jitter, and WAN packet loss rate to each routing path. Figure 1 and Table 4 shows complete information
about our testbed and testing tools. Testing data flows are from X to Y, passing through the DUT,
routers, monitoring point, and WAN emulator. The Cisco routers are installed specifically for WiseWAN
because of its V.35 interface. Each DUT is individually tested on this testbed. Appendix B displays our
testbed photo. IP-aliasing employed at A and I in Fig.1 emulates multiple competing sources and their
corresponding sinks, respectively. Self-written wan-emu virtual interface driver is used to emulate the
dynamics of the Internet. They are detailed as follows:
192.168.88.X
ncftpput
tcpdum p

100M H ub

Linux
2.2.14
P-III 700

Source
1 ~ 99

172.16.88.X

Q oSW orks
PacketShaper
IPolicer

253
R eportD ata

100

J
Telephone

V oice
Src 2

C
N T 4 Server
254

252

254

Cisco
Router 2514

254

FTP Server

Linux
2.2.14 100M
P-III 700

H ub

254

Linux
2.2.14
P-III 700

D estination
1 ~ 99

V .35 C able

C isco 1750
V O IP G atew ay

L
Sm artBits

V oice
D est2

100

V oice
D est1

V oice
Src 1

192.168.88.201

254

C isco
Router 2514

W A N Em ulator
Linux
2.2.14
P-III 700

RTP
(G .729)

254

10.1.1.X

172.16.89.X

tcpdum p
T TT

W iseW A N

Cisco 1750
V O IP G atew ay

172.16.87.X

FloodG ate
G uardianPro

P-III700

172.16.86.X

10.1.1.201
W in2K
P-III 700
CO M 2

Telephone

Sm artV oIpQ oS

Figure 1: The Testbed: Mimic the Real-life Internet


Note: All PC are equipped with Intel Express Pro 10/100Mbps network interface cards. The V.35 serial clock rate between Cisco routers is set to
2Mbps. Each DUT is individually tested on this testbed.

Tool

Function

Description

Position in
Fig. 1

Ncftpput [16]

TCP Traffic

Traffic: 20 ncftputs flows from subnet X to subnet Y.

generator

Packet size: 1,500 bytes


TCP options: SACK/timestamp/window-scaling disabled.

SmartVoIpQoS

VoIP (UDP) traffic Traffic: Single VoIP flow with RTP format UDP packets.

[17]

generator

Codec: G.729 (50 frame/sec, frame size=74 byte, around 30kbps)

VoIP Gateway

Same as above

Same as above

ttt [18]

Real-time traffic

Monitor the bandwidth of the traffic passing through it by protocols, G

K and N

bandwidth monitor source/destination IP, etc.


Tcpdump [19]

Packet sniffer

Self-written AWK Data Analyzer

Dump each packets header to the RAM disk to avoid I/O overheads. A and H
Calculating statistics from the tcpdump result.

To have different delays, delay jitters, and random/periodic packet

scripts [20]
Self-written wan

WAN Emulator

emulator [20]

loss rates impairments on different flows.

Table 6: Testing Tools


1. IP-aliasing : In Linux each network interface card (NIC) can emulate 100 NICs, with each virtual NIC
having a unique IP address. With proper routing table setup at A in Fig.1, we can direct certain flows
destined to a certain virtual NIC at I through a virtual NIC at A. Virtual NICs generate packets with
their corresponding IP addresses such that the DUT will feel that outgoing TCP data packets are from
different local hosts, and incoming TCP Acks are from different remote hosts. Moreover, packets are
sent without link-layer collisions since only a single physical NIC is present at A and I.
2. wan-emu: Wan-emu is a Linux virtual interface driver that resides between the IP layer and the NIC
driver. In this testbed, multiple wan-emu virtual devices are attached to the sink-side last-hop NIC
driver (at H with IP 10.1.1.254) to have different impairments on different routes. With proper static
route, we can direct flows destined to a virtual NIC at I through a specific wan-emu interface that has
the desired link characteristics. Each packet passing through is pasted a timestamp indicating the time
for it to be kicked out. An interrupt is triggered every 1ms to examine how many packets are due and
should be forwarded. The timer granularity can be easily tuned to 8192 Hz in Linux. Impairments such
as the random/periodic loss rate and delay jitter are also implemented.
3

3.2 Test Methodology


This test includes three sub-tests: Basic Test, Robustness Test, and Advanced Test.

A. Basic Test
This test evaluates the accuracy of the class bandwidth and the fairness among the connections
within the class. Besides, this test also investigates the stability of each DUT among its five-time runs.
The total WAN link bandwidth is set to T1 (1.544Mbps)4 and is partitioned into five classes (20, 40, 128,
256, and 1100kbps), with each class matching four TCP connections. Each class is set to guarantee that

Note that some operating systems merely support alias IP addresses, but cannot support alias interfaces, such as FreeBSD
and Windows 2000.
4
BroadWeb/Acute iPolicer does not have WAN link speed setup.
8

each connection has 1/4 of the class bandwidth5. All settings are fixed without any bandwidth
borrowing. This test repeats in consecutive five runs, with 200 seconds intervals in between. Within each
run, 20 FTP connections are simultaneously flowing from A to I (Table 6), with each class match 4
connections. After 250 seconds, all the ncftpput processes are killed. Data from 30 to 230 seconds are
analyzed. The statistics are explained in Table 7. Appendix C uses an intuitive example to illustrate the
following statistics.
Statistic

Accuracy

Quantify what?

Definition

The differences between:


Averaged normalized goodput*
(1)the class bandwidth settings
(2)the measured class
5 measured class goodput for Run i

bandwidth

i 1

given class goodput for Run i

Comparison
Standard
The closer to 1, the
better

Stability of

The differences of the accuracy CoV** of normalized goodput among the five runs It depends***.

accuracy

statistics among the five runs.

(Same as above, but take the CoV among the 5 runs


instead of the average.)

Fairness

Fairness of bandwidth usage

Averaged CoV among 4 connections goodputs

among the 4 connections in

better

Differences of the fairness

Same as above, but take the standard deviation

fairness

statistic among the five runs

among the 5 runs instead of the average.

class.

i 1

Stability of

Ratio

CoV of goodputs(among the 4 connections) in Run i

each class.

Retransmission Retransmission ratio in each

The closer to 0, the

Retransmitted Packet Count for Run i


i 1 Total Packet Send Count for Run i


5

It depends***.

The closer to 0, the

5
better

* Goodput is the effective throughput (bytes/time) excluding the bandwidth consumed by retransmission.
** CoV denotes coefficient of variation, which means standard deviation over mean.
*** If the accuracy tends to 1, it would be better for its stability to be 0. This implies the DUT always performs accurately. However, if the accuracy tends to
0, and its stability also tends to 0, it implies that the DUT always performs inaccurately. This also applies to fairness and its stability (Appendix C).

Table 7: Basic Test Statistics

B. Robustness Test
Packets may be generated by different operating systems, hence different TCP implementations,
and pass through paths with various delays and loss rates. Long-distance TCP connections are expected
to be vulnerable to Internet losses because they require more time to obtain Acks for recovering to their
target bandwidth. Since many DUTs regulate TCP Acks, it is our concern whether they are compatible
with the major operation systems. Table 8 describes our test methodology.

NetGuard Guardian Pro cannot accept per-connection setting.


9

Test Item

Description
DUT Settings

Comparison standard

Test Methodology

Under Heterogeneous Same as Basic WAN delays of the four connections in each class Same as the Basic Test
Internet Delays

Test.

are 10ms, 50ms, 100ms, 150ms

Under Various

200kbps

Internet Loss Rates

the test flow.

Under Different

80kbps for the (1)WAN: delay=50ms, periodic loss rate=1%.

Sending Operating

test flow.

for A single TCP connection is tested under 0.5%, 1%, Whether the goodput can
2%, 4% and 8% periodic loss rates.

smoothly degrade.
How closely the byte-time

(2)TCP Source OS= {Linux 2.2.14, Windows 2000, lines of the operating

Systems

FreeBSD 4.0, Solaris8}.

systems can overlap with

(3)TCP Receiver OS= Linux 2.2.14.

each other.

(4)Each time a single TCP connection is tested.

Table 8: Robustness Test Methodology

C. Advanced Test
This test includes bandwidth borrowing test and VoIP quality test. Bandwidth borrowing has been
described in Section 2. VoIP quality is separately tested through SmartBits and VoIP Gateway to
evaluate whether the DUTs can precisely allocate adequate bandwidth for voice traffic. Each test is
conducted under heavily-loaded FTP traffic. Detailed test methodologies are in Table 9.
Test Item

Description
DUT Settings

Inter-class
Bandwidth
Borrowing

Comparison
standard

Test Methodology

(1) Link speed=T1 (1.544Mbps), divided Connection 1 and 2 are started and stopped in (1) Stability of
into 2 classes A, B. A=B=777kbps.

sequence.

each

(2) Class A matches connection 1, Class

connection.

B matches connection 2.

(2) How

(3) A and B can borrow with each other.

seamlessly the

Intra-class

(1) Link speed=T1 (1.544Mbps), divided

total

Bandwidth

into 1 classes A. A=1.544Mbps.

Borrowing

bandwidth line

(2) The class matches connection 1 and 2.

can be when

(3) Per-connection bandwidth: at least

connection 1
terminates.

777kbps, at most 1.544Mbps.


VoIP test using (1) Link speed={T1,125kbps}, divided

Background: 20 FTP connections.

PSQM1,

SmartVoIpQoS

Foreground: a 30kbps G.729 VoIP flow.

delay and loss.

Background: 20 FTP connections.

Listening with

into 2 classes A, B.

VoIP test using (2) A=30kbps for voice traffic,


VoIP Gateway

B={T1,125kbps}-30kbps for FTP

(Cisco 1750)

traffic.

Foreground: Dial a phone (JP to NP, G.729 ears2.


codec), hold Xs and Ys phones,

(3) FTP traffic can occupy the voice class

speak 1 to 10 at 2 word/sec, and

10

jitter,

until voice traffic begins.

judge the voice quality.

PSQM (Perceptual Speech Quality Measurement) is calculated from delay, jitter, and loss statistics. PSQM rated as 6.5 has the poorest quality

The VoIP Gateway is set to continuously sample the sound even when the primary tester keeps silent. Thus the data flow is always around 30 kbps.

Table 9: Advanced Test Methodology

4. Benchmark Test Results


A. Basic Test Results
A-1. Accuracy and Stability of Accuracy
Figure 2 (A1 is accuracy, B1 is its stability, A2 and B2 will be discussed in the robustness test)
reveals that the DUTs can be classified into three groups: ALTQ_CBQ, PacketShaper, and QoSWorks
have the most accurate and stable control for each class; WiseWAN and FloodGate are less effective in
the narrowband class (20kbps) because of their large retransmission ratios as will be shown in section A3; iPolicer and Guardian Pro are the least effective. iPolicer has several terminated connections in the
middle of each run. Thus those connections not sending data waste bandwidth and result in instability
among the five runs6.

Note: The test crew had performed many five-run tests on iPolicer. It is only after the above phenomenon has been
verified that we include the most general one of the five-run tests in our analysis.
11

Figure 2: Results of accuracy and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay)
A-2. Fairness and Stability of Fairness
Figure 3 (A1 is fairness, B1 is its stability, A2 and B2 will be discussed in robustness test) also
distinguishes three groups: PacketShaper is the most fair and stable; QoSWorks is less fair but is stable
in the 20kbps class, implying that it is less fair in the 20kbps class in all the five test runs (Appendix C).
FloodGate and WiseWAN are less fair and stable in the 20kbps class. iPolicer, Guardian Pro, and
ALTQ_CBQ+RED provide poor fairness. Pure CBQ has the poorest fairness under narrowband
(20~40kbps) classes. However, it is somewhat alleviated after applying RED to each class because RED
tends to drop more packets from the connection that is more aggressively sending the data.

12

Figure 3: Results of fairness and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay)
A-3. Retransmission Ratio
Figure 4 A1 (A2 will be discussed in robustness test) shows large retransmission ratio in
narrowband classes (20~40kbps), except for PacketShaper and QoSWorks, but especially in WiseWAN,
iPolicer, FloodGate and ALTQ_CBQ+RED. As an analogy, a small exit often keeps many people
waiting before it. FloodGate and ALTQ_CBQ+RED use packet dropping to slow down TCP flows so
they have high retransmissions. WiseWAN has enormous packet losses at the Cisco router before
WiseWAN can control the traffic at the WAN link. Results of iPolicer are not easy to comprehend in
terms of the technologies it claims (adjusting the TCP window size).

13

Figure 4: Test results of retransmission ratio (A1: No Internet Delay; A2: With Internet Delay)

B. Robustness Test Results


B-1. Under Heterogeneous Internet Delays
To make it easy to compare with the Basic Test, the test results are listed with those of Basic Test.
Figure 2 (A2, B2), Figure 3 (A2, B2) and Figure 4 (A2) separately demonstrates the results. Most results
scales up the differences among the DUTs in the Basic Test, especially with iPolicer and ALTQ_CBQ in
the fairness statistic. Long-distance connections are vulnerable to packet losses due to buffer overflows
at the controlling device, as described in Section 3.2 B. ALTQ_CBQ+RED can alleviate the unfairness
degree of ALTQ_CBQ because the short-distance connections, which are more aggressively sending the
data, have more packets dropped by the RED mechanism. Guardian Pro cannot guarantee each
connection and thus reveals significant instability between Basic Test and this test. QoSWorks is less fair
under the broadband class (1.1Mbps).
B-2. Under Various Packet Loss Rates
Normally a TCP flow slows down its transmission rate when packet losses occur. Figure 5 shows
the goodput of each DUT under different Internet packet loss rates (each flow is with 200kbps and the
measured goodput is averaged over 200 seconds as in Basic Test). Almost all the DUTs can smoothly
lower their goodputs as packet loss rate increases, except for PacketShaper and iPolicer. These two
devices give up sizing the TCP window when they have detected the TCP loss events (triple duplicate
Acks). Thus, the TCP sending window suddenly bumps up and causes a burst of packets flowing to the
controlling device, resulting in a higher goodput at 0.5% loss rate. This phenomenon is alleviated when
increasing the packet loss rate.

14

B andw idth (kbps)

IPolicer
FloodG ate

200

PacketShaper
W iseW A N
G uardianPro

180

Q oSW orks
A LTQ _C B Q

160
0

0.5

8 Loss rate (% )

Figure 5: Robustness Testgoodput under various packet loss rates


B-3. Under Different Sending Operating Systems
In this compatibility test (see Fig.6, the X axis is time, Y axis is the bytes sent, thus the slope is the
bandwidth), TCP connections sending from different operating systems passing through PacketShaper
have different results. PacketShaper shrinks the TCP window to the condition that no more than 4
packets are in the WAN pipe. Thus, each packet loss resorts to a retransmission timeout instead of using
fast retransmit [21]. Since BSD-derived UNIX systems use a coarse-grained retransmission timer
(500ms) [21] such that they slowly retransmit the lost packets. In contrast, Linux keeps a fined-grained
retransmission timer and has the best performance when packet losses occur. iPolicer has a serious bug
when sending data from Windows 2000 to Linux 2.2.14. The tcpdump tool found that the TCP Ack
header length is miscalculated when passing through iPolicer, causing incorrectly triggering of data
packets from TCP senders. TCP has many options and various implementations, so explicitly modifying
the packet header requires sever compatibility tests. The other products can fairly treat TCP flows from
different operating systems.

Figure 6: Robustness test Under different Sending Operating Systems

C. Advanced Test Results


C-1. Bandwidth Borrowing Test Results
This test uses ttt to observe the effectiveness of bandwidth borrowing. In each figure we only focus
on three lines: the total bandwidth (ip/ether line), the bandwidth of connection 1 (xxxx/tcp line) and the

15

bandwidth of connection 2 (yyyy/tcp line). The test crew draws another baseline indicating the ideal
total link bandwidth (1.544Mbps) for comparison.
Inter-Class Bandwidth Borrowing Test Results
Figure 7 shows the inter-class bandwidth borrowing benchmark results. iPolicer does not have
this function, so we set the bandwidth of both of the two classes as 1.544Mbps. However, Cisco
Routers link is set to 2Mbps, thus the two 1.544Mbps flows through iPolicer exceeds the baseline
bandwidth. After connection 1 terminates, the total bandwidth narrows down to around 1.5Mbps
with some bandwidth fluctuation. WiseWAN and ALTQ can automatically borrow bandwidth among
classes, and the others can be further configured with the degree of bandwidth borrowing. Guardian
Pro has an unstable look when connection 2 starts to obtain a bandwidth share. ALTQ_CBQ and
ALTQ_CBQ+RED can only borrow a limited bandwidth (from 777kbps to 1.1Mbps). FloodGate,
PacketShaper and QoSWorks can perform inter-class bandwidth borrowing seamlessly.

(a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate

(e) Packeteer PacketShaper (f) Sitara QoSWorks

(c) NetGuard GuardianPro

(g) NetReality WiseWAN

(d) ALTQ_CBQ

(h) ALTQ_CBQ+RED

Figure 7: Inter-class Bandwidth Borrowing Test


Intra-Class Bandwidth Borrowing
Figure 8 shows the intra-class bandwidth borrowing benchmark results. iPolicer lacks this
function so after connection 1 terminates, connection 2 cannot occupy the newly available
bandwidth within the class. Guardian Pro and ALTQ_CBQ have fluctuating bandwidth sharing
between the two connections since they cannot guarantee per-connection bandwidth. This
phenomenon in ALTQ_CBQ is again slightly alleviated after applying RED. The other four products
are quite similar in this test, except that PacketShaper and FloodGate have little gaps.

16

(a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate (c) NetGuard Guardian Pro

(e) Packeteer PacketShaper (f) Sitara QoSWorks

(d) ALTQ_CBQ

(g) NetReality WiseWAN (h) ALTQ_CBQ+RED

Figure 8: Intra-class Bandwidth Borrowing Test


C-2. VoIP Quality Test
This test does not include iPolicer since presently it cannot control UDP traffic. This test is
performed by the Smartbits and by the Cisco 1750 VoIP gateways. The former gives quantitative results
while the latter judges the voice quality through hearing.
Figure 9 (a) shows that under T1 WAN link (1.544Mbps) the DUTs differ in latency and jitter.
However, the ultimate voice quality grades (PSQM) are similar except for ALTQ_CBQ. This is also
verified by the VoIP Gateway (Table 10) test. We thus conclude that under T1 access link the G.729 bit
rate can be easily allocated. In contrast, under 125kbps WAN link (Fig.9 (b) and Table 10), the voice can
only barely be recognized with PacketShaper. Transmitting a large packet (1500 bytes) to the
narrowband WAN link (125kbps) takes a long time such that its following small voice packet (74 bytes)
has to wait until the previous large packet is completely scheduled out. However, after QoSWorks
2500

20

2000

50

1500

40

Base

PacketShaper

FloodGate

Average Latency

WiseWAN
Max Latency

GuardianPro

QoSWorks

2.2

2.48

2.56

500

6.5

2.7

2.45

WiseWAN
PSQM

GuardianPro

Loss rate

QoSWorks

20

10

ALTQ_CBQ

WiseWAN

GuardianPro

Max Latency

QoSWorks

QoSWorks2 ALTQ_CBQ

Jitter (Latency Variation)

PSQM and Loss Rate


25

15

2.6

Jitter (ms)

10
PacketShaper FloodGate
Average Latency

0
FloodGate

20

Base

5
PacketShaper

30

1000

ALTQ_CBQ

1
0
Base

60

Jitter (Latency Variation) 121.2768

PSQM and Loss Rate


6
5
4

70

6.5

6.5

6.5

6.5

6.5

6.23

100
80
60

2.6

2.2

40

20

Loss rate (%)

30.6303

PSQM

1.0533

10.8978

10

PSQM

25

15
81.0739

100

3
2

3000

L o s s ra te (% )

(m s )

150

80

30

(m s )

200

50

Latency and jitter

837.5684

Latency (ms)

1529.3163

Latency and jitter

250

0
Base

PacketShaper

FloodGate

WiseWAN

GuardianPro

PSQM

Loss rate

QoSWorks

QoSWorks2

ALTQ_CBQ

(a) T1 WAN link (1.544Mbps)


(b)125kbps WAN link
Note: Base results are conducted under clean testbed without enabling any DUT. The G.729 Codec is not lossless
compression. Even though the jitter and loss is few, the PSQM is at least 2.2.

Figure 9: VoIP Test Results


17 of SmartVoIPQoS

exercises Packet Size Optimization (minifying the maximum transfer unit of FTP connections when
establishing the connections), the voice quality approaches the original voice both in Smartbits and
Gateway tests. While it is promising, readers should be aware that minifying the packet size of all other
TCP connections can cause large overhead. As an analogy, the overhead of several small trucks carrying
the goods is larger than that of a big truck carrying the same goods. This tradeoff depends on the
considerations of the network administrator.
T1 WAN link Speed
Calling time

Baseline (only voice)

About 0.2 sec

125kbps WAN link Speed

Delay time

Voice quality

estimated by ears

(legibility)

Very short< 0.1 sec

Very good

Calling time

<1 sec

Delay time

Voice quality

estimated by ears

(legibility)

Very short< 0.1

Very good

sec
Baseline (with background FTP)

Cannot establish the connection

Cannot establish the connection

iPolicer

Cannot be testeddo not support UDP traffic control

Cannot be testeddo not support UDP traffic control

FloodGate

About 0.5 sec

Very short< 0.1 sec

Very good

About 7sec

About 1 sec

Very Poor<10%

Guardian Pro

About 0.5 sec

Very short< 0.1 sec

Very good

About 3 sec

About 1.5 sec

Ultra poor<1%

WiseWAN

About 0.5 sec

Very short< 0.1 sec

Very good

About 7sec

About 1.5 sec

Ultra poor<1%

PacketShaper

About 0.5 sec

Very short< 0.1 sec

Very good

About 1 sec

About 1 sec

Poor (60%)

ALTQ_CBQ

About 2 sec

Very short< 0.1 sec

Very good

About 18 sec

About 1 sec

Very Poor<10%

QoSWorks

About 1 sec

Very short< 0.1 sec

Very good

About 17 sec

About 1 sec

Very Poor<10%

QoSWorks Optimized

Not tested (no need to)

About 6 sec

Very short< 0.2

Very good

sec

Table 10: VoIP Test Results Through VoIP Gateway

5. Conclusions
This work designs a novel testbed that mimics the real-life Internet conditions, such as multiple
connections, heterogeneous WAN delays/delay jitters/packet loss rates, and different TCP source
implementations. Most test reports, such as those by the Tolly Group [22], are financed by the vendors
and may be biased. Additionally, the testbed in those reports is over-simplified, without in-depth test
items or with inadequate number of connections. This work first classifies the policy rules into three
major types: (1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule
among classes. The test methodology then quantifies the effectiveness of the above policy rule types of
each device in terms of accuracy, fairness, stability, robustness, bandwidth borrowing, and VoIP quality.
The test results reveal several things that can be reproducible with our open tools: (1) the narrowband
class-based rule and its fairness among the flows are harder to enforce when multiple TCP connections

18

compete for the same queue, resulting in large queue length and TCP retransmissions. (2) explicitly
sizing the TCP window could cause performance or fairness degradation even under slight packet loss
rates; (3) the open source solution can compete with commercial products in accurately limiting flow
aggregates; (4) the video/voice qualities of real-time applications significantly depends on the packet
sizes of all other traffic when using a narrowband (125kbps) access link. Detailed functionality
comparison among the DUTs gives further directions for enhancing open source solutions, such as
Packeteers traffic discovery and QoSWorkss intuitive user interface. The ALTQ package lacks perconnection bandwidth guarantee within the class that it needs further refinements to satisfy the
enterprises demand. Some vendors in this test use open sources but never do they open their kernel
patches. We are currently patching ALTQ with per-connection bandwidth guarantee and will feedback to
the Open Source community. After all, open source should be open.

6. References
[1]
[2]

[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]

S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss, An Architecture for


Differentiated Services, RFC 2475, Dec. 1998.
Stiliadis, and A. Varma, Latency-Rate Servers: A General Model for Analysis of Traffic
Scheduling Algorithms, IEEE/ACM Transactions on Networking, Vol. 6, No. 5, pp.611-624, Oct.
1998.
S. Floyd, and V. Jacobson, Link-sharing and resource management models for packet
networks, IEEE/ACM Transactions on Networking, Vol. 3, No. 4, pp.365-386, 1995.
K. Fall, and S. Floyd, Simulation-based Comparisons of Tahoe, Reno, and SACK TCP,
ACM Computer Communication Review, Vol. 26 No. 3, pp.5-21, Jul. 1996.
J. Padhye, and S. Floyd, On Inferring TCP Behavior, ACM SIGCOMM'2001, San Diego,
USA, August, 2001. http://www.acm.org/sigcomm/sigcomm2001/p23.html (to be appeared)
S. Floyd and V. Jacobson, Random Early Detection Gateways for Congestion Avoidance,
IEEE/ACM Transactions on Networking, Vol. 1, No. 4, pp.397-413, Aug. 1993.
S. Karandikar, S. Kalyanaraman, P. Bagal, and B. Packer, TCP Rate Control, ACM
Computer Communication Review, Vol. 30, No. 1, Jan. 2000.
K. Cho, Alternate Queueing for BSD UNIX (ALTQ), http://www.csl.sony.co.jp/person/kjc
NetGuard Corporation, http://www.netguard.com
Check Point Software Technologies, http://www.checkpoint.com
BroadWeb Corporation, http://www.broadweb.com.tw
Acute Communication Corporation, http://www.acutecomm.com
Packeteer Corporation, http://www.packeteer.com
Sitara Networks, http://www.sitaranetworks.com
NetReality Corporation, http://www.net-reality.com

19

[16]
[17]
[18]
[19]
[20]
[21]
[22]

Ncftpput Software, http://www.ncftp.com


K. Cho, Tele Traffic Tapper (ttt), http://www.csl.sony.co.jp/person/kjc
Spirent Communications, http://www.netcomsystems.com
Lawrence Berkeley National Laboratory, tcpdump, http://www-nrg.ee.lbl.gov
H. Y. Wei, WAN Emulator, http://speed.cis.nctu.edu.tw/wanemu/
W. R. Stevens, TCP/IP Illustrated Volume 1 - The Protocols, Addison-Wesley, 1994.
Tolly Group, http://www.tolly.com

Acknowledgements
We thank the vendors who so generously provided us with the devices and their verifications of
the test results. We are grateful to Ching-Chuan Chiang and Yi-Chung Liu for their help on the
preliminary tests and functionality comparisons.

Appendix
Appendix A. Detailed Functionality Comparison
A-1. Policy Console User Interface
As for the policy console user interface (Table A), a notable function is how many devices a
management console can control. Policy consoles of PacketShaper and QoSWorks can control only one
device since they use built-in web servers for configuration with Web browsers. Policy consoles of
others (except for ALTQ) can remotely control multiple devices located at different places. As for
schedule control, per-rule schedule control is more effective. For example, some rules can be inactive
during non-office hours, but VoIP rule should be always active to guarantee voice quality.
Vendor/Model

Type

Schedule

Management

Control

Console

ALTQ

Config File

NetGuards

GUI Win32

Per-rule

Guardian Pro

Application

CheckPoints

GUI Win32

FloodGate

Application

NetRealitys
WiseWan

GUI Java
Application

OS

Single device FreeBSD 4.0


Global

Win NT/2000

devices
Per-rule

Global

Global

Alert

Per-class bandwidth usage

N/A

Line Statistics Report/Response Time

Log

Report/Protocol Distribution Report


Win NT/2000

Line Statistics Report/Response Time

devices
Per-rule

Monitor/Statistics

N/A

Report/Protocol Distribution Report


Win NT/Solaris

devices

Line Statistics Report/Port Report /Response Time SNMP trap


Report/Protocol Distribution Report/VoIP
Report/Top Ten Talkers/Top Ten Protocols or Apps

20

BroadWeb/Acutes Web Browser

Per-rule

Global

Web Server Web Client

Line Statistics Report/Top Ten Report/

iPolicer

(Java Applet)

devices

Another NT IE 5.0

Top Ten Talkers/Top Ten Protocols

Packeteers

Web Browser Per-device

Single

Web Server Web Client

Utilization/Network Efficiency/Top Ten

device

Embedded

Classes/Top Twenty Talkers/Per-class Bandwidth

PacketShaper

(HTML)

Sitaras QoSWorks Web Browser Per-device


(HTML)

Any

Web Server

Usage/Response Time Report

Single

Web Server Web Client

Per-class Bandwidth Usage/Link statistics/Top

device

Embedded

classes per link/Top Applications/Protocol

Any

Email trap

SNMP
trap

SNMP
trap

Distribution/Traffic by address

Web Server

Table A: Management Interface and Statistics of Flow


A-2. Special Functions
PacketShaper is superior in its Traffic Discovery, which can automatically identify the protocols
of the traffic passing through it and provide an instant feedback to the network administrator for further
bandwidth setting. Others have to manually monitor whether the newly specified packet filters can
capture its corresponding traffic. WiseWAN is directly installed at the WAN link (V.35 cable) and thus
can verify whether the measured bandwidth matches the subscribed bandwidth. Additionally, it can
detect PVCs in the frame relay network. Thus a single WiseWAN device can control all the traffic on the
mesh-structured frame relay links among branch offices. QoSWorks significantly focuses on controlling
VoIP traffic. With shrinking TCP data packet size, VoIP (UDP packets) traffic can pass through
QoSWorks smoothly, especially in narrowband WAN link. Moreover, QoSWorks has built-in Web cache
(not verified in this report). Both FloodGate and Guardian Pro can be integrated with their firewall, VPN
and NAT packages. Integrated solutions may reduce management costs.

Appendix B. Testbed Photo

Figure B: Testbed Photo

21

Appendix C. Intuitive Example for Basic Test Statistics


This intuitive example illustrates how the Basic Test statistics of the 20kbps class are derived. As
described in Section 3.2, each class matches four connections, and the test repeats for five runs. Ideally
within each run each connection can receive 1/4 of the class bandwidth. The example results tell us that
the accuracy statistic is 19, which approaches the ideal result 20, cannot reflect real conditions. With the
aid of poor stability of accuracy, we can judge that the DUT is actually not good in accuracy. On the
other hand, Not fair with Good stability of fairness means that the DUT cannot fairly" treat the
flows almost all the time.
20kbps
40kbps
128kbps
1.544M bps

256kbps

1.1M bps

Ideal
Round 1 Round 2 Round 3 Round 4 Round 5
5
4
1
12
1
13
5
4
2
5
2
14
20
14
7
30
5
39
5
4
2
6
1
5
5
2
2
7
1
7
10
10
40
(14+7+30+5+39)/5=19 => A ccurate!!
10
5=19
A ccurat
C oV(14+7+30+5+39)/
(14,7,30,5,39)=2.
4 =>=>Poor
stabie!!
lity!!
10
C oV (14,7,30,5,39)=2.4 => Poor stability!!
32
32
128
32
32
C oV =0.23 C oV =0.25
C oV =0.36
C oV =0.35
C oV =0.39
64
C oV =0.23 C oV =0.25
C oV =0.36
C oV =0.35
C oV =0.39
64
256
64
64
275
(0.23+0.25+0.36+0.35+0.39)/5=0.32 => N otFair
275
(0.23+0.
5=0.
32 =>
=>GNood
otFai
r lity
23+0.25+0.
25+0.36+0.
36+0.35+0.
35+0.39)/
39)=0.
063
stabi
1100 Std(0.
275
Std(0.23+0.25+0.36+0.35+0.39)=0.063 => G ood stability
275

Figure C: Intuitive Example for Basic Test Statistics

22

Potrebbero piacerti anche