BW Benchmark

On Evaluating Policy-Based Bandwidth Management Devices
Huan-Yun Wei1
Ying-Dar Lin
Department of Computer and Information Science

National Chiao Tung University, Hsinchu, Taiwan
Tel: +886-3-5712121-ext56667
FAX: +886-3-5721490
Email: {hywei,ydlin}@cis.nctu.edu.tw
Policy-based bandwidth management defines how to allocate bandwidth resources according to
organizational policy rules. Enterprises often employ such policy-based devices at their organizational
edges to manage the narrow but expensive Internet access links. This work designs a novel testbed and
uses it to evaluate the functionality and performance of many such devices, including six commercial
products and one open source solution. Their policy rules can be categorized into (1) class-based rule;
(2) connection rule within a class; (3) bandwidth borrowing rule among classes. The testbed mimics the
real-life Internet with heterogeneous Internet delays/delay jitters/packet loss rates, and evaluates the
effectiveness of policy enforcement of the above three policy types in terms of accuracy, fairness,
stability, robustness, bandwidth borrowing, and voice over IP (VoIP) quality. The test results 2 reveal that
(1) explicitly sizing the TCP window could cause performance or fairness degradation even under slight
packet loss rates; (2) the open source solution can compete with commercial products in accurately
limiting flow aggregates; (3) the voice qualities over IP networks significantly depends on the packet
sizes of all other traffic when using a narrowband (125kbps) access link.
Keywords: policy-based, bandwidth management, TCP, testbed, emulator
Corresponding author
All test results are verified by the vendors and are reproducible through our open tools. Nowadays most benchmark reports
are financed by vendors and may be biased, without practical testbeds. Guided by this neutral test, readers can obtain in-depth
sights when examining bandwidth management devices.
2
1. Introduction
Internet services provide an economic and convenient system to carry out business, such as
efficient information exchange among branch offices, or efficient customer/provider access to the
services. However, the importance of the services varies, and enterprises often fail to effectively utilize
the narrow but expensive WAN link bandwidth. For instance, the bandwidth required by ERP
(Enterprise Resource Planning), voice over IP (VoIP), and e-business may be occupied by less-important
applications such as FTP. Since end-to-end Internet QoS such as DiffServ [1] is still under experiment,
enterprises seek to at least manage their inbound and outbound links. Thus, policy-based bandwidth
management devices are employed at organizational edges to set and enforce organizational policies for
pursuing the utmost benefits.
Network administrators define policy rules to achieve resource management objectives for the
enterprise. Each policy rule contains condition and action fields to define specific actions for
specific conditions. Condition defines the packet-matching criteria, such as a certain subnet, application,
or protocol. Action defines the bandwidth parameters, such as at least 100kbps or at most 200kbps.
So each policy rule is class-based that it groups a set of traffic flows into a per-class queue according to
the specified packet filter (condition), and then the class of traffic is scheduled out at its corresponding
specified bandwidth (action). Moreover, the class-based rules can be further configured with bandwidth
borrowing among the classes to dynamically utilize available bandwidth effectively. Additionally, each
connection within a class can be guaranteed to have at least a certain amount of bandwidth. Throughout
this work we evaluate the effectiveness of various policy enforcements for the above three policy types:
(1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule among classes.
The following subsections review traditional and prevalent technologies to enforce these policy rules.
Traditional TechnologyQueuing
A straightforward method for bandwidth management is to queue less-important traffic and pass
important traffic as soon as possible. Queuing can be roughly categorized into (1) priority-based queuing
and (2) rate-based queuing. Priority-based queuing sets the priority among the classes and the highest
priority class is scheduled out first. This is suitable for short-lived, extremely important, or transactionoriented flows. However, priority-based queuing cannot quantitatively guarantee/limit the bandwidth for
a class. As an analogy, if everyone is VIP, then no one is real VIP. In contrast, rate-based queuing
employs various packet scheduling algorithms [2] that can decide from which class comes the next
packet for transmission. This can effectively limit senders who are trying to overburden the resource.
Besides, the minimum bandwidth for important applications can be quantitatively guaranteed. Floyd and
Jacobson [3] further investigate the bandwidth borrowing among the classes. Queuing has different
impacts upon UDP and TCP data flows. Next we briefly review UDP and TCP protocols.
Queuing the Internet Traffic: TCP vs. UDP

The majority of software applications today use TCP (Transmission Control Protocol) for data
transmission because TCP can establish a reliable end-to-end connection. TCP receivers acknowledge
the successful reception of each data packet by replying an Ack to their TCP senders. Thus, Ack packets
can trigger senders sending out new data packets. Unacknowledged data packets are retransmitted to
guarantee reliability of data transfers. TCP also incorporates flow control mechanisms that prevent a
sender from overburdening the network capacity or overflowing its receivers buffer. Thus each TCP
sender keeps two window values, congestion window (CWND) and receiver advertised window
(RWND), and seeks to satisfy both network capacity (congestion control) and receiver's capability of
receiving the data, respectively. So each TCP sender do not have unacknowledged data more than
min(CWND, RWND). RWND is advertised by the receiver in TCP Ack packets and ranges widely
among operating systems. CWND is kept increasing exponentially during the slow-start phase and
linearly during the congestion avoidance phase, probing available bandwidth until packet losses occur.
Loss behavior differs among versions but mainly on how the CWND is shrunk and raised, or how the
lost segments are accurately retransmitted. Falls and Floyd [4] give a good overview and problems on
Tahoe, Reno, NewReno, and SACK TCP versions. Padhye and Floyd [5] further investigate the TCP
version distribution among 4550 Web servers. Unlike TCP, UDP (User Datagram Protocol) lacks the
connection establishment, reliability of data transfer, and flow control. UDP only provides port number
multiplexing and is commonly used by real-time applications such as video conferencing and Voice over
IP (VoIP).
Queuing has different impacts upon UDP and TCP flows. As for real-time UDP traffic, the bit rate
is often fixed and the video/voice quality heavily depends on the loss rate, delay, and delay jitter. The
packet scheduler must precisely allocate enough bandwidth for real-time UDP traffic to minimize packet
losses and delay at the controlling device. Moreover, the packets of the real-time traffic require to be
smoothly scheduled out with even intervals for minimizing the delay jitter. As for TCP traffic, TCP
flows competing for the same queue can cause a great amount of data packets queued in the device,
resulting in high buffer requirement and large packet latency at the device. Moreover, the TCP flows
may not fairly share the class bandwidth, especially when their round-trip times (RTT) are different.
Thus many vendors apply specific algorithms for regulating TCP traffic.
Specific Algorithms for TCP Traffic

To guarantee each TCP connection bandwidth within a class, and hence achieve fairness among
the flows within a class, the ideal solution is to actively control the sending rate of each sender within
the class instead of letting them compete with each other. Thus queuing and its queuing delay, buffer
requirement can be reduced. Other types of traffic such as UDP can only resort to the primitive solution,
queuing, to passively control its bandwidth. Two methods exist for controlling each TCP connection: (1)
window-sizing and (2) packet-dropping.
1.
2.
Window Sizing: Since a TCP connection can be actively controlled through the feedback
Acks, the window-sizing method directly influences the amount of sending bytes by shrinking the
RWND in the TCP Acks. In this test, iPolicer, PacketShaper, WiseWAN, QoSWorks and Guardian
Pro belong to this type. Karandikar et al. [6] sponsored by Packeteer investigate the window-sizing
technique. Though window-sizing can directly control per-connection bandwidth, it needs to
readjust its Ack regulations when another connection enters or leaves the class.
Packet Dropping: Because a TCP sender slows down its transmission rate in response of
network congestion by halving its congestion window size, the packet-dropping method drops
packets and expects that the sender will slow down its rate when detecting the packet loss events
[7]. In this test, FloodGate (uses per-flow queuing) and ALTQ_CBQ+RED belong to this type.
This work designs a novel testbed for evaluating the effectiveness of various policy enforcement
techniques used by existing products or solutions. The testbed mimics the real-life Internet
characteristics such as WAN delay, delay jitter, and packet loss. Section 2 compares the relevant
information of the devices under test (DUT). Section 3 then describes the design of our testbed and the
test methodology. Section 4 demonstrates the test results. Finally, a summary of the test results and
conclusions are given in Section 5.
2. Device under Test (DUT)

This test project invites nine vendors, and six of them join this test. Table 1 compares the
relevant information of all the DUTs. Most DUTs are installed at LAN-router link to prevent router
queues from overflowing and causing congestion. Because the grade of each DUT differs, so only low
bandwidth configurations (below 1.544Mbps) are tested. This minifies hardware differences so that test
results can reflect true management capability of each DUT.
Vendor/
Grade
Model
S/W
(Announced) Ver.
OS,
Install at
HW/SW
Hardware
Boot CPU RAM
Interface
from
100Mbps
2.2 FreeBSD, Software Between Our P!!! 700MHz PC with 256M
NetGuards Guardian Pro [9]
10Mbps
5.02
NT 4.0, Software
LAN
CheckPoints FloodGate [10]
45 Mbps
4.1
NT 4.0, Software
and
1.6.4
Embedded NT,
Router
iPolicer 100 Mbps
100-CR2202 [11][12]
Packeteers PacketShaper
Hardware
45 Mbps
4500 [13]
Sitaras QoSWorks
4.1.2 Embedded Linux,
Same FreeBSD
Same NT server
HA*
Same NT server
10/100Mbps
Another NT server
10/100Mbps
Embedded Hard
booting from a hard disk.

Flash P!!! 128M
600
Flash P!!! 128M
Hardware
100 Mbps
SDRAM, 2 Intel 100M NICs installed,
32M
600
1.8 Embedded FreeBSD,
Hard
Log to
Over
ALTQ 2.2 [8]
BroadWeb/Acutes
Fail
P!!! 192M
Disk
10/100Mbps
Embedded Hard
QWX-10000 [14]
NetRealitys WiseWan
Hardware
5Mbps
4.0
Disk
Proprietary,
200/500 [15]
WAN link Flash
Hardware
32M
600
P
Disk
32M
133
V.35
Another NT server
(10Mbps log)
Note 1: Invited venders also include Lucents Access Point, Allots NetEnforcer these two decide not to join this test after examining our test plan and
Ciscos Cisco Assure (did not want to join at the beginning).
Note 2: Fail Over is defined as the capability of bypassing traffic when the power is off. HA means high availability module (optional).
Note 3: Sitira revealed to us that QoSWorks uses ALTQ_CBQ.
Table 1: Product information and software/hardware platforms
2.1 Functionality of Policy Console

Network administrators use policy console to define organizational bandwidth policy rules. Table 2
lists the functionality of each policy console. All DUTs can limit the bandwidth of a class. Moreover,
most DUTs can guarantee the minimum bandwidth of each connection within the class, except for
Guardian Pro and ALTQ. These two settings can be further set by (a) inter-class bandwidth borrowing
and (b) intra-class bandwidth borrowing, respectively. In (a) the DUTs can redistribute any available
bandwidth unused by some classes to other active classes; in (b) if any flow in a class terminates, its
bandwidth will be fairly redistributed to other flows.
Vendor/
Model
Packet Classifier
Src/Dst IP/Port#,
Host
mask, Prot. ID
list
Direction
UDP
WAN
Per-Class Bandwidth Control
Bandwidth Borrowing
(In/Out)
traffic
Link Class Guarantee BW for each
Inter-class Intra-class
control Speed limit
connection in the class
Setup
ALTQ
Both
Auto
Compete2
NetGuards Guardian Pro
Both
Degree1
Compete
CheckPoints FloodGate
Both
Degree
Degree
NetRealitys WiseWan
Both
Auto
Auto
Acute/Broadwebs iPolicer
Both
Both
Degree
Degree
Sitaras QoSWorks
Both
Auto
Auto
Degree means that administrators can manually specify the degree of bandwidth borrowing.
DUTs without connection guarantee let the flows within the class compete with each other.
Table 2: Functionality Comparison of the Devices under Test
2.3 Protocol Support

Table 3 compares the protocol support of each DUT. Most Internet services/protocols can be
recognized by layer-4 TCP/UDP port numbers. However, layer-7 awareness can increase the simplicity
and capability of bandwidth management. For example, FTP protocol includes the passive mode, in
which FTP-data port (port 20, for sending data) can be dynamically changed to another by negotiation in
the FTP-Cmd port (port 21, for sending FTP commands). If the DUT cannot recognize what negotiation
is in the FTP-Cmd port, obviously it cannot control the connection that is actually sending the data.
PacketShaper and WiseWAN have the richest layer-7 awareness. In terms of quantity of port-service
mapping entries, WiseWAN and PacketShaper are the richest. The next richest are FloodGate and
Guardian Pro. iPolicer, QoSWorks, and ALTQ have few or no built-in port-service mapping entries and
require manual lookups in the port-service mapping table. Although iPolicer can identify UDP, it cannot
control its bandwidth.
Vendor/
Model
Layer awareness
Built-in port-service mappings
Layer
Layer-7 TYPE
TCP
ALTQ
NetGuards Guardian Pro
60
CheckPoints FloodGate
URL/MIME-TYPE
NetRealitys WiseWan
Acute/BroadWebs iPolicer
ICMP IPX
# of other protocols
UDP
0 (Manually assign port #)
Manually assign port #
35
15
60
35
URL/MIME-TYPE
109
79
Above 250
12
Cannot control
URL
Total above 200 (layer 2 ~7)
Above 200
Sitaras QoSWorks
0 (Manually assign port #)
*Note: This table only lists the protocols that can control rather than just recognize only.
Table 3: Comparison of Protocols Support

Appendix A-1 and A-2 further compare the policy console user interface and special functions of
the DUTs. Most DUTs mix priority-based and rate-based queuing, however, this test focuses on ratebased policy that controls TCP connections flowing from enterprises (LAN) to WAN since TCP
traffic occupies most of the Internet traffic. As for UDP traffic, this test focuses on real-time applications
such as Voice over IP (VoIP). Differences between configured bandwidth and measured results will be
quantified.
3. Testbed and Test Methodology

Testbed and test methodology significantly influence test results and require careful examination to
avoid misinterpretation of the results.
3.1 Testbed: Mimics the Real-Life Internet

Internet is very dynamic. Different connections have different paths and therefore have different
distances and path qualities. Our testbed mimics the above properties by setting WAN delay, WAN delay
jitter, and WAN packet loss rate to each routing path. Figure 1 and Table 4 shows complete information
about our testbed and testing tools. Testing data flows are from X to Y, passing through the DUT,
routers, monitoring point, and WAN emulator. The Cisco routers are installed specifically for WiseWAN
because of its V.35 interface. Each DUT is individually tested on this testbed. Appendix B displays our
testbed photo. IP-aliasing employed at A and I in Fig.1 emulates multiple competing sources and their
corresponding sinks, respectively. Self-written wan-emu virtual interface driver is used to emulate the
dynamics of the Internet. They are detailed as follows:
192.168.88.X
ncftpput
tcpdum p
100M H ub
Linux
2.2.14
P-III 700
Source
1 ~ 99
172.16.88.X
Q oSW orks
PacketShaper
IPolicer
253
R eportD ata
100
J
Telephone
V oice
Src 2
C
N T 4 Server
254
252
254
Cisco
Router 2514
254
FTP Server
Linux
2.2.14 100M
P-III 700
H ub
254
Linux
2.2.14
P-III 700
D estination
1 ~ 99
V .35 C able
C isco 1750
V O IP G atew ay
L
Sm artBits
V oice
D est2
100
V oice
D est1
V oice
Src 1
192.168.88.201
254
C isco
Router 2514
W A N Em ulator
Linux
2.2.14
P-III 700
RTP
(G .729)
254
10.1.1.X
172.16.89.X
tcpdum p
T TT
W iseW A N
Cisco 1750
V O IP G atew ay
172.16.87.X
FloodG ate
G uardianPro
P-III700
172.16.86.X
10.1.1.201
W in2K
P-III 700
CO M 2
Telephone
Sm artV oIpQ oS
Figure 1: The Testbed: Mimic the Real-life Internet

Note: All PC are equipped with Intel Express Pro 10/100Mbps network interface cards. The V.35 serial clock rate between Cisco routers is set to
2Mbps. Each DUT is individually tested on this testbed.
Tool
Function
Description
Position in
Fig. 1
Ncftpput [16]
TCP Traffic
Traffic: 20 ncftputs flows from subnet X to subnet Y.
generator
Packet size: 1,500 bytes

TCP options: SACK/timestamp/window-scaling disabled.
SmartVoIpQoS
VoIP (UDP) traffic Traffic: Single VoIP flow with RTP format UDP packets.
[17]
generator
Codec: G.729 (50 frame/sec, frame size=74 byte, around 30kbps)
VoIP Gateway
Same as above
Same as above
ttt [18]
Real-time traffic
Monitor the bandwidth of the traffic passing through it by protocols, G
K and N
bandwidth monitor source/destination IP, etc.

Tcpdump [19]
Packet sniffer
Self-written AWK Data Analyzer
Dump each packets header to the RAM disk to avoid I/O overheads. A and H
Calculating statistics from the tcpdump result.
To have different delays, delay jitters, and random/periodic packet
scripts [20]
Self-written wan
WAN Emulator
emulator [20]
loss rates impairments on different flows.
Table 6: Testing Tools

1. IP-aliasing : In Linux each network interface card (NIC) can emulate 100 NICs, with each virtual NIC
having a unique IP address. With proper routing table setup at A in Fig.1, we can direct certain flows
destined to a certain virtual NIC at I through a virtual NIC at A. Virtual NICs generate packets with
their corresponding IP addresses such that the DUT will feel that outgoing TCP data packets are from
different local hosts, and incoming TCP Acks are from different remote hosts. Moreover, packets are
sent without link-layer collisions since only a single physical NIC is present at A and I.
2. wan-emu: Wan-emu is a Linux virtual interface driver that resides between the IP layer and the NIC
driver. In this testbed, multiple wan-emu virtual devices are attached to the sink-side last-hop NIC
driver (at H with IP 10.1.1.254) to have different impairments on different routes. With proper static
route, we can direct flows destined to a virtual NIC at I through a specific wan-emu interface that has
the desired link characteristics. Each packet passing through is pasted a timestamp indicating the time
for it to be kicked out. An interrupt is triggered every 1ms to examine how many packets are due and
should be forwarded. The timer granularity can be easily tuned to 8192 Hz in Linux. Impairments such
as the random/periodic loss rate and delay jitter are also implemented.
3
3.2 Test Methodology

This test includes three sub-tests: Basic Test, Robustness Test, and Advanced Test.
A. Basic Test
This test evaluates the accuracy of the class bandwidth and the fairness among the connections
within the class. Besides, this test also investigates the stability of each DUT among its five-time runs.
The total WAN link bandwidth is set to T1 (1.544Mbps)4 and is partitioned into five classes (20, 40, 128,
256, and 1100kbps), with each class matching four TCP connections. Each class is set to guarantee that
Note that some operating systems merely support alias IP addresses, but cannot support alias interfaces, such as FreeBSD
and Windows 2000.
4
BroadWeb/Acute iPolicer does not have WAN link speed setup.
8
each connection has 1/4 of the class bandwidth5. All settings are fixed without any bandwidth
borrowing. This test repeats in consecutive five runs, with 200 seconds intervals in between. Within each
run, 20 FTP connections are simultaneously flowing from A to I (Table 6), with each class match 4
connections. After 250 seconds, all the ncftpput processes are killed. Data from 30 to 230 seconds are
analyzed. The statistics are explained in Table 7. Appendix C uses an intuitive example to illustrate the
following statistics.
Statistic
Accuracy
Quantify what?
Definition
The differences between:

Averaged normalized goodput*
(1)the class bandwidth settings
(2)the measured class
5 measured class goodput for Run i
bandwidth
i 1
given class goodput for Run i
Comparison
Standard
The closer to 1, the
better
Stability of
The differences of the accuracy CoV** of normalized goodput among the five runs It depends***.
accuracy
statistics among the five runs.
(Same as above, but take the CoV among the 5 runs

instead of the average.)
Fairness
Fairness of bandwidth usage
Averaged CoV among 4 connections goodputs
among the 4 connections in
better
Differences of the fairness
Same as above, but take the standard deviation
fairness
statistic among the five runs
among the 5 runs instead of the average.
class.
i 1
Stability of
Ratio
CoV of goodputs(among the 4 connections) in Run i
each class.
Retransmission Retransmission ratio in each
Retransmitted Packet Count for Run i

i 1 Total Packet Send Count for Run i

5
It depends***.
5
better
* Goodput is the effective throughput (bytes/time) excluding the bandwidth consumed by retransmission.
** CoV denotes coefficient of variation, which means standard deviation over mean.
*** If the accuracy tends to 1, it would be better for its stability to be 0. This implies the DUT always performs accurately. However, if the accuracy tends to
0, and its stability also tends to 0, it implies that the DUT always performs inaccurately. This also applies to fairness and its stability (Appendix C).
Table 7: Basic Test Statistics
B. Robustness Test
Packets may be generated by different operating systems, hence different TCP implementations,
and pass through paths with various delays and loss rates. Long-distance TCP connections are expected
to be vulnerable to Internet losses because they require more time to obtain Acks for recovering to their
target bandwidth. Since many DUTs regulate TCP Acks, it is our concern whether they are compatible
with the major operation systems. Table 8 describes our test methodology.
NetGuard Guardian Pro cannot accept per-connection setting.

9
Test Item
Description
DUT Settings
Comparison standard
Test Methodology
Under Heterogeneous Same as Basic WAN delays of the four connections in each class Same as the Basic Test
Internet Delays
Test.
are 10ms, 50ms, 100ms, 150ms
Under Various
200kbps
Internet Loss Rates
the test flow.
Under Different
80kbps for the (1)WAN: delay=50ms, periodic loss rate=1%.
Sending Operating
test flow.
for A single TCP connection is tested under 0.5%, 1%, Whether the goodput can
2%, 4% and 8% periodic loss rates.
smoothly degrade.
How closely the byte-time
(2)TCP Source OS= {Linux 2.2.14, Windows 2000, lines of the operating
Systems
FreeBSD 4.0, Solaris8}.
systems can overlap with
(3)TCP Receiver OS= Linux 2.2.14.
each other.
(4)Each time a single TCP connection is tested.
Table 8: Robustness Test Methodology
C. Advanced Test
This test includes bandwidth borrowing test and VoIP quality test. Bandwidth borrowing has been
described in Section 2. VoIP quality is separately tested through SmartBits and VoIP Gateway to
evaluate whether the DUTs can precisely allocate adequate bandwidth for voice traffic. Each test is
conducted under heavily-loaded FTP traffic. Detailed test methodologies are in Table 9.
Test Item
Description
DUT Settings
Inter-class
Bandwidth
Borrowing
Comparison
standard
Test Methodology
(1) Link speed=T1 (1.544Mbps), divided Connection 1 and 2 are started and stopped in (1) Stability of
into 2 classes A, B. A=B=777kbps.
sequence.
each
(2) Class A matches connection 1, Class
connection.
B matches connection 2.
(2) How
(3) A and B can borrow with each other.
seamlessly the
Intra-class
(1) Link speed=T1 (1.544Mbps), divided
total
Bandwidth
into 1 classes A. A=1.544Mbps.
Borrowing
bandwidth line
(2) The class matches connection 1 and 2.
can be when
(3) Per-connection bandwidth: at least
connection 1
terminates.
777kbps, at most 1.544Mbps.

VoIP test using (1) Link speed={T1,125kbps}, divided
Background: 20 FTP connections.
PSQM1,
SmartVoIpQoS
Foreground: a 30kbps G.729 VoIP flow.
delay and loss.
Background: 20 FTP connections.
Listening with
into 2 classes A, B.
VoIP test using (2) A=30kbps for voice traffic,

VoIP Gateway
B={T1,125kbps}-30kbps for FTP
(Cisco 1750)
traffic.
Foreground: Dial a phone (JP to NP, G.729 ears2.

codec), hold Xs and Ys phones,
(3) FTP traffic can occupy the voice class
speak 1 to 10 at 2 word/sec, and
10
jitter,
until voice traffic begins.
judge the voice quality.
PSQM (Perceptual Speech Quality Measurement) is calculated from delay, jitter, and loss statistics. PSQM rated as 6.5 has the poorest quality
The VoIP Gateway is set to continuously sample the sound even when the primary tester keeps silent. Thus the data flow is always around 30 kbps.
Table 9: Advanced Test Methodology
4. Benchmark Test Results

A. Basic Test Results
A-1. Accuracy and Stability of Accuracy
Figure 2 (A1 is accuracy, B1 is its stability, A2 and B2 will be discussed in the robustness test)
reveals that the DUTs can be classified into three groups: ALTQ_CBQ, PacketShaper, and QoSWorks
have the most accurate and stable control for each class; WiseWAN and FloodGate are less effective in
the narrowband class (20kbps) because of their large retransmission ratios as will be shown in section A3; iPolicer and Guardian Pro are the least effective. iPolicer has several terminated connections in the
middle of each run. Thus those connections not sending data waste bandwidth and result in instability
among the five runs6.
Note: The test crew had performed many five-run tests on iPolicer. It is only after the above phenomenon has been
verified that we include the most general one of the five-run tests in our analysis.
11
Figure 2: Results of accuracy and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay)
A-2. Fairness and Stability of Fairness
Figure 3 (A1 is fairness, B1 is its stability, A2 and B2 will be discussed in robustness test) also
distinguishes three groups: PacketShaper is the most fair and stable; QoSWorks is less fair but is stable
in the 20kbps class, implying that it is less fair in the 20kbps class in all the five test runs (Appendix C).
FloodGate and WiseWAN are less fair and stable in the 20kbps class. iPolicer, Guardian Pro, and
ALTQ_CBQ+RED provide poor fairness. Pure CBQ has the poorest fairness under narrowband
(20~40kbps) classes. However, it is somewhat alleviated after applying RED to each class because RED
tends to drop more packets from the connection that is more aggressively sending the data.
12
Figure 3: Results of fairness and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay)
A-3. Retransmission Ratio
Figure 4 A1 (A2 will be discussed in robustness test) shows large retransmission ratio in
narrowband classes (20~40kbps), except for PacketShaper and QoSWorks, but especially in WiseWAN,
iPolicer, FloodGate and ALTQ_CBQ+RED. As an analogy, a small exit often keeps many people
waiting before it. FloodGate and ALTQ_CBQ+RED use packet dropping to slow down TCP flows so
they have high retransmissions. WiseWAN has enormous packet losses at the Cisco router before
WiseWAN can control the traffic at the WAN link. Results of iPolicer are not easy to comprehend in
terms of the technologies it claims (adjusting the TCP window size).
13
Figure 4: Test results of retransmission ratio (A1: No Internet Delay; A2: With Internet Delay)
B. Robustness Test Results

B-1. Under Heterogeneous Internet Delays
To make it easy to compare with the Basic Test, the test results are listed with those of Basic Test.
Figure 2 (A2, B2), Figure 3 (A2, B2) and Figure 4 (A2) separately demonstrates the results. Most results
scales up the differences among the DUTs in the Basic Test, especially with iPolicer and ALTQ_CBQ in
the fairness statistic. Long-distance connections are vulnerable to packet losses due to buffer overflows
at the controlling device, as described in Section 3.2 B. ALTQ_CBQ+RED can alleviate the unfairness
degree of ALTQ_CBQ because the short-distance connections, which are more aggressively sending the
data, have more packets dropped by the RED mechanism. Guardian Pro cannot guarantee each
connection and thus reveals significant instability between Basic Test and this test. QoSWorks is less fair
under the broadband class (1.1Mbps).
B-2. Under Various Packet Loss Rates
Normally a TCP flow slows down its transmission rate when packet losses occur. Figure 5 shows
the goodput of each DUT under different Internet packet loss rates (each flow is with 200kbps and the
measured goodput is averaged over 200 seconds as in Basic Test). Almost all the DUTs can smoothly
lower their goodputs as packet loss rate increases, except for PacketShaper and iPolicer. These two
devices give up sizing the TCP window when they have detected the TCP loss events (triple duplicate
Acks). Thus, the TCP sending window suddenly bumps up and causes a burst of packets flowing to the
controlling device, resulting in a higher goodput at 0.5% loss rate. This phenomenon is alleviated when
increasing the packet loss rate.
14
B andw idth (kbps)
IPolicer
FloodG ate
200
PacketShaper
W iseW A N
G uardianPro
180
Q oSW orks
A LTQ _C B Q
160
0
0.5
8 Loss rate (% )
Figure 5: Robustness Testgoodput under various packet loss rates

B-3. Under Different Sending Operating Systems
In this compatibility test (see Fig.6, the X axis is time, Y axis is the bytes sent, thus the slope is the
bandwidth), TCP connections sending from different operating systems passing through PacketShaper
have different results. PacketShaper shrinks the TCP window to the condition that no more than 4
packets are in the WAN pipe. Thus, each packet loss resorts to a retransmission timeout instead of using
fast retransmit [21]. Since BSD-derived UNIX systems use a coarse-grained retransmission timer
(500ms) [21] such that they slowly retransmit the lost packets. In contrast, Linux keeps a fined-grained
retransmission timer and has the best performance when packet losses occur. iPolicer has a serious bug
when sending data from Windows 2000 to Linux 2.2.14. The tcpdump tool found that the TCP Ack
header length is miscalculated when passing through iPolicer, causing incorrectly triggering of data
packets from TCP senders. TCP has many options and various implementations, so explicitly modifying
the packet header requires sever compatibility tests. The other products can fairly treat TCP flows from
different operating systems.
Figure 6: Robustness test Under different Sending Operating Systems
C. Advanced Test Results

C-1. Bandwidth Borrowing Test Results
This test uses ttt to observe the effectiveness of bandwidth borrowing. In each figure we only focus
on three lines: the total bandwidth (ip/ether line), the bandwidth of connection 1 (xxxx/tcp line) and the
15
bandwidth of connection 2 (yyyy/tcp line). The test crew draws another baseline indicating the ideal
total link bandwidth (1.544Mbps) for comparison.
Inter-Class Bandwidth Borrowing Test Results
Figure 7 shows the inter-class bandwidth borrowing benchmark results. iPolicer does not have
this function, so we set the bandwidth of both of the two classes as 1.544Mbps. However, Cisco
Routers link is set to 2Mbps, thus the two 1.544Mbps flows through iPolicer exceeds the baseline
bandwidth. After connection 1 terminates, the total bandwidth narrows down to around 1.5Mbps
with some bandwidth fluctuation. WiseWAN and ALTQ can automatically borrow bandwidth among
classes, and the others can be further configured with the degree of bandwidth borrowing. Guardian
Pro has an unstable look when connection 2 starts to obtain a bandwidth share. ALTQ_CBQ and
ALTQ_CBQ+RED can only borrow a limited bandwidth (from 777kbps to 1.1Mbps). FloodGate,
PacketShaper and QoSWorks can perform inter-class bandwidth borrowing seamlessly.
(a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate
(e) Packeteer PacketShaper (f) Sitara QoSWorks
(c) NetGuard GuardianPro
(g) NetReality WiseWAN
(d) ALTQ_CBQ
(h) ALTQ_CBQ+RED
Figure 7: Inter-class Bandwidth Borrowing Test

Intra-Class Bandwidth Borrowing
Figure 8 shows the intra-class bandwidth borrowing benchmark results. iPolicer lacks this
function so after connection 1 terminates, connection 2 cannot occupy the newly available
bandwidth within the class. Guardian Pro and ALTQ_CBQ have fluctuating bandwidth sharing
between the two connections since they cannot guarantee per-connection bandwidth. This
phenomenon in ALTQ_CBQ is again slightly alleviated after applying RED. The other four products
are quite similar in this test, except that PacketShaper and FloodGate have little gaps.
16
(a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate (c) NetGuard Guardian Pro
(e) Packeteer PacketShaper (f) Sitara QoSWorks
(d) ALTQ_CBQ
(g) NetReality WiseWAN (h) ALTQ_CBQ+RED
Figure 8: Intra-class Bandwidth Borrowing Test

C-2. VoIP Quality Test
This test does not include iPolicer since presently it cannot control UDP traffic. This test is
performed by the Smartbits and by the Cisco 1750 VoIP gateways. The former gives quantitative results
while the latter judges the voice quality through hearing.
Figure 9 (a) shows that under T1 WAN link (1.544Mbps) the DUTs differ in latency and jitter.
However, the ultimate voice quality grades (PSQM) are similar except for ALTQ_CBQ. This is also
verified by the VoIP Gateway (Table 10) test. We thus conclude that under T1 access link the G.729 bit
rate can be easily allocated. In contrast, under 125kbps WAN link (Fig.9 (b) and Table 10), the voice can
only barely be recognized with PacketShaper. Transmitting a large packet (1500 bytes) to the
narrowband WAN link (125kbps) takes a long time such that its following small voice packet (74 bytes)
has to wait until the previous large packet is completely scheduled out. However, after QoSWorks
2500
20
2000
50
1500
40
Base
PacketShaper
FloodGate
Average Latency
WiseWAN
Max Latency
GuardianPro
QoSWorks
2.2
2.48
2.56
500
6.5
2.7
2.45
WiseWAN
PSQM
GuardianPro
Loss rate
QoSWorks
20
10
ALTQ_CBQ
WiseWAN
GuardianPro
Max Latency
QoSWorks
QoSWorks2 ALTQ_CBQ
Jitter (Latency Variation)
PSQM and Loss Rate

25
15
2.6
Jitter (ms)
10
PacketShaper FloodGate
Average Latency
0
FloodGate
20
Base
5
PacketShaper
30
1000
ALTQ_CBQ
1
0
Base
60
Jitter (Latency Variation) 121.2768
PSQM and Loss Rate

6
5
4
70
6.5
6.5
6.5
6.5
6.5
6.23
100
80
60
2.6
2.2
40
20
Loss rate (%)
30.6303
PSQM
1.0533
10.8978
10
PSQM
25
15
81.0739
100
3
2
3000
L o s s ra te (% )
(m s )
150
80
30
(m s )
200
50
Latency and jitter
837.5684
Latency (ms)
1529.3163
Latency and jitter
250
0
Base
PacketShaper
FloodGate
WiseWAN
GuardianPro
PSQM
Loss rate
QoSWorks
QoSWorks2
ALTQ_CBQ
(a) T1 WAN link (1.544Mbps)

(b)125kbps WAN link
Note: Base results are conducted under clean testbed without enabling any DUT. The G.729 Codec is not lossless
compression. Even though the jitter and loss is few, the PSQM is at least 2.2.
Figure 9: VoIP Test Results

17 of SmartVoIPQoS
exercises Packet Size Optimization (minifying the maximum transfer unit of FTP connections when
establishing the connections), the voice quality approaches the original voice both in Smartbits and
Gateway tests. While it is promising, readers should be aware that minifying the packet size of all other
TCP connections can cause large overhead. As an analogy, the overhead of several small trucks carrying
the goods is larger than that of a big truck carrying the same goods. This tradeoff depends on the
considerations of the network administrator.
T1 WAN link Speed
Calling time
Baseline (only voice)
About 0.2 sec
125kbps WAN link Speed
Delay time
Voice quality
estimated by ears
(legibility)
Very short< 0.1 sec
Very good
Calling time
<1 sec
Delay time
Voice quality
estimated by ears
(legibility)
Very short< 0.1
Very good
sec
Baseline (with background FTP)
Cannot establish the connection
Cannot establish the connection
iPolicer
Cannot be testeddo not support UDP traffic control
Cannot be testeddo not support UDP traffic control
FloodGate
About 0.5 sec
Very short< 0.1 sec
Very good
About 7sec
About 1 sec
Very Poor<10%
Guardian Pro
About 0.5 sec
Very short< 0.1 sec
Very good
About 3 sec
About 1.5 sec
Ultra poor<1%
WiseWAN
About 0.5 sec
Very short< 0.1 sec
Very good
About 7sec
About 1.5 sec
Ultra poor<1%
PacketShaper
About 0.5 sec
Very short< 0.1 sec
Very good
About 1 sec
About 1 sec
Poor (60%)
ALTQ_CBQ
About 2 sec
Very short< 0.1 sec
Very good
About 18 sec
About 1 sec
Very Poor<10%
QoSWorks
About 1 sec
Very short< 0.1 sec
Very good
About 17 sec
About 1 sec
Very Poor<10%
QoSWorks Optimized
Not tested (no need to)
About 6 sec
Very short< 0.2
Very good
sec
Table 10: VoIP Test Results Through VoIP Gateway
5. Conclusions
This work designs a novel testbed that mimics the real-life Internet conditions, such as multiple
connections, heterogeneous WAN delays/delay jitters/packet loss rates, and different TCP source
implementations. Most test reports, such as those by the Tolly Group [22], are financed by the vendors
and may be biased. Additionally, the testbed in those reports is over-simplified, without in-depth test
items or with inadequate number of connections. This work first classifies the policy rules into three
major types: (1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule
among classes. The test methodology then quantifies the effectiveness of the above policy rule types of
each device in terms of accuracy, fairness, stability, robustness, bandwidth borrowing, and VoIP quality.
The test results reveal several things that can be reproducible with our open tools: (1) the narrowband
class-based rule and its fairness among the flows are harder to enforce when multiple TCP connections
18
compete for the same queue, resulting in large queue length and TCP retransmissions. (2) explicitly
sizing the TCP window could cause performance or fairness degradation even under slight packet loss
rates; (3) the open source solution can compete with commercial products in accurately limiting flow
aggregates; (4) the video/voice qualities of real-time applications significantly depends on the packet
sizes of all other traffic when using a narrowband (125kbps) access link. Detailed functionality
comparison among the DUTs gives further directions for enhancing open source solutions, such as
Packeteers traffic discovery and QoSWorkss intuitive user interface. The ALTQ package lacks perconnection bandwidth guarantee within the class that it needs further refinements to satisfy the
enterprises demand. Some vendors in this test use open sources but never do they open their kernel
patches. We are currently patching ALTQ with per-connection bandwidth guarantee and will feedback to
the Open Source community. After all, open source should be open.
6. References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss, An Architecture for

Differentiated Services, RFC 2475, Dec. 1998.
Stiliadis, and A. Varma, Latency-Rate Servers: A General Model for Analysis of Traffic
Scheduling Algorithms, IEEE/ACM Transactions on Networking, Vol. 6, No. 5, pp.611-624, Oct.
1998.
S. Floyd, and V. Jacobson, Link-sharing and resource management models for packet
networks, IEEE/ACM Transactions on Networking, Vol. 3, No. 4, pp.365-386, 1995.
K. Fall, and S. Floyd, Simulation-based Comparisons of Tahoe, Reno, and SACK TCP,
ACM Computer Communication Review, Vol. 26 No. 3, pp.5-21, Jul. 1996.
J. Padhye, and S. Floyd, On Inferring TCP Behavior, ACM SIGCOMM'2001, San Diego,
USA, August, 2001. http://www.acm.org/sigcomm/sigcomm2001/p23.html (to be appeared)
S. Floyd and V. Jacobson, Random Early Detection Gateways for Congestion Avoidance,
IEEE/ACM Transactions on Networking, Vol. 1, No. 4, pp.397-413, Aug. 1993.
S. Karandikar, S. Kalyanaraman, P. Bagal, and B. Packer, TCP Rate Control, ACM
Computer Communication Review, Vol. 30, No. 1, Jan. 2000.
K. Cho, Alternate Queueing for BSD UNIX (ALTQ), http://www.csl.sony.co.jp/person/kjc
NetGuard Corporation, http://www.netguard.com
Check Point Software Technologies, http://www.checkpoint.com
BroadWeb Corporation, http://www.broadweb.com.tw
Acute Communication Corporation, http://www.acutecomm.com
Packeteer Corporation, http://www.packeteer.com
Sitara Networks, http://www.sitaranetworks.com
NetReality Corporation, http://www.net-reality.com
19
[16]
[17]
[18]
[19]
[20]
[21]
[22]
Ncftpput Software, http://www.ncftp.com

K. Cho, Tele Traffic Tapper (ttt), http://www.csl.sony.co.jp/person/kjc
Spirent Communications, http://www.netcomsystems.com
Lawrence Berkeley National Laboratory, tcpdump, http://www-nrg.ee.lbl.gov
H. Y. Wei, WAN Emulator, http://speed.cis.nctu.edu.tw/wanemu/
W. R. Stevens, TCP/IP Illustrated Volume 1 - The Protocols, Addison-Wesley, 1994.
Tolly Group, http://www.tolly.com
Acknowledgements
We thank the vendors who so generously provided us with the devices and their verifications of
the test results. We are grateful to Ching-Chuan Chiang and Yi-Chung Liu for their help on the
preliminary tests and functionality comparisons.
Appendix
Appendix A. Detailed Functionality Comparison
A-1. Policy Console User Interface
As for the policy console user interface (Table A), a notable function is how many devices a
management console can control. Policy consoles of PacketShaper and QoSWorks can control only one
device since they use built-in web servers for configuration with Web browsers. Policy consoles of
others (except for ALTQ) can remotely control multiple devices located at different places. As for
schedule control, per-rule schedule control is more effective. For example, some rules can be inactive
during non-office hours, but VoIP rule should be always active to guarantee voice quality.
Vendor/Model
Type
Schedule
Management
Control
Console
ALTQ
Config File
NetGuards
GUI Win32
Per-rule
Guardian Pro
Application
CheckPoints
GUI Win32
FloodGate
Application
NetRealitys
WiseWan
GUI Java
Application
OS
Single device FreeBSD 4.0

Global
Win NT/2000
devices
Per-rule
Global
Global
Alert
Per-class bandwidth usage
N/A
Line Statistics Report/Response Time
Log
Report/Protocol Distribution Report

Win NT/2000
Line Statistics Report/Response Time
devices
Per-rule
Monitor/Statistics
N/A
Report/Protocol Distribution Report

Win NT/Solaris
devices
Line Statistics Report/Port Report /Response Time SNMP trap

Report/Protocol Distribution Report/VoIP
Report/Top Ten Talkers/Top Ten Protocols or Apps
20
BroadWeb/Acutes Web Browser
Per-rule
Global
Web Server Web Client
Line Statistics Report/Top Ten Report/
iPolicer
(Java Applet)
devices
Another NT IE 5.0
Top Ten Talkers/Top Ten Protocols
Packeteers
Web Browser Per-device
Single
Utilization/Network Efficiency/Top Ten
device
Embedded
Classes/Top Twenty Talkers/Per-class Bandwidth
PacketShaper
(HTML)
Sitaras QoSWorks Web Browser Per-device

(HTML)
Any
Web Server
Usage/Response Time Report
Single
Per-class Bandwidth Usage/Link statistics/Top
device
Embedded
classes per link/Top Applications/Protocol
Any
Email trap
SNMP
trap
SNMP
trap
Distribution/Traffic by address
Web Server
Table A: Management Interface and Statistics of Flow

A-2. Special Functions
PacketShaper is superior in its Traffic Discovery, which can automatically identify the protocols
of the traffic passing through it and provide an instant feedback to the network administrator for further
bandwidth setting. Others have to manually monitor whether the newly specified packet filters can
capture its corresponding traffic. WiseWAN is directly installed at the WAN link (V.35 cable) and thus
can verify whether the measured bandwidth matches the subscribed bandwidth. Additionally, it can
detect PVCs in the frame relay network. Thus a single WiseWAN device can control all the traffic on the
mesh-structured frame relay links among branch offices. QoSWorks significantly focuses on controlling
VoIP traffic. With shrinking TCP data packet size, VoIP (UDP packets) traffic can pass through
QoSWorks smoothly, especially in narrowband WAN link. Moreover, QoSWorks has built-in Web cache
(not verified in this report). Both FloodGate and Guardian Pro can be integrated with their firewall, VPN
and NAT packages. Integrated solutions may reduce management costs.
Appendix B. Testbed Photo
Figure B: Testbed Photo
21
Appendix C. Intuitive Example for Basic Test Statistics

This intuitive example illustrates how the Basic Test statistics of the 20kbps class are derived. As
described in Section 3.2, each class matches four connections, and the test repeats for five runs. Ideally
within each run each connection can receive 1/4 of the class bandwidth. The example results tell us that
the accuracy statistic is 19, which approaches the ideal result 20, cannot reflect real conditions. With the
aid of poor stability of accuracy, we can judge that the DUT is actually not good in accuracy. On the
other hand, Not fair with Good stability of fairness means that the DUT cannot fairly" treat the
flows almost all the time.
20kbps
40kbps
128kbps
1.544M bps
256kbps
1.1M bps
Ideal
Round 1 Round 2 Round 3 Round 4 Round 5
5
4
1
12
1
13
5
4
2
5
2
14
20
14
7
30
5
39
5
4
2
6
1
5
5
2
2
7
1
7
10
10
40
(14+7+30+5+39)/5=19 => A ccurate!!
10
5=19
A ccurat
C oV(14+7+30+5+39)/
(14,7,30,5,39)=2.
4 =>=>Poor
stabie!!
lity!!
10
C oV (14,7,30,5,39)=2.4 => Poor stability!!
32
32
128
32
32
C oV =0.23 C oV =0.25
C oV =0.36
C oV =0.35
C oV =0.39
64
C oV =0.23 C oV =0.25
C oV =0.36
C oV =0.35
C oV =0.39
64
256
64
64
275
(0.23+0.25+0.36+0.35+0.39)/5=0.32 => N otFair
275
(0.23+0.
5=0.
32 =>
=>GNood
otFai
r lity
23+0.25+0.
25+0.36+0.
36+0.35+0.
35+0.39)/
39)=0.
063
stabi
1100 Std(0.
275
Std(0.23+0.25+0.36+0.35+0.39)=0.063 => G ood stability
275
Figure C: Intuitive Example for Basic Test Statistics
22

BW Benchmark

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

BW Benchmark

Caricato da

Copyright:

Formati disponibili

On Evaluating Policy-Based Bandwidth Management Devices

Department of Computer and Information Science

Queuing the Internet Traffic: TCP vs. UDP

Specific Algorithms for TCP Traffic

2. Device under Test (DUT)

2.2 FreeBSD, Software Between Our P!!! 700MHz PC with 256M

NetGuards Guardian Pro [9]

CheckPoints FloodGate [10]

iPolicer 100 Mbps

4.1.2 Embedded Linux,

booting from a hard disk.

Flash P!!! 128M

SDRAM, 2 Intel 100M NICs installed,

1.8 Embedded FreeBSD,

ALTQ 2.2 [8]

WAN link Flash

Table 1: Product information and software/hardware platforms

2.1 Functionality of Policy Console

Per-Class Bandwidth Control

Link Class Guarantee BW for each

control Speed limit

connection in the class

NetGuards Guardian Pro

Table 2: Functionality Comparison of the Devices under Test

2.3 Protocol Support

Built-in port-service mappings

NetGuards Guardian Pro

0 (Manually assign port #)

Manually assign port #

Manually assign port #

Manually assign port #

Total above 200 (layer 2 ~7)

0 (Manually assign port #)

Manually assign port #

Table 3: Comparison of Protocols Support

3. Testbed and Test Methodology

3.1 Testbed: Mimics the Real-Life Internet

Figure 1: The Testbed: Mimic the Real-life Internet

Traffic: 20 ncftputs flows from subnet X to subnet Y.

Packet size: 1,500 bytes

Codec: G.729 (50 frame/sec, frame size=74 byte, around 30kbps)

Monitor the bandwidth of the traffic passing through it by protocols, G

bandwidth monitor source/destination IP, etc.

Self-written AWK Data Analyzer

To have different delays, delay jitters, and random/periodic packet

loss rates impairments on different flows.

Table 6: Testing Tools

3.2 Test Methodology

The differences between:

given class goodput for Run i

statistics among the five runs.

(Same as above, but take the CoV among the 5 runs

Fairness of bandwidth usage

Averaged CoV among 4 connections goodputs

among the 4 connections in

Differences of the fairness

Same as above, but take the standard deviation

statistic among the five runs

among the 5 runs instead of the average.

CoV of goodputs(among the 4 connections) in Run i

Retransmission Retransmission ratio in each

The closer to 0, the

Retransmitted Packet Count for Run i